Publications

Multimedia subjective quality assessment experiments are the most prominent and reliable way to evaluate the visual quality as perceived by human observers. Along with laboratory (lab) subjective experiments, crowdsourcing (CS) experiments have become very popular in recent years, e.g., during the COVID-19 pandemic these experiments provide an alternative to lab tests. However, conducting subjective quality assessment tests in CS raises many challenges: internet connection quality, lack of control on participants' environment, participants' consistency and reliability, etc. In this work, we evaluate the performance of CS studies for 3D graphics quality assessment. To this end, we conducted a CS experiment based on the double stimulus impairment scale method and using a dataset of 80 meshes with diffuse color information corrupted by various distortions. We compared its results with those previously obtained in a lab study conducted on the same dataset and in a virtual reality environment. Results show that under controlled conditions and with appropriate participant screening strategies, a CS experiment can be as accurate as a lab experiment.

Link

With the ever-growing use of extended reality (XR) technologies, researchers seek to understand the factors leading to a higher quality of experience. Measurements of the quality of experience (e.g., presence, immersion, flow) are usually assessed through questionnaires completed after the experience. To cope with shortcomings and limitations of this kind of assessment, some recent studies tend to use physiological measures and interaction traces generated in the virtual world in replacement or in addition to questionnaires. Those physiological and behavioral measurements are still complex to implement, and existing studies difficult to replicate, because of a lack of easy and efficient tools to collect and visualize such data produced during XR experiments. In this paper, we present XRE-cho, a Unity package that allows for the recording, replaying and visualization of user behavior and interactions during XR sessions; recorded data allows to replay the whole experience, as it includes movements of the XR device, controllers and interacted objects, as well as eye tracking data (e.g., gaze position, pupils diameter). The capabilities of this tool are illustrated in a user study, where 12 participants' data have been collected and visualized with XREcho. Source code for XREcho is publicly available on GitHub.1

Link

Surface meshes associated with diffuse texture or color attributes are becoming popular multimedia contents. They provide a high degree of realism and allow six degrees of freedom (6DoF) interactions in immersive virtual reality environments. Just like other types of multimedia, 3D meshes are subject to a wide range of processing, e.g., simplification and compression, which result in a loss of quality of the final rendered scene. Thus, both subjective studies and objective metrics are needed to understand and predict this visual loss. In this work, we introduce a large dataset of 480 animated meshes with diffuse color information, and associated with perceived quality judgments. The stimuli were generated from 5 source models subjected to geometry and color distortions. Each stimulus was associated with 6 hypothetical rendering trajectories (HRTs): combinations of 3 viewpoints and 2 animations. A total of 11520 quality judgments (24 per stimulus) were acquired in a subjective experiment conducted in virtual reality. The results allowed us to explore the influence of source models, animations and viewpoints on both the quality scores and their confidence intervals. Based on these findings, we propose the first metric for quality assessment of 3D meshes with diffuse colors, which works entirely on the mesh domain. This metric incorporates perceptually-relevant curvature-based and color-based features. We evaluate its performance, as well as a number of Image Quality Metrics (IQMs), on two datasets: ours and a dataset of distorted textured meshes. Our metric demonstrates good results and a better stability than IQMs. Finally, we investigated how the knowledge of the viewpoint (i.e., the visible parts of the 3D model) may improve the results of objective metrics.

Link

The impact of olfactory cues on user experience in virtual reality is increasingly studied. However, results are still heterogeneous and existing studies difficult to replicate, mainly due to a lack of standardized olfactory displays. In that context, we present Nebula, a low-cost, open-source, olfactory display capable of diffusing scents at different diffusion rates using a nebulization process. Nebula can be used with PC VR or autonomous head-mounted displays, making it easily transportable without the need for an external computer. The device was calibrated to diffuse at three diffusion rates: no diffusion, low and high. For each level, the quantity of delivered odor was precisely characterized using a repeated weighting method. The corresponding perceived olfactory intensities were evaluated by a psychophysical experiment on sixteen participants. Results demonstrated the device capability to successfully create three significantly different perceived odor intensities (Friedman test p < 10− 6, Wilcoxon tests padj < 10− 3), without noticeable smell persistence and with limited noise and discomfort. For reproducibility and to stimulate further research in the area, 3D printing files, electronic hardware schemes, and firmware/software source-code are made publicly available.

Link

Food craving is a main pathological issue in Eating Disorders (ED), associated with binge eating such as bulimia nervosa and binge eating disorder. Common therapies are currently based on Cognitive and Behavioral Therapies (CBT). However, a significant number of patients are not receptive to conventional CBTs. Virtual Reality (VR) is now used as a variant of CBTs, especially VR based cue exposure therapy, but only food triggers are generally considered in this form of therapy. In this context, we developed the ReVBED environment: a semi-directed VR exposure scenario allowing the induction of food craving by including multimodal stimuli. The proposed paper presents the REVBED environment and scenario, and the planned clinical study of which aim is to evaluate its effectiveness for inducing food cravings in bulimia nervosa and binge eating disorder patients, with subjective and objective measurements.

Link

This paper presents the concept of multisensory digital twins, a brief state of the art of existing techniques for both capturing and immersive rendering of multisensory stimuli, and a case study of multisensory digital twin creation for cultural heritage.

Link

Over the past decade, three-dimensional (3D) graphics have become highly detailed to mimic the real world, exploding their size and complexity. Certain applications and device constraints necessitate their simplification and/or lossy compression, which can degrade their visual quality. Thus, to ensure the best Quality of Experience, it is important to evaluate the visual quality to accurately drive the compression and find the right compromise between visual quality and data size. In this work, we focus on subjective and objective quality assessment of textured 3D meshes. We first establish a large-scale dataset, which includes 55 source models quantitatively characterized in terms of geometric, color, and semantic complexity, and corrupted by combinations of five types of compression-based distortions applied on the geometry, texture mapping, and texture image of the meshes. This dataset contains over 343k distorted stimuli. We propose an approach to select a challenging subset of 3,000 stimuli for which we collected 148,929 quality judgments from over 4,500 participants in a large-scale crowdsourced subjective experiment. Leveraging our subject-rated dataset, a learning-based quality metric for 3D graphics was proposed. Our metric demonstrates state-of-the-art results on our dataset of textured meshes and on a dataset of distorted meshes with vertex colors. Finally, we present an application of our metric and dataset to explore the influence of distortion interactions and content characteristics on the perceived quality of compressed textured meshes.

Link

Virtual reality is a new technology that has been developing a lot during the last decade. With autonomous head-mounted displays appearing on the market, new uses and needs have been created. The 3D content displayed by those devices can now be stored on distant servers rather than directly in the device's memory. In such networked immersive experiences, the 3D environment has to be streamed in real-time to the headset. In that context, several recent papers proposed utility metrics and selection strategies to schedule the streaming of the different objects composing the 3D environment, in order to minimize the latency and to optimize the quality of what is being visualized by the user at each moment. However, these proposed frameworks are hardly comparable since they operate on different systems and data. Therefore, we hereby propose an open-source DASH-based web framework for adaptive streaming of 3D content in a 6 Degrees of Freedom (DoFs) scenario. Our framework integrates several strategies and utility metrics from the state of the art, as well as several relevant features: 3D graphics compression, levels of details and the use of a visual quality index. We used our software to demonstrate the relevance of those tools and provide useful hints for the community for the further improvements of 3D streaming systems.

Link

Numerous methodologies for subjective quality assessment exist in the field of image processing. In particular, the Absolute Category Rating with Hidden Reference (ACR-HR), the Double Stimulus Impairment Scale (DSIS), and the Subjective Assessment Methodology for Video Quality (SAMVIQ) are considered three of the most prominent methods for assessing the visual quality of 2D images and videos. Are these methods valid/accurate to evaluate the perceived quality of 3D graphics data? Is the presence of an explicit reference necessary, due to the lack of human prior knowledge on 3D graphics data compared to natural images/videos? To answer these questions, we compare these three subjective methods (ACR-HR, DSIS, and SAMVIQ) on a dataset of high-quality colored 3D models, impaired with various distortions. These subjective experiments were conducted in a virtual reality environment. Our results show differences in the performance of the methods depending on the 3D contents and the types of distortions. We show that DSIS and SAMVIQ outperform ACR-HR in terms of accuracy and point out a stable performance. In regard to the time-effort, DSIS achieves the highest accuracy in the shortest assessment time. Results also yield interesting conclusions on the importance of a reference for judging the quality of 3D graphics. We finally provide recommendations regarding the influence of the number of observers on the accuracy.

Link

Efficient objective and perceptual metrics are valuable tools to evaluate the visual impact of compression artifacts on the visual quality of volumetric videos (VVs). In this paper, we present some of the MPEG group efforts to create, benchmark and calibrate objective quality assessment metrics for volumetric videos represented as textured meshes. We created a challenging dataset of 176 volumetric videos impaired with various distortions and conducted a subjective experiment to gather human opinions (more than 5896 subjective scores were collected). We adapted two state-of-the-art model-based metrics for point cloud evaluation to our context of textured mesh evaluation by selecting efficient sampling methods. We also present a new image-based metric for the evaluation of such VVs whose purpose is to reduce the cumbersome computation times inherent to the point-based metrics due to their use of multiple kd-tree searches. Each metric presented above is calibrated (i.e., selection of best values for parameters such as the number of views or grid sampling density) and evaluated on our new ground-truth subjective dataset. For each metric, the optimal selection and combination of features is determined by logistic regression through cross-validation. This performance analysis, combined with MPEG experts’ requirements, lead to the validation of two selected metrics and recommendations on the features of most importance through learned feature weights.

Link

Studies into food-related behaviors and emotions are increasingly being explored with Virtual Reality (VR). Applications of VR technologies for food science include eating disorder therapies, eating behavior studies and sensory analyzes. These applications involve 3D food stimuli intended to elicit cravings, stress, and/or emotions. However, the visual quality (i.e., the realism) of used food stimuli is heterogeneous, and this factor’s influence on the results has never been isolated and evaluated. In this context, this work aims to study how the visual quality of food stimuli, exposed in a virtual reality environment, influences the resulting desire to eat.

Link

Designers know that part of the appreciation of a product comes from the properties of its materials. These materials define the object’s appearance and produce emotional reactions that can influence the act of purchase. Although known and observed as important, the affective level of a material remains difficult to assess. While many studies have been conducted regarding material colors, here we focus on two material properties that drive how light is reflected by the object: metalness and smoothness. In this context, this work aims to study the influence of these properties on the induced emotional response.

Link

Many studies have investigated how interpersonal differences between users influence their experience in Virtual Reality (VR) and it is now well recognized that user's subjective experiences and responses to the same VR environment can vary widely. In this study, we focus on player traits, which correspond to users' preferences for game mechanics, arguing that players react differently when experiencing VR scenarios. We developed three scenarios in the same VR environment that rely on different game mechanics, and evaluate the influence of the scenarios, the player traits and the time of practice of the VR environment on users' perceived flow. Our results show that 1) the type of scenario has an impact on specific dimensions of flow; 2) the scenarios have different effects on flow depending on the order they are performed, the flow preconditions being stronger when performed at last; 3) almost all dimensions of flow are influenced by the player traits, these influences depending on the scenario, 4) the Aesthetic trait has the most influences in the three scenarios. We finally discuss the findings and limitations of the present study that we believe has strong implications for the design of scenarios in VR experiences.

Link

Social communication interactions are paramount in human daily-life. In virtual environments, virtual agents allow for more social presence to the users, and can trigger social behaviours on them. However, the complexity of real-life situations, notably with external stimuli perturbations (noise, visual occlusions etc.) is not handled by classical interaction models. We propose here a model that includes an acoustic and visual sensing system, to compute and evaluate agent-dependent perceptions from the agents and their surroundings, before triggering expressive actions and reactions.

Link

Human social interactions rely on multisensory cues. In this regard, visual and auditory cues are paramount during the initiation of an interaction. In this preliminary work, we propose an approach to let Intelligent Virtual Agents (IVAs) simulating sound perception capabilities. Our model targets to control IVAs' reactive behaviour through their analysis of perceived other agents' emitted sounds. For that, we explored auditory features close to the human system.

Link

Human social interactions rely on multisensory cues. In this regard, visual and auditory cues are paramount during the initiation of an interaction. In this preliminary work, we propose an approach to let Intelligent Virtual Agents (IVAs) simulating sound perception capabilities. Our model targets to control IVAs' reactive behaviour through their analysis of perceived other agents' emitted sounds. For that, we explored auditory features close to the human system.

Link

Nonverbal communication is paramount in daily life, as well as in populated virtual reality (VR) environments. In this paper, we focused on gaze behaviour, which is key to initiate and drive social interactions. Previous work on photographs and on virtual agents showed the importance of gaze, even in the presence of multiple stimuli, by demonstrating the stare-in-the-crowd effect: humans detect faster and observe gazes directed towards them longer than the averted ones. While previous studies focused on static scenarios, which fail in representing the complexity of real-life social interactions, we propose to explore the stare-in-the-crowd effect in dynamic situations. To this end, we designed a within-subject experiment where 21 users navigated a virtual street through an idle or moving crowd of virtual agents. Agents’ gaze was manipulated to display averted, directed, or shifting gaze. We analysed the user’s gaze (fixations, dwell time) and locomotor behaviours (path decisions, proximity to agents) as well as their social anxiety. Results showed that the stare-in-the-crowd effect is preserved when walking through both types of crowd, and that social anxiety decreases gaze interaction time and affects proximity behaviours in case of agents with directed gazes. However, virtual agents’ gaze did not elicit significant changes on users’ locomotion. These findings highlight the importance of considering virtual agents’ gaze when creating VR environments, and open future work perspectives to better understand factors that would strengthen or decrease this effect at gaze and locomotor levels.

Link

From education to medicine to entertainment, a wide range of industrial and academic fields now utilize eXtended Reality (XR) technologies. This diversity and growing use are boosting research and leading to an increasing number of XR experiments involving human subjects. The main aim of these studies is to understand the user experience in the broadest sense, such as the user cognitive and emotional states. Behavioral data collected during XR experiments, such as user movements, gaze, actions, and physiological signals constitute precious assets for analyzing and understanding the user experience. While they contribute to overcome the intrinsic flaws of explicit data such as post-experiment questionnaires, the required acquisition and analysis tools are costly and challenging to develop, especially for 6DoF (Degrees of Freedom) XR experiments. Moreover, there is no common format for XR behavioral data, which restrains data-sharing, and thus hinders wide usages across the community, replicability of studies, and the constitution of large datasets or meta-analysis. In this context, we present PLUME, an open-source software toolbox (PLUME Recorder, PLUME Viewer, PLUME Python) that allows for the exhaustive record of XR behavioral data (including synchronous physiological signals), their offline interactive replay and analysis (with a standalone application), and their easy sharing due to our compact and interoperable data format. We believe that PLUME can greatly benefit the scientific community by making the use of behavioral and physiological data available for the greatest, contributing to the reproducibility and replicability of XR user studies, enabling the creation of large datasets, and contributing to a deeper understanding of user experience.

Link