The domain of extended reality is categorized into three types of environmental realities: the real world, augmented reality (AR), and virtual reality (VR). It is possible to display information at varying distances relative to the observer within these realms. Ideally, one would assume that information is perceived at an equivalent depth across all three realities. However, perceptual studies reveal that users tend to either overestimate or underestimate the depths of virtual objects in AR and VR compared to their real-world counterparts. This discrepancy in depth perception for virtual objects is influenced by a multiple factors, including hardware characteristics (such as field of view, vergence-accommodation conflict, IPD misalignment, and others), the availability of depth cues, the quality of the graphics, individual differences among users, different measurement techniques, and numerous additional variables.
To examine egocentric stereoscopic depth perception, our laboratory has developed a custom optical see-through augmented reality (AR) display called the AR Haploscope. This specialized tabletop stereoscope display allows for adjustable accommodation, vergence, and interpupillary distance (IPD) to present controlled virtual images with varying sizes, brightness levels, and color hues. By employing this custom-designed display alongside commercially available AR-VR headsets (such as Microsoft HoloLens 2, HTC Vive Pro Eye), we conduct foundational research on the near-field egocentric stereoscopic depth perception of both real and virtual objects. Our research evaluates the effects of vergence-accommodation conflict (VAC), occlusion, brightness, among other factors, on depth perception. From the standpoint of depth perception, there is a notable absence of literature on the behavior of the human visual system when comparing real and virtual objects with varying depths. We are exploring how depth-dependent elements of the human visual system, such as vergence, pupil size, and IPD, react to depth alterations in real and virtual objects. These depth-dependent components are being measured using eye-tracking technology. This body of research is essential for many applications, such as military, medical, and maintenance. If you want to learn more about this research, please read the following papers:
Mohammed Safayet Arefin, J. Edward Swan II, Russell A. Cohen-Hoffing, and Steven M. Thurman. Estimating Perceptual Depth Changes with Eye Vergence and Interpupillary Distance using an Eye Tracker in Virtual Reality. In ACM Symposium on Eye Tracking Research and Applications (ETRA), ACM, June 2022. DOI: 10.1145/3517031.3529632. Download: [Pre-Print]
Jaya Surya Bontha and Mohammed Safayet Arefin. Effort to Replicate Custom-made Augmented Reality Haploscope. In IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct), Bellevue, WA, USA, 2024, pp. 37-41,
doi: 10.1109/ISMAR-Adjunct64951.2024.00019. Download: [Pre-Print]
Mohammed Safayet Arefin, J. Edward Swan II, Russell Cohen-Hoffing, and Steven M. Thurman. Mapping Eye Vergence Angle to the Depth of Real and Virtual Objects as an Objective Measure of Depth Perception. arXiv > Computer Science > Human-Computer Interaction > 2311.09242, Nov 2023. https://arxiv.org/abs/2311.09242
The eye vergence angle (EVA) and inter-pupillary distance (IPD) are defined according to binocular human vision when the eyes verge on a near or far target. Calculating EVA and IPD from eye tracker data using the vectors of 3D gaze direction and 3D gaze origin.
During continuous switching of eye focus from one distance to another in the augmented or virtual reality system to integrate information, users observe only one piece of information (either real or virtual) in focus. Other information becomes blurred for a concise amount of time (around 360 milliseconds). We termed this situation as "Out-of-focus."
In our analysis, we observed that participants' performance decreased in the OST AR system due to the out-of-focus issue. Therefore, it brings the importance of generating a sharper representation of the out-of-focus visual information. More specifically, virtual information must be rendered to look sharper when seen as out-of-focus or during incorrect accommodation demand. We termed this sharper rendering of the virtual information as "SharpView AR". To accomplish this, it requires the knowledge of mathematical modeling of out-of-focus blur (retinal blurred image) with Zernike polynomials, which model focal deficiencies of human vision, and developing a focus correction algorithm based on total variation optimization, which corrects out-of-focus blur. The research requires synthetic simulation, optical camera-based measurement, and used-based study for validation purposes.
If you want to learn more about this research, please read the following paper and my PhD dissertation:
Mohammed Safayet Arefin, Carlos Montalto, Alexander Plopski, and J. Edward Swan II. A SharpView Font with Enhanced Out-of-Focus Text Legibility for Augmented Reality Systems. In Proceedings of IEEE Virtual Reality and 3D User Interfaces (IEEE VR), Orlando, FL, USA, March 2024 pp. 31-41. doi: 10.1109/VR58804.2024.00027. [Pre-Print] [Video]
Mohammed Safayet Arefin. 2022. Augmented Reality Fonts with Enhanced Out-of-Focus Text Legibility. Ph.D. Dissertation. Mississippi State University, Mississippi State, Mississippi. [Download]
This video shows an example of SharpView AR. We considered textual information as the primary AR component, specifically short AR text labels. This novel AR font is termed as “Shaprview font,” and it promises to mitigate the effect of out-of-focus issues. This video was captured through the optics of the optical see-through augmented reality system. The real information (cross: X) was presented at 4.0m, and the SharpView virtual information (word: 'TEXT') was rendered at 0.25m. SharpView virtual information (pre-corrected font) was generated for the out-of-focus aberration of +4.57D and pupil diameter of 5mm. When the camera lens focused on the real "X" at 0.25m, our SharpView rendered information exhibited sharper representation and improved visual acuity, though it was out-of-focus. However, when the camera focused on our rendered virtual information, the real information (X) became blurred as it was not rendered according to our algorithm.
Head-mounted display (HMD) technology augments human capability by supplying additional information, thereby enhancing performance across diverse situations such as military operations, surgical procedures, assembly, and maintenance tasks, among others. HMD interface clutter refers to the amount of virtual information displayed on the HMD. In contrast, environmental clutter involves information density in the user's surrounding field of view. The total clutter is assessed by combining metrics from both sources. If the information in the HMD's field of view is cluttered, it may obscure the real-world scene, leading users to potentially miss important details. The HMD interface Transparency concept aims to achieve a balance between the clarity of virtual information presented on the display and the visibility of real-world scenes. Low transparency results in virtual information being prominently displayed, which might obscure real-world details. On the other hand, high transparency allows the real world to remain visible, but may reduce the clarity and legibility of virtual information. Examining the layout features and user interactions with the system is essential to tackle these challenges. Hence, we propose enhancing user performance and minimizing cognitive load by adjusting the interface transparency and clutter. We have termed this concept the Clutter and Transparency Tradeoff. Our current research is focused on evaluating the impact of head-mounted display (HMD) clutter and transparency on both human performance and cognitive load.
High HMD Display Clutter and Transparency
Low HMD Display Clutter and Transparency
In optical see-through (OST) augmented reality (AR) displays (e.g., Microsoft HoloLens, Google Glass), information is often distributed between real-world and virtual contexts and often appears at different distances from the user. Therefore, the user must repeatedly switch contexts and refocus their eyes to integrate the information. Therefore, in AR, integrating information between real and virtual contexts raises the issues of (1) context switching, where users must switch visual and cognitive attention between information sources, (2) focal distance switching, where users must accommodate (change the shape of the eye’s lens) to see, in sharp focus, information at a new distance, and (3) transient focal blur, the focal blur user perceives while switching the focal distance. The addressed problems involve the display’s optical design and how it interacts with human perception and vision. If these problems are not handled properly, users can suffer visual fatigue and incorrect distance perception, leading to reduced performance. Currently, we are investigating the impact of these AR interface issues on human performance and eye fatigue in indoor and outdoor settings. These issues impact many OST AR applications, including medical procedures, battlefield and tactical awareness applications, and heads-up displays in cars and aircraft, among others. If you want to learn more about this research, please read the following paper:
Mohammed Safayet Arefin, Nate Phillips, Alexander Plopski, Joseph L. Gabbard, and J. Edward Swan II. The Effect of Context Switching, Focal Switching Distance, Binocular and Monocular Viewing, and Transient Focal Blur on Human Performance in Optical See-Through Augmented Reality. IEEE Transactions on Visualization and Computer Graphics, Special Issue on IEEE Virtual Reality and 3D User Interfaces (VR 2022), 28(5):2014–2025, Mar 2022. DOI: 10.1109/TVCG.2022.3150503. Download: [Pre-Print] [Appendix]
Example of perceptual focus issues in the AR system. When the user focus is on the background, the AR symbology becomes blurry, and when the user focus is on the AR symbology, the background becomes blurry. The reason is that both AR symbology and the real-world background are at different focal distances. Frames are taken from a YouTube video showing the Google Glass AR display in daily use.