We evaluate MoCam-VR by completing all ten tasks from the Telekinesis benchmark. Each task is executed ten times consecutively, and the success rate (SR) is recorded. Our results highlight MoCam-VR’s exceptional performance among vision-based low-cost teleoperation systems. As shown in the table above, MoCam-VR achieves a perfect success rate (10/10) in eight out of the ten tasks, outperforming both Telekinesis and AnyTeleop in complex tasks such as Box Rotation, Cup Stack, and Open Drawer & Pickup Cup.
MoCam-VR demonstrates consistent superiority in tasks requiring precise dexterity and stability, such as Scissor Pickup and Two Cup Stacking, where even a minor error in hand reconstruction could lead to failure. Notably, in the Open Drawer task, MoCam-VR achieved a perfect score despite the added complexity of requiring the drawer to be opened by grasping the handle, compared to AnyTeleop’s simpler approach.
Compared to AnyTeleop, which relies solely on fixed RGB or RGB-D cameras, MoCam-VR’s hand-mounted camera maintains an optimal viewpoint, minimizing finger occlusion and enabling more accurate hand pose estimation. In tasks prone to environmental occlusion, such as Open Drawer & Pickup Cup, MoCam-VR achieves a 100% success rate due to the additional perspectives provided by multiple cameras, offering a clearer view of objects and enabling more precise grasping.