Introduction: Lei Jun’s Humanoid Robot Dream Finally Beyond PowerPoint

On June 8, Xiaomi did something big—the official release of CyberOne Gen 2. Standing 170cm tall, walking at 3km/h, powered by the self-developed MiMo embodied large model, breaking through 40% success rate as the only competitor at CVPR 2026 real-machine manipulation track, securing dual championships at ICRA 2026 full-body control track. The live demo even showed it holding a Xiaomi 17T Pro to take photos—not staged, but real-machine operation.
More critically: it has already started working at Xiaomi’s automotive factory. This is not a concept machine, not a laboratory toy, but a genuine “worker” that can tighten screws, move parts, and take quality inspection photos.
Xiaomi’s journey in humanoid robotics, counting from the 2022 debut of CyberOne Gen 1 “Tie Da,” has spanned four years. Gen 1 was technology validation; Gen 2 is scenario deployment. Lei Jun says humanoid robots are the final piece of Xiaomi’s “human-car-home full ecosystem” puzzle—phones manage people, cars manage mobility, robots manage the home. Now, this piece is finally starting to fit.
Product Overview: Evolution from “Tie Da” to “Worker”
The core upgrade of CyberOne Gen 2 is not height or weight, but the “brain”—the MiMo embodied large model.
What is MiMo? Xiaomi’s self-developed embodied intelligence large model. In February 2026, Xiaomi open-sourced the first-generation VLA large model Xiaomi-Robotics-0 (4.7 billion parameters, “brain + cerebellum” hybrid architecture), followed by the MiMo-V2.5 series in March. MiMo’s core capability bridges “language understanding” and “physical operation”: when you say “tighten that screw to the specified torque,” MiMo can decompose this into a physical action chain of “locate screw → grab tool → align with hole → rotate → detect torque → stop.”
The Weight of CVPR 2026 Dual Championships: CVPR is the top conference in computer vision. The real-machine manipulation track requires robots to complete grasping, placing, and tool usage tasks in real environments. CyberOne Gen 2 was the only competitor to break through 40% success rate—meaning in complex, dynamic, unstructured real environments, it can stably complete nearly half of the operation tasks. By comparison, the 2025 champion’s success rate was still around 25%.
ICRA 2026 Full-Body Control Dual Championships: ICRA is the top conference in robotics. The full-body control track tests a robot’s ability to walk, balance, and avoid obstacles while simultaneously performing upper-body operations. CyberOne Gen 2 can tighten screws while walking, carry boxes while avoiding obstacles—this “multi-task parallel” capability is the core requirement of industrial scenarios.
Live Demo of Holding Phone for Photography: This action seems simple but is actually extremely difficult. Phones are smooth, fragile, irregular objects; grip strength requires precise control (too light and it drops, too heavy and it breaks), and photography requires stable posture. CyberOne Gen 2’s ability to complete this action demonstrates that its tactile perception and force control precision have reached commercial levels.
Specifications: How Much Technology Fits in a 170cm Body
| Spec | Details |
|---|---|
| Height | 170cm |
| Walking Speed | 3km/h |
| Drive Model | MiMo embodied large model |
| AI Architecture | VLA (Vision-Language-Action) |
| Open-Source Model | Xiaomi-Robotics-0 (4.7 billion parameters) |
| Vision System | Mi-Sense 3.0 (3D spatial perception) |
| Upper Body DoF | 21+ |
| Hand | Dexterous hand (tactile sensors) |
| Lower Body | Bipedal walking |
| Battery | Unannounced (Gen 1 reference ~3 hours) |
| Weight | ~52kg (Gen 1 reference) |
| Application Scenarios | Factory assembly,quality inspection |
| Deployment Status | Already working at Xiaomi automotive factory |
Data source: Xiaomi official launch event, CVPR/ICRA 2026 papers, IT Home
CyberOne Gen 2’s hardware platform continues the first generation’s design framework but with key upgrades:
Upper Body Dexterous Hand: Gen 1’s hand was a simple gripper; Gen 2 upgrades to a multi-DoF dexterous hand with independent finger control and palm covered with tactile sensor arrays. This enables it to hold phones, tighten screws, and plug in interfaces—actions requiring “perceive → adjust → execute” closed loops rather than simple “open → close.”
Mi-Sense 3.0 Vision System: Supports real-time 3D environment reconstruction with 10x precision improvement over Gen 1. In factory environments, it can identify part positions, poses, types, and even detect surface defects (scratches, stains, deformations).
Whole-Body Control Algorithm: Based on MiMo large model’s hybrid training of reinforcement learning + imitation learning. First trained millions of times in simulation environments, then fine-tuned in real environments. The ICRA dual championships prove that this algorithm reaches internationally top-tier levels in dynamic balance and multi-task coordination.
Deep Analysis: What Can Factory Deployment Actually Do?
CyberOne Gen 2’s specific work content at Xiaomi’s automotive factory includes:
Stud Insertion / Screw Tightening: At the die-casting workshop stud insertion station, continuously operating autonomously for 3 hours with bilateral simultaneous installation success rate reaching 90.2%, meeting the fastest 76-second production line cycle requirement. This data comes from March 2026 factory testing, harder-core than launch event demos.
Part: Cooperating with AGVs (Automated Guided Vehicles), grabbing part boxes from shelves, to assembly stations, and placing them at designated positions. During, it needs to avoid workers and other equipment with dynamic path planning.
Quality Inspection Photography: Holding a phone (or dedicated camera) to photograph assembled components, with the AI vision system detecting whether they are qualified. Unqualified products are automatically marked, notifying manual re-inspection.
Simple Assembly: Aligning two parts, inserting them, and securing them. Such tasks require force control precision (preventing over-tightening or under-tightening) and visual guidance (aligning with holes).
But these tasks share a common characteristic: structured, repetitive, with clear standards. CyberOne Gen 2 currently cannot handle unstructured tasks (such as “organize those messy parts”) or respond to emergencies (such as dropped parts, broken tools). Lei Jun himself admits that humanoid robots currently belong to the “apprentice” status, not yet truly becoming “formal workers.”
Comparison: CyberOne Gen 2 vs Tesla Optimus vs Unitree G1 vs UBTECH Walker
| Feature | Xiaomi CyberOne Gen 2 | Tesla Optimus Gen 2 | Unitree G1 | UBTECH Walker S1 |
|---|---|---|---|---|
| Height | 170cm | 173cm | 127cm | 172cm |
| Weight | ~52kg | ~73kg | 35kg | ~76kg |
| Walking Speed | 3km/h | 5km/h | 2m/s | 3.5km/h |
| DoF | 21+ | 22+ | 43 | 41 |
| AI Model | MiMo (self-developed) | Tesla self-developed | Open-source/self-developed | ROSA 2.0 |
| Dexterous Hand | Yes (tactile) | Yes (11 DoF) | Optional | Yes (force control) |
| Factory Deployment | Already working (Xiaomi) | Testing phase | Research/education | Already working (BYD, etc.) |
| Price | Unannounced (target < $30,000) | Unannounced (estimated $20,000-30,000) | 99,000 yuan | Unannounced |
| Open Source | Partial (VLA model) | No | Partial | No |
| Positioning | Industrial + ecosystem | Industrial + general | Research + education | Industrial + service |
CyberOne Gen 2’s differentiation is clear: ecosystem integration. It is not the strongest (Optimus), not the most flexible (G1), not the most mature (Walker), but it is the only humanoid robot destined from birth to become part of the “human-car-home ecosystem.”
Imagine this scenario: you leave home in the morning, your Xiaomi phone automatically syncs your schedule to CyberOne; after you get in the car, navigation data syncs to CyberOne, which turns on the home AC in advance; before you get off work, CyberOne receives the car’s location and starts preparing dinner (simple operations); when you arrive home, CyberOne reports today’s household completion status and syncs inspection photos to your phone album.
This “cross-device collaboration” capability is unique to Xiaomi’s ecosystem. Other manufacturers’ humanoid robots, no matter how powerful, are “islands.”
Pros and Cons
| Pros | Cons |
|---|---|
| MiMo large model driven, internationally top-tier AI capability (CVPR + ICRA dual championships) | Factory tasks currently limited to structured, repetitive work |
| Already genuinely deployed, not a laboratory concept | Weak handling of unstructured tasks and emergencies |
| Human-car-home ecosystem integration, vast scenario imagination space | Battery life and stability require long-term validation |
| Dexterous hand + tactile sensors, strong delicate operation capability | Price unannounced, expected high initial cost |
| Open-source VLA model, developer-friendly ecosystem | Deeply bound to Xiaomi ecosystem, limited value for non-Xiaomi users |
| Full-size 170cm, more natural human-robot collaboration | 52kg weight, fall safety risks require attention |
Who Should Buy
Recommended for:
- Deep Xiaomi ecosystem users expecting full-scenario “human-car-home” linkage
- Industrial manufacturing enterprises needing structured assembly/quality inspection automation
- Tech enthusiasts following humanoid robot technology frontiers
- Investors/researchers focusing on embodied intelligence and VLA model development
Not recommended for:
- Ordinary household users (currently no consumer version, and functions not suitable for home use)
- Non-Xiaomi ecosystem users (ecosystem integration advantages cannot be leveraged)
- Users needing unstructured task processing (such as organizing clutter, caring for elderly)
- Small businesses with limited budgets (expected high initial procurement costs)
FAQ
Q: Can I buy CyberOne Gen 2 for home use? A: Currently only面向 industrial scenarios and enterprise clients, with no consumer sales plans. Xiaomi’s long-term goal is to launch home service models priced at 20,000-30,000 yuan, but the timeline is unannounced.
Q: Is the MiMo large model open-source? A: In February 2026, the first-generation VLA large model Xiaomi-Robotics-0 (4.7 billion parameters) was open-sourced, but whether the complete MiMo model running on CyberOne Gen 2 is open-source has not been officially clarified.
Q: Is factory deployment propaganda or reality? A: According to March 2026 testing data, CyberOne operated continuously for 3 hours at Xiaomi’s automotive factory die-casting workshop, achieving 90.2% bilateral nut installation success rate, meeting the 76-second cycle. This is real deployment, but scale may still be limited (hundreds of units).
Q: Who is stronger, CyberOne or Tesla Optimus? A: Each has advantages. Optimus leads in hardware performance (speed, strength), but CyberOne leads in AI models (CVPR/ICRA results) and ecosystem integration. Both are currently in industrial pilot phases, not yet mass-produced.
Q: What is the battery life? A: Official Gen 2 battery data has not been announced. Referencing Gen 1’s approximately 3 hours, Gen 2 may improve through battery optimization and energy management, but industrial scenarios typically require battery swap or charging solutions.
Conclusion
The release of Xiaomi CyberOne Gen 2 marks the Chinese humanoid robot industry’s formal transition from “laboratory showmanship” to “factory.” The CVPR and ICRA dual championships are not the endpoint but the starting point—they prove MiMo large model’s reliability in real physical environments, but the 90.2% success rate also means the remaining 9.8% of errors need to be conquered.
CyberOne Gen 2’s greatest value lies not in what it can do now, but in what it represents: the final piece of Xiaomi’s “human-car-home full ecosystem” puzzle. When phones, cars, and robots share the same AI brain (MiMo), the same data flow, and the same user profile, “intelligent living” transforms from marketing rhetoric into real experience.
But this road is still long. Large-scale engineering application of humanoid robots faces prominent challenges including poor process stability, high hardware costs, and limited workstation quantities. Lei Jun promises that within 5 years humanoid robots will be deployed in Xiaomi factories—whether this promise is fulfilled depends on MiMo model iteration speed, supply chain cost decline curves, and most importantly: whether users are willing to pay for “a walking AI assistant.”
Leave a Reply