Paper List
A curated collection of influential research papers in robotics, computer vision, and machine learning
OpenVLA系列工作#
OpenVLA: An Open-Source Vision-Language-Action Model
具身操作VLA foundation model
2024-06
Fine-Tuning Vision-Language-Action Models: Optimizing Speed and Success
具身操作VLA foundation model
2025-02-01
RDT系列工作#
RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation
双臂协同操作foundation model
2024-10
H-RDT: Human Manipulation Enhanced Bimanual Robotic Manipulation
面向更加数据高效的双臂协同操作foundation model
2024-10
TikTok GR系列工作#
UNLEASHING LARGE-SCALE VIDEO GENERATIVE PRE-TRAINING FOR VISUAL ROBOT MANIPULATION
字节跳动提出的基于大规模视频预训练模型
2023-12
GR-2: A Generative Video-Language-Action Model with Web-Scale Knowledge for Robot Manipulation
字节跳动提出的基于大规模视频预训练模型
2024-10
GR-3 Technical Report
字节跳动提出的基于大规模视频预训练模型
2025-07-01
Google-Research RT系列工作#
RT-1: Robotics Transformer for Real-World Control at Scale
RT系列VLA关键工作
2022-12
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
RT系列VLA关键工作
2023-07
PaLM-E系列工作#
PaLM-E: An Embodied Multimodal Language Model
PaLM-E系列关键工作
2023-03
Meta-AI系列工作#
R3M: A Universal Visual Representation for Robot Manipulation
Meta-AI系列关键工作
2022-03
π系列工作#
π0: A Vision-Language-Action Flow Model for General Robot Control
PI系列VLA关键工作
2024-10
π0.5: a Vision-Language-Action Model with Open-World Generalization
PI系列VLA关键工作
2024-10
Being-Beyond系列工作#
Being-H0: Vision-Language-Action Pretraining from Large-Scale Human Videos
通过现有的大规模数据,构建具身操作的foundation model
2025-07
Being-0: A Humanoid Robotic Agent with Vision-Language Models and Modular Skills
通过现有的大规模数据,构建具身操作的foundation model
2025-03
Agibot系列工作#
AgiBot World Colosseo: A Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems
Agibot系列关键工作
2025-03
Genie Envisioner: A Unified World Foundation Platform for Robotic Manipulation
Agibot系列关键工作
2025-08
Octo系列工作#
Octo: An Open-Source Generalist Robot Policy
Octo系列关键工作
2024-05
Embodied-R1 Series#
Embodied-R1: Reinforced Embodied Reasoning for General Robotic Manipulation
参照R1的训练方法,进行Embodied Reasoning
2025-08
星海图系列工作#
Galaxea Open-World Dataset & G0 Dual-System VLA Model
星海图首个双系统VLA模型和开源数据集
2025-08
1X系列工作#
1X World Model: Evaluating Bits, not Atoms
1X系列世界模型
2025-08
NVIDIA GR00T系列工作#
GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
NVIDIA GR00T系列首个工作,实现了大小脑模型架构
2025-03
GR00T N1.5: An Improved Open Foundation Model for Generalist Humanoid Robots
NVIDIA GR00T系列关键工作
2025-06
Last updated: Aug.21 2025