Dynamic Grasping with a Learned
Meta-Controller

*Equal Contribution    1Columbia University    2University of Pennsylvania

What is the problem? Grasping moving objects is a challenging task that requires multiple submodules such as object pose predictor, arm motion planner, etc. Each submodule operates under its own set of meta-parameters. For example, how far the pose predictor should look into the future (i.e., look-ahead time) and the maximum amount of time the motion planner can spend planning a motion (i.e., time budget). Many previous works assign fixed values to these parameters; however, at different moments within a single episode of dynamic grasping, the optimal values should vary depending on the current scene.

tasks

What is our approach? In this work, we propose a dynamic grasping pipeline with a meta-controller that controls the look-ahead time and time budget dynamically. We learn the meta-controller through reinforcement learning with a sparse reward.


How well does our approach work? Our experiments show the meta-controller improves the grasping success rate (up to 28% in the most cluttered environment) and reduces grasping time, compared to the strongest baseline. Our meta-controller learns to reason about the reachable workspace and maintain the predicted pose within the reachable region. In addition, it assigns a small but sufficient time budget for the motion planner. Our method can handle different objects, trajectories, and obstacles. Despite being trained only with 3-6 random cuboidal obstacles, our meta-controller generalizes well to 7-9 obstacles and more realistic out-of-domain household setups with unseen obstacle shapes.

Experimental Setups

We design 4 setups and each combines different trajectories and obstacles. We train our meta-controller only in the 3-6 Random Blocks setup; however, we evaluate the meta-controller in all 4 setups. The Household and Cluttered Household setups are designed to test the generalization ability of our meta-controller to out-of-domain scenes.

tasks

Demo Highlights

We show that our meta-controller, despite only being trained with 3-6 obstacles, can successfully generalize to 7-9 obstacles and to more realistic environments with unseen obstacle shapes that mimic warehouse and household scenarios. With these obstacles, the environment can be extremely cluttered.

3-6 Random Blocks
7-9 Random Blocks
Household Setup
Cluttered Household Setup

Comparison and Analysis 🧐


What does our meta-controller learn?
  • It can reason about the reachable workspace and through dynamically controlling the look-ahead time and time budget, it maintains the predicted pose and the planned motion within the most reachable region.
  • It learns to generate a small look-ahead time when the predicted trajectory is not accurate.
  • It learns to produce a small but sufficient time budget for motion planning.


Meta-controller VS Grid-search in Household


Meta-controller VS Grid-search in Cluttered Household


Supplementary Video

Citation


        @article{yjia2023MetaController,
          title = {Learning a Meta-Controller for Dynamic Grasping},
          author = {Jia, Yinsen and Xu, Jingxi and Jayaraman, Dinesh and Song, Shuran},
          publisher = {arXiv},
          year = {2023},
        }