Composing Complex Skills by Learning Transition Policies

Humans acquire complex skills by exploiting previously learned skills and making transitions between them. To empower machines with this ability, we propose a method that can learn transition policies which effectively connect primitive skills to perform sequential tasks without handcrafted rewards. To efficiently train our transition policies, we introduce proximity predictors which induce rewards gauging proximity to suitable initial states for the next skill. The proposed method is evaluated on a set of complex continuous control tasks in bipedal locomotion and robotic arm manipulation which traditional policy gradient methods struggle at.

@inproceedings{lee2019composing, title = {Composing Complex Skills by Learning Transition Policies}, author = {Youngwoon Lee and Shao-Hua Sun and Sriram Somasundaram and Edward S. Hu and Joseph J. Lim}, author = {Lee, Youngwoon and Sun, Shao-Hua and Somasundaram, Sriram and Hu, Edward S and Lim, Joseph J}, booktitle = {Proceedings of International Conference on Learning Representations}, year = {2019}, url={https://openreview.net/forum?id=rygrBhC5tQ}, }

Highlights

Training Curves

Manipulation

Locomotion

Citation

Acknowledgements