PANDORA

Diffusion Policy Learning for Dexterous Robotic Piano Playing

Yanjia Huang, Renjie Li, Zhengzhong Tu

Read Paper GitHub Repo

The pipeline begins with the robot state \(s_t\) and goal state \(g_t\) that condition a U-Net (via FiLM) to iteratively denoise an initial noisy action \(x_t\) into \(x_0\) using DDIM. The denoised action \(x_0\) is then added to the IK solver’s output \(q\) (residual combination) to form the final action, which is executed in MuJoCo. An Oracle Reward module, driven by a large language model, evaluates the performance based on style and accuracy.

PANDORA Pipeline

Abstract

We present PANDORA, a novel diffusion-based policy learning framework designed specifically for dexterous robotic piano performance. Our approach employs a conditional U-Net architecture enhanced with FiLM-based global conditioning, which iteratively denoises noisy action sequences into smooth, high-dimensional trajectories. To achieve precise key execution coupled with expressive musical performance, we design a composite reward function that integrates task-specific accuracy, audio fidelity, and high-level semantic feedback from a large language model (LLM) oracle. The LLM oracle assesses musical expressiveness and stylistic nuances, enabling dynamic, hand-specific reward adjustments. Further augmented by a residual inverse-kinematics refinement policy, PANDORA achieves state-of-the-art performance in the ROBOPIANIST environment, significantly outperforming baselines in both precision and expressiveness. Ablation studies validate the critical contributions of diffusion-based denoising and LLM-driven semantic feedback in enhancing robotic musicianship.

Results & Performance

PANDORA outperforms baselines in both precision and expressiveness within the ROBOPIANIST environment.

Results Graph

Demo

Watch the demonstration of PANDORA's robotic piano playing in action.

PANDORA (Ours)

Baseline (PianoMime)

Viva La Vida by Coldplay

Last Christmas by Wham!

Adieu by Rammstein

Lose Yourself by Eminem

Arcade by Duncan Laurence

Na Zare by Alliance