MotionAdapter: Video Motion Transfer via Content-Aware Attention Customization

  • Zhexin Zhang
    Hangzhou Dianzi University
     
  • Yifeng Zhu
    Harbin Institute of Technology (Shenzhen)
     
  • Yangyang Xu
    Harbin Institute of Technology (Shenzhen)
     
  • Long Chen
    The Hong Kong University of Science and Technology
     
  • Yong Du
    Ocean University of China
     
  • Shengfeng He
    Singapore Management University
     
  • Jun Yu
    Harbin Institute of Technology (Shenzhen)
     

Abstract

Recent advances in diffusion-based text-to-video models, particularly those built on the diffusion transformer architecture, have achieved remarkable progress in generating high-quality and temporally coherent videos. However, transferring complex motions between videos remains challenging. In this work, we present MotionAdapter, a content-aware motion transfer framework that enables robust and semantically aligned motion transfer within DiT-based T2V models. Our key insight is that effective motion transfer requires \romannumeral1) explicit disentanglement of motion from appearance and \romannumeral 2) adaptive customization of motion to target content. MotionAdapter first isolates motion by analyzing cross-frame attention within 3D full-attention modules to extract attention-derived motion fields. To bridge the semantic gap between reference and target videos, we further introduce a DINO-guided motion customization module that rearranges and refines motion fields based on content correspondences. The customized motion field is then used to guide the DiT denoising process, ensuring that the synthesized video inherits the reference motion while preserving target appearance and semantics. Extensive experiments demonstrate that MotionAdapter outperforms state-of-the-art methods in both qualitative and quantitative evaluations. Morever, MotionAdapter naturely support complex motion transfer and motion editing tasks such as zooming.

Video Results from MotionAdapter Main Paper


Figure 1 Samples

carousel
Prompt: Race car drifting in circles in a parking lot
Reference Video
Reference Video
SMM
SMM
MOFT
MOFT
MotionClone
MotionClone
MotionInversion
MotionInversion
DiTflow
DiTflow
FastVMT
FastVMT
RoPECraft
RoPECraft
DeT
DeT
MotionAdapter
MotionAdapter
motocross-bumps
Prompt: Woman on water slide goes up and down, in the air, aerial view
Reference Video
Reference Video
SMM
SMM
MOFT
MOFT
MotionClone
MotionClone
MotionInversion
MotionInversion
DiTflow
DiTflow
FastVMT
FastVMT
RoPECraft
RoPECraft
DeT
DeT
MotionAdapter
MotionAdapter
blackswan
Prompt: A paper boat floating in a bathtub
Reference Video
Reference Video
SMM
SMM
MOFT
MOFT
MotionClone
MotionClone
MotionInversion
MotionInversion
DiTflow
DiTflow
FastVMT
FastVMT
RoPECraft
RoPECraft
DeT
DeT
MotionAdapter
MotionAdapter
bus
Prompt: Closeup aerial view of an ant crawling in a desert
Reference Video
Reference Video
SMM
SMM
MOFT
MOFT
MotionClone
MotionClone
MotionInversion
MotionInversion
DiTflow
DiTflow
FastVMT
FastVMT
RoPECraft
RoPECraft
DeT
DeT
MotionAdapter
MotionAdapter


Figure 5 Samples

Prompt: Dog walking on the grass field
Reference
Video
MotionAdapter
w/o Motion Customization
MotionAdapter
w/ Motion Customization



Figure 6 Samples

flamingo
Prompt: A swan drinking water from a puddle
Reference Video
Reference Video
SMM
SMM
MOFT
MOFT
MotionClone
MotionClone
MotionInversion
MotionInversion
DiTflow
DiTflow
FastVMT
FastVMT
RoPECraft
RoPECraft
DeT
DeT
MotionAdapter
MotionAdapter
goat
Prompt: Jaguar walking in a snowy forest
Reference Video
Reference Video
SMM
SMM
MOFT
MOFT
MotionClone
MotionClone
MotionInversion
MotionInversion
DiTflow
DiTflow
FastVMT
FastVMT
RoPECraft
RoPECraft
DeT
DeT
MotionAdapter
MotionAdapter
dogs-jump
Prompt: Goat jumping between rocks in a rocky canyon with steep cliffs
Reference Video
Reference Video
SMM
SMM
MOFT
MOFT
MotionClone
MotionClone
MotionInversion
MotionInversion
DiTflow
DiTflow
FastVMT
FastVMT
RoPECraft
RoPECraft
DeT
DeT
MotionAdapter
MotionAdapter


Figure 7 Samples

Prompt: Biker riding past and doing a jump trick in the air
Reference Video
MotionAdapter
Prompt: Leopard running up a snowy hill in a forest
Reference Video
MotionAdapter


Figure 8 Samples

Prompt: A woman walking dog, side view.
Reference Video 1
Reference Video 2
Woman-Reference1; Dog-Reference2
Woman-Reference2; Dog-Reference1
Prompt: A robot walking in the park, side view.
Reference Video
Zoom in reference motion
Zoom out reference motion

Mentioned in Supplementary


Comparison of different backbones

Prompt: Driving motorcycle through cityscape, first person perspective
Reference Video
MotionAdapter (CogvideoX 2B-T2V)
MotionAdapter (CogvideoX 5B-T2V)
MotionAdapter (CogvideoX 5B-I2V)
Prompt:A pair of zebras walking in the savannah
Reference Video
MotionAdapter (CogvideoX 2B-T2V)
MotionAdapter (CogvideoX 5B-T2V)
MotionAdapter (CogvideoX 5B-I2V)
Prompt: Lion chasing geese in a park
Reference Video
MotionAdapter (CogvideoX 2B-T2V)
MotionAdapter (CogvideoX 5B-T2V)
MotionAdapter (CogvideoX 5B-I2V)



Additional Comparison Sample

Prompt: Aerial view of red ferrari driving on a street
Reference Video
Reference Video
MSE
MSE
SMM
SMM
MOFT
MOFT
MotionClone
MotionClone
MotionInversion
MotionInversion
DiTflow
DiTflow
DeT
DeT
MotionAdapter
MotionAdapter

Ablation Study Part


Ablation Study of Motion Refinement Module & Motion Customzation Module

The person jump back in the "w/o Motion Customzation" and "w/o Motion Refinement" video
Prompt: Parkour runner vaulting over walls in a neighbourhood, side view
Reference Video
w/o Motion Customzation
w/o Motion Refinement
MotionAdapter
The monkey have not jump twice in the "w/o Motion Customzation" and "w/o Motion Refinement" video
Prompt: Monkey running and jumping over bushes in the jungle, side view
Reference Video
w/o Motion Customzation
w/o Motion Refinement
MotionAdapter

Ablation Study of Top-K Paramater & Guidance Block (Supplementary)

motocross-bumps
Prompt: BMX biker jumps over a hill on mountain bike
Reference Video
Top-K k=1
top1
Top-K k=10
top10
7th Block
layer6
36th Block
layer35
MotionAdapter
dog-gooses
Prompt: Dog chasing geese in a park
Reference Video
Top-K k=1
top1
Top-K k=10
top10
7th Block
layer6
36th Block
layer35
MotionAdapter
motorbike
Prompt: Cheetah running quickly in the savannah, side view
Reference Video
Top-K k=1
top1
Top-K k=10
top10
7th Block
layer6
36th Block
layer35
MotionAdapter