📝 Publications
A full publication list is available on my Google Scholar page.
(*: Equal contribution; †: Corresponding authors.)
🎬 Video Generation, World Model & Multimodal Model

[arXiv 2026 Kling Team] OmniDirector: General Multi-Shot Camera Cloning without Cross-Paired Data
Jiwen Liu, Shuo Li, Zhang Fang, Xiang Li, Yitong Zhou, Zijie Meng, et al.
- We propose OmniDirector, a general framework for multi-shot camera cloning that operates without the need for cross-paired data, significantly advancing camera control in video synthesis.

[arXiv 2026] ARGUS: Stacked Multi-View Identity Mosaic Injection for Subject-Preserving Video Generation
Zijie Meng, Jiwen Liu, Yu Liu, Chen Tong, Xiao Liu, Yingya Zhang, Yong Xu, Pengfei Wan.
[Code] (Internal/Coming soon)
- We propose ARGUS, a novel framework for subject-preserving video generation using stacked multi-view identity mosaic injection, ensuring high fidelity and temporal consistency.
- This work was conducted during my internship at Kuaishou Kling, focusing on controllable identity injection in foundation video models.

[ICASSP 2026 oral] Make a Game: A Novel Paradigm for Interactive Game Rendering
Zijie Meng, Jian Che, Bo Wei, Xuesong Cao.
[Award] First Prize in PKU Challenge Cup
- We introduce a novel paradigm for interactive game rendering using unified tokens and lightweight plugins, enhancing controllability in video generation.
- Successfully generalized to complex interactive game scenarios, providing a bridge between generative AI and real-time game engines.
Dataset

[NeurIPS 2026] 3d-rad: A Comprehensive 3d Radiology Med-vqa Dataset with Multi-temporal Analysis and Diverse Diagnostic Tasks
X Gai, J Liu, Y Li, Zijie Meng, J Wu, Z Liu.
- We introduce 3D-RAD, the most comprehensive 3D radiology dataset for Medical VQA, supporting multi-temporal analysis and diverse clinical diagnostic tasks.
🎨 Image-Generation & Restoration & Segmentation

[Science China Info. Sci. 2025] Orpaint: A Zero-Shot Inpainting Model for Oracle Bone Inscription Rubbings with Visual Mamba Block
Zijie Meng, Yuer Zeng, Xiang Chang, Tianyang Xu, Fei Chao, Xuesong Cao, Chun Chen, Qiang Shen.
[Journal] JCR-Q1, CCF-A
- We propose Orpaint, the first zero-shot inpainting model specifically designed for Oracle Bone Inscription (甲骨文) restoration.
- By integrating the Visual Mamba Block into the Diffusion denoising network, we achieve significantly faster inference and better structural restoration for damaged ancient rubbings.

[ACM MM 2025] Robust Single Image Sand Removal by Leveraging Uncertainty-aware SAM Priors and Prompt Learning with Refined Perceptual Loss
Bo Wei, Huafeng Liu, Cheng Qian, Zizheng Li, Wenbo Wu, Zijie Meng.
CCF-A Conference
- We address the challenging task of sand-dust image restoration by leveraging uncertainty-aware SAM (Segment Anything Model) priors and prompt learning.
- My contribution focused on the Llama3 fine-tuning for generating refined perceptual instructions.

[ICME 2026 Spotlight] Decoupling Semantics from Distortions: Multi-Scale Two-Stream Vision-Language Alignment for AI-Generated Image Quality Assessment
Zijie Meng.
CCF-B | GitHub
- We propose a multi-scale two-stream vision-language alignment framework that decouples semantic understanding from distortion perception for robust AI-generated image quality assessment.
- My contribution focused on the overall framework design and vision-language alignment strategy.

[MICCAI 2025] SynPo: Boosting Training-Free Few-Shot Medical Segmentation via High-Quality Negative Prompts
Y Liu, H Xiao, J Chai, Y Zhang, R Wang, Zijie Meng, Z Luo.
CCF-B Conference / Medical AI Top Conference
- We propose SynPo, which boosts training-free medical image segmentation by utilizing high-quality negative prompts to refine few-shot boundary detection.