"A year spent in artificial intelligence is enough to make one believe in God."
Alan Perlis
Hi there! I am Zhiwei Jia. You can also call me Sean. I am an ML researcher/engineer at Zoom, working on image & video generation models for its GenAI products. I obtained my Ph.D. at UC San Diego, focusing on Embodied AI (i.e., multimodal AI agent). I was working with Prof. Hao Su and Prof. Zhuowen Tu.
Selected Work Experience
ML Researcher/Engineer @ Zoom | 2023/11~
Image and video generation with diffusion models and LLMs.
Research Intern @ Google | 2022/6~9
VLM fine-tuning for image ad understanding.
Research Intern @ Amazon | 2021/6~9
Indoor scene and human instruction understanding with multimodal Transformers.
Research Intern @ Google X | 2020/6~9
Image generation via GANs with applications to sim-to-real domain adaptation.
ML Engineer Intern @ Quora | 2019/6~9
LLM fine-tuning for fine-grained text understanding.
Selected Publications (full list here)
Multimodal Understanding & Generation
Reward Fine-Tuning Two-Step Diffusion Models via Learning Differentiable Latent-Space Surrogate Reward (under review) [page]
Z. Jia, Y. Nan, H. Zhao, G. LiuMetaCLUE: Towards Comprehensive Visual Metaphors Research (CVPR 2023) [page]
A. Akula, B. Driscoll, P. Narayana, S. Changpinyo, Z. Jia, S. Damle, G. Pruthi, S. Basu, L. Guibas, W. Freeman, Y. Li, V. JampaniKAFA: Rethinking Image Ad Understanding with Knowledge-Augmented Feature Adaptation of Vision-Language Models (ACL 2023) [arXiv]
Z. Jia, B. Yuan, K. Wang, H. Wu, D. Clifford, Z. Yuan, H. SuSemantically Robust Unpaired Image Translation for Data with Unmatched Semantics Statistics (ICCV 2021) [arXiv]
Z. Jia, B. Yuan, K. Wang, H. Wu, D. Clifford, Z. Yuan, H. Su
AI Agent & Sequential Decision-Making
Chain-of-Thought Predictive Control (ICML 2024) [page]
Z. Jia, V. Thumuluri, F. Liu, L. Chen, Z. Huang, H. SuLearning to Act with Affordance-Aware Multimodal Neural SLAM (IROS 2022) [arXiv]
Z. Jia, K. Lin, Y. Zhao, Q. Gao, G. Thattai, G. SukhatmeImproving Policy Optimization with Generalist-Specialist Learning (ICML 2022) [arXiv]
Z. Jia, X. Li, Z. Ling, S. Liu, Y. Wu, H. SuRefactoring Policy for Compositional Generalizability using Self-Supervised Object Proposals (NeurIPS 2020) [page]
T. Mu, J. Gu, Z. Jia, H. Tang, H. Su
Email: sean.jia.z.w 📞 gmail.com (replaced with "@")
LinkedIn / Google Scholar / Github / X (Twitter)