Yi Liu
Yi Liu (刘熠)

Google Scholar     CV   

AIGC Algorithm Engineer at ByteDance

Email: yiliu61richard@gmail.com

Research Interests: Multimodal LLMs, Multimodal Data Synthesis, Long Video Understanding, Temporal Action Detection

About Me

I am currently an AIGC Algorithm Engineer at ByteDance (Commercial AI-AIGC), where I lead the development of advertising creative-material Agents — spanning data construction, model training, and product deployment. Before that, I led the Multimodal Understanding & Generation Group at Honor Device Co., Ltd. from 2024 to 2025, managing a team of 10+ engineers on on-device Vision-Language Models. I received my Ph.D. degree at MMLab@SIAT, University of Chinese Academy of Sciences (UCAS), supervised by Prof. Yu Qiao and Prof. Yali Wang in 2024. I was also a research intern at Shanghai AI Laboratory from 2022 to 2023. I received my B.Eng. degree from Huazhong University of Science and Technology (HUST), Wuhan, China, in 2019.

Publications

LvBench: A Benchmark for Long-form Video Understanding with Versatile Multi-modal Question Answering, International Journal of Computer Vision, 2025 (IJCV, 中科院1区, IF=9.3, 共一第3)
MLLM-TA: Leveraging Multimodal Large Language Models for Precise Temporal Video Grounding, IEEE Signal Processing Letters, 2024 (SPL, 中科院2区, IF=3.9, 第1作者)
MVBench: A Comprehensive Multi-modal Video Understanding Benchmark, Computer Vision and Pattern Recognition, 2024 (CVPR, CCF-A会议, 第6作者)
F2S-Net: Learning Frame-To-Segment Prediction for Online Action Detection, Journal of Real-Time Image Processing, 2024 (JRTIP, 中科院3区, IF=3.0, 第1作者)
Dual masked modeling for weakly-supervised temporal boundary discovery, IEEE Transactions on Multimedia, 2023 (TMM, 中科院1区, IF=9.7, 共一第2)
Learning Discriminative Feature Representation for Open Set Action Recognition, ACM International Conference on Multimedia, 2023 (ACM MM, CCF-A会议, 共一第2)
FineAction: A Fine-Grained Video Dataset for Temporal Action Localization, IEEE Transactions on Image Processing, 2022 (TIP, 中科院1区, IF=13.7, 第1作者)
VideoPipe 2022 Challenge: Real-World Video Understanding for Urban Pipe Inspection, International Conference on Pattern Recognition, 2022 (ICPR, CCF-C会议, 第1作者)

Experience

Work Experience

  • ByteDance — Commercial AI-AIGC, China Commercialization & Advertising. Creative-Material Agent Lead (AIGC Algorithm Engineer). Nov 2025 – Present
  • Honor Device Co., Ltd. — LLM Capability Platform. Lead, Multimodal Understanding & Generation Group, managing a team of 10+ engineers. Apr 2024 – Oct 2025
  • Shanghai AI Laboratory — General Vision Group. Research Intern. Mar 2022 – Dec 2023
  • Honors & Awards

    Professional

  • 2025 — Shining Star Award, Honor AI Platform Department
  • 2024 — AI Multimodal Breakthrough Team Award, Honor R&D Management
  • Ph.D. @ Chinese Academy of Sciences / UCAS

  • 2023 — Outstanding Merit Student Pacesetter, UCAS
  • 2022 — 1st Place (×2), ECCV Ego4D Episodic Memory Challenge — Looking At Me Track & Moments Queries Track
  • 2020–2022 — President's Outstanding Award (×3), SIAT, Chinese Academy of Sciences
  • 2020 — Merit Student, UCAS
  • 2019 — First Prize, Huike Cup AI Application Innovation Challenge
  • Undergraduate @ HUST

  • 2019 — Outstanding Graduate, HUST
  • 2018 — National Second Prize & Southern-Region Runner-up, ABU Robocon
  • 2018 — Provincial First Prize, National Mechanical Innovation Design Contest
  • 2017 — Provincial Second Prize, “Challenge Cup” National Student Academic Competition
  • 2017 — Provincial Second Prize, China Undergraduate Mathematical Contest in Modeling
  • 2017 — Meritorious Winner (Honorable Mention), Mathematical Contest in Modeling (MCM/ICM)
  • Workshops & Challenges

  • Student organizer of ECCV 2022 DeeperAction Challenge, Track 1: Temporal Action Localization
  • Student organizer of ICPR 2022 VideoPipe Challenge, Track 2: Temporal Defect Localization
  • Student organizer of ICCV 2021 DeeperAction Challenge, Track 1: Temporal Action Localization