Senior AI Lead, Multimodal Systems
Salary: Salary isn't listed publicly — we'll share the details once we've confirmed your profile is a genuine fit for the role.
Track: Senior IC with a clear path to Chief AI Officer (CAIO) AI domains: LLM · Image Gen · Video Gen · Sound & TTS · Agents · Model Training · Fine-Tuning · Infrastructure · Evaluation · Data Pipelines Languages: Japanese required (business level); business-level English or Mandarin also required
About the role
We are hiring a Senior AI Lead — a technically exceptional generalist who can architect and orchestrate the full spectrum of AI modalities needed to bring a living, breathing companion game to life. This is not a narrow specialist role: you will own the strategy and execution across language, vision, audio, motion, and agent behavior, making them work as a seamless, immersive whole.
This is one of the most senior and consequential roles at Gazai. You will define the AI architecture of Anini, lead a growing AI team, and sit on the technology leadership team with a clear path toward Chief AI Officer. You will shape not just what we build, but how we think about AI at Gazai.
Responsibilities
- Architect and lead the multimodal AI system powering Anini — integrating LLM dialogue, image generation, video synthesis, sound and voice, and autonomous agents into a cohesive, real-time companion experience
- Define and own the long-term AI roadmap across all modalities; translate product vision into concrete AI research and engineering priorities
- Lead model selection, fine-tuning, and post-training across domains — including character-consistent image generation, expressive TTS, story-aware LLMs, and behavioral agents
- Design and oversee agent architectures enabling proactive, autonomous companion behavior: planning, memory, tool use, and real-world integrations (social media, smart home, IoT)
- Establish evaluation frameworks and quality standards across all AI outputs — latency, coherence, visual consistency, emotional expressiveness, and safety
- Build and manage scalable AI infrastructure: model serving, data pipelines, training compute, and cost optimization
- Grow and mentor the AI team; set engineering culture and best practices across the function
- Collaborate across product, engineering, and leadership to deliver AI innovations to customers
- Track the frontier of AI research across all relevant modalities and rapidly prototype what matters
Qualifications
- 7+ years of ML/AI engineering experience, including leadership of AI systems or teams
- Hands-on depth in at least two AI modalities (e.g. LLMs + image gen, or agents + video synthesis)
- Strong conceptual and practical understanding of modern deep learning — transformers, diffusion models, autoregressive generation
- Experience fine-tuning or post-training large models (RLHF, DPO, LoRA, etc.)
- Experience designing and shipping agentic systems using frameworks such as LangGraph, AutoGen, CrewAI, or custom-built architectures
- Proficiency in Python; comfort with model serving infrastructure (e.g. vLLM, Triton, Ray Serve)
- Strong instinct for system design: latency, reliability, and cost tradeoffs at scale
- Ability to lead cross-functional AI projects and communicate clearly across research, engineering, and product
- Japanese required (business level); business-level proficiency in English or Mandarin also required
- Eagerness to stay at the frontier — fast learner with strong research literacy
Bonus — you will stand out if…
- You have experience with character-consistent image or video generation, style LoRAs, or anime/illustration-specific fine-tuning
- You have shipped real-time or low-latency multimodal pipelines in a consumer product context
- You have experience with voice synthesis, expressive TTS, or sound generation AI
- You have research publications or open-source contributions in generative AI, agents, or multimodal systems
- You have built evaluation infrastructure for generative AI (human evals, automated evals, red-teaming)
- You have experience with vector databases (Pinecone, Qdrant, Chroma) and retrieval-augmented systems
- You have prior experience in the game, entertainment, or interactive media AI space