We’re looking for an experienced Data Scientist with a strong background in Prompt Engineering and Generative AI to join our cutting-edge AI research team for a limited-term project. In this role, you’ll be instrumental in advancing our Retrieval-Augmented Generation (RAG) capabilities, fine-tuning language model prompts, and implementing intelligent systems that generate insights from diverse data sources. This is a perfect opportunity for someone who enjoys experimentation and is passionate about expanding the frontiers of large language model applications.
Core Responsibilities
- Design and refine prompt strategies to improve the quality, precision, and value of AI-generated content.
- Develop and manage RAG workflows leveraging Atlas Vector Search, SQL-based databases, and proprietary data.
- Build and evaluate ML models with a focus on retrieval-based frameworks and agent-like architectures.
- Plan and execute A/B testing and empirical studies to assess and optimize prompt performance.
- Monitor user engagement and system metrics to guide iterative improvements.
- Work closely with teams across product, engineering, and content to ensure AI solutions align with strategic goals.
- Develop reproducible machine learning pipelines using Docker, supporting production-level deployment.
- Build intuitive dashboards and reporting tools to showcase the performance and business impact of AI solutions.
Required Skills & Experience
- Demonstrated experience crafting and tuning prompts for large language models (LLMs), especially those in the GPT-4 era.
- Practical knowledge of RAG architecture and semantic search with Atlas Vector Search.
- Strong Python skills for scripting, data manipulation, and model development.
- Solid understanding of relational databases such as PostgreSQL or MySQL for data querying and integration.
- Proficiency with Docker for creating portable and consistent ML environments.
- Familiarity with designing data-driven experiments and interpreting results effectively.
- Excellent written and verbal communication skills, with the ability to explain complex AI concepts in a clear, accessible way.
Bonus Skills
- Experience with Agentic AI systems or reasoning-based architectures.
- Exposure to Apache Kafka for real-time data streaming and pipeline development.
- Knowledge of Kubernetes (K8s) for scaling and orchestrating ML workloads.
- Backend development familiarity, especially using Node.js.
- Understanding of user experience (UX) principles and tools such as Streamlit, Dash, Power BI, or Tableau.