Graduate degree in a quantitative field (CS, statistics, applied mathematics, machine learning, or related discipline)
• Good programming skills in Python with strong working knowledge of Python’s numerical, data analysis, or AI frameworks such as NumPy, Pandas, Scikit-learn, etc. • Experience with LMs (Llama (1/2/3), T5, Falcon, Langchain or framework similar like Langchain) • Candidate must be aware of entire evolution history of NLP (Traditional Language Models to Modern Large Language Models), training data creation, training set-up and finetuning • Candidate must be comfortable interpreting research papers and architecture diagrams of Language Models • Candidate must be comfortable with LORA, RAG, Instruct fine-tuning, Quantization, etc.
• Predictive modelling experience in Python (Time Series/ Multivariable/ Causal)
• Experience applying various machine learning techniques and understanding the key parameters that affect their performance
• Experience of building systems that capture and utilize large data sets to quantify performance via metrics or KPIs
• Excellent verbal and written communication
• Comfortable working in a dynamic, fast-paced, innovative environment with several ongoing concurrent projects.
Roles & Responsibilities:
• Lead a team of Data Engineers, Analysts and Data scientists to carry out following activities:
• Connect with internal / external POC to understand the business requirements
• Coordinate with right POC to gather all relevant data artifacts, anecdotes, and hypothesis
• Create project plan and sprints for milestones / deliverables
• Spin VM, create and optimize clusters for Data Science workflows
• Create data pipelines to ingest data effectively
• Assure the quality of data with proactive checks and resolve the gaps
• Carry out EDA, Feature Engineering & Define performance metrics prior to run relevant ML/DL algorithms
• Research whether similar solutions have been already developed before building ML models
• Create optimized data models to query relevant data efficiently
• Run relevant ML / DL algorithms for business goal seek
• Optimize and validate these ML / DL models to scale
• Create light applications, simulators, and scenario builders to help business consume the end outputs
• Create test cases and test the codes pre-production for possible bugs and resolve these bugs proactively
• Integrate and operationalize the models in client ecosystem
• Document project artifacts and log failures and exceptions.
• Measure, articulate impact of DS projects on business metrics and finetune the workflow based on feedbacks