Jade Global logo

Software AI Engineer - US

Jade Global
Full-time
On-site
San Jose, California, United States
Artificial Intelligence
Software AI Engineer - US1

Job Title: Software AI Engineer/Architect

Location: Santa Clara, CA (onsite preferred but remote candidates can be considered)

Experience:  8- 10 yrs

Job Type: Contract/ FTE

 

This role requires deep, end-to-end understanding of how Large Language Models are built, trained, optimized, deployed, and operated.

 

Candidates must demonstrate hands-on experience beyond consuming hosted LLM APIs, with a strong grasp of the underlying ML theory, system trade-offs, and production realities of AI/ML solutions.

 

Mandatory Competency Areas (Non-Negotiable)

1. Foundations of LLMs (How They Actually Work)

Candidate must demonstrate first-principles understanding, including:

  • Transformer architectures (attention, embeddings, positional encoding)

  • Tokenization strategies and their impact on cost & performance

  • Training vs inference behavior

  • Loss functions, pre-training objectives, and alignment techniques (SFT, RLHF)

  • Limitations: hallucinations, bias, context collapse, long-range degradation

 

2. Model Development & Adaptation

Hands-on experience with:

  • Pre-training vs fine-tuning trade-offs

  • Parameter-efficient tuning (LoRA, QLoRA, adapters)

  • Quantization and pruning techniques

  • Model evaluation beyond accuracy (task fitness, safety, robustness)

  • Data curation, labeling strategies, and contamination risks. Model Development & Adaptation

 

3. Inference, Serving & Optimization

Strong understanding of:

  • Inference pipelines and token generation mechanics

  • KV caching, batching, streaming responses

  • Throughput vs latency trade-offs

  • Memory constraints and GPU utilization strategies

  • Model parallelism (tensor, pipeline) and their failure modes

 

4. End-to-End AI/ML System Design

Ability to architect complete AI solutions, including:

  • Data ingestion and preprocessing pipelines

  • Training / fine-tuning workflows

  • Model registry, versioning, and lineage

  • Deployment strategies (canary, A/B, shadow traffic)

  • Feedback loops for continuous improvement

 

5. Retrieval, Memory & Tool-Augmented Systems

In-depth experience with:

  • Retrieval-Augmented Generation (RAG) design

  • Embeddings lifecycle management

  • Vector databases and hybrid retrieval

  • Prompt/tool orchestration and agentic workflows

  • Failure modes of RAG and mitigation strategies

 

6. MLOps, Observability & Reliability

Strong ownership mindset for production AI:

  • Monitoring model quality drift and regressions

  • Debugging hallucinations and retrieval failures

  • Logging prompts, responses, and model metadata

  • Cost tracking and optimization (token economics)

  • Incident response for AI systems

 

7. Security, Ethics & Governance

Clear understanding of:

  • Prompt injection and data leakage risks

  • Training data privacy and IP protection

  • Model abuse, misuse, and guardrails

  • Regulatory and compliance considerations

  • Responsible AI principles in production systems