Our Large Language Model as a Service (LLMaaS) offering gives you access to cutting-edge language models, inferred using SecNumCloud-qualified infrastructure, HDS-certified for healthcare data hosting, and therefore sovereign, calculated in France. Benefit from high performance and optimal security for your AI applications. Your data remains strictly confidential, and is neither exploited nor stored after processing.
Chat & Reasoning
Our large models offer state-of-the-art performance for the most demanding tasks. They are particularly well-suited to applications requiring a deep understanding of language, complex reasoning or the processing of long documents.
qwen3.6:27b
gpt-oss:120b
llama3.3:70b
nemotron-3-super:120b
qwen3-2507:235b
mistral-small4:119b
qwen3-2507-think:4b
Programming & Agents
Our programming and agent models are specially optimised for agentic software engineering, large-scale code generation and development workflow automation.
qwen3.6:35b
qwen-coder-next:80b
qwen3-next:80b
devstral-small-2:24b
functiongemma:270m
Vision & Multimodal
Our Vision & Multimodal models can analyse images, videos and visual documents. They excel in OCR, object detection, structured extraction and spatio-temporal reasoning.
qwen3-vl:235b
qwen3-vl:30b
qwen3-vl:4b
gemma4:31b
gemma4:12b-it-qat
Embedding
Our embedding models transform text into vector representations for semantic search, clustering and RAG (Retrieval-Augmented Generation) pipelines.
bge-m3:567m
qwen3-embedding:4b
qwen3-embedding:8b
qwen3-embedding:0.6b
granite-embedding:278m
embeddinggemma:300m
Reranking
Our reranking models reorder search results by relevance to refine the quality of RAG pipelines. Compatible with the Cohere API.
nvidia/llama-nemotron-rerank-vl-1b-v2
qwen3-reranker:4b
qwen3-reranker:0.6b
bge-reranker-large
Security
Our security models specialise in detecting problematic content, preventing jailbreaks and ensuring regulatory compliance (RGPD, HDS). They can be used as pre-filters or post-filters in your workflows.
granite3-guardian:8b
granite3-guardian:2b
Translation
Our translation models offer high fidelity in 55 languages, respecting the grammar, cultural nuances and technical specificities of the documents.
translategemma:27b
Audio & Image
Our Audio & Image models enable real-time voice transcription (ASR streaming) and image generation from text descriptions, compatible with the OpenAI API.
voxtral
z-image:16b
Model comparison
This comparison table will help you choose the model best suited to your needs, based on various criteria such as context size, performance and specific use cases.
| Model | Publisher | Parameters | Context (tokens) | Vision | Agent | Reasoning | Security | Quick * | Energy efficiency * |
|---|---|---|---|---|---|---|---|---|---|
| Chat & Reasoning | |||||||||
| qwen3.6:27b | Qwen Team | 27B | 1 000 000 | ||||||
| gpt-oss:120b | OpenAI | 120B | 120 000 | ||||||
| llama3.3:70b | Meta | 70B | 132 000 | ||||||
| nemotron-3-super:120b | NVIDIA | 120B | 1 000 000 | ||||||
| qwen3-2507:235b | Qwen Team | 235B | 200 000 | ||||||
| mistral-small4:119b | Mistral AI | 119B | 262 144 | ||||||
| qwen3-2507-think:4b | Qwen Team | 4B | 250 000 | ||||||
| Programming & Agents | |||||||||
| qwen3.6:35b | Qwen Team | 35B | 1 000 000 | ||||||
| qwen-coder-next:80b | Qwen Team | 80B | 250 000 | ||||||
| qwen3-next:80b | Qwen Team | 80B | 250 000 | ||||||
| devstral-small-2:24b | Mistral AI & All Hands AI | 24B | 200 000 | ||||||
| functiongemma:270m | 270M | 32 768 | |||||||
| Vision & Multimodal | |||||||||
| qwen3-vl:235b | Qwen Team | 235B | 200 000 | ||||||
| qwen3-vl:30b | Qwen Team | 30B | 250 000 | ||||||
| qwen3-vl:4b | Qwen Team | 4B | 250 000 | ||||||
| gemma4:31b | 31B | 250 000 | |||||||
| gemma4:12b-it-qat | 12B | 250 000 | |||||||
| Embedding | |||||||||
| bge-m3:567m | BAAI | 567M | 8 192 | ||||||
| qwen3-embedding:4b | Qwen Team | 4B | 40 000 | ||||||
| qwen3-embedding:8b | Qwen Team | 8B | 40 000 | ||||||
| qwen3-embedding:0.6b | Qwen Team | 0.6B | 32 768 | ||||||
| granite-embedding:278m | IBM | 278M | 512 | ||||||
| embeddinggemma:300m | 300M | 2 048 | |||||||
| Reranking | |||||||||
| nvidia/llama-nemotron-rerank-vl-1b-v2 | NVIDIA | 1B | 4 096 | N.C. | |||||
| qwen3-reranker:4b | Qwen Team | 4B | 4 096 | N.C. | |||||
| qwen3-reranker:0.6b | Qwen Team | 0.6B | 4 096 | N.C. | |||||
| bge-reranker-large | BAAI | 335M | 512 | N.C. | |||||
| Security | |||||||||
| granite3-guardian:8b | IBM | 8B | 8 192 | ||||||
| granite3-guardian:2b | IBM | 2B | 8 192 | ||||||
| Translation | |||||||||
| translategemma:27b | 27B | 120 000 | |||||||
| Audio & Image | |||||||||
| voxtral | Mistral AI | 4B | 32 768 | N.C. | |||||
| z-image:16b | Community | 16B | N.C. | N.C. | |||||
Recommended use cases
Here are some common use cases and the most suitable models for each. These recommendations are based on the specific performance and capabilities of each model.
Multilingual dialogue
- nemotron-3-super:120b
- qwen3.6:27b
- gpt-oss:120b
Analysis of long documents
- nemotron-3-super:120b
- qwen3.6:27b
- qwen3-2507:235b
Programming and development
- qwen3.6:35b
- qwen-coder-next:80b
- devstral-small-2:24b
- nemotron-3-super:120b
Visual analysis
- qwen3-vl:235b
- gemma4:31b
- qwen3-vl:30b
Safety and compliance
- granite4.1-guardian:8b
- granite3-guardian:8b
- granite3-guardian:2b
- mistral-small4:119b
Light deployments
RAG (Retrieval-Augmented Generation)
- bge-m3:567m
- nvidia/llama-nemotron-rerank-vl-1b-v2
- qwen3.6:27b