通义千问3（Qwen3）

更新时间：2025-05-04 17:20:28

链接直达

认领站点

手机查看

站点反馈

站点信息

站点链接：https://qwen3.app

站点标题：Qwen3: Think Deeper, Act Faster | Hybrid Thinking AI Model

收录时间：2023-05-10 16:45:38

访问次数：166次

站点关键词：Qwen3, AI model, large language model, MoE architecture, hybrid thinking, multilingual AI,Qwen3, 通义千问，通义，阿里ai

Discover the powerful capabilities that make Qwen3 stand out in the world of large language models. Advanced Pre-training Trained on 36 trillion tokens covering 119 languages and dialects, with expanded knowledge from web and PDF-like documents. Four-Stage Training Developed through long chain-of-thought cold start, reasoning-based RL, thinking mode fusion, and general RL to create a versatile AI system. Robust Model Family Eight models ranging from 0.6B to 235B parameters, including two efficient MoE models that reduce both training and inference costs. Extended Context Length Up to 128K token context length for complex document processing and analysis with no blind spots. Benchmark Excellence Superior performance in tasks like Arena-Hard, LiveBench, LiveCodeBench, GPQA-Diamond, and MMLU-Pro. AI-Ready Deployment Pre-configured for easy deployment with frameworks like SGLang, vLLM, and compatible with OpenAI-like endpoints.

探索使通义千问3（Qwen3）在大语言模型领域脱颖而出的强大能力。

### 先进的预训练

在涵盖119种语言和方言的36万亿个标记上进行训练，从网页和类似PDF的文档中扩展了知识。

### 四阶段训练

通过长思维链冷启动、基于推理的强化学习、思维模式融合和通用强化学习来开发，打造一个多功能的人工智能系统。

### 强大的模型家族

拥有从6亿（0.6B）到2350亿（235B）参数的八个模型，包括两个高效的混合专家（MoE）模型，降低了训练和推理成本。

### 扩展的上下文长度

高达12.8万（128K）标记的上下文长度，可对复杂文档进行无盲点的处理和分析。

### 卓越的基准测试表现

在诸如Arena - Hard、LiveBench、LiveCodeBench、GPQA - Diamond和MMLU - Pro等任务中表现出色。

### 易于人工智能部署