Alibaba releases Qwen3 large model, 235 billion parameters supporting 119 languages, pioneering 'Fast Thinking/Slow Thinking' hybrid reasoning, surpassing Gemini 2.5 Pro in math/code capabilities, deployable with four GPUs

1.1 Intelligent Mode Switching
Introducing dual-engine 'Fast Mode' and 'Deep Mode':
1.2 User-Defined Control
Innovative 'Thinking Budget' regulator allows developers to adjust via API parameters:
2.1 Comprehensive Benchmark Leadership
| Test Category | Qwen3-235B | DeepSeek-R1 | OpenAI-o1 |
|---|---|---|---|
| AIME25 Math Reasoning | 81.5 | 79.2 | 80.8 |
| LiveCodeBench Code | 70.7 | 68.4 | 69.9 |
| ArenaHard Alignment | 95.6 | 93.1 | 94.8 |
2.2 Hardware Cost Revolution
3.1 Mixture of Experts (MoE) System
Adopts 235B parameter MoE architecture with:
3.2 Three-Phase Training System
4.1 Model Portfolio
| Model Name | Parameters | Type | Use Case |
|---|---|---|---|
| Qwen3-235B-A22B | 235B | MoE | Enterprise AI Hub |
| Qwen3-32B | 32B | Dense | Cloud Server Deployment |
| Qwen3-4B | 4B | Dense | Mobile/Vehicle Devices |
4.2 Developer Support
5.1 Enterprise Solutions
5.2 Personal User Applications
6.1 Recommended Hardware Configuration
| Model Size | GPU Requirements | Memory Usage | Inference Speed |
|---|---|---|---|
| 235B | 4x H20 | 64GB | 45 token/s |
| 32B | 2x A100 80G | 48GB | 78 token/s |
| 4B | Snapdragon 8 Gen3/RTX4060 | 6GB | Instant Response |
6.2 Quick Access Channels
Qwen3 achieves 'elephant dance' through hybrid reasoning architecture, maintains 235B parameter scale while reducing commercial deployment costs to one-third of industry standards. Its open-source strategy and multilingual support are accelerating AI democratization globally. With terminal device adaptations progressing, this efficiency revolution led by Alibaba may become a critical turning point in the AGI era.
Official Introduction: https://qwenlm.github.io/blog/qwen3/
GitHub: https://github.com/QwenLM/Qwen3
Discover more sites in the same category
Auto GLM Meditation launched by Zhipu AI is the first desktop agent program that combines GUI operation with meditation ability. It realizes in-depth thinking and real-time execution through the self-developed base models GLM-4-AIR-0414 and GLM-Z1-Rumination. This tool can independently complete the complete workflow of search/analysis/verification/summary in the browser. It supports complex task processing such as the production of niche travel guides and the generation of professional research reports. It has the characteristics of dynamic tool invocation and self-evolving reinforcement learning and is completely free. Currently, it is in the Beta testing stage.
Chat DLM is different from autoregression. It is a language model based on Diffusion (diffusion), with a MoE architecture that takes into account both speed and quality.
**Claude 3.7 Sonnet** is Anthropic’s smartest and most transparent AI model to date. With hybrid reasoning, developer-oriented features, and agent-like capabilities, it marks a major evolution in general-purpose AI. Whether you're writing code, analyzing data, or solving tough problems, Claude 3.7 offers both speed and thoughtful depth.
Claude 4 is a suite of advanced AI models by Anthropic, including Claude Opus 4 and Claude Sonnet 4. These models are a significant leap forward, excelling in coding, complex reasoning, and agent workflows.
DeepSeek, founded in 2023, is dedicated to researching the world's leading underlying models and technologies of general artificial intelligence and challenging the cutting-edge challenges of artificial intelligence. Based on self-developed training frameworks, self-built intelligent computing clusters, and tens of thousands of computing cards and other resources, the DeepSeek team has released and open-sourced multiple large models with hundreds of billions of parameters in just half a year, such as the DeepSeek-LLM general large language model and the DeepSeek-Coder code large model. And in January 2024, it was the first to open source the first domestic MoE large model (DeepSeek-MoE). The generalization effects of each major model outside the public evaluation list and real samples have all performed outstandingly, surpassing models of the same level. Talk to DeepSeek AI and easily access the API.
Claude.ai offers efficient AI writing and conversational services, supporting multiple languages, automatic text generation, and polishing to enhance content creation efficiency. Experience the convenience of an intelligent assistant now.
Share your thoughts about this page. All fields marked with * are required.