AI API rate limits, by provider

Rate limits are the invisible ceiling every AI app eventually hits. Providers cap you on requests per minute (RPM), tokens per minute (TPM), and sometimes requests or tokens per day — and each provider counts, resets, and surfaces those limits differently. A request that sails through at 2 a.m. can 429 at peak traffic because you crossed a TPM line you didn't know existed.

These references lay out the actual numbers per provider and tier, explain which response headers tell you how much headroom is left (x-ratelimit-remaining-* and friends), and show how the limits scale as you move up usage tiers. Where a provider gates higher limits behind spend or time-on-platform, we spell out the checklist. Treat every number as "as of" its stamped date — providers raise and reshuffle limits frequently, so confirm against your own dashboard before you size a workload.

AI API Rate Limits Compared: OpenAI vs Anthropic vs Gemini vs DeepSeek

Compare AI API rate limits across OpenAI, Anthropic, Gemini, and DeepSeek: how each counts tokens, per-day caps, raising limits, and the errors they return.
Updated June 18, 2026
DeepSeek API Rate Limits: How Throttling Actually Works

DeepSeek API rate limits reference: why there are no fixed RPM/TPM tiers, how dynamic throttling and slow responses work under load, and how to engineer for it.
Updated June 18, 2026
Google Gemini Rate Limits Explained: RPM, TPM & RPD

Gemini API rate limits reference: how RPM, TPM, and the per-day (RPD) cap work, free vs paid tiers, how to raise limits, and how to engineer around them.
Updated June 18, 2026
Anthropic Claude Rate Limits Explained: RPM, ITPM & OTPM

Anthropic Claude rate limits reference: requests, input tokens, and output tokens per minute, usage tiers, the rate-limit headers, and handling 429 vs 529.
Updated June 2, 2026
OpenAI Rate Limits Explained: RPM, TPM & Tiers

OpenAI rate limits reference: how RPM and TPM work, the usage-tier table, the response headers that show your headroom, and how to engineer around the limits.
Updated June 1, 2026

AI API Rate Limits Compared: OpenAI vs Anthropic vs Gemini vs DeepSeek

DeepSeek API Rate Limits: How Throttling Actually Works

Google Gemini Rate Limits Explained: RPM, TPM & RPD

Anthropic Claude Rate Limits Explained: RPM, ITPM & OTPM

OpenAI Rate Limits Explained: RPM, TPM & Tiers