MrPrompts
← Glossary

Definition

What are Tokens in AI?

Tokens are the basic units of text that AI models process. A token is roughly 3-4 characters, or about three-quarters of an English word. AI models do not read text as humans do; they break it into tokens, process each one mathematically, and generate new tokens as output. Token counts determine how much text fits in a context window and how much API usage costs.

How tokens work

Before an AI model can process your text, it splits the input into tokens using a tokenizer. Common English words are usually a single token ("the", "and", "work"). Longer or less common words get split into multiple tokens ("unpredictable" becomes several tokens). Punctuation, spaces, and special characters are also tokenized.

A practical rule of thumb: 1 token is approximately 0.75 English words, or 1 word is approximately 1.3 tokens. A 1,000-word document is roughly 1,300 tokens. A full page of text is around 500-800 tokens depending on the writing style. These are approximations; exact counts depend on the specific tokenizer each model uses.

Token counts matter in two practical ways: context window limits and pricing. If a model has a 128,000-token context window, that is the total space for your system prompt, your messages, any documents you include, and the model's responses combined. API pricing is also per-token, with input tokens (what you send) typically costing less than output tokens (what the model generates).

Why it matters

Understanding tokens helps you write more efficient prompts and manage AI costs. A system prompt that is 2,000 tokens gets sent with every single message in a conversation. Over hundreds of conversations, that adds up to significant token usage. Writing concise, effective prompts is not just good communication; it is cost management.

Tokens also affect how much information you can give the AI in a single interaction. If you need to analyze a 50-page report, you need to know whether it fits in the context window or whether you need to break it into chunks. If you are building a knowledge base, token awareness helps you decide between pasting documents directly into prompts versus using RAG to retrieve only the relevant sections.

For teams evaluating AI tools, token pricing is a key cost driver. Different models charge different rates per token, and the difference between input and output pricing can be significant. Understanding tokens helps you compare vendors, estimate costs, and design workflows that deliver value without runaway spending.

Subscribe to the MrPrompts Newsletter

Join 5,000+ builders. One practical AI framework every week: prompt templates, workflow blueprints, and knowledge base strategies you can use the same day. Free.

Keep exploring