LLM Core Concepts

Last updated on Dec 1, 2025 · 2 min read

Temperature - It’s a hyperparameter that controls the randomness of the model output. A high temperature produces more unpredictable, creative results. A low temperature gives deterministic and expected output. For example 2+2 will give a 4 in case of low temperature, and will probably return some random number in case of high temperature. Varies in the range 0 to 1
Top P - It’s another hyperparameter that controls the randomness of output. It’s a threshold probability and selects the top tokens whose cumulative probability exceeds the threshold. Then model randomly samples from this set of tokens to generate the output. For example if p is 0.9 the model will only consider the most likely words that make up the 90% probability mass. Ranges from 0.0 to 1.0
Top K
Tokens - Input tokens refer to the text sent to the model. Output tokens refer to the text returned by the model
Parameters (What does 1B parameter model mean)
Tokenization - Helps in breaking down a text into smaller parts known as tokens. The primary reason to do this is to break down a long text into smaller machine readable pieces.
Tokenization types -
- Word tokenization - breaks text into words
- Character tokenization - breaks text into individual characters
- Subword tokenization - Example chatbots becomes chat, bots after tokenization
- Sentence tokenization - breaks text into sentences
- White space tokenization - breaks text based on white spaces such as spaces, tabs, and newline characters.
- N-gram Tokenization - This method generates n-grams, contiguous sequences of n items (words or characters) from the text.
Hallucination - When a model generates false, nonsensical, completely made up answer that appears to be very convincingly correct, but is totally incorrect.
Difference between parameter and hyperparameter
Prompting techniques
- Zero-shot
- Few-shot
- Chain-of-thought
- Meta

LLM Core Concepts

Table of Contents