Maths Formulae Statistics Normal Distribution(gaussion Distribution)

Normal Distribution (Gaussian) – Bell Curve

Learn about the normal distribution, also known as the Gaussian distribution, including its properties and real-world ap...
🔔

Definition

The Normal Distribution, also known as the Gaussian Distribution, is a continuous probability distribution characterized by a symmetric, bell-shaped curve. It is a fundamental concept in statistics because it appears in many natural and social phenomena. Its prevalence is explained by the Central Limit Theorem, which states that the sum of a large number of independent random variables will be approximately normally distributed, regardless of the underlying distribution.

SymbolDescription
\[ \mu \]Population Mean - The center of the distribution.
\[ \sigma \]Standard Deviation - A measure of the spread or width of the distribution.
\[ \sigma^2 \]Variance - The square of the standard deviation.
\[ f(x) \]Probability Density Function (PDF) - The height of the bell curve at a given point x.
\[ X \]A random variable following the distribution.
\[ Z \]Z-Score - The standardized value indicating how many standard deviations a point is from the mean.
🔑

Key Formulas

\[ f(x) = \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2} \]
Probability Density Function (PDF)
\[ Z = \frac{X - \mu}{\sigma} \]
Standardization (Z-Score)
\[ E[X] = \mu \]
Mean (Expected Value)
\[ \text{Var}(X) = \sigma^2 \]
Variance
\[ F(x) = P(X \leq x) = \int_{-\infty}^{x} f(t) \, dt \]
Cumulative Distribution Function (CDF)
📊

Diagram

μ μ−σ μ+σ f(x) = (1/σ√2π) e^(−(x−μ)²/2σ²)
Normal Distribution: symmetric bell curve centred at μ, inflection points at μ±σ — the foundation of statistical inference

The normal distribution is visualized as a symmetric, bell-shaped curve. The horizontal axis represents the values of the random variable (x), while the vertical axis represents the probability density. The peak of the curve is at the mean (μ), which is also the median and mode. The spread of the curve is determined by the standard deviation (σ). The curve has inflection points at μ - σ and μ + σ. The total area under the curve is equal to 1.

⚙️

Properties

Symmetry: The curve is perfectly symmetric about its center, the mean (μ). The mean, median, and mode are all equal.

\[ f(\mu + x) = f(\mu - x) \]
Symmetry property

Asymptotic Tails: The curve approaches the horizontal axis asymptotically, meaning it gets closer and closer but never touches it as x approaches positive or negative infinity.

\[ \lim_{x \to \pm\infty} f(x) = 0 \]
Asymptotic behavior

Empirical Rule (68-95-99.7): Approximately 68% of the data falls within one standard deviation of the mean, 95% within two standard deviations, and 99.7% within three standard deviations.

\[ P(\mu - \sigma \leq X \leq \mu + \sigma) \approx 0.68 \]
1 Standard Deviation

Closure Property: A linear combination of independent normally distributed random variables is also normally distributed.

\[ \text{If } X \sim N(\mu, \sigma^2), \text{ then } aX + b \sim N(a\mu + b, a^2\sigma^2) \]
Linear transformation property
📈

Proof and Derivation

A full derivation of the normal distribution's probability density function (PDF) is complex, involving concepts like the Gaussian integral. However, its significance is best understood through the Central Limit Theorem (CLT). The CLT states that the distribution of the sample mean of a large number of independent, identically distributed (i.i.d.) random variables approaches a normal distribution, regardless of the original distribution.

This means that if you take many samples from any population and calculate the mean of each sample, the distribution of those means will form a bell curve. This is why the normal distribution is so common in nature and statistics—it's the limiting distribution that emerges when many small, random effects are added together.

\[ \text{If } X_1, X_2, \ldots, X_n \text{ are i.i.d with } E[X_i] = \mu, \text{Var}(X_i) = \sigma^2 \]
Setup for Central Limit Theorem
\[ \frac{\bar{X} - \mu}{\sigma/\sqrt{n}} \xrightarrow{d} N(0, 1) \text{ as } n \to \infty \]
Central Limit Theorem Statement
🔢

Worked Example

A variable X is normally distributed with a mean (μ) of 50 and a standard deviation (σ) of 5. Calculate the Z-score for a value of X = 62.
  1. Recall the formula for the Z-score: Z = (X - μ) / σ.
  2. Substitute the given values into the formula: Z = (62 - 50) / 5.
  3. Calculate the difference in the numerator: 62 - 50 = 12.
  4. Perform the division: Z = 12 / 5 = 2.4.
The Z-score for X = 62 is 2.4. This means the value 62 is 2.4 standard deviations above the mean.
🧮

Try It

🏭

Applications

Quality Control & Manufacturing: The dimensions of manufactured parts, such as bolts or engine components, often follow a normal distribution. Statistical process control uses this to set tolerance limits and monitor production quality, identifying when a process deviates from its expected performance.

Finance & Economics: In finance, asset returns are often modeled as being normally distributed. This assumption is a cornerstone of many financial models, including the Black-Scholes model for option pricing and modern portfolio theory for risk management.

Natural and Social Sciences: Many biological measurements, such as human height, weight, and blood pressure, are approximately normally distributed. In psychology and education, test scores like IQ or SAT scores are often designed to follow a normal distribution.

Statistical Inference: The normal distribution is the foundation for many hypothesis tests (like t-tests and ANOVA) and for constructing confidence intervals. The Central Limit Theorem allows statisticians to make inferences about population parameters even when the population distribution is unknown.

🌍

Real World Examples

The heights of adult males in a country are normally distributed with a mean of 177 cm and a standard deviation of 7 cm. What percentage of adult males are taller than 191 cm?
  1. Standardize the value X = 191 cm using the Z-score formula: Z = (191 - 177) / 7.
  2. Calculate the Z-score: Z = 14 / 7 = 2.0.
  3. Find the probability P(Z > 2.0). Using a standard normal table or calculator, P(Z ≤ 2.0) is approximately 0.9772.
  4. The probability of being taller is 1 - P(Z ≤ 2.0), so P(Z > 2.0) = 1 - 0.9772 = 0.0228.
Approximately 2.28% of adult males in the country are taller than 191 cm.
A coffee machine dispenses coffee into cups. The amount of coffee is normally distributed with a mean of 200 ml and a standard deviation of 5 ml. What is the probability that a cup will contain between 190 ml and 210 ml?
  1. This range (190 ml to 210 ml) is exactly μ ± 2σ, since 190 = 200 - 2*5 and 210 = 200 + 2*5.
  2. Apply the Empirical Rule (68-95-99.7), which states that approximately 95% of the data falls within 2 standard deviations of the mean.
  3. Alternatively, calculate Z-scores for both values. Z₁ = (190 - 200) / 5 = -2. Z₂ = (210 - 200) / 5 = 2.
  4. Find P(-2 ≤ Z ≤ 2). Using a Z-table, this is P(Z ≤ 2) - P(Z ≤ -2) ≈ 0.9772 - 0.0228 = 0.9544.
The probability that a cup will contain between 190 ml and 210 ml is approximately 95.44%.
🏞️

Real World Scenarios

μ=100 85 115 70 130 IQ Scores μ=100, σ=15 — population IQ
IQ Distribution
IQ scores across large populations follow N(100, 15²). About 68% of people score between 85–115 and 95% between 70–130.
Male Heights (cm) 155 170 185 200 μ≈175cm, σ≈7cm
Human Heights
Adult male heights in any country closely follow a normal distribution. This lets clothing manufacturers determine how many of each size to produce.
true value random errors Measurement Errors Errors cluster symmetrically
Measurement Errors
By the Central Limit Theorem, repeated measurement errors follow a normal distribution — justifying least-squares fitting in physics, surveying, and GPS.

Student Test Scores On a standardized test taken by thousands of students, the distribution of scores often resembles a bell curve. Most students will score near the average, with fewer students achieving very high or very low scores, creating a natural grading curve.

Astronomy Measurement Errors When astronomers measure the distance to a star multiple times, small, random errors from atmospheric interference and equipment limitations cause the measurements to cluster around a central value. This spread of measurements typically follows a normal distribution.

Shoe Manufacturing A shoe manufacturer aims to produce size 9 shoes. Due to slight variations in materials and machinery, the actual shoe sizes produced will be normally distributed around the target size 9, with most being very close and fewer being slightly larger or smaller.

🗂️

Types and Classifications

The normal distribution family is defined by two parameters: the mean (μ) and the variance (σ²). While there is an infinite number of normal distributions, the most important classification is the distinction between a general normal distribution and the special case of the standard normal distribution.

TypeMean (μ)Standard Deviation (σ)Description
General Normal DistributionAny real numberAny positive real numberRepresents any bell-shaped, symmetric distribution. Denoted as N(μ, σ²).
Standard Normal Distribution01A special case used as a reference. Any normal distribution can be converted to this form using Z-scores. Denoted as N(0, 1).
⚠️

Common Mistakes

⚠️ Confusing Standard Deviation (σ) and Variance (σ²). The variance is the standard deviation squared. Always check which parameter is given in a problem, as the PDF formula uses σ while the notation N(μ, σ²) uses the variance.
💡 Assuming all data is normally distributed. While common, the normal distribution is not universal. Always check the data's distribution (e.g., with a histogram or a normality test) before applying methods that assume normality.
⚠️ Misinterpreting the PDF value f(x). The value of the probability density function is not a probability. For a continuous distribution, the probability of any single exact value is zero. Probability is found by calculating the area under the curve over an interval.
🚀

Study Strategy

1 📚 Build Your Foundation
  • Grasp the definition of the probability density function (PDF) and what each variable (μ for mean, σ for standard deviation) represents.
  • Review the properties of the Normal Distribution, such as its symmetry around the mean and the Empirical Rule (68-95-99.7).
  • Understand the concept of the Standard Normal Distribution (Z-distribution) where μ=0 and σ=1.
  • Study the provided diagram to visually connect the formula to the bell-shaped curve and the area underneath it.
2 🧠 Commit Formulas to Memory
  • Write out the full Probability Density Function (PDF) formula for the Normal Distribution from memory multiple times.
  • Memorize the Z-score transformation formula: Z = (X - μ) / σ, which is crucial for standardization.
  • Use flashcards to test your recall of both the PDF and the Z-score formulas daily.
  • Explain each component of the formulas aloud to a study partner or to yourself to solidify understanding.
3 ✍️ Solve Guided Problems
  • Follow the 'Worked Example' section step-by-step, ensuring you understand the calculation of the PDF value for a given x.
  • Practice calculating Z-scores for various data points (X) given a specific mean (μ) and standard deviation (σ).
  • Work through problems that require finding the probability (area under the curve) using a Z-table or statistical software.
  • Analyze the 'Common Mistakes' section and attempt to solve similar problems, actively avoiding the listed pitfalls.
4 🌍 Connect to Real-World Scenarios
  • Select a scenario from the 'Real World Examples' section (e.g., test scores, heights) and assign your own values for μ and σ.
  • Calculate the probability of a specific outcome, such as the likelihood of a student scoring above a certain grade.
  • Use the formula to determine the range of values that would contain the middle 95% of a population in a given application.
  • Formulate your own problem based on the 'Applications' section and solve it from start to finish to test your mastery.
By systematically building from concepts to application, you can confidently master the Normal Distribution and its powerful real-world uses.

Frequently Asked Questions

×

×