graph TD
A[Probability Distributions] --> B[Discrete Distributions]
A --> C[Continuous Distributions]
B --> B1[Binomial<br/>Distribution]
B --> B2[Cumulative Binomial<br/>Distribution]
B --> B3[Hypergeometric<br/>Distribution]
B --> B4[Poisson<br/>Distribution]
C --> C1[Exponential<br/>Distribution]
C --> C2[Uniform<br/>Distribution]
C --> C3[Normal<br/>Distribution]
C3 --> C3a[Standard Normal<br/>Distribution]
C3 --> C3b[Probability<br/>Calculation]
C3 --> C3c[Determining the<br/>Value of X]
C3 --> C3d[Approximation to<br/>Binomial Distribution]
style A fill:#000,stroke:#000,color:#fff
style B fill:#000,stroke:#000,color:#fff
style C fill:#000,stroke:#000,color:#fff
6 Probability Distributions
Modeling Business Uncertainty with Discrete and Continuous Distributions
7 Chapter 5: Probability Distributions
By the end of this chapter, you will be able to:
- Calculate expected values and variances for discrete probability distributions
- Apply the binomial distribution to yes/no business scenarios
- Use the hypergeometric distribution for sampling without replacement
- Model rare events with the Poisson distribution
- Analyze time-between-events using the exponential distribution
- Work with uniform distributions for equally likely outcomes
- Master the normal distribution for continuous business data
- Convert problems to standard normal (Z-scores) for probability calculations
- Approximate binomial with normal distribution for large samples
7.1 5.1 Introduction: From Probability Principles to Distributions
In Chapter 4, we learned to calculate the probability of individual events. Now we extend those principles to probability distributions - comprehensive models that describe the likelihood of all possible outcomes for a random variable.
Real-World Business Applications:
A Bradley University study of Peoria, Illinois emergency services revealed:
- 911 call response times: Uniformly distributed between 1.2 and 4.6 minutes
- Call arrival rate: Poisson distribution with average of 9 calls per hour
- Home values: Normally distributed with mean $45,750, SD $15,110
- Budget impact: 42,089 homes potentially subject to new property tax
The mayor wanted to reduce average response time to 2 minutes at a cost of $575,000 per 30-second improvement. This chapter provides the statistical toolkit to analyze such decisions.
Random Variable: A variable whose value is determined by a random experiment.
Discrete Random Variable: Can assume only specific values (usually integers) - results from counting
- Examples: Number of defects, customer arrivals, sales calls, coin flips
Continuous Random Variable: Can assume any value within a range - results from measuring
- Examples: Weight, time, temperature, income, response time
Probability Distribution: A listing of all possible outcomes and their associated probabilities.
7.1.1 Visualizing Probability Distributions
Discrete Distribution Example - Rolling a die:
| Outcome | 1 | 2 | 3 | 4 | 5 | 6 |
|---|---|---|---|---|---|---|
| P(X) | 1/6 | 1/6 | 1/6 | 1/6 | 1/6 | 1/6 |
{.striped .hover}
Properties of ALL probability distributions:
1. 0 \leq P(X = x_i) \leq 1 (each probability between 0 and 1)
2. \sum P(X = x_i) = 1 (all probabilities sum to 1)
7.2 5.2 Mean and Variance of Discrete Distributions
Just as we calculated mean and variance for data sets in Chapter 3, we can compute them for probability distributions.
\mu = E(X) = \sum [x_i \cdot P(x_i)]
Interpretation: The long-run average value if we repeat the experiment many times.
Variance of Discrete Distribution
\sigma^2 = \sum [(x_i - \mu)^2 \cdot P(x_i)]
Standard Deviation
\sigma = \sqrt{\sigma^2}
7.2.1 Example 5.1: Ponder Real Estate Monthly Sales
Solution - Step 1: Convert to Probability Distribution
| Houses (x) | Months | P(x) | x · P(x) | (x - μ)² · P(x) |
|---|---|---|---|---|
| 5 | 3 | 3/24 = 0.125 | 0.625 | (5-10.912)² (0.125) = 4.369 |
| 8 | 7 | 7/24 = 0.292 | 2.336 | (8-10.912)² (0.292) = 2.476 |
| 10 | 4 | 4/24 = 0.167 | 1.670 | (10-10.912)² (0.167) = 0.139 |
| 12 | 5 | 5/24 = 0.208 | 2.496 | (12-10.912)² (0.208) = 0.246 |
| 17 | 3 | 3/24 = 0.125 | 2.125 | (17-10.912)² (0.125) = 4.633 |
| 20 | 2 | 2/24 = 0.083 | 1.660 | (20-10.912)² (0.083) = 6.855 |
| Total | 24 | 1.000 | 10.912 | 18.718 |
{.striped .hover}
Step 2: Calculate Statistics
\mu = E(X) = 10.912 \text{ houses/month}
\sigma^2 = 18.718 \text{ houses}^2
\sigma = \sqrt{18.718} = 4.326 \text{ houses}
Step 3: Compare to Previous Performance
| Metric | Previous | New | Change |
|---|---|---|---|
| Mean (μ) | 7.3 | 10.912 | +3.612 ✓ |
| SD (σ) | 5.7 | 4.326 | -1.374 ✓ |
{.striped .hover}
Good news for Mr. Ponder!
✅ Increased sales: Average jumped from 7.3 to 10.9 houses/month (+49.5%)
✅ Reduced variability: Standard deviation dropped from 5.7 to 4.3 (-24.1%)
Translation: More consistent, higher performance. He should stay in real estate and skip the rodeo career!
Strategic implication: Whatever changed in the past 24 months (marketing, market conditions, sales process) is working. Document and replicate the success factors.
7.3 5.3 The Binomial Distribution - Modeling Yes/No Outcomes
Many business situations involve binary outcomes repeated multiple times:
- Will this customer buy? (Yes/No) × 50 sales calls
- Is this product defective? (Yes/No) × 100 units inspected
- Did the student pass? (Yes/No) × 200 test takers
The binomial distribution is perfect for these scenarios.
Four Properties (Bernoulli Process):
- Fixed number of trials (n)
- Only two outcomes per trial (success/failure)
- Constant probability (\pi) for each trial
- Independent trials (one doesn’t affect others)
Binomial Probability Formula
P(X = x) = {}_nC_x \cdot \pi^x \cdot (1-\pi)^{n-x}
Where:
- n = number of trials
- x = number of successes desired
- \pi = probability of success on single trial
- {}_nC_x = \frac{n!}{x!(n-x)!} = combinations
Mean and Variance (Shortcuts)
\mu = n\pi
\sigma^2 = n\pi(1-\pi)
\sigma = \sqrt{n\pi(1-\pi)}
7.3.1 Example 5.2: Journal of Higher Education - Summer Jobs
Solution:
Given: n = 7 trials, \pi = 0.40 probability of success
a) Exactly 5 have jobs: P(X = 5)
P(X=5) = {}_7C_5 \cdot (0.40)^5 \cdot (0.60)^2
= \frac{7!}{5!2!} \cdot (0.01024) \cdot (0.36)
= 21 \cdot 0.01024 \cdot 0.36 = 0.0774
From Binomial Table (Appendix III, Table B):
Look up n=7, \pi=0.40, x=5 → 0.0774
b) None have jobs: P(X = 0)
P(X=0) = {}_7C_0 \cdot (0.40)^0 \cdot (0.60)^7 = 1 \cdot 1 \cdot 0.0280 = 0.0280
From table: n=7, \pi=0.40, x=0 → 0.0280
c) All 7 have jobs: P(X = 7)
P(X=7) = {}_7C_7 \cdot (0.40)^7 \cdot (0.60)^0 = 1 \cdot 0.0016 \cdot 1 = 0.0016
From table: n=7, \pi=0.40, x=7 → 0.0016
- 7.74% chance exactly 5 of 7 work (moderately likely)
- 2.80% chance none work (rare - less than 3%)
- 0.16% chance all 7 work (very rare - less than 2 in 1000)
Most likely outcome: E(X) = n\pi = 7(0.40) = 2.8 \approx 3 students working
The extremes (0 or 7) are both unlikely. We’d typically see 2-4 students with summer jobs in a sample of 7.
7.3.2 Handling π > 0.50: The Complement Trick
Problem: Binomial tables only go up to \pi = 0.50. What if \pi = 0.70?
Solution: Use the complement!
7.3.3 Example 5.3: Internet Connectivity in Flatbush
Solution - The Complement Trick:
Key insight: If 70% are connected (success), then 30% are not connected (failure).
Reframe: 6 successes at \pi = 0.70 equals 4 failures at \pi = 0.30
Visual proof:
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | (π=0.70) |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | (π=0.30) |
Therefore:
P(X = 6 | n=10, \pi=0.70) = P(X = 4 | n=10, \pi=0.30)
From Binomial Table: n=10, \pi=0.30, x=4 → 0.2001
Rule: When \pi > 0.50, find P(X = x) by looking up P(X = n-x) with \pi' = 1 - \pi
7.4 5.4 Cumulative Binomial Distributions
Often we need probability of a range rather than exact value:
- “At most 3 defects” → P(X \leq 3)
- “At least 5 sales” → P(X \geq 5)
- “Between 2 and 4 complaints” → P(2 \leq X \leq 4)
7.4.1 Using Cumulative Binomial Tables (Table C)
Table C provides: P(X \leq k) = P(X=0) + P(X=1) + ... + P(X=k)
7.4.2 Example 5.4: Student Employment (Continued)
Solution:
a) 3 or fewer: P(X \leq 3)
From Cumulative Table C: n=7, \pi=0.40, x=3 → 0.7102
b) At least 5: P(X \geq 5)
Tables give P(X \leq k), not P(X \geq k). Use complement:
P(X \geq 5) = 1 - P(X \leq 4)
From Table C: n=7, \pi=0.40, x=4 → 0.9037
P(X \geq 5) = 1 - 0.9037 = 0.0963
c) Between 3 and 5 (inclusive): P(3 \leq X \leq 5)
Strategy: P(3 \leq X \leq 5) = P(X \leq 5) - P(X \leq 2)
From Table C:
- P(X \leq 5) = 0.9812
- P(X \leq 2) = 0.4199
P(3 \leq X \leq 5) = 0.9812 - 0.4199 = 0.5613
- 71.02% probability 3 or fewer work (high - most samples will be in this range)
- 9.63% probability at least 5 work (low - only 1 in 10 samples)
- 56.13% probability between 3-5 work (moderate - slightly better than coin flip)
Central tendency: The distribution clusters around expected value \mu = 2.8, making 3-5 the most probable range.
7.5 5.5 The Hypergeometric Distribution
The binomial distribution requires constant probability across trials. But what if we sample without replacement from a small population?
Example: Drawing cards from a deck. First draw: P(Ace) = 4/52. Second draw: P(Ace) = 3/51 (if first was Ace) or 4/51 (if not).
Probability changed → Binomial doesn’t apply → Use Hypergeometric!
P(X = x) = \frac{{}_rC_x \times {}_{N-r}C_{n-x}}{{}_NC_n}
Where:
- N = population size
- r = number of successes in population
- n = sample size
- x = number of successes desired in sample
When to Use:
- Sampling without replacement
- From finite population
- Sample size is large relative to population (typically n/N > 0.05)
7.5.1 Example 5.5: Racehorse Contagious Disease
Solution:
P(X = 2) = \frac{{}_4C_2 \times {}_{10-4}C_{3-2}}{{}_10C_3}
Step 1: Calculate combinations
{}_4C_2 = \frac{4!}{2!2!} = \frac{24}{4} = 6 \text{ (ways to select 2 sick from 4)}
{}_{6}C_1 = \frac{6!}{1!5!} = 6 \text{ (ways to select 1 healthy from 6)}
{}_10C_3 = \frac{10!}{3!7!} = \frac{720}{6} = 120 \text{ (total ways to select 3 from 10)}
Step 2: Calculate probability
P(X = 2) = \frac{6 \times 6}{120} = \frac{36}{120} = 0.30
30% chance of finding exactly 2 sick horses in the sample of 3.
Expected value: E(X) = n \times \frac{r}{N} = 3 \times \frac{4}{10} = 1.2 sick horses
Finding 2 sick horses is above average but not rare. The vet should:
- Test the remaining 7 horses immediately
- Quarantine the stable
- Begin treatment protocol for confirmed cases
7.5.2 Example 5.6: Employment Discrimination Case - Johnson District
Solution:
We need: P(X \leq 1) = P(X=0) + P(X=1)
P(X = 1): Exactly 1 woman promoted
P(X=1) = \frac{{}_4C_1 \times {}_5C_2}{{}_9C_3} = \frac{4 \times 10}{84} = \frac{40}{84} = 0.4762
P(X = 0): No women promoted
P(X=0) = \frac{{}_4C_0 \times {}_5C_3}{{}_9C_3} = \frac{1 \times 10}{84} = \frac{10}{84} = 0.1190
Total probability:
P(X \leq 1) = 0.4762 + 0.1190 = 0.5952
59.52% chance (nearly 60%) that at most 1 woman would be promoted purely by random selection.
Court’s conclusion: This is not unusually low. Random chance could easily produce this outcome without discrimination.
Statistical standard: Courts typically look for probabilities < 5% before inferring discrimination. Here, 59.52% is far above that threshold.
Verdict: Insufficient statistical evidence of gender bias. Case dismissed.
Note: This doesn’t prove no discrimination occurred - only that the statistical evidence is weak.
7.6 5.6 The Poisson Distribution - Modeling Rare Events
The Poisson distribution models the number of rare events occurring in a fixed interval (time, space, area, volume):
- Customer arrivals per hour
- Defects per 100 units
- Website crashes per month
- Typos per page
- Accidents per quarter
P(X = x) = \frac{\mu^x \cdot e^{-\mu}}{x!}
Where:
- \mu = average number of occurrences in the interval
- x = specific number of occurrences
- e = 2.71828 (natural logarithm base)
Requirements:
1. Events are rare (low probability)
2. Events are independent
3. Average rate (\mu) is constant
Mean and Variance (Same value!)
E(X) = \mu \text{Var}(X) = \mu \sigma = \sqrt{\mu}
7.6.1 Example 5.7: University Tutorial Office
Solution:
a) Exactly 4 students: P(X = 4)
P(X=4) = \frac{(5.2)^4 \cdot e^{-5.2}}{4!} = \frac{731.16 \cdot 0.00552}{24} = \frac{4.036}{24} = 0.1681
From Poisson Table (Table D): \mu = 5.2, x = 4 → 0.1681
b) No students: P(X = 0)
P(X=0) = \frac{(5.2)^0 \cdot e^{-5.2}}{0!} = \frac{1 \cdot 0.00552}{1} = 0.0055
From table: \mu = 5.2, x = 0 → 0.0055
c) Exactly 8 students: P(X = 8)
From table: \mu = 5.2, x = 8 → 0.0731
- 16.81% chance of exactly 4 students (common outcome)
- 0.55% chance of zero students (very rare - empty office is unusual)
- 7.31% chance of 8 students (occasional surge)
Operational planning:
- Expect 5-6 students most hours (around the mean)
- Staff for peak: Plan capacity for 8-10 students to handle surges (95th percentile ≈ 9 students)
- Quiet hours rare: Empty office only ~1 in 200 hours
Expected variance: \sigma = \sqrt{5.2} = 2.28 students - moderate variability
7.6.2 Adjusting μ for Different Time Intervals
Critical skill: The mean must match the time period in the problem!
7.6.3 Example 5.8: Defect Rate Adjustment
Solution:
a) 200 units: Adjust μ
\mu_{200} = 4 \times \frac{200}{100} = 8 \text{ defects}
P(X=10 | \mu=8) = 0.0993 \text{ (from table)}
b) 50 units: Adjust μ
\mu_{50} = 4 \times \frac{50}{100} = 2 \text{ defects}
P(X=0 | \mu=2) = 0.1353 \text{ (from table)}
7.7 5.7 The Exponential Distribution - Time Between Events
While Poisson models how many events occur, Exponential models how long between events.
If arrivals follow Poisson → time between arrivals follows Exponential.
P(X \leq t) = 1 - e^{-\mu t}
Where:
- t = time period of interest
- \mu = average rate of occurrence
- e = 2.71828
Relationship to Poisson:
- Poisson: “3 customers per hour” → How many?
- Exponential: “Time until next customer” → How long?
Mean and Variance
E(X) = \frac{1}{\mu} \sigma^2 = \frac{1}{\mu^2}
7.7.1 Example 5.9: Cross City Cab Company
Solution:
Time conversion: μ = 12 per 60 minutes. What fraction is 5 minutes?
t = \frac{5}{60} = \frac{1}{12}
Probability taxi arrives within 5 minutes:
P(X \leq 5 \text{ min}) = 1 - e^{-\mu t} = 1 - e^{-(12)(1/12)} = 1 - e^{-1}
From Table D (Poisson table where x=0 gives e^{-\mu}):
e^{-1} = 0.3679
P(X \leq 5) = 1 - 0.3679 = 0.6321
✅ Wait for the taxi!
63.21% probability a taxi arrives within 5 minutes (> 50% threshold).
Additional insights:
- P(X \leq 10 \text{ min}) = 1 - e^{-2} = 1 - 0.1353 = 0.8647 (86.5% within 10 min)
- P(5 < X \leq 10) = 0.8647 - 0.6321 = 0.2326 (23.3% arrive between 5-10 min)
Expected wait time: E(X) = \frac{1}{\mu} = \frac{1}{12} hour = 5 minutes
With 12 taxis/hour on average, you’ll likely wait around 5 minutes - acceptable for most business travelers.
7.8 5.8 The Uniform Distribution - Equally Likely Outcomes
The uniform distribution applies when all outcomes in a range are equally likely.
Think of it as a “flat” distribution - constant probability density across the range.
Continuous Uniform Distribution on [a, b]
Mean: \mu = \frac{a + b}{2}
Variance: \sigma^2 = \frac{(b-a)^2}{12}
Height (probability density): \text{Height} = \frac{1}{b-a}
Probability X falls between X_1 and X_2: P(X_1 \leq X \leq X_2) = \frac{X_2 - X_1}{b - a}
7.8.1 Example 5.10: Dow Chemical Fertilizer Bags
Solution:
Step 1: Determine a and b
Range = 2.4 pounds spreads evenly around mean of 25:
a = \mu - \frac{\text{Range}}{2} = 25 - \frac{2.4}{2} = 25 - 1.2 = 23.8 \text{ lbs}
b = \mu + \frac{\text{Range}}{2} = 25 + 1.2 = 26.2 \text{ lbs}
a) Will Harry get at least 23 pounds?
Minimum bag weight = 23.8 pounds
✅ YES - Even the lightest bag (23.8 lbs) exceeds Harry’s 23 lb requirement!
b) Probability of bag > 25.5 pounds:
P(X > 25.5) = P(25.5 < X \leq 26.2) = \frac{26.2 - 25.5}{26.2 - 23.8}
= \frac{0.7}{2.4} = 0.2917
✅ Harry has no worries!
- Guaranteed at least 23 pounds (minimum is 23.8)
- 29.17% chance of getting bonus weight (> 25.5 lbs)
- Average bag: 25 pounds (exactly what’s advertised)
Quality perspective:
- Consistent product: Uniform distribution means predictable range
- No short-weighting: Minimum 23.8 lbs protects consumers
- Bonus potential: Nearly 1 in 3 bags exceed 25.5 lbs
Compared to normal distribution: Uniform has no extreme values - all bags within narrow 2.4 lb range.
7.9 5.9 The Normal Distribution - The Crown Jewel of Statistics
Of all probability distributions, the normal distribution is the most important in statistics. We introduced its bell-shaped, symmetric form in Chapter 3 with the empirical rule. Now we’ll unlock its full analytical power.
Key characteristics:
- Continuous distribution (not discrete)
- Infinite range (theoretically from -∞ to +∞)
- Used for measured variables: heights, weights, temperatures, IQ scores, financial returns
- Completely defined by two parameters: mean (μ) and standard deviation (σ)
7.9.1 Real-World Case: ToppsWear Clothing Manufacturer
ToppsWear recognized the public was constantly changing in physical size and proportions. To produce better-fitting clothing, management commissioned a comprehensive study of customer body measurements.
Findings: Customer heights are normally distributed with:
- Mean: μ = 67 inches
- Standard deviation: σ = 2 inches
This normal distribution allows ToppsWear to:
1. Predict what percentage of customers fall into each size range
2. Optimize inventory allocation across sizes
3. Minimize both stockouts and excess inventory
Symmetry:
- 50% of observations above the mean
- 50% of observations below the mean
- Mirror image around μ
Area Under the Curve:
- Total area = 1.00 (100%)
- Area = Probability
- 50% of area to right of μ, 50% to left
Empirical Rule (68-95-99.7):
- 68.3% of observations within μ ± 1σ
- 95.5% of observations within μ ± 2σ
- 99.7% of observations within μ ± 3σ
7.9.2 Comparing Different Normal Distributions
Three scenarios for ToppsWear:
Distribution I: μ = 67 inches, σ = 2 inches (adult population)
Distribution II: μ = 79 inches, σ = 2 inches (basketball players)
Distribution III: μ = 67 inches, σ = 4 inches (more diverse population)
Key insights:
- Different means → Shift left/right (Distributions I vs II)
- Different standard deviations → Change width/flatness (Distributions I vs III)
- Same area percentages apply regardless of μ or σ (Empirical Rule)
Distribution III is flatter and more spread out because σ = 4 (double the variability of Distribution I).
Empirical Rule application:
- Distribution I: 68.3% of heights between 65-69 inches (μ ± 1σ)
- Distribution III: 68.3% of heights between 63-71 inches (wider range due to larger σ)
7.10 5.10 The Standard Normal Distribution (Z-Scores)
Since there are infinite possible normal distributions (each with different μ and σ), statisticians created a standard form for all calculations.
Z = \frac{X - \mu}{\sigma}
Where:
- Z = standard normal deviate (Z-score)
- X = original value
- μ = population mean
- σ = population standard deviation
Standard Normal Distribution:
- Mean = 0
- Standard deviation = 1
- Symmetric around Z = 0
Interpretation of Z:
“The number of standard deviations an observation is above (+) or below (-) the mean.”
7.10.1 Example 5.11: ToppsWear Customer Heights - Z-Score Conversions
Solution:
Tom Typical: X = 67
Z = \frac{67 - 67}{2} = \frac{0}{2} = 0
Tom is exactly average → Z = 0 (at the mean)
Paula Petite: X = 63
Z = \frac{63 - 67}{2} = \frac{-4}{2} = -2.00
Paula is 2 standard deviations below average → Z = -2.00
Steve Stretch: X = 70
Z = \frac{70 - 67}{2} = \frac{3}{2} = 1.50
Steve is 1.5 standard deviations above average → Z = 1.50
Z = 0: Exactly at the mean (perfectly average)
Z = +1: One standard deviation above average (taller/larger)
Z = -1: One standard deviation below average (shorter/smaller)
Z > +3: Extremely high (top 0.15%)
Z < -3: Extremely low (bottom 0.15%)
ToppsWear application:
- Paula (Z = -2) is shorter than 97.7% of customers → Small/Petite sizes
- Tom (Z = 0) is median customer → Medium/Regular sizes
- Steve (Z = 1.5) is taller than 93.3% of customers → Large/Tall sizes
7.11 5.11 Calculating Probabilities with the Normal Distribution
The beauty of standardization: If you know the area, you know the probability!
Think of it like a dartboard:
- 2/3 of target painted green
- 1/3 of target painted red
- Equal chance of hitting any point
- P(hitting green) = 2/3 because 2/3 of area is green
Same logic for normal curves: Area under curve = Probability
7.11.1 Using the Standard Normal Table (Table E)
Table E provides: Area from mean (Z=0) to any Z-value
7.11.2 Example 5.12: TelCom Satellite Transmission Times
Solution:
a) P(125 ≤ X ≤ 150):
Z = \frac{125 - 150}{15} = \frac{-25}{15} = -1.67
From Table E: Z = 1.67 → Area = 0.4525
Visual: 45.25% of all transmissions fall between 125-150 seconds
b) P(X < 125):
Strategy: Total area below mean = 0.5000
Area between 125-150 = 0.4525
P(X < 125) = 0.5000 - 0.4525 = 0.0475
Only 4.75% of transmissions are shorter than 125 seconds (rare!)
c) P(145 ≤ X ≤ 155):
Step 1: Find area from 145 to 150
Z = \frac{145 - 150}{15} = -0.33
From Table E: Area = 0.1293
Step 2: By symmetry, area from 150 to 155 = 0.1293
P(145 \leq X \leq 155) = 0.1293 + 0.1293 = 0.2586
25.86% of transmissions fall in this narrow 10-second window
d) P(160 ≤ X ≤ 165):
Step 1: Area from 150 to 165
Z = \frac{165 - 150}{15} = 1.00
From Table E: Area = 0.3413
Step 2: Area from 150 to 160
Z = \frac{160 - 150}{15} = 0.67
From Table E: Area = 0.2486
Step 3: Subtract
P(160 \leq X \leq 165) = 0.3413 - 0.2486 = 0.0927
Service demand profile:
- Most transmissions (95%) between 120-180 seconds (μ ± 2σ)
- Very short calls (<125 sec) rare at 4.75% → Premium pricing opportunity
- Mid-range calls (145-155) common at 25.86% → Standard pricing tier
- Longer calls (160-165) moderate at 9.27% → Volume discount potential
Capacity planning:
- Plan for peak at 180 seconds (μ + 2σ) to serve 97.5% of calls
- Reserve overflow capacity for extreme cases (>180 sec)
Revenue optimization:
- Tiered pricing: <120s (premium), 120-180s (standard), >180s (discounted)
- Bundle packages targeting 145-155 second average usage (26% of market)
7.12 5.12 Finding X-Values from Known Probabilities (Inverse Normal)
Sometimes we know the desired probability and must find the corresponding X-value.
Business applications:
- “What score puts a student in top 10%?” (scholarships)
- “What income level defines the poorest 15%?” (welfare programs)
- “What response time separates best 10% from worst 10%?” (performance evaluation)
7.12.1 Example 5.13: Presidential Economic Policy - Welfare Threshold
Solution - The Inverse Process:
Step 1: Visualize the problem
- We know area = 0.15 (left tail)
- We need to find X = income threshold
Step 2: Convert to table lookup area
Table E shows area from mean to Z, not tail area.
\text{Area from mean to Z} = 0.5000 - 0.1500 = 0.3500
Step 3: Find Z from Table E
Look inside table body for area closest to 0.3500
→ Find 0.3508 at Z = 1.04
Step 4: Assign correct sign
We’re working in the left tail (below mean) → Z = -1.04
Step 5: Solve for X
Z = \frac{X - \mu}{\sigma}
-1.04 = \frac{X - 13,812}{3,550}
X = 13,812 + (-1.04)(3,550)
X = 13,812 - 3,692 = \$10,120
Income threshold: Anyone earning $10,120 or less receives government assistance (bottom 15%).
Budget impact:
- U.S. population ≈ 265 million (1996)
- 15% = 39.75 million people qualify
- At $5,000/person/year → $198.75 billion annual cost
Political considerations:
- Threshold creates “welfare cliff” at $10,121
- May discourage earning just above cutoff
- Recommend graduated phase-out from $10,120-$15,000
7.12.2 Example 5.14: Fire Department Response Times - Performance Benchmarking
Solution:
Find X₁ (bottom 10% cutoff):
Step 1: Area from mean to Z = 0.5000 - 0.1000 = 0.4000
Step 2: In Table E, find 0.3997 (closest to 0.4000) → Z = 1.28
Step 3: Left tail → Z = -1.28
-1.28 = \frac{X_1 - 12.8}{3.7}
X_1 = 12.8 + (-1.28)(3.7) = 12.8 - 4.74 = 8.06 \text{ minutes}
Find X₂ (top 10% cutoff):
Step 1: Right tail, same logic → Z = +1.28
1.28 = \frac{X_2 - 12.8}{3.7}
X_2 = 12.8 + (1.28)(3.7) = 12.8 + 4.74 = 17.54 \text{ minutes}
Excellent (Top 10%): Response time < 8.06 minutes
→ Serve as model programs for improvement initiatives
Acceptable (Middle 80%): Response time 8.06 - 17.54 minutes
→ Meet standard performance expectations
Needs Improvement (Bottom 10%): Response time > 17.54 minutes
→ Receive training, resources, and mentorship from excellent departments
Implementation strategy:
1. Pair each “needs improvement” department with “excellent” mentor
2. Analyze best practices: dispatch protocols, routing algorithms, staffing
3. Set 12-month improvement target: reduce times by 20%
4. Monthly progress reviews with state commission
Expected impact: If bottom 10% improve to average (12.8 min), estimated 45 lives saved annually statewide.
7.13 5.13 Normal Approximation to the Binomial Distribution
When n is large, calculating binomial probabilities becomes tedious:
- Tables don’t extend to large n
- Formulas involve massive factorials (100!)
- Computers can still struggle with extreme values
Solution: Use normal distribution as approximation!
Requirements:
- n\pi \geq 5 (at least 5 expected successes)
- n(1-\pi) \geq 5 (at least 5 expected failures)
- π reasonably close to 0.50 (symmetric)
Formulas:
\mu = n\pi \sigma = \sqrt{n\pi(1-\pi)}
Continuity Correction Factor:
Because normal is continuous but binomial is discrete:
- P(X = 10) → P(9.5 ≤ X ≤ 10.5)
- P(X ≤ 10) → P(X ≤ 10.5)
- P(X ≥ 10) → P(X ≥ 9.5)
7.13.1 Example 5.15: Labor Union Strike Vote
Solution:
Binomial (Exact) - From Table B:
P(X = 10 | n = 15, \pi = 0.40) = 0.0245
Normal Approximation:
Step 1: Check requirements
n\pi = 15(0.40) = 6 \geq 5 \quad ✓ n(1-\pi) = 15(0.60) = 9 \geq 5 \quad ✓
Step 2: Calculate normal parameters
\mu = n\pi = 15(0.40) = 6 \sigma = \sqrt{15(0.40)(0.60)} = \sqrt{3.6} = 1.897
Step 3: Apply continuity correction
P(X = 10) → P(9.5 ≤ X ≤ 10.5)
Step 4: Convert to Z-scores
Z_1 = \frac{9.5 - 6}{1.897} = 1.85 \quad \text{(Area = 0.4678)}
Z_2 = \frac{10.5 - 6}{1.897} = 2.37 \quad \text{(Area = 0.4911)}
Step 5: Calculate probability
P(9.5 \leq X \leq 10.5) = 0.4911 - 0.4678 = 0.0233
Binomial (exact): 0.0245 (2.45%)
Normal (approx): 0.0233 (2.33%)
Difference: 0.0012 (0.12 percentage points)
Error: Only 4.9% relative error - excellent approximation!
When approximation improves:
- Larger n (n > 30)
- π closer to 0.50
- Example: n = 100, π = 0.40 → error typically < 1%
When to stick with binomial:
- Small n (n < 10)
- Extreme π (π < 0.10 or π > 0.90)
- Software available (use exact calculation)
7.14 Problemas Resueltos (Solved Problems)
7.14.1 Problema 1: Medical Device Reliability
Solution:
a) P(X ≤ 2): Use Poisson approximation (rare events)
\mu = n\pi = 500(0.003) = 1.5 \text{ defectives}
From Poisson Table (μ = 1.5):
- P(X = 0) = 0.2231
- P(X = 1) = 0.3347
- P(X = 2) = 0.2510
P(X \leq 2) = 0.2231 + 0.3347 + 0.2510 = 0.8088
80.88% probability of 2 or fewer defects
b) Expected defectives:
E(X) = \mu = 1.5 \text{ units}
c) Risk assessment:
P(X ≥ 1) = 1 - P(X = 0) = 1 - 0.2231 = 0.7769 (77.7% chance of at least 1 defect)
Recommendation: ✅ YES, inspect all units
- Pacemaker failure = life-threatening
- 77.7% chance of defect in order is unacceptable risk
- Cost of inspection << cost of patient death/lawsuit
7.14.2 Problema 2: Quality Control - Hypergeometric Application
Solution:
P(X = 2) = \frac{{}_8C_2 \times {}_{42}C_3}{{}_50C_5}
Step 1: Calculate combinations
{}_8C_2 = \frac{8!}{2!6!} = \frac{8 \times 7}{2} = 28
{}_{42}C_3 = \frac{42!}{3!39!} = \frac{42 \times 41 \times 40}{6} = 11,480
{}_50C_5 = \frac{50!}{5!45!} = 2,118,760
Step 2: Calculate probability
P(X = 2) = \frac{28 \times 11,480}{2,118,760} = \frac{321,440}{2,118,760} = 0.1517
15.17% chance of finding exactly 2 defectives
Expected defectives in sample:
E(X) = n \times \frac{r}{N} = 5 \times \frac{8}{50} = 0.8 \text{ units}
Quality decision: Finding 2 defectives (vs expected 0.8) suggests shipment quality is worse than average → Reject entire shipment
7.14.3 Problema 3: Customer Service Call Center
Solution:
a) P(X = 25 | μ = 18):
From Poisson Table: μ = 18, X = 25 → 0.0201 (2.01%)
b) P(15 ≤ X ≤ 20):
From Cumulative Poisson Table:
- P(X ≤ 20) = 0.8355
- P(X ≤ 14) = 0.3518
P(15 \leq X \leq 20) = 0.8355 - 0.3518 = 0.4837
48.37% probability (nearly half the time)
c) Exponential - Time until next call:
μ = 18 calls/hour = 18 calls/60 min = 0.30 calls/minute
P(X \leq 2 \text{ min}) = 1 - e^{-\mu t} = 1 - e^{-(0.30)(2)} = 1 - e^{-0.6}
= 1 - 0.5488 = 0.4512
45.12% probability next call within 2 minutes
Average wait: E(X) = \frac{1}{\mu} = \frac{1}{0.30} = 3.33 minutes
7.14.4 Problema 4: Manufacturing Tolerance Analysis
Solution:
a) P(4.85 ≤ X ≤ 5.15):
Z_1 = \frac{4.85 - 5.00}{0.12} = -1.25 \quad \text{(Area = 0.3944)}
Z_2 = \frac{5.15 - 5.00}{0.12} = 1.25 \quad \text{(Area = 0.3944)}
P(4.85 \leq X \leq 5.15) = 0.3944 + 0.3944 = 0.7888
78.88% within specifications
b) Scrap rate:
\text{Scrap} = 1 - 0.7888 = 0.2112
21.12% scrap rate (unacceptable! Target: < 1%)
c) Find σ for 99% within specs:
Target: P(4.85 ≤ X ≤ 5.15) = 0.99
→ Each tail = 0.005
→ Area from mean to spec limit = 0.495
From Table E: Area 0.4950 → Z = 2.58
2.58 = \frac{5.15 - 5.00}{\sigma}
\sigma = \frac{0.15}{2.58} = 0.0581 \text{ cm}
Recommendation: Reduce variability from σ = 0.12 to σ = 0.058 (51% reduction)
→ Requires process improvement: better machines, training, quality control
7.15 Lista de Fórmulas (Formula Reference)
7.15.1 Discrete Distribution Fundamentals
Expected Value (Mean): \mu = E(X) = \sum [x_i \cdot P(x_i)]
Variance: \sigma^2 = \sum [(x_i - \mu)^2 \cdot P(x_i)]
Standard Deviation: \sigma = \sqrt{\sigma^2}
7.15.2 Binomial Distribution
Probability Formula: P(X = x) = {}_nC_x \cdot \pi^x \cdot (1-\pi)^{n-x}
Combinations: {}_nC_x = \frac{n!}{x!(n-x)!}
Mean: \mu = n\pi
Variance: \sigma^2 = n\pi(1-\pi)
Standard Deviation: \sigma = \sqrt{n\pi(1-\pi)}
7.15.3 Hypergeometric Distribution
Probability Formula: P(X = x) = \frac{{}_rC_x \times {}_{N-r}C_{n-x}}{{}_NC_n}
Mean: \mu = n \cdot \frac{r}{N}
7.15.4 Poisson Distribution
Probability Formula: P(X = x) = \frac{\mu^x \cdot e^{-\mu}}{x!}
Mean: \mu = \lambda t \quad \text{(rate × time)}
Variance: \sigma^2 = \mu
Standard Deviation: \sigma = \sqrt{\mu}
7.15.5 Exponential Distribution
Cumulative Probability: P(X \leq t) = 1 - e^{-\mu t}
Mean: E(X) = \frac{1}{\mu}
Variance: \sigma^2 = \frac{1}{\mu^2}
7.15.6 Uniform Distribution
Mean: \mu = \frac{a + b}{2}
Variance: \sigma^2 = \frac{(b-a)^2}{12}
Probability: P(X_1 \leq X \leq X_2) = \frac{X_2 - X_1}{b - a}
Probability Density: f(x) = \frac{1}{b-a}
7.15.7 Normal Distribution
Z-Score Transformation: Z = \frac{X - \mu}{\sigma}
Inverse (Finding X from Z): X = \mu + Z\sigma
Normal Approximation to Binomial: \mu = n\pi \sigma = \sqrt{n\pi(1-\pi)}
Continuity Correction: - P(X = a) → P(a - 0.5 ≤ X ≤ a + 0.5)
- P(X ≤ a) → P(X ≤ a + 0.5)
- P(X ≥ a) → P(X ≥ a - 0.5)
7.15.8 Empirical Rule (Normal Distributions Only)
- 68.3% within μ ± 1σ
- 95.5% within μ ± 2σ
- 99.7% within μ ± 3σ
7.16 Chapter Summary
This chapter introduced the major probability distributions used in business statistics:
Discrete Distributions:
- Binomial: Fixed trials, constant probability, independent (credit approvals, quality sampling)
- Hypergeometric: Sampling without replacement from finite population (discrimination cases, shipment inspection)
- Poisson: Rare events over time/space (customer arrivals, defects, accidents)
Continuous Distributions:
- Exponential: Time between events (taxi arrivals, equipment failure, service times)
- Uniform: Equally likely outcomes (random number generation, arrival times)
- Normal: The crown jewel - heights, weights, IQ, financial returns, measurement error
Key Skills Mastered:
✓ Calculate probabilities using distribution formulas and tables
✓ Convert normal distributions to standard normal (Z-scores)
✓ Find probabilities from Z-scores and vice versa
✓ Apply continuity correction for normal approximation to binomial
✓ Select appropriate distribution based on business context
Next Chapter: We build on these foundations to explore sampling distributions - the bridge between probability theory and statistical inference!
Next Chapter: Sampling Distributions