View the model
Download the model
We want to extimate what fraction of items in a
certain population are Red.
How big a sample size do we need as a function of:
M = population size,
F = actual, but unknown to us, fraction of the population that is Red,
N = our sample size,
R = resulting number (a random variable) of items in our sample that are Red,
P = R/N = our estimate of the fraction items that are Red in the population,
T = Target standard deviation for P that we specify.
Then, based on the Hypergeometric distribution:
Var( R) = N*F*M*(M-F*M)*(M-N)/(M*M*(M-1))
= N*F*(1-F)*(M-N)/(M-1)
Var(P) = Var( R)/( N*N) = F*(1-F)*(M-N)/(N*(M-1))
If we want our estimator to have some specified target standard deviation T or less,
( implies variance of T*T) this means we want to solve:
T*T = F*(1-F)*(M-N)/(N*(M-1)), or if we multiply through by N:
N*T*T= F*(1-F)*(M-N)/(M-1), or
N*(T*T + F*(1-F)/(M-1)) = F*(1-F)*M/(M-1), or
N = (F*(1-F)*M/(M-1))/ (T*T + F*(1-F)/(M-1)),
As M goes to infinity, notice that
N approaches (from below): (F*(1-F))/(T*T);