the central limit theorem is one of the
most important concepts in statistics
the reason for this is the unmatched
practical application of the theorem
okay let's get started then imagine that
you are given a data set its
distribution does not matter it could be
normal uniform binomial or completely
random the first thing you want to do is
start taking out subsets from the data
set or as statisticians call it you
start sampling it this would allow you
to get a better idea of how the entire
data set is made right okay
once you have taken a sufficient number
of samples and then calculated the mean
of each sample we'll be able to apply
the central limit theorem
no matter the distribution of the entire
data set binomial uniform or another one
the means of the samples you took from
the entire data set will approximate a
normal distribution the more samples you
extract and the bigger they are the
closer to a normal distribution the
sample means will be more over their
distribution will have the same mean as
the original data set and an end times
smaller variance where n is the size of
your samples you took from the data set
let's confirm the theorem with an
example we have prepared 960 random
numbers from 1 to 1,000 this is their
frequency distribution so you are sure
that they are randomly picked the mean
of this data set is 489 and it's
variance is 82,000 805 let's extract 30
random samples out of the data set each
consisting of 25 numbers remember when
we said that the sample should be
sufficiently large a common rule of
thumb is that the sample should be
bigger than 25 observations the bigger
the sample size the better the results
you'll get so we have our samples now we
are going to calculate their means and
plot them once again okay excellent
it looks approximately normally
distributed doesn't it let's check if
the other part of the theorem was right
the mean of our newly acquired dataset
is 492 while it's variance 3000 171 did
we expect these numbers we anticipated a
mean of 489 and a variance of 80 2805
divided by 25 so around three thousand
three hundred twelve well when dealing
with such big numbers we almost get the
mean right and the variance was not that
far off either in the next few lectures
you will learn how to statistically
confirm whether such small differences
are close enough to the actual result we
expect to obtain spoiler alert they are
and we'll show you why so we have
learned the main idea behind the central
limit theorem the key takeaway from this
lesson is that the number of samples
taken tends towards infinity the
distribution of the means start
approximating a normal distribution
imagine their power if your data set was
made up of millions of values and you
could afford to sample just a tiny bit
of them we can be assuming normally
distributed data almost all the time and
that's extremely helpful as you will see
later on okay thanks for watching
for more videos like this one please
subscribe