# T Distribution - Explained

What is a T Distribution?

# What is a T Distribution?

The T distribution, also known as Students t-distribution, is a probability distribution that is used to estimate population parameters in small-sized samples as well as samples with an unknown population variance. The T distribution shares a lot of similarities with normal distributions (a.k.a. Bell Curves); however, it is discernibly shorter as well as fatter with heavier tails. This essentially means that T distributions are much more likely to depict extreme values, compared to normal distributions. Tail heaviness in T distributions can be attributed to a parameter known as the degrees of freedom the smaller the value, the heavier the tail. Conversely, for higher values, i.e for sample sizes greater than 30, the T distribution assumes the shape of a standard normal distribution, with a mean of 0 and a standard deviation of 1.

Back to: RESEARCH, ANALYSIS, & DECISION SCIENCE

## How is a T Distribution Used?

The T distribution was introduced in 1908 by a chemist by the name of William Sealy Gosset. Hired by the celebrated Guinness brewery in its quest for formulating better industrial processes through the application of biochemistry, Gosset wasted no time in devising the T test as a cost-effective method of monitoring the quality of stout. However, it was company policy at Guinness during that time to forbid chemists from publishing their findings. This did not deter Gosset and he promptly published his statistical work under the pseudonym Student. To understand the basic principles behind a T distribution, let us consider a sample size of n observations picked from a standard normal population distribution with a mean denoted by M and a standard deviation denoted by D. Now, it will be observed that the sample mean m and the sample standard deviation d differ from M and D. This variance can be attributed to the randomness of the sample. From the above considerations, it will be possible to calculate a Z-score by using the following formula. Z = (m M)/{D/sqrt(n)} The Z-score calculated using the above formula has a normal distribution value with a mean of 0 and a standard deviation of 1. Now, let us consider the same Z-score calculated by applying the estimated standard deviation with the following formula T = (m M)/{d/sqrt(n)} It will be observed that the difference between d and D turns the normal distribution into a T distribution that exhibits (n - 1) degrees of freedom.

## The Rationale behind Using a T Distribution

The central limit theorem postulates that in large sample sizes, the sampling distribution of a statistic is inclined to follow a normal distribution. As such, it is a straightforward process to calculate a z-score, as long as the standard deviation of the population is known.This, in turn, allows statisticians to evaluate probabilities with the sample mean by using the normal distribution. However, in the case of smaller samples, the standard deviation of the population is often unknown, and as such statisticians use the distribution of the t statistic. In short, the t distribution allows statisticians to perform statistical analyses on smaller data sets that otherwise cannot be analyzed using the normal distribution. According to statisticians, Frederick Mosteller and John Tukey, the value of students work is not on great numerical change, but rather in the assumption that it is possible to make allowances for the uncertainties of small samples, even in studies that vastly differ from the students original problem. According to Mosteller and Tukey, the value of Students work also lay in the provision of numerical assessment of how small the numerical adjustments of confidence points were in the Students problem and how they relied on the extremeness of the probabilities that were involved. Lastly, the value of Students work also lay in presentation of tables that could be used to assess the uncertainty associated with even minute data samples.

## Limitations of the T Distribution

According to Mosteller and Tukey, the t distribution also suffered from certain drawbacks and limitations. To begin with, statisticians using the t distribution were easily prone to neglecting the proviso that the solutions would stand to be true if and only if appropriate assumptions were being held. Secondly, a t distribution usually tends to overemphasize on the accuracy of Students solution for his idealized problem. Lastly, the t distribution helped to divert attention of theoretical statisticians to the development of exact ways of treating other problems.