A vast set of mathematical and computational tools used to understand data is referred to as Statistical learning. Those tools are often classified as supervised or unsupervised.  If you build a statistical model for prediction or estimation, then the outputs are probably based on inputs, then we can say that you are using supervised statistical learning. This is often the case in the economy, medicine, astrophysics, public policy, and so on. On the other side, if you use unsupervised statistical learning, outputs are unsupervised by the inputs; you don't build a model that presumes a relation between input and output.  Nevertheless, one can learn the relationship and structure from such data. 



Examples of statistical learning:

  • Wage Data - finding the relation between two variables (number of sales as a function of advertising budget, or life expectancy in years as a function of GDP of a country,...)
  • Stock Market Data - predicting an output value, categorical or qualitative. For example, one can ask about is some stock index going to rise or fall at some time in the future, or what is going to be a value of a USD at the beginning of next month.
  • Clustering Problems - This includes observing input data without corresponding output. For example, demographic data about customers can be used to find (discover) which types of customers (age, sex, education level,...) are similar to each other by grouping them. Here, you don't try to predict an output variable.
Term statistical learning is fairly new, but the concepts underlying this field are developed a long time ago. Legendre and Gauss, at the beginning of the 19th century, published a paper on the method of least squares (an earlier form of so-called linear regression). This approach was first successfully used in astronomy. The 20th century brought other linear methods, but only with increasing computational power did other non-linear methods emerged, and help bring out statistical learning as a powerful tool for the broader community.