Bias-Variance Trade-Off: A friendly guide

Bias? Variance? A Trade-Off? What's all that? 🤷‍♂️️

Alright, Before we jump into answering your questions, let's get something out of the way. "Generalization Error"

According to Wikipedia: In supervised learning applications in machine learning and statistical learning theory, generalization error is a measure of how accurately an algorithm is able to predict outcome values for previously unseen data.

Now, we have that out of the way. Let's see how "Generalization Error" jumped into the topic.

PermalinkAlright, What is Bias? 🤔️

Bias is the part of the generalization error that happens due to wrong assumptions by your model. An instance of this wrong assumption is when your model assumes your training data is linear when it is actually quadratic.

A highly biased model is always likely to "underfit" the training data.

Now you see why we talked about generalization error initially?

PermalinkOk, What is Variance? 😕️

Variance is the part of the generalization error that happens due to the model's excessive sensitivity to small variations in the training data. (Oh, Interesting) A model with a high degree of sensitivity is likely to have high variance and thus "overfit" the training data.

So then, What is the Bias-Variance Trade-Off?

Well, increasing a model's complexity will typically increase its variance and reduce its bias. Conversely, reducing a model's complexity increases its bias and reduces its variance. This is why it is called a "trade-off".

Hope this has answered some of your questions.
See you next time 🎉️😎️