They are very different statistical models from a mathematical point of
view. See the online scikit-learn documentation or reference text books
such as "Elements of Statistical Learning" for more details.

In practice, linear model tends to be faster to fit on large data,
especially when the number of features is large (although it depends on the
solver, loss, penalty, data scaling...).

Linear model cannot fit prediction tasks when the data is not linearly
separable (by definition) while tree based model do not have this
restriction.

Tree based model can still under fit in some cases but for different
reasons (e.g. when we limit the depth of the trees).

Linear model can be made mode expressive via feature engineering (e.g.
k-bins discretizer, polynomial features expansion, Nystroem kernel
approximation...) and thereafter sometimes be competitive with tree based
models even on task that where originally non linearly separable tasks.
However this is not guaranteed either. Cross-validation and parameter
tuning are still required to tell which class of model works best for a
specific task.

As you said, tree based model "cannot extrapolate" in the sense that their
decision function is piecewise constant while the decision function of
linear model is an hyperplane. Depending on the tasks the lack of
extrapolation can either be considered a limitation or a benefit (for
instance to avoid unrealistic extrapolations like people with a negative
age or size, predicting negative mechanical energy loss via heat
dissipation, fractions that are larger than 100%, 6 stars out of 5
recommendations...).

-- 
Olivier
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to