They are very different statistical models from a mathematical point of view. See the online scikit-learn documentation or reference text books such as "Elements of Statistical Learning" for more details.
In practice, linear model tends to be faster to fit on large data, especially when the number of features is large (although it depends on the solver, loss, penalty, data scaling...). Linear model cannot fit prediction tasks when the data is not linearly separable (by definition) while tree based model do not have this restriction. Tree based model can still under fit in some cases but for different reasons (e.g. when we limit the depth of the trees). Linear model can be made mode expressive via feature engineering (e.g. k-bins discretizer, polynomial features expansion, Nystroem kernel approximation...) and thereafter sometimes be competitive with tree based models even on task that where originally non linearly separable tasks. However this is not guaranteed either. Cross-validation and parameter tuning are still required to tell which class of model works best for a specific task. As you said, tree based model "cannot extrapolate" in the sense that their decision function is piecewise constant while the decision function of linear model is an hyperplane. Depending on the tasks the lack of extrapolation can either be considered a limitation or a benefit (for instance to avoid unrealistic extrapolations like people with a negative age or size, predicting negative mechanical energy loss via heat dissipation, fractions that are larger than 100%, 6 stars out of 5 recommendations...). -- Olivier
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn