Thank you both for the papers references. @ Andreas, What is your take? And what are you implying?
The Breiman (2001) paper points out the black box vs. statistical approach. I call them black box vs. open box. He advocates black box in the paper. Black box: y <--- nature <--- x Open box: y <--- linear regression <---- x Decision trees and neural nets are black box model. They require large amount of data to train, and skip the part where it tries to understand nature. Because it is a black box, you can't open up to see what's inside. Linear regression is a very simple model that you can use to approximate nature, but the key thing is that you need to know how the data are generated. @ Brown, I know nothing about molecular modeling. The paper your linked "Beware of q2!" paper raises some interesting point, as far as I see in sklearn linear regression, score is R^2. On Wed, Jun 5, 2019 at 9:11 AM Andreas Mueller <t3k...@gmail.com> wrote: > > On 6/4/19 8:44 PM, C W wrote: > > Thank you all for the replies. > > > > I agree that prediction accuracy is great for evaluating black-box ML > > models. Especially advanced models like neural networks, or > > not-so-black models like LASSO, because they are NP-hard to solve. > > > > Linear regression is not a black-box. I view prediction accuracy as an > > overkill on interpretable models. Especially when you can use > > R-squared, coefficient significance, etc. > > > > Prediction accuracy also does not tell you which feature is important. > > > > What do you guys think? Thank you! > > > Did you read the paper that I sent? ;) > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn >
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn