Thanks, Chris. I will look into your recommendations. I have tried artificial neural network and it was giving me good results on test set as well.
Regards Waseem On Thu, Jun 23, 2016 at 12:00 PM, chris brew <cb...@acm.org> wrote: > It is probably a good idea to start by separating off part of your > training data into a held-out development set that is not used for > training, which you can use to create learning curves and estimate probable > performance on unseen data. I really recommend Andrew Ng's machine learning > course material from Stanford and Coursera. It shows you how to use > learning curves to understand your problem and also the way that different > estimators behave. > > > There are many estimators that will achieve an extremely good fit to > typical training data, but the differences between estimators show up > mostly in what happens with unseen test data. Personally I always start by > seeing how well simple classifiers or regressors do (Naive Bayes, linear > regression, etc.), then try regularized linear models like ElasticNets then > try SVMs, then try random forests or other ensemble models. That way, I > finish up using the powerful and complex models only when the data demands > it. > > On 23 June 2016 at 10:20, muhammad waseem <m.waseem.ah...@gmail.com> > wrote: > >> Hi All, >> I am trying to use random forests for a regression problem, with 10 input >> variables and one output variable. I am getting very good fit even with >> default parameters and low n_estimators. Even with n_estimator = 10, I get >> R^2 value of 0.95 on testing dataset (MSE=23) and a value of 0.99 for >> the training set. I was wondering, if this is common with random forest or >> I am missing something, Could you please share your experience? The total >> number of sample (training +testing) are equal to 10971. >> Also, what are the most important parameters (max_depth, bootstrap, >> max_leaf_nodes etc.) that I need to play with to tune my model even >> further? Lastly, is there is a way I can visualise a single tree of my >> forest (just for demonstration purposes)? >> Please see a figure below to demonstrate how well it is fitting with >> default values. >> >> >> >> [image: Inline image 1] >> Thanks >> Kindest Regards >> Waseem >> >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn@python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> >> > > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn > >
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn