Hi All, I am trying to use random forests for a regression problem, with 10 input variables and one output variable. I am getting very good fit even with default parameters and low n_estimators. Even with n_estimator = 10, I get R^2 value of 0.95 on testing dataset (MSE=23) and a value of 0.99 for the training set. I was wondering, if this is common with random forest or I am missing something, Could you please share your experience? The total number of sample (training +testing) are equal to 10971. Also, what are the most important parameters (max_depth, bootstrap, max_leaf_nodes etc.) that I need to play with to tune my model even further? Lastly, is there is a way I can visualise a single tree of my forest (just for demonstration purposes)? Please see a figure below to demonstrate how well it is fitting with default values.
[image: Inline image 1] Thanks Kindest Regards Waseem
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn