Thanks, Chris. I will look into your recommendations. I have tried
artificial neural network and it was giving me good results on test set as
well.

Regards
Waseem

On Thu, Jun 23, 2016 at 12:00 PM, chris brew <cb...@acm.org> wrote:

> It is probably a good idea to start by separating off part of your
> training data into a held-out development set that is not used for
> training, which you can use to create learning curves and estimate probable
> performance on unseen data. I really recommend Andrew Ng's machine learning
> course material from Stanford and Coursera. It shows you how to use
> learning curves to understand your problem and also the way that different
> estimators behave.
>
>
> There are many estimators that will achieve an extremely good fit to
> typical training data, but the differences between estimators show up
> mostly in what happens with unseen test data. Personally I always start by
> seeing how well simple classifiers or regressors do (Naive Bayes, linear
> regression, etc.), then try regularized linear models like ElasticNets then
> try SVMs, then try random forests or other ensemble models. That way, I
> finish up using the powerful and complex models only when the data demands
> it.
>
> On 23 June 2016 at 10:20, muhammad waseem <m.waseem.ah...@gmail.com>
> wrote:
>
>> Hi All,
>> I am trying to use random forests for a regression problem, with 10 input
>> variables and one output variable. I am getting very good fit even with
>> default parameters and low n_estimators. Even with n_estimator = 10, I get
>> R^2 value of 0.95 on testing dataset (MSE=23) and a value of 0.99 for
>> the training set. I was wondering, if this is common with random forest or
>> I am missing something, Could you please share your experience? The total
>> number of sample (training +testing) are equal to 10971.
>> Also, what are the most important parameters (max_depth, bootstrap,
>> max_leaf_nodes etc.) that I need to play with to tune my model even
>> further? Lastly, is there is a way I can visualise a single tree of my
>> forest (just for demonstration purposes)?
>> Please see a figure below to demonstrate how well it is fitting with
>> default values.
>>
>>
>>
>> [image: Inline image 1]
>> Thanks
>> Kindest Regards
>> Waseem
>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn@python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to