Hi Jason, It looks like you are evaluating your error on your training data, aren't you? It will give you a (very) poor estimate of the generalization error of your model. Instead, try your model on an independent part of your dataset (in particular, one which has a not been used to fit to your forest), it should give you a better estimate. You can also evaluate your model within a cross-validation loop.
Best, Gilles On 15 August 2013 09:12, Robert Layton <[email protected]> wrote: > The first thing I'd do is publish the result (just kidding!). > > Try it with another data set first, especially one that has an example in > the docs. > If you are still getting top marks, it may be your "framework" around the > code. (are you doing proper test/train splits, etc) > If it drops, consider that you may have a dataset that can get high > accuracies. Random Forests are good methods... > > > On 15 August 2013 17:03, Jason Williams <[email protected]> wrote: >> >> I ran a few test based on Random Forest Classifier >> (http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html) >> with default setting. The classification (repeated the classification >> procedure several times) is nearly 100% correct. That seems to be >> overfitting. Is there any points (e.g. dataset, etc.) I can check to see if >> I did something wrong? >> >> Thanks >> >> >> ------------------------------------------------------------------------------ >> Get 100% visibility into Java/.NET code with AppDynamics Lite! >> It's a free troubleshooting tool designed for production. >> Get down to code-level detail for bottlenecks, with <2% overhead. >> Download for free and get started troubleshooting in minutes. >> >> http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk >> _______________________________________________ >> Scikit-learn-general mailing list >> [email protected] >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > > > > > -- > > Public key at: http://pgp.mit.edu/ Search for this email address and select > the key from "2011-08-19" (key id: 54BA8735) > > ------------------------------------------------------------------------------ > Get 100% visibility into Java/.NET code with AppDynamics Lite! > It's a free troubleshooting tool designed for production. > Get down to code-level detail for bottlenecks, with <2% overhead. > Download for free and get started troubleshooting in minutes. > http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk > _______________________________________________ > Scikit-learn-general mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > ------------------------------------------------------------------------------ Get 100% visibility into Java/.NET code with AppDynamics Lite! It's a free troubleshooting tool designed for production. Get down to code-level detail for bottlenecks, with <2% overhead. Download for free and get started troubleshooting in minutes. http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
