Hi Deneche, Just tested it. With the KDD dataset, everything works fine. When I try to use my own dataset, the BuildForest class throws an exception
Error: null I attached my dataset with 283 numerical features and the last column is class label of 1 or 0. Do you know why I got this exception? Thanks Yang On Fri, Mar 19, 2010 at 12:10 AM, deneche abdelhakim <[email protected]>wrote: > Hi Yang, > > The changes will be available in Mahout 0.4, but they are already > committed, so you could just get the code from svn. > By the way, I updated the Wiki to explain how to use the new code, take a > look here: > http://cwiki.apache.org/MAHOUT/partial-implementation.html > > Hope you find it useful > > Deneche > --- En date de : Ven 19.3.10, Yang Sun <[email protected]> a écrit : > > > De: Yang Sun <[email protected]> > > Objet: Re: New to mahout > > À: [email protected] > > Date: Vendredi 19 mars 2010, 0h11 > > Hi deneche, > > I noticed that Mahout 0.3 is released. Is the random > > forrest class ready to > > output? Is it still called > > org.apache.mahout.df.BreimanExample? > > > > Thanks, > > > > On Fri, Mar 12, 2010 at 3:01 AM, deneche abdelhakim <[email protected] > >wrote: > > > > > Yes, there is still a lot of work to do =P > > > As Ted said, Decision Forests classifier should > > ultimately have a similar > > > interface to all Mahout classifiers. > > > > > > > > > > 1. how can I output the model and how can I use > > the trained model to > > > > predict > > > > > > I should commit a patch really soon (this Saturday ?) > > that will allow you > > > to save the trained model and use it to claffiy new > > data > > > > > > > > > > 2. There is also no option to specify number of > > random > > > > features for each tree. How can I adjust that > > parameter? > > > > > > the -sl parameter allows you to specify the number of > > random features the > > > trainer will randomly select for each tree node. Is > > this what you are > > > looking for ? > > > > > > > > > > I think there is still no enough parameters > > options to use, at least not > > > > enough as R's > > > > > > I'll love to hear any suggestion/addition you want me > > to make. I already > > > have a lot of features I want to add, but I could use > > your (the users) > > > feedback to know which feature I should start working > > on first. =D > > > > > > --- En date de : Ven 12.3.10, Cui tony <[email protected]> > > a écrit : > > > > > > > De: Cui tony <[email protected]> > > > > Objet: Re: New to mahout > > > > À: [email protected] > > > > Date: Vendredi 12 mars 2010, 2h43 > > > > 1. You can check the example java > > > > code in trunk : BuildForest.java and > > > > TestForest.java. > > > > > > > > 2. I think there is still no enough parameters > > options to > > > > use, at least not > > > > enough as R's > > > > > > > > 2010/3/12 Yang Sun <[email protected]> > > > > > > > > > Thanks for the reply. The trunk version runs > > without > > > > any problem. I still > > > > > have a couple questions about the method. > > > > > > > > > > 1. how can I output the model and how can I > > use the > > > > trained model to > > > > > predict > > > > > classes of new data? I saw the options of > > the class: > > > > > > > > > > Options > > > > > --data (-d) path > > > > Data > > path > > > > > --dataset (-ds) dataset > > > > Dataset path > > > > > --iterations (-i) numIterations > > > > Number of times to repeat the test > > > > > --nbtrees (-t) nbtrees > > > > Number of trees to > > grow, > > > > each iteration > > > > > --help (-h) > > > > > > Print out > > > > help > > > > > > > > > > It seems no option for specifying output > > directory on > > > > Hadoop. > > > > > > > > > > 2. There is also no option to specify number > > of random > > > > features for each > > > > > tree. How can I adjust that parameter? > > > > > > > > > > Thanks > > > > > > > > > > On Thu, Mar 11, 2010 at 10:29 AM, Ted > > Dunning <[email protected]> > > > > > wrote: > > > > > > > > > > > Try using the trunk version. We > > are about > > > > to release 0.3 and it has > > > > > > significant improvements. > > > > > > > > > > > > On Thu, Mar 11, 2010 at 9:40 AM, Yang > > Sun <[email protected]> > > > > > wrote: > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > It's the first time I try to use > > mahout. But > > > > the Breiman example gave > > > > > me > > > > > > > the > > > > > > > following exception: > > > > > > > > > > > > > > [localhost]$ hadoop jar > > > > examples/target/mahout-examples-0.2.job > > > > > > > > > org.apache.mahout.df.BreimanExample -d > > > > test_data/glass.data -ds > > > > > > test_data/ > > > > > > > glass.info -i 10 -t 100 > > > > > > > 10/03/11 09:26:07 INFO > > df.BreimanExample: > > > > Iteration 0 > > > > > > > 10/03/11 09:26:07 INFO > > df.BreimanExample: > > > > Growing a forest with m=4 > > > > > > > 10/03/11 09:26:07 INFO > > > > ref.SequentialBuilder: Building 10% > > > > > > > 10/03/11 09:26:07 INFO > > > > ref.SequentialBuilder: Building 20% > > > > > > > 10/03/11 09:26:07 INFO > > > > ref.SequentialBuilder: Building 30% > > > > > > > 10/03/11 09:26:07 INFO > > > > ref.SequentialBuilder: Building 40% > > > > > > > 10/03/11 09:26:07 INFO > > > > ref.SequentialBuilder: Building 50% > > > > > > > 10/03/11 09:26:07 INFO > > > > ref.SequentialBuilder: Building 60% > > > > > > > 10/03/11 09:26:07 INFO > > > > ref.SequentialBuilder: Building 70% > > > > > > > 10/03/11 09:26:07 INFO > > > > ref.SequentialBuilder: Building 80% > > > > > > > 10/03/11 09:26:07 INFO > > > > ref.SequentialBuilder: Building 90% > > > > > > > 10/03/11 09:26:07 INFO > > > > ref.SequentialBuilder: Building 100% > > > > > > > Exception in thread "main" > > > > java.lang.IllegalArgumentException: > > > > > > > labels.length > > > > > > > != predictions.length > > > > > > > at > > > > > > > > > > > > > org.apache.mahout.df.ErrorEstimate.errorRate(ErrorEstimate.java:29) > > > > > > > at > > > > > > > > > > > > > > > > > > org.apache.mahout.df.BreimanExample.runIteration(BreimanExample.java:108) > > > > > > > at > > > > > > > > > > > > org.apache.mahout.df.BreimanExample.run(BreimanExample.java:214) > > > > > > > at > > > > > > org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > > > > > > > at > > > > > > > > > > > > org.apache.mahout.df.BreimanExample.main(BreimanExample.java:143) > > > > > > > at > > > > > > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > > > > > > at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > > > > > > > at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > > > > > > at > > > > java.lang.reflect.Method.invoke(Method.java:597) > > > > > > > at > > > > > > org.apache.hadoop.util.RunJar.main(RunJar.java:156) > > > > > > > > > > > > > > Can any one help me get through > > it? > > > > > > > > > > > > > > Thanks, > > > > > > > Yang > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
