the actual code does not support the ARFF format directly, you need to remove all the lines at the start of the train and test sets that begin with '@', what's left should be the data tuples.
The following error message says that it actually found such a line at the start of the training set: > 10/03/15 10:07:00 ERROR data.DataLoader: 0: @relation 'KDDTrain' --- En date de : Lun 15.3.10, Cui tony <[email protected]> a écrit : > De: Cui tony <[email protected]> > Objet: Re: decision forest > À: [email protected] > Date: Lundi 15 mars 2010, 3h09 > I followed the wiki, and get error > message on step of "generate a file > descriptor" > > hadoop jar trunk/core/target/mahout-core-0.4-SNAPSHOT.job > org.apache.mahout.df.tools.Describe -p > testdata/KDDTrain+.arff -f > testdata/KDDTrain+.info -d N 3 C 2 N C 4 N C 8 N 2 C 19 N > L > 10/03/15 10:07:00 INFO tools.Describe: Generating the > descriptor... > 10/03/15 10:07:00 INFO tools.Describe: generating the > dataset... > 10/03/15 10:07:00 ERROR data.DataLoader: 0: @relation > 'KDDTrain' > Exception in thread "main" > java.lang.IllegalArgumentException: Wrong number > of attributes in the string > at > org.apache.mahout.df.data.DataLoader.parseString(DataLoader.java:67) > at > org.apache.mahout.df.data.DataLoader.generateDataset(DataLoader.java:222) > at > org.apache.mahout.df.tools.Describe.generateDataset(Describe.java:120) > at > org.apache.mahout.df.tools.Describe.runTool(Describe.java:109) > at > org.apache.mahout.df.tools.Describe.main(Describe.java:94) > at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at > java.lang.reflect.Method.invoke(Method.java:597) > at > org.apache.hadoop.util.RunJar.main(RunJar.java:156) > > 2010/3/14 deneche abdelhakim <[email protected]> > > > I committed the patch. Take the look at the wiki ( > > http://cwiki.apache.org/MAHOUT/partial-implementation.html) > to see how to > > classify new data. > > > > --- En date de : Ven 12.3.10, deneche abdelhakim > <[email protected]> > a > > écrit : > > > > > De: deneche abdelhakim <[email protected]> > > > Objet: Re: decision forest > > > À: [email protected] > > > Date: Vendredi 12 mars 2010, 10h53 > > > I'll have some free time this > > > Saturday, I'll take a look at this issue. > Hopefully I will > > > be able to commit the patch. > > > > > > --- En date de : Jeu 11.3.10, Cui tony <[email protected]> > > > a écrit : > > > > > > > De: Cui tony <[email protected]> > > > > Objet: Re: decision forest > > > > À: [email protected] > > > > Date: Jeudi 11 mars 2010, 2h55 > > > > Is that a bug or my compiling > > > > problem? > > > > > > > > 2010/3/9 Cui tony <[email protected]> > > > > > > > > > I did the following : > > > > > > > > > > rm -rf ~/.m2 > > > > > and in the trunk folder to mvn install > : > > > > > > > > > > [...@master trunk]$ pwd > > > > > /data/hadoop/mahout/trunk > > > > > [...@master trunk]$ mvn clean install > > > -DskipTests=true > > > > > > > > > > Everything is fine, including install > core and > > > utils, > > > > but when come to > > > > > examples, I meet the same error as I > described > > > last > > > > mail. > > > > > > > > > > > > > > > 2010/3/9 Jake Mannix <[email protected]> > > > > > > > > > > On Mon, Mar 8, 2010 at 9:59 PM, Robin > Anil <[email protected]> > > > > wrote: > > > > >> > > > > >> > I suspect your maven > dependencies are > > > screwed > > > > up, I would suggest the > > > > >> > following. > > > > >> > > > > > >> > #clear the maven cache > > > > >> > rm -rf ~/.m2 > > > > >> > #do a clean install > > > > >> > mvn clean install > -DskipTests=true > > > > >> > > > > > >> > > > > >> ... from the top level mahout > checkout > > > directory > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
