One important clarification, for now only TestForest can handle directory input 
paths, BuildForest won't work with input directories

--- En date de : Sam 27.3.10, deneche abdelhakim <[email protected]> a écrit :

> De: deneche abdelhakim <[email protected]>
> Objet: Re : Question about mahout Describe
> À: [email protected]
> Date: Samedi 27 mars 2010, 7h43
> Wasn't possible, but it is now :)
> Just committed a patch that allow the input path to be a
> directory, checkout the last version of mahout and run
> TestForest like this:
> 
> [localhost]$ hjar
> examples/target/mahout-examples-0.4-SNAPSHOT.job
> org.apache.mahout.df.mapreduce.TestForest -i
> /user/fulltestdata -ds rf/testdata.info -m
> rf-testmodel-5-100 -a -o rf/fulltestprediction
> 
> for every file in fulltestdata (e.g.
> fulltestdata/file1.data) you'll get a prediction file in
> fulltestprediction (e.g. fulltestprediction/file1.data.out)
> 
> Hope it helps you
> 
> 
> --- En date de : Ven 26.3.10, Yang Sun <[email protected]>
> a écrit :
> 
> > De: Yang Sun <[email protected]>
> > Objet: Question about mahout Describe
> > À: [email protected]
> > Date: Vendredi 26 mars 2010, 22h16
> > I was testing mahout recently. It
> > runs great on small testing datasets.
> > However, when I try to expand the dataset to a big
> dataset
> > directory, I got
> > the following error message:
> > 
> > [localhost]$ hjar
> > examples/target/mahout-examples-0.4-SNAPSHOT.job
> > org.apache.mahout.df.mapreduce.TestForest -i
> > /user/fulltestdata/* -ds rf/
> > testdata.info -m rf-testmodel-5-100 -a -o
> > rf/fulltestprediction
> > 
> > Exception in thread "main" java.io.IOException: Cannot
> open
> > filename
> > /user/fulltestdata/*
> >         at
> >
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1474)
> >         at
> >
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1465)
> >         at
> >
> org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:372)
> >         at
> >
> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:178)
> >         at
> >
> org.apache.hadoop.fs.FileSystem.open(FileSystem.java:351)
> >         at
> >
> org.apache.mahout.df.mapreduce.TestForest.testForest(TestForest.java:190)
> >         at
> >
> org.apache.mahout.df.mapreduce.TestForest.run(TestForest.java:137)
> >         at
> >
> org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >         at
> >
> org.apache.mahout.df.mapreduce.TestForest.main(TestForest.java:228)
> >         at
> > sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> >         at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >         at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >         at
> > java.lang.reflect.Method.invoke(Method.java:597)
> >         at
> > org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> > My question is: can I use mahout on directories
> instead of
> > single files? and
> > how?
> > 
> > Thanks,
> > 
> 
> 
> 
> 



Reply via email to