Wasn't possible, but it is now :) Just committed a patch that allow the input 
path to be a directory, checkout the last version of mahout and run TestForest 
like this:

[localhost]$ hjar examples/target/mahout-examples-0.4-SNAPSHOT.job 
org.apache.mahout.df.mapreduce.TestForest -i /user/fulltestdata -ds 
rf/testdata.info -m rf-testmodel-5-100 -a -o rf/fulltestprediction

for every file in fulltestdata (e.g. fulltestdata/file1.data) you'll get a 
prediction file in fulltestprediction (e.g. fulltestprediction/file1.data.out)

Hope it helps you


--- En date de : Ven 26.3.10, Yang Sun <[email protected]> a écrit :

> De: Yang Sun <[email protected]>
> Objet: Question about mahout Describe
> À: [email protected]
> Date: Vendredi 26 mars 2010, 22h16
> I was testing mahout recently. It
> runs great on small testing datasets.
> However, when I try to expand the dataset to a big dataset
> directory, I got
> the following error message:
> 
> [localhost]$ hjar
> examples/target/mahout-examples-0.4-SNAPSHOT.job
> org.apache.mahout.df.mapreduce.TestForest -i
> /user/fulltestdata/* -ds rf/
> testdata.info -m rf-testmodel-5-100 -a -o
> rf/fulltestprediction
> 
> Exception in thread "main" java.io.IOException: Cannot open
> filename
> /user/fulltestdata/*
>         at
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1474)
>         at
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1465)
>         at
> org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:372)
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:178)
>         at
> org.apache.hadoop.fs.FileSystem.open(FileSystem.java:351)
>         at
> org.apache.mahout.df.mapreduce.TestForest.testForest(TestForest.java:190)
>         at
> org.apache.mahout.df.mapreduce.TestForest.run(TestForest.java:137)
>         at
> org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at
> org.apache.mahout.df.mapreduce.TestForest.main(TestForest.java:228)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at
> java.lang.reflect.Method.invoke(Method.java:597)
>         at
> org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> My question is: can I use mahout on directories instead of
> single files? and
> how?
> 
> Thanks,
> 



Reply via email to