None of the clustering implementations hard code the filesystem. The
file names are constructed from the input and output filepath arguments.
Jeff
Grant Ingersoll wrote:
I seem to recall this being something you have to set in your Hadoop
configuration. Or, let me double check that we aren't hard-coding the
FS in our Job.
-Grant
On Apr 15, 2009, at 1:27 PM, Stephen Green wrote:
On Apr 14, 2009, at 6:54 PM, Stephen Green wrote:
On Apr 14, 2009, at 5:17 PM, Grant Ingersoll wrote:
I would be concerned about the fact that EMR is using 0.18 and
Mahout is on 0.19 (which of course raises another concern expressed
by Owen O'Malley to me at ApacheCon: No one uses 0.19)
Well, I did run Mahout locally on a 0.18.3 install, but that was
writing to and reading from HDFS. I can build a custom
mahout-examples that has the 0.18.3 Hadoop jars (or perhaps no
hadoop jar at all...) I'm guessing if EMR is on 0.18.3 and it gets
popular, then you're going to have to deal with that problem.
More fun today. I checked out the mahout-0.1 release and rebuilt
mahout. I took the mahout-examples job, removed the hadoop jar, and
then tried to run the KMeans clustering against the synthetic control
data. This failed with the same exception that I was originally
getting yesterday:
java.lang.IllegalArgumentException: Wrong FS: s3n://mahout-output/,
expected: hdfs://domU-12-31-38-01-C5-22.compute-1.internal:9000
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:320)
at
org.apache.hadoop.dfs.DistributedFileSystem.checkPath(DistributedFileSystem.java:84)
at
org.apache.hadoop.dfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:140)
at
org.apache.hadoop.dfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:408)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:667)
at
org.apache.mahout.clustering.syntheticcontrol.kmeans.Job.runJob(Job.java:77)
at
org.apache.mahout.clustering.syntheticcontrol.kmeans.Job.main(Job.java:43)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
Steve
--
Stephen Green // [email protected]
Principal Investigator \\ http://blogs.sun.com/searchguy
Aura Project // Voice: +1 781-442-0926
Sun Microsystems Labs \\ Fax: +1 781-442-1692
--------------------------
Grant Ingersoll
http://www.lucidimagination.com/
Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
using Solr/Lucene:
http://www.lucidimagination.com/search