Hi, How did you mount 'testdata' on HDFS ? If you want mahout to access data from HDFS, I suppose HADOOP_HOME has to be set?
Regards Lokendra On Tue, Feb 15, 2011 at 11:01 PM, Sean Owen <sro...@gmail.com> wrote: > I could be wrong -- I thought that also controlled what Hadoop assumes > the file system to be for non-absolute paths. Though I now also see an > "fs.defaultFS" parameter that sounds a little more like it. > > If setting these resolves the problem at least it's clear what's going > on. Whether or not things ought to be smarter about assuming a certain > file system is another question. > > On Tue, Feb 15, 2011 at 5:23 PM, Jeffrey Rodgers <jjrodg...@gmail.com> > wrote: > > Hm, my understanding has always been fs.default.name should point to > your > > namenode. e.g: > > > > <property> > > <name>fs.default.name</name> > > <value>hdfs://ec2-50-16-170-221.compute-1.amazonaws.com:8020</value> > > </property> > > > > On Mon, Feb 14, 2011 at 5:37 PM, Sean Owen <sro...@gmail.com> wrote: > >> > >> I think you're not setting your fs.default.name appropriately in the > >> Hadoop config? This should control the base from which paths are > >> resolved, so it this is not where you think it should be looking, > >> check that setting. > >> > >> On Mon, Feb 14, 2011 at 10:34 PM, Jeffrey Rodgers <jjrodg...@gmail.com> > >> wrote: > >> > Hello, > >> > > >> > My test environment is using Cloudera's Hadoop (CDH beta 3) using > Whirr > >> > to > >> > spawn the EC2 cluster. I am spawning the cluster from another EC2 > >> > instance. > >> > > >> > I'm attempting to use the Kmeans example following the instructions > from > >> > the > >> > Quickstart guide. I mount my testdata on the HDFS and see: > >> > > >> > drwxr-xr-x - ubuntu supergroup 0 2011-02-14 21:48 > >> > /user/ubuntu/Mahout-trunk > >> > > >> > Within Mahout-trunk is /testdata/. Note the usage of /user/ubuntu/. > >> > > >> > When I run the examples, they seem to be looking for /home/ (see error > >> > log > >> > below). Looking through the code, it looks there are functions for > >> > getInput > >> > so I assume there is a configuration setting of sorts, but it is not > >> > apparent to me. > >> > > >> > no HADOOP_HOME set, running locally > >> > Feb 14, 2011 10:05:14 PM org.slf4j.impl.JCLLoggerAdapter warn > >> > WARNING: No > >> > org.apache.mahout.clustering.syntheticcontrol.canopy.Job.props > >> > found on classpath, will use command-line arguments only > >> > Feb 14, 2011 10:05:14 PM org.slf4j.impl.JCLLoggerAdapter info > >> > INFO: Running with default arguments > >> > Feb 14, 2011 10:05:14 PM org.apache.hadoop.metrics.jvm.JvmMetrics init > >> > INFO: Initializing JVM Metrics with processName=JobTracker, sessionId= > >> > Feb 14, 2011 10:05:14 PM org.apache.hadoop.mapred.JobClient > >> > configureCommandLineOptions > >> > WARNING: Use GenericOptionsParser for parsing the arguments. > >> > Applications > >> > should implement Tool for the same. > >> > Exception in thread "main" > >> > org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input > path > >> > does > >> > not exist: file:/home/ubuntu/Mahout-trunk/testdata > >> > <trimmed> > >> > > >> > Thanks in advance, > >> > Jeff > >> > > > > > >