Re: Running Examples using CDH3 + Whirr on EC2

Lokendra Singh Tue, 15 Feb 2011 09:40:17 -0800

Hi,

How did you mount 'testdata' on HDFS ?
If you want mahout to access data from HDFS, I suppose HADOOP_HOME has to be
set?


Regards
Lokendra


On Tue, Feb 15, 2011 at 11:01 PM, Sean Owen <sro...@gmail.com> wrote:

> I could be wrong -- I thought that also controlled what Hadoop assumes
> the file system to be for non-absolute paths. Though I now also see an
> "fs.defaultFS" parameter that sounds a little more like it.
>
> If setting these resolves the problem at least it's clear what's going
> on. Whether or not things ought to be smarter about assuming a certain
> file system is another question.
>
> On Tue, Feb 15, 2011 at 5:23 PM, Jeffrey Rodgers <jjrodg...@gmail.com>
> wrote:
> > Hm, my understanding has always been fs.default.name should point to
> your
> > namenode.  e.g:
> >
> >   <property>
> >     <name>fs.default.name</name>
> >     <value>hdfs://ec2-50-16-170-221.compute-1.amazonaws.com:8020</value>
> >   </property>
> >
> > On Mon, Feb 14, 2011 at 5:37 PM, Sean Owen <sro...@gmail.com> wrote:
> >>
> >> I think you're not setting your fs.default.name appropriately in the
> >> Hadoop config? This should control the base from which paths are
> >> resolved, so it this is not where you think it should be looking,
> >> check that setting.
> >>
> >> On Mon, Feb 14, 2011 at 10:34 PM, Jeffrey Rodgers <jjrodg...@gmail.com>
> >> wrote:
> >> > Hello,
> >> >
> >> > My test environment is using Cloudera's Hadoop (CDH beta 3) using
> Whirr
> >> > to
> >> > spawn the EC2 cluster.  I am spawning the cluster from another EC2
> >> > instance.
> >> >
> >> > I'm attempting to use the Kmeans example following the instructions
> from
> >> > the
> >> > Quickstart guide.  I mount my testdata on the HDFS and see:
> >> >
> >> > drwxr-xr-x   - ubuntu supergroup          0 2011-02-14 21:48
> >> > /user/ubuntu/Mahout-trunk
> >> >
> >> > Within Mahout-trunk is /testdata/.  Note the usage of /user/ubuntu/.
> >> >
> >> > When I run the examples, they seem to be looking for /home/ (see error
> >> > log
> >> > below).  Looking through the code, it looks there are functions for
> >> > getInput
> >> > so I assume there is a configuration setting of sorts, but it is not
> >> > apparent to me.
> >> >
> >> > no HADOOP_HOME set, running locally
> >> > Feb 14, 2011 10:05:14 PM org.slf4j.impl.JCLLoggerAdapter warn
> >> > WARNING: No
> >> > org.apache.mahout.clustering.syntheticcontrol.canopy.Job.props
> >> > found on classpath, will use command-line arguments only
> >> > Feb 14, 2011 10:05:14 PM org.slf4j.impl.JCLLoggerAdapter info
> >> > INFO: Running with default arguments
> >> > Feb 14, 2011 10:05:14 PM org.apache.hadoop.metrics.jvm.JvmMetrics init
> >> > INFO: Initializing JVM Metrics with processName=JobTracker, sessionId=
> >> > Feb 14, 2011 10:05:14 PM org.apache.hadoop.mapred.JobClient
> >> > configureCommandLineOptions
> >> > WARNING: Use GenericOptionsParser for parsing the arguments.
> >> > Applications
> >> > should implement Tool for the same.
> >> > Exception in thread "main"
> >> > org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input
> path
> >> > does
> >> > not exist: file:/home/ubuntu/Mahout-trunk/testdata
> >> > <trimmed>
> >> >
> >> > Thanks in advance,
> >> > Jeff
> >> >
> >
> >
>

Re: Running Examples using CDH3 + Whirr on EC2

Reply via email to