On Fri, Jan 28, 2011 at 6:28 AM, <praveen.pe...@nokia.com> wrote: > Thanks Tom. Could you eloborate little more on the second option. > > What is the HADOOP_CONF_DIR here, after launching the cluster?
~/.whirr/<cluster-name> > When you said run in new process, did you mean using command line Whirr tool? I meant that you could launch Whirr using the CLI, or Java. Then run the job in another process, with HADOOP_CONF_DIR set. The MR jobs you are running I assume can be run against an arbitrary cluster, so you should be able to point them at a cluster started by Whirr. Tom > > I may finally end up writing my own driver for running external mapred jobs > so I can have more control but I was just curious to know if option #2 is > better than writing my own driver. > > Praveen > > -----Original Message----- > From: ext Tom White [mailto:t...@cloudera.com] > Sent: Thursday, January 27, 2011 4:01 PM > To: whirr-user@incubator.apache.org > Subject: Re: Running Mapred jobs after launching cluster > > If they implement the Tool interface then you can set configuration on them. > Failing that you could set HADOOP_CONF_DIR and run them in a new process. > > Cheers, > Tom > > On Thu, Jan 27, 2011 at 12:52 PM, <praveen.pe...@nokia.com> wrote: >> Hmm... >> I am running some of the map reduce jobs written by me but some of them are >> in external libraries (eg. Mahout) which I don't have control over. Since I >> can't modify the code in external libraries, is there any other way to make >> this work? >> >> Praveen >> >> -----Original Message----- >> From: ext Tom White [mailto:tom.e.wh...@gmail.com] >> Sent: Thursday, January 27, 2011 3:42 PM >> To: whirr-user@incubator.apache.org >> Subject: Re: Running Mapred jobs after launching cluster >> >> You don't need to add anything to the classpath, but you need to use the >> configuration in the org.apache.whirr.service.Cluster object to populate >> your Hadoop Configuration object so that your code knows which cluster to >> connect to. See the getConfiguration() method in HadoopServiceController for >> how to do this. >> >> Cheers, >> Tom >> >> On Thu, Jan 27, 2011 at 12:21 PM, <praveen.pe...@nokia.com> wrote: >>> Hello all, >>> I wrote a java class HadoopLanucher that is very similar to >>> HadoopServiceController. I was succesfully able to launch a cluster >>> programtically from my application using Whirr. Now I want to copy >>> files to hdfs and also run a job progrmatically. >>> >>> When I copy a file to hdfs its copying to local file system, not hdfs. >>> Here is the code I used: >>> >>> Configuration conf = new Configuration(); FileSystem hdfs = >>> FileSystem.get(conf); hdfs.copyFromLocalFile(false, true, new >>> Path(localFilePath), new Path(hdfsFileDirectory)); >>> >>> Do I need to add anything else to the classpath so Hadoop libraries >>> know that it needs to talk to the dynamically lanuched cluster? When >>> running Whirr from command line I know it uses HADOOP_CONF_DIR to >>> find the hadoop config files but when doing the same from Java I am >>> wondering how to solve this issue. >>> >>> Praveen >>> >>> >> >