Re: JobClient using deprecated JobConf

Martin Becker Thu, 23 Sep 2010 01:25:28 -0700

Hi,

I would still like to use the new API. So what I am trying to do now isto not use the command line interface to submit a job, but do it fromJava code. How do I do this? This is what I do at the moment:

* Clean start up of Hadoop (formatted file system and all)
* Using the standard WordCount Mapper and Reducer I wrote this main method:


    public static void main(String[] args) throws IOException,
        InterruptedException, ClassNotFoundException {

        Configuration configuration = new Configuration();
        InetSocketAddress socket = new InetSocketAddress("localhost",
   9001);
        Cluster cluster = new Cluster(socket, configuration);

        FileSystem fs = cluster.getFileSystem();
        Path homeDirectory = fs.getHomeDirectory();

        Path input = new Path(homeDirectory, INPUT);
        Path output = new Path(homeDirectory, OUTPUT);

        fs.delete(output, true);
        fs.copyFromLocalFile(new
   Path("resources/test/wordcount/data/ipsum.txt"), new Path(input,
   "input.txt"));

        Job job = Job.getInstance(cluster);

   //1    job.addArchiveToClassPath(new Path("release/test.jar"));

   //2    job.addFileToClassPath(new
   Path("bin/de/fstyle/hadoop/tutorial/wordcount/WordCount.class"));
   //    job.addFileToClassPath(new
   Path("bin/de/fstyle/hadoop/tutorial/wordcount/WordCount$Map.class"));
   //    job.addFileToClassPath(new
   Path("bin/de/fstyle/hadoop/tutorial/wordcount/WordCount$Reduce.class"));

        job.setJarByClass(WordCount.class);
        job.setMapperClass(Map.class);
        job.setCombinerClass(Reduce.class);
        job.setReducerClass(Reduce.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);
        FileInputFormat.addInputPath(job, input);
        FileOutputFormat.setOutputPath(job, output);

        System.exit(job.waitForCompletion(true) ? 0 : 1);

    }
* I tried to run this code as is in Eclipse.

* Obviously, I guess, Hadoop needed the WordClass classes to work so Igot this error:java.lang.RuntimeException: java.lang.ClassNotFoundException:de.fstyle.hadoop.tutorial.wordcount.WordCount$Map* Putting everything into a jar and adding the following line did not doany good:

job.addArchiveToClassPath(new Path("release/test.jar"));
* Adding each class separately throws the same exception:

job.addFileToClassPath(newPath("bin/de/fstyle/hadoop/tutorial/wordcount/WordCount.class"));job.addFileToClassPath(newPath("bin/de/fstyle/hadoop/tutorial/wordcount/WordCount$Map.class"));job.addFileToClassPath(newPath("bin/de/fstyle/hadoop/tutorial/wordcount/WordCount$Reduce.class"));

* Using
job.setJar("release/test.jar");
Will get me

java.io.FileNotFoundException: File/tmp/hadoop-martin/mapred/staging/martin/.staging/job_201009221802_0033/job.jardoes not exist.

So how would I set this up/use oi correctly? Sorry, I did not find anytutorial or examples anywhere.


Martin


On 22.09.2010 18:29, Tom White wrote:

Note that JobClient, along with the rest of the "old" API in
org.apache.hadoop.mapred, has been undeprecated in Hadoop 0.21.0 so
you can continue to use it without warnings.

Tom

On Wed, Sep 22, 2010 at 2:43 AM, Amareshwari Sri Ramadasu
<amar...@yahoo-inc.com>  wrote:

In 0.21, JobClient methods are available in org.apache.hadoop.mapreduce.Job
and org.apache.hadoop.mapreduce.Cluster classes.

On 9/22/10 3:07 PM, "Martin Becker"<_martinbec...@web.de>  wrote:

  Hello,

I am using the Hadoop MapReduce version 0.20.2 and soon 0.21.
I wanted to use the JobClient class to circumvent the use of the command
line interface.
I am noticed that JobClient still uses the deprecated JobConf class for
jib submissions.
Are there any alternatives to JobClient not using the deprecated JobConf
class?

Thanks in advance,
Martin

Re: JobClient using deprecated JobConf

Reply via email to