Hi,
I would still like to use the new API. So what I am trying to do now is to not use the command line interface to submit a job, but do it from Java code. How do I do this? This is what I do at the moment:
* Clean start up of Hadoop (formatted file system and all)
* Using the standard WordCount Mapper and Reducer I wrote this main method:

    public static void main(String[] args) throws IOException,
        InterruptedException, ClassNotFoundException {

        Configuration configuration = new Configuration();
        InetSocketAddress socket = new InetSocketAddress("localhost",
   9001);
        Cluster cluster = new Cluster(socket, configuration);

        FileSystem fs = cluster.getFileSystem();
        Path homeDirectory = fs.getHomeDirectory();

        Path input = new Path(homeDirectory, INPUT);
        Path output = new Path(homeDirectory, OUTPUT);

        fs.delete(output, true);
        fs.copyFromLocalFile(new
   Path("resources/test/wordcount/data/ipsum.txt"), new Path(input,
   "input.txt"));

        Job job = Job.getInstance(cluster);

   //1    job.addArchiveToClassPath(new Path("release/test.jar"));

   //2    job.addFileToClassPath(new
   Path("bin/de/fstyle/hadoop/tutorial/wordcount/WordCount.class"));
   //    job.addFileToClassPath(new
   Path("bin/de/fstyle/hadoop/tutorial/wordcount/WordCount$Map.class"));
   //    job.addFileToClassPath(new
   Path("bin/de/fstyle/hadoop/tutorial/wordcount/WordCount$Reduce.class"));

        job.setJarByClass(WordCount.class);
        job.setMapperClass(Map.class);
        job.setCombinerClass(Reduce.class);
        job.setReducerClass(Reduce.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);
        FileInputFormat.addInputPath(job, input);
        FileOutputFormat.setOutputPath(job, output);

        System.exit(job.waitForCompletion(true) ? 0 : 1);

    }
* I tried to run this code as is in Eclipse.
* Obviously, I guess, Hadoop needed the WordClass classes to work so I got this error: java.lang.RuntimeException: java.lang.ClassNotFoundException: de.fstyle.hadoop.tutorial.wordcount.WordCount$Map * Putting everything into a jar and adding the following line did not do any good:
job.addArchiveToClassPath(new Path("release/test.jar"));
* Adding each class separately throws the same exception:
job.addFileToClassPath(new Path("bin/de/fstyle/hadoop/tutorial/wordcount/WordCount.class")); job.addFileToClassPath(new Path("bin/de/fstyle/hadoop/tutorial/wordcount/WordCount$Map.class")); job.addFileToClassPath(new Path("bin/de/fstyle/hadoop/tutorial/wordcount/WordCount$Reduce.class"));
* Using
job.setJar("release/test.jar");
Will get me
java.io.FileNotFoundException: File /tmp/hadoop-martin/mapred/staging/martin/.staging/job_201009221802_0033/job.jar does not exist.

So how would I set this up/use oi correctly? Sorry, I did not find any tutorial or examples anywhere.

Martin


On 22.09.2010 18:29, Tom White wrote:
Note that JobClient, along with the rest of the "old" API in
org.apache.hadoop.mapred, has been undeprecated in Hadoop 0.21.0 so
you can continue to use it without warnings.

Tom

On Wed, Sep 22, 2010 at 2:43 AM, Amareshwari Sri Ramadasu
<amar...@yahoo-inc.com>  wrote:
In 0.21, JobClient methods are available in org.apache.hadoop.mapreduce.Job
and org.apache.hadoop.mapreduce.Cluster classes.

On 9/22/10 3:07 PM, "Martin Becker"<_martinbec...@web.de>  wrote:

  Hello,

I am using the Hadoop MapReduce version 0.20.2 and soon 0.21.
I wanted to use the JobClient class to circumvent the use of the command
line interface.
I am noticed that JobClient still uses the deprecated JobConf class for
jib submissions.
Are there any alternatives to JobClient not using the deprecated JobConf
class?

Thanks in advance,
Martin




Reply via email to