Re: Using MultipleTextOutputFormat for map-only jobs

2011-05-02 Thread Geoffry Roberts
All, I read this thread and noticed the example code sited in it is based on what I believe is the older, and at one time deprecated, org.apache.hadoop.mapred.lib.* package. I am attempting to output to multiple files, but I am using the org.apache.hadoop.mapreduce.lib.output.* package. I am not

Re: Using MultipleTextOutputFormat for map-only jobs

2011-04-14 Thread Hari Sreekumar
I changes jobConf.setMapOutputKeyClass(Text.class); to jobConf.setMapOutputKeyClass(NullWritable.class); Still no luck.. I also get this error in many mappers: java.io.IOException: Failed to delete earlier output of task: attempt_201104041514_0069_m_03_0 at org.apache.hadoop.mapred.

Re: Using MultipleTextOutputFormat for map-only jobs

2011-04-14 Thread Hari Sreekumar
Here's what I tried: static class MapperClass extends MapReduceBase implements Mapper { @Override public void map(LongWritable key, Text value, OutputCollector output, Reporter reporter) throws IOException { output.collect( NullWritab

Re: Using MultipleTextOutputFormat for map-only jobs

2011-04-14 Thread Hari Sreekumar
That is exactly what I do when I have a reduce phase, and it works. But in case of map-only jobs, it doesn't work. I'll try overriding the getOutputfileFromInputFile() method. On Thu, Apr 14, 2011 at 5:19 PM, Harsh J wrote: > Hello again Hari, > > On Thu, Apr 14, 2011 at 5:10 PM, Hari Sreekumar

Re: Using MultipleTextOutputFormat for map-only jobs

2011-04-14 Thread Harsh J
Hello again Hari, On Thu, Apr 14, 2011 at 5:10 PM, Hari Sreekumar wrote: > Here is a part of the code I am using: >     jobConf.setOutputFormat(MultipleTextOutputFormat.class); You need to subclass the OF and use it properly, else the abstract class takes over with the default name always used (

Re: Using MultipleTextOutputFormat for map-only jobs

2011-04-14 Thread Hari Sreekumar
Here is a part of the code I am using: static class mapperClass extends MapReduceBase implements Mapper { @Override public void map(LongWritable key, Text value, OutputCollector output, Reporter reporter) throws IOException { output.collect(

Re: Using MultipleTextOutputFormat for map-only jobs

2011-04-14 Thread Harsh J
Hello Hari, On Thu, Apr 14, 2011 at 11:09 AM, Hari Sreekumar wrote: > Hi, > I have a map-only mapreduce job where I want to deduce the output filename > from the output key/value. I figured MultipleTextOutputFormat is the best > fit for my purpose. But I am unable to use it in map-only jobs. I wa

Using MultipleTextOutputFormat for map-only jobs

2011-04-13 Thread Hari Sreekumar
Hi, I have a map-only mapreduce job where I want to deduce the output filename from the output key/value. I figured MultipleTextOutputFormat is the best fit for my purpose. But I am unab