Hi again! I´ve been trying to apply the first of the solutions Jason
proposed two mails ago, but I have some questions about it:
1.-In Hadoop 0.20.1 "RecordWriter" is an abstract class, so it can´t be
instantiated.
2.- The "configure" and "close" methods of my task you refer are the
"configure" and "close" methods used in the Reducer?
It would help a lot if you´d put some code instructions (only the ones
reffering to this issue) so I can have a better point of view.
Finally, I´d like to show my close method in my reducer class, because I use
a different approach and I can´t see why it fails:
*...@override
protected void cleanup(Context cont) throws IOException {
//write output to a file
Configuration conf = new Configuration();
JobContext jCont = new JobContext(conf, null);
FileSystem fs = FileSystem.get(jCont.getConfiguration());
Path outDir = new Path("/user/hadoop-user/output", "output");
Path outFile = new Path(outDir, "reduce-out");
SequenceFile.Writer writer = SequenceFile.createWriter(fs, conf,
outFile, LongWritable.class, LongWritable.class,
CompressionType.NONE);
writer.append(new Text(keyword), new IntWritable(fitnessValue));
writer.close();
}
*
Thanks a lot in advance!
2009/10/2 Jason Venner <[email protected]>
> I see these ways to go here.
>
> 1. The one I know to work is to create a recordwriter in the configure
> method of your task, in the per task work/output directory, and then
> rename
> it to your chosen name in the close. your task calls write on the
> recordwriter directly instead of output.collect
> 2. Use the multi output format
> 3. in the close method of the task, rename the part-xxx to your name. I
> am not certain that this is safe in the close method of the task
> 4. define a custom OutputCommitter class which renames the file to your
> chosen name.
>
>
>
>
> On Thu, Oct 1, 2009 at 1:00 PM, Alberto Luengo Cabanillas <
> [email protected]
> > wrote:
>
> > Hi everyone! I have a newbie question: I´m actually using Hadoop 0.20.1
> and
> > I´d like to know how can I change the name of the resulting file with the
> > one I want (i.e from "part-r-00000" to "myoutput"). I´ve found something
> > related in JIRA (https://issues.apache.org/jira/browse/MAPREDUCE-370)
> but
> > I
> > don´t know for sure i that is my problem too. In this case, do I apply
> the
> > patch over the affected file and I´m ready to go or do I need to do
> > something more later?
> > Thanks a lot!
> >
> > --
> > Alberto
> >
>
>
>
> --
> Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> http://www.amazon.com/dp/1430219424?tag=jewlerymall
> www.prohadoopbook.com a community for Hadoop Professionals
>
--
Alberto