Re: reading data from multiple output files into a single Map method.

2009-02-02 Thread jason hadoop
Do you really want to have a single task process all of the reduce outputs? If you want all of your output processed by a set of map tasks, you can set the output directory of your previous job to be the input directory of your next job, ensuring that the framework knows how to read the key value

reading data from multiple output files into a single Map method.

2009-02-02 Thread some speed
Hi, I am implementing a chain M-R job in java. If I am using multiple reducers, then the output seems to be dispersed among several files on the dfs. How can I now read these files into the Map method of the next job? Another doubt I have is ...Is it possible to keep appending to the same outpu

Re: Newbie: multiple output files

2008-11-23 Thread tim robertson
rplay.javaeye.com/blog/191188 for example. > > On Sun, Nov 23, 2008 at 9:12 PM, tim robertson <[EMAIL PROTECTED]>wrote: > >> Hi, >> >> Can someone please point me at the best way to create multiple output >> files based on the Key outputted from the Map? So I

Re: Newbie: multiple output files

2008-11-23 Thread Jeremy Chow
+ "_" + value.toString(); 4. } you can also check out http://coderplay.javaeye.com/blog/191188 for example. On Sun, Nov 23, 2008 at 9:12 PM, tim robertson <[EMAIL PROTECTED]>wrote: > Hi, > > Can someone please point me at the best way to create multiple output > file

Newbie: multiple output files

2008-11-23 Thread tim robertson
Hi, Can someone please point me at the best way to create multiple output files based on the Key outputted from the Map? So I end up with no reduction, but a file per Key outputted in the Mapping phase, ideally with the Key as the file name. Many thanks, Tim

Re: Multiple output files

2008-09-06 Thread Owen O'Malley
On Sep 6, 2008, at 9:35 AM, Ryan LeCompte wrote: I have a question regarding multiple output files that get produced as a result of using multiple reduce tasks for a job (as opposed to only one). If I'm using a custom writable and thus writing to a sequence output, am I gauranteed that a

Re: Multiple output files

2008-09-06 Thread Ryan LeCompte
This clears up my concerns. Thanks! Ryan On Sep 6, 2008, at 2:17 PM, Owen O'Malley <[EMAIL PROTECTED]> wrote: On Sep 6, 2008, at 9:35 AM, Ryan LeCompte wrote: I have a question regarding multiple output files that get produced as a result of using multiple reduce tasks fo

Multiple output files

2008-09-06 Thread Ryan LeCompte
Hello, I have a question regarding multiple output files that get produced as a result of using multiple reduce tasks for a job (as opposed to only one). If I'm using a custom writable and thus writing to a sequence output, am I gauranteed that all of the day for a particular key will appear

Re: Multiple output files by reducers?

2008-08-26 Thread Khanh Nguyen
As far as I know, if you check out hadoop 0.19 trunk, there is MultipleOutputCollector.java that will do what you need. There are guidelines in the source code as well. I dont know if the recent hadoop 0.18 has this feature. -k On Tue, Aug 26, 2008 at 7:57 PM, Tarandeep Singh <[EMAIL PROTECTED]>

Multiple output files by reducers?

2008-08-26 Thread Tarandeep Singh
Hi, Is it correct that the output of Map-Reduce job can result in multiple files in the output directory ? If yes, then how can I read the output in the order generated by the MR job ? Can I use FileStatus.getModificationTime( ) and pick the files in the increasing order of their modification time