Re: Multiple outputs and getmerge?

2009-04-21 Thread Todd Lipcon
On Mon, Apr 20, 2009 at 1:14 PM, Stuart White stuart.whi...@gmail.comwrote: Is this the best/only way to deal with this? It would be better if hadoop offered the option of writing different outputs to different output directories, or if getmerge offered the ability to specify a file prefix

RE: Multiple outputs and getmerge?

2009-04-21 Thread Koji Noguchi
Stuart, I once used MultipleOutputFormat and created (mapred.work.output.dir)/type1/part-_ (mapred.work.output.dir)/type2/part-_ ... And JobTracker took care of the renaming to (mapred.output.dir)/type{1,2}/part-__ Would that work for you? Koji -Original

Re: Multiple outputs and getmerge?

2009-04-21 Thread Stuart White
On Tue, Apr 21, 2009 at 12:06 PM, Todd Lipcon t...@cloudera.com wrote: Would dfs -cat do what you need? e.g: ./bin/hdfs dfs -cat /path/to/output/ExceptionDocuments-m-\* /tmp/exceptions-merged Yes, that would work. Thanks for the suggestion.

Re: Multiple outputs and getmerge?

2009-04-21 Thread Stuart White
On Tue, Apr 21, 2009 at 1:00 PM, Koji Noguchi knogu...@yahoo-inc.com wrote: I once used MultipleOutputFormat and created   (mapred.work.output.dir)/type1/part-_   (mapred.work.output.dir)/type2/part-_    ... And JobTracker took care of the renaming to  

RE: Multiple outputs and getmerge?

2009-04-21 Thread Koji Noguchi
@hadoop.apache.org Subject: Re: Multiple outputs and getmerge? On Tue, Apr 21, 2009 at 1:00 PM, Koji Noguchi knogu...@yahoo-inc.com wrote: I once used MultipleOutputFormat and created   (mapred.work.output.dir)/type1/part-_   (mapred.work.output.dir)/type2/part-_    ... And JobTracker took