Re: Handling bad records

2012-02-27 Thread Harsh J
Mohit, Use the MultipleOutputs API: http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/lib/MultipleOutputs.html to have a named output of bad records. There is an example of use detailed on the link. On Tue, Feb 28, 2012 at 3:48 AM, Mohit Anchlia wrote: > What's the best w

Re: Handling bad records

2012-02-27 Thread Mohit Anchlia
Thanks that's helpful. In that example what is "A" and "B" referring to? Is that the output file name? mos.getCollector("seq", "A", reporter).collect(key, new Text("Bye")); mos.getCollector("seq", "B", reporter).collect(key, new Text("Chau")); On Mon, Feb 27, 2012 at 9:53 PM, Harsh J wrote: >

Re: Handling bad records

2012-02-28 Thread madhu phatak
Hi Mohit , A and B refers to two different output files (multipart name). The file names will be seq-A* and seq-B*. Its similar to "r" in part-r-0 On Tue, Feb 28, 2012 at 11:37 AM, Mohit Anchlia wrote: > Thanks that's helpful. In that example what is "A" and "B" referring to? Is > that the

Re: Handling bad records

2012-02-28 Thread Subir S
Can multiple output be used with Hadoop Streaming? On Tue, Feb 28, 2012 at 2:07 PM, madhu phatak wrote: > Hi Mohit , > A and B refers to two different output files (multipart name). The file > names will be seq-A* and seq-B*. Its similar to "r" in part-r-0 > > On Tue, Feb 28, 2012 at 11:37

Re: Handling bad records

2012-02-28 Thread Harsh J
Subir, No, not unless you use a specialized streaming library (pydoop, dumbo, etc. for python, for example). On Tue, Feb 28, 2012 at 2:19 PM, Subir S wrote: > Can multiple output be used with Hadoop Streaming? > > On Tue, Feb 28, 2012 at 2:07 PM, madhu phatak wrote: > >> Hi Mohit , >>  A and B