Outputting to different paths from the same input file

schnitzi Thu, 10 Jul 2008 20:16:42 -0700

Okay, I've found some similar discussions in the archive, but I'm still not
clear on this.  I'm new to Hadoop, so 'scuse my ignorance...


I'm writing a Hadoop tool to read in an event log, and I want to produce two
separate outputs as a result -- one for statistics, and one for budgeting. 
Because the event log I'm reading in can be massive, I would like to only
process it once.  But the outputs will each be read by further M/R
processes, and will be significantly different from each other.

I've looked at MultipleOutputFormat, but it seems to just want to partition
data that looks basically the same into this file or that.

What's the proper way to do this?  Ideally, whatever solution I implement
should be atomic, in that if any one of the writes fails, neither output
will be produced.


AdTHANKSvance,
Mark
-- 
View this message in context: 
http://www.nabble.com/Outputting-to-different-paths-from-the-same-input-file-tp18395861p18395861.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Outputting to different paths from the same input file

Reply via email to