Okay, I've found some similar discussions in the archive, but I'm still not clear on this. I'm new to Hadoop, so 'scuse my ignorance...
I'm writing a Hadoop tool to read in an event log, and I want to produce two separate outputs as a result -- one for statistics, and one for budgeting. Because the event log I'm reading in can be massive, I would like to only process it once. But the outputs will each be read by further M/R processes, and will be significantly different from each other. I've looked at MultipleOutputFormat, but it seems to just want to partition data that looks basically the same into this file or that. What's the proper way to do this? Ideally, whatever solution I implement should be atomic, in that if any one of the writes fails, neither output will be produced. AdTHANKSvance, Mark -- View this message in context: http://www.nabble.com/Outputting-to-different-paths-from-the-same-input-file-tp18395861p18395861.html Sent from the Hadoop core-user mailing list archive at Nabble.com.