Hi all, I am currently encountering a tough problem, my job use MultipleOutputFormat to output result into different folder, and I have to use a combiner to enhance performance. In this situation, reduce does not work, reduce cannot receive any data. I searched this issue and found a related topic, http://lucene.472066.n3.nabble.com/Combiner-and-MultipleOutputs-in-Mapreduce-td1640503.html , but not get clear what the solution is really. Seems it is the constraint of hadoop framework?
I found a interesting phenomenon, when I limit the map input record to a small number (such as 10000), the reduce is ok, it can receive data and the result is correct. But when the input is over a million record, the reduce receive nothing. I guess the reason is the combiner only be called once when data is small while combiner be called multiple time when data is huge. To summary, how can I make combiner feasible while using MultipleOutputFormat? Any solution or suggestion is welcome. Thanks