If I set my reducer output to map file output format and the job would
say have 100 reducers, will the output generate 100 different index
file (one for each reducer) or one index file for all the reducers
(basically one index file per job)?

If it is one index file per reducer, can rely on HDFS append to change
the index write behavior and build one index file from all the
reducers by basically making all the parallel reducers to append to
one index file? Data files do not matter.

Reply via email to