Came across a job which was taking a long time in
UnorderedPartitionedKVWriter.mergeAll. Saw that it was decompressing and
reading data from spill files (8500 spills) and then writing the final
compressed merge file. Why do we need spill files for
UnorderedPartitionedKVWriter? Why not just buffer and keep directly writing
to the final file which will save a lot of time.

Regards,
Rohini

Reply via email to