Hi,
I want to merge multiple files in one HDFS dir to one file. I am planning
to write a map only job using input format which will create only one
inputSplit per dir.
this way my job don't need to do any shuffle/sort.(only read and write back
to disk)
Is there any such file format already implemented ?
Or any there better solution for the problem.

thanks.

Reply via email to