The sort implementation is pluggable (see MAPREDUCE-2454,
MAPREDUCE-4049), so please feel free to fork and improve it. Selecting
a sort implementation based on job configuration (e.g.,
BinaryComparable keys) would allow for much more efficient and
specialized implementations. -C
On Thu, Mar 20, 20
I'd like to discuss about some improvements of the sort stage in current
hadoop. There are no formal documents available right now, so I will just
hope someone would be interested in it. If so I can give more detail
information.
To start with, I have conducted a series of experiments on version