In this case, the code in question is the non-distributed code rather
than Hadoop. But yes I agree it will make a perhaps bigger difference
on Hadoop. All of the Hadoop stuff uses integer keys.

On Fri, Mar 9, 2012 at 2:10 AM, Paritosh Ranjan <pran...@xebia.com> wrote:
> Are these identifiers used as keys for mappers somewhere?
> If yes, then the sorting phase of map reduce will be much faster with long,
> as the key comparison time will be less ( long comparison will take less
> time than String comparison, due to lesser number of bytes  ) as well as
> more records can be kept in memory while sorting ( because the size is less
> ).
> I was once processing 1 billion records and just changing the keys from
> String to Long increased the performance by 20%.
>
> Ignore if this is not the case.
>

Reply via email to