Randomize input file?

John Clarke Thu, 21 May 2009 07:19:28 -0700

Hi,

I have a need to randomize my input file before processing. I understand I
can chain Hadoop jobs together so the first could take the input file
randomize it and then the second could take the randomized file and do the
processing.


The input file has one entry per line and I want to mix up the lines before
the main processing.

Is there an inbuilt ability I have missed or will I have to try and write a
Hadoop program to shuffle my input file?

Cheers,
John

Randomize input file?

Reply via email to