Hi Ramprasad,
On Sun, 7 Aug 2011 20:58:14 +0530
Ramprasad Prasad <[email protected]> wrote:
> I have a file that contains records of customer interaction
> The first column of the file is the batch number(INT) , and other columns
> are date time , close time etc etc
>
> I have to sort the entire file in order of the first column .. but the
> problem is that the file is extremely huge.
>
> For the largest customer it contains 1100 million records and the file is
> 44GB !
> how can I sort this big a file
>
I suggest splitting the files into bins. Each bin will contain the records with
the batch numbers in a certain range (say 0-999,999 ; 1,000,000-1,999,999,
etc.). You should select the bins so the numbers are spread more or less
evenly. Then you sort each bin separately, and then append the bins in order.
Let me know if there's anything else you don't understand, and if you're
interested, I can be commissioned to write it for you (but it shouldn't be too
hard.).
Regards,
Shlomi Fish
>
>
>
>
--
-----------------------------------------------------------------
Shlomi Fish http://www.shlomifish.org/
Why I Love Perl - http://shlom.in/joy-of-perl
Chuck Norris refactors 10 million lines of Perl code before lunch.
Please reply to list if it's a mailing list post - http://shlom.in/reply .
--
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
http://learn.perl.org/