On Thu, Dec 16, 2010 at 7:32 AM, Assaf Gordon wrote:
>
Hi Assaf,
Sorry for replying so long after this email, guess I'm a bit late for
the party. But it's too interesting a subject to pass up :)
>
> GNU sort have many more capabilities than perl's sort, including:
> multithreaded sort (introduced, although a bit buggy, in version 8.6),
I'm really not familiar with multithreading support in Perl's sort. It
seems like a great question (and perhaps a prompt for action!) to send
to p5p. I guess it's up to you to say if this is a gating issue for
using Perl's sort.
> sorting huge files (bigger than available RAM),
As long as you're not trying to slurp the file into memory inside your
Perl program, why is this a problem for Perl?
use Tie::File;
tie my @array, 'Tie::File', 'large_unsorted_file' or die "blah";
print "$_\n" for sort @array;
I'm sure there are other more clever solutions, see for example:
http://search.cpan.org/dist/Sort-External/
But the above example code should work, or am I missing something obvious?
> many built-in sort options (version sort, human-numeric sort), etc.
http://search.cpan.org/dist/Sort-Maker/
http://search.cpan.org/dist/Sort-DataTypes/
(... and many other sort modules on cpan... )
There's probably already a solution for most sort related options,
somewhere on CPAN. If there isn't, Perl's sort-subs give you the
ability to, I think, cover pretty much any situation. By the way, the
Sort::External module has a very nice cookbook that I thought gave
some very nice advise, you might find it helpful:
http://search.cpan.org/dist/Sort-External/lib/Sort/External/Cookbook.pod
>
> Actually, it's the real thing. It will be a wrapper script that accepts the
> same arguments as GNU sort,
> but will support sorting a file that has a header line as the first line.
> Very useful for our needs at the lab.
>
Not Perl related, but the question "How do I run a multithreaded sort
on a huge text file while skipping the first few header lines?" might
get you some interesting answers over at Stack Overflow. Who knows,
maybe you'll end up with a Java based solution ;)
Regards,
Offer Kaye
_______________________________________________
Perl mailing list
[email protected]
http://mail.perl.org.il/mailman/listinfo/perl