I use Perl for heavy duty text processing. A question on Perl Monks
about Perl 5's handling of a large input file got me wondering how the
two Perls compare at the moment.
I wrote a couple of simple programs, in both languages, to write and
read a 10 Gb text file filled with identical 100-character lines. The
reading programs counted total lines and characters of the input file.
The results on my fastest host show that much optimization is still
needed for Perl 6.
I compared read times for file sizes from one to 10 Gb in one-gigabyte
increments and, in general, Perl 6 takes roughly 30 times longer than
Perl 5.14 to read the same file. So far I see no significant
improvement in Rakudo 2016.01 over 2015.12, but the tests haven't
quite finished yet.
When I use the stats incantation shown by Liz, I get:
$ time perl6 --stagestats read-file-test.p6 large-1-gb-file.txt
Stage start : 0.000
Stage parse : 0.160
Stage syntaxcheck: 0.000
Stage ast : 0.000
Stage optimize : 0.005
Stage mast : 0.021
Stage mbc : 0.000
Stage moar : 0.000
File 'large-1-gb-file.txt' size: 1000000000 bytes
Normal end.
For input file 'large-1-gb-file.txt':
Number lines: 10000000
Number chars: 1000000000
real 2m8.585s
user 2m5.408s
sys 0m0.968s
It looks to me that there are no stage hotspots, just overall
optimization with improvements to be done.
Without the stats I get for Perl 5 (5.14):
--------------------------------------------------------
$ time perl read-file-test.pl large-1-gb-file.txt
File 'large-1-gb-file.txt' size: 1000000000 bytes
Normal end.
For input file 'large-1-gb-file.txt':
Number lines: 10000000
Number chars: 1000000000
real 0m6.216s
user 0m4.784s
sys 0m0.328s
And for Perl 6 (2016.01.1) I get:
---------------------------------------------
$ time perl6 read-file-test.p6 large-1-gb-file.txt
File 'large-1-gb-file.txt' size: 1000000000 bytes
Normal end.
For input file 'large-1-gb-file.txt':
Number lines: 10000000
Number chars: 1000000000
real 2m6.687s
user 2m4.216s
sys 0m0.588s
I tried the suggestion from Bart Wiegmans to compile the program:
$ perl6 --target=mbc --output=read-file-test.moarvm read-file-test.p6
$ time perl6 read-file-test.moarvm large-1-gb-file.txt
Error while reading from file: Malformed UTF-8
So I guess precompilation is not yet ready for public testing. That
will be a nice feature, IMHO!
Cheers!
-Tom