I just ran a little experiment.  I patched Parrot::HLLCompiler to transcode 
the source code it reads to UCS-2 before parsing and compiling it, then I 
profiled building perl6.pbc.

Without this hack, the build takes around 20 seconds, mostly running NQP over 
languages/perl6/src/parser/actions.pm.

With the hack, the build takes around 12 seconds.

Now the tests don't all pass, and I think that this is because Perl 6 intends 
to store its identifiers as UTF-8, and comparing the two is not exact.  I 
won't guarantee that.  Also, you do need to have ICU installed and configured 
so that you can link Parrot against it for this to have a hope of working.

Caveats aside, it does seem like there's a point at which converting a string 
to a fixed-width encoding before performing indexed access may improve 
performance notably.

(Callgrind suggests that about 45% of the running time of the NQP part of the 
build comes from utf8_set_position and utf8_skip_forward.)

-- c

Reply via email to