I just ran a little experiment. I patched Parrot::HLLCompiler to transcode the source code it reads to UCS-2 before parsing and compiling it, then I profiled building perl6.pbc.
Without this hack, the build takes around 20 seconds, mostly running NQP over languages/perl6/src/parser/actions.pm. With the hack, the build takes around 12 seconds. Now the tests don't all pass, and I think that this is because Perl 6 intends to store its identifiers as UTF-8, and comparing the two is not exact. I won't guarantee that. Also, you do need to have ICU installed and configured so that you can link Parrot against it for this to have a hope of working. Caveats aside, it does seem like there's a point at which converting a string to a fixed-width encoding before performing indexed access may improve performance notably. (Callgrind suggests that about 45% of the running time of the NQP part of the build comes from utf8_set_position and utf8_skip_forward.) -- c