On Thu, 09 Sep 2010, Michael Ludwig wrote: > > What does not work, however, is to have a variable $käse under utf8 > and then try to refer to it from inside a "no utf8" block, using either > encoding. Without the utf8 pragma, identifiers are not allowed to have > funny characters. (Yes, it was a stupid exercise.)
The Perl parser is internally not UTF8-clean, so I would recommend not to use non-ASCII characters in variable names for now, even if it looks like it mostly works under "utf8". >From perltodo.pod: | =head2 Properly Unicode safe tokeniser and pads. | | The tokeniser isn't actually very UTF-8 clean. C<use utf8;> is a hack - | variable names are stored in stashes as raw bytes, without the utf-8 flag | set. The pad API only takes a C<char *> pointer, so that's all bytes too. The | tokeniser ignores the UTF-8-ness of C<PL_rsfp>, or any SVs returned from | source filters. All this could be fixed. Cheers, -Jan