Re: utf8 pragma, lexical scope

'Michael Ludwig' Thu, 09 Sep 2010 23:21:43 -0700

Jan Dubois schrieb am 09.09.2010 um 13:13 (-0700):
> > Without the utf8 pragma, identifiers are not allowed to have
> > funny characters. (Yes, it was a stupid exercise.)
> 
> The Perl parser is internally not UTF8-clean, so I would recommend
> not to use non-ASCII characters in variable names for now, even if
> it looks like it mostly works under "utf8".


Okay. I can certainly get by without non-ASCII variable names.

> >From perltodo.pod:
> 
> | =head2 Properly Unicode safe tokeniser and pads.
> |
> | The tokeniser isn't actually very UTF-8 clean. C<use utf8;> is a
> | hack - variable names are stored in stashes as raw bytes, without
> | the utf-8 flag set. The pad API only takes a C<char *> pointer,
> | so that's all bytes too. The tokeniser ignores the UTF-8-ness of
> | C<PL_rsfp>, or any SVs returned from source filters.  All this
> | could be fixed.

Thanks - I didn't know this doc.
-- 
Michael Ludwig

Re: utf8 pragma, lexical scope

Reply via email to