Re: Matching Greek letters in UTF-8 file

Brian Fraser Thu, 29 Sep 2011 13:30:09 -0700

On Thu, Sep 29, 2011 at 4:03 PM, John Delacour <johndelac...@gmail.com>wrote:


>
>
Nitpick:  Why the upper-case charset name?
>

Uppercase is UTF-8-strict, while lowercase is the lax version that perl uses
internally. Unless you are passing data from one perl program to another,
and you are using illegal-UTF8-but-legal-UTFX (like if you define your own
new characters beyond 10FFFF, which Perl allows but strict UTF-8 shouldn't),
there's basically no good reason to use the lax version.


>
> Interesting to hear that encoding is broken.  I came across a problem the
> other day wich I couldn't work out at all.  But if you include 'qw< :std
> :encoding(utf-8) >', is that not also using encoding?
>
>
Sorta, but not quite. use encoding ...; does a couple of different things,
and most of them not that well. Foremost it sets the source encoding to some
arbitrary value, but also the default encodings for IO streams -- whereas
use open ...; only sets the default encodings, with :std setting them for
STD(IN|ERR|OUT).
The :encoding() part actually refers to a layer provided by a module, not
the encoding pragma.

Yeah, it's a mess. :)

Re: Matching Greek letters in UTF-8 file

Reply via email to