On Fri, Jan 31, 2014 at 3:58 AM, Will Crawford
<billcrawford1...@gmail.com>wrote:

>
> If the string has been decoded *from* UTF-8 to Perl's internal
> representation, it's *not* going to be marked as UTF8 internally; it
> *shouldn't* be. It's no longer a "UTF8" string but a "Unicode" string,
> complete with wide characters. If anything, the internal "UTF8" flag
> means "this string needs decoding" rather than "has been decoded".
>


$ perl -le 'use Encode;  my $chars = decode_utf8( "bytes" ); print
Encode::is_utf8( $chars ) ? "Is flagged utf8\n" : "not flagged\n"; use
Devel::Peek; Dump($chars)'
Is flagged utf8

SV = PV(0x7fb8c10023f0) at 0x7fb8c102b6a8
  REFCNT = 1
  FLAGS = (PADMY,POK,pPOK,UTF8)
  PV = 0x7fb8c0e01170 "bytes"\0 [UTF8 "bytes"]
  CUR = 5
  LEN = 16

Everything is encoded.   The flag tells Perl that its internal
representation is encoded as utf8 so knows to work with it as utf8
characters (e.g. length() is length of chars, matching works on chars, etc.)

$ perl -le 'use Encode;  my $chars = decode( 'latin1', "bytes" ); print
Encode::is_utf8( $chars ) ? "Is flagged utf8\n" : "not flagged\n"; use
Devel::Peek; Dump($chars)'
Is flagged utf8



-- 
Bill Moseley
mose...@hank.org
_______________________________________________
List: Catalyst@lists.scsys.co.uk
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/
Dev site: http://dev.catalyst.perl.org/

Reply via email to