On Fri, Jan 31, 2014 at 3:58 AM, Will Crawford <billcrawford1...@gmail.com>wrote:
> > If the string has been decoded *from* UTF-8 to Perl's internal > representation, it's *not* going to be marked as UTF8 internally; it > *shouldn't* be. It's no longer a "UTF8" string but a "Unicode" string, > complete with wide characters. If anything, the internal "UTF8" flag > means "this string needs decoding" rather than "has been decoded". > $ perl -le 'use Encode; my $chars = decode_utf8( "bytes" ); print Encode::is_utf8( $chars ) ? "Is flagged utf8\n" : "not flagged\n"; use Devel::Peek; Dump($chars)' Is flagged utf8 SV = PV(0x7fb8c10023f0) at 0x7fb8c102b6a8 REFCNT = 1 FLAGS = (PADMY,POK,pPOK,UTF8) PV = 0x7fb8c0e01170 "bytes"\0 [UTF8 "bytes"] CUR = 5 LEN = 16 Everything is encoded. The flag tells Perl that its internal representation is encoded as utf8 so knows to work with it as utf8 characters (e.g. length() is length of chars, matching works on chars, etc.) $ perl -le 'use Encode; my $chars = decode( 'latin1', "bytes" ); print Encode::is_utf8( $chars ) ? "Is flagged utf8\n" : "not flagged\n"; use Devel::Peek; Dump($chars)' Is flagged utf8 -- Bill Moseley mose...@hank.org
_______________________________________________ List: Catalyst@lists.scsys.co.uk Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/ Dev site: http://dev.catalyst.perl.org/