On Jan 5, 2008, at 11:28 PM, Andrew Rodland wrote:
On Saturday 05 January 2008 04:54:59 pm Daniel McBrearty wrote:
well I'm damned, I thought I had this stuff working squeaky clean.
But
I was wrong. I actually had two bugs cancelling each other out -
usually.
[snip]
--' [debug] abçöeü
[debug] $VAR1 = "ab\x{c3}\x{a7}\x{c3}\x{b6}e\x{c3}\x{bc}";
[debug] it's UTF8!
Looks like the problem is here... the utf8 flag is on, indicating
that $edit
is a string of characters, rather than bytes -- but the dumper
output seems
to show that these "characters" correspond to UTF-8 encoded bytes,
instead of
the actual characters of the data -- meaning that the bytes actually
stored
in the string are along the lines of "ab\x{c3}\x{83}\x{c2}\x{a7}"...
not
good. Somewhere, your data got the utf8 flag set "by assumption"
instead of
by decoding. $edit = decode("UTF-8", $edit) should clear it up,
although
finding the original problem is probably a better idea. :)
Andrew
ISTR that last time I looked at C::P::Unicode, it did things in a
manner that I didn't like. I can't remember if this is because i
thought it was wrong or if it just didn't work right for me, but maybe
some more eyes on C::P::Unicode might be a good idea.
-ash
_______________________________________________
List: Catalyst@lists.scsys.co.uk
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/
Dev site: http://dev.catalyst.perl.org/