OK, I think I've got something that I'm happy with.

There were a couple problems with how my SOAP::Lite handler was processing UTF8.

First, SOAP::Lite encodes extended 8-bit strings as base 64. What a pain. I found this workaround <http://london.pm.org/pipermail/ london.pm/Week-of-Mon-20061023/005171.html>.

Second, I found this <http://www.perlmonks.org/?node_id=589132> which suggested running the string through utf8::decode and this seemed to solve my problems.

So my solution ends up looking something like this:
        # make sure Perl marks this as utf8
        utf8::decode($s);
# encode string ourselves to prevent SOAP::LIte from encoding it as base64
        push @results, SOAP::Data->type( string => $s );

It works for me.

Drew


On Mar 16, 2007, at 2:23 PM, Robert Landrum wrote:

Drew Wilson wrote:
But I cannot figure out WHERE this conversion is being done.

I've picked through SOAP::Lite enough to know that unicode conversions are probably more than it knows how to handle.

However, SOAP::Data::encode_data uses a regex to munge data. Perhaps there's a conversion happening in the regex engine that breaks the UTF8.

Rob

Reply via email to