Apologies Arnon, I got your original message with the problem description after I had sent mine...
> > To explain where there is some magic at play: > > Apache::ASP::Response does a "use bytes" which is to deal with the > output stream correctly I believe this is around content length > calculations. I think this is fine here, and turning this off makes > things worse for these examples. > > Apache::ASP::Response is more importantly tied as a file handle when > this code is run: > > tie *RESPONSE, 'Apache::ASP::Response', $self->{Response}; > select(RESPONSE); > > This is to allow for print to go to $Response->PRINT which aliases to > $Response->Write. Fundamentally all output is going through > $Response->Write at the end of the day including the script static > content itself. > > What I have found is that this will output the correct bytes in this > Apache::ASP script: > > <% print STDOUT Encode::decode('ISO-8859-1',"\xE2"); %> > > as it bypasses the tied file handle layer to $Response, so we know perl > is working at this point! > > but doing this is where we have a problem: > > <% print Encode::decode('ISO-8859-1',"\xE2"); %> > > and immediately in the Apache::ASP::Response::Write() method the data > has already been converted incorrectly without any processing > occurring. Its as if by merely going through the tied interface that > data goes through some conversion process. I have played with various > IO settings as in "open ..." and various "use" pragmas to no avail but > really shooting blind here on what could not be working. > > So the way I see it.. > That rang a bell for me: Read the section ``The UTF8 flag'' in Encode to see the problem. ${$Response->{out}} contains a copy of the stuff you're sending to $Response->Write(), AKA $Response->WriteRef() but without copying the utf-8 flag. You can make the example work by simply turning the utf8 flag unconditionally on via ``Encode::_utf8_on(${$Response->{out}});'' after the print statements in Latin-1.rasp. Of course, your data should either ALL have the utf8 flag on (eg via Encode::decode) or ALL have it off, because ${$Response->{out}} can either have it on or off but obviously not both. > Encode and perltie seem to have some conflicting bits here. > > If there were some workaround here I would be glad to hear it but I seem > to have exhausted my ability to troubleshoot this. I'm not sure there is a generic solution, except perhaps mess around with ``is_utf8($$dataref)'' before appending it to $Response->{out} and make sure that the same kind of data is appended (either ON or OFF) to $Response->{out}. See below for why this is a problem > >> # Latin-1.rasp: ############# >> >> <% >> #use open ( ":utf8", ":std" ); >> #binmode ( STDOUT, ":encoding(ISO-8859-1)" ); >> >> $::Response->{Charset} = "ISO-8859-1"; >> >> use Encode; >> >> print Encode::decode('ISO-8859-1',"\xE2"), >> Encode::decode('UTF-8',Encode::encode('UTF-8',"\xE2")), #these will now work if #Encode::_utf8_on(${$Response->{out}}); #is set because they have the flag themselves >> "\x{00E2}", >> chr(0x00E2); #these, on the other hand will not # #the opposite holds true for #Encode::_utf8_off(${$Response->{out}}); #of course >> %> I'm sure we can design a ``proper'' solution but not without some user-configurable settings and a bit of ugly code. Best Regards, Thanos Chatziathanassiou --------------------------------------------------------------------- To unsubscribe, e-mail: asp-unsubscr...@perl.apache.org For additional commands, e-mail: asp-h...@perl.apache.org