On 6/5/12 2:02 AM, Arnon Weinberg wrote:

How can I set the output character encoding of Apache::ASP output?
...

Hi Arnon, All,

I have gone over the thread and been stumped on this for a while. Bottom line it looks like Apache::ASP does not play well with Encode, and this seems to me to be around the PerlIO interactions and something not quite connecting right on a tied file handle. But I do know know the answer to solve this. :(

To explain where there is some magic at play:

Apache::ASP::Response does a "use bytes" which is to deal with the output stream correctly I believe this is around content length calculations. I think this is fine here, and turning this off makes things worse for these examples.

Apache::ASP::Response is more importantly tied as a file handle when this code is run:

        tie *RESPONSE, 'Apache::ASP::Response', $self->{Response};
        select(RESPONSE);

This is to allow for print to go to $Response->PRINT which aliases to $Response->Write. Fundamentally all output is going through $Response->Write at the end of the day including the script static content itself.

What I have found is that this will output the correct bytes in this Apache::ASP script:

<% print STDOUT Encode::decode('ISO-8859-1',"\xE2"); %>

as it bypasses the tied file handle layer to $Response, so we know perl is working at this point!

but doing this is where we have a problem:

<% print Encode::decode('ISO-8859-1',"\xE2"); %>

and immediately in the Apache::ASP::Response::Write() method the data has already been converted incorrectly without any processing occurring. Its as if by merely going through the tied interface that data goes through some conversion process. I have played with various IO settings as in "open ..." and various "use" pragmas to no avail but really shooting blind here on what could not be working.

So the way I see it..

Encoding Magic
File handle tie Magic  <--- data conversion
Data to $Response->Write

Encode and perltie seem to have some conflicting bits here.

If there were some workaround here I would be glad to hear it but I seem to have exhausted my ability to troubleshoot this.

Regards,

Josh



# Latin-1.rasp: #############

<%
#use open ( ":utf8", ":std" );
#binmode ( STDOUT, ":encoding(ISO-8859-1)" );

$::Response->{Charset} = "ISO-8859-1";

use Encode;

print Encode::decode('ISO-8859-1',"\xE2"),
Encode::decode('UTF-8',Encode::encode('UTF-8',"\xE2")),
"\x{00E2}",
chr(0x00E2);
%>

#############################

asp-perl Latin-1.rasp
Content-Type: text/html; charset=ISO-8859-1
Content-Length: 6
Cache-Control: private

ââââ
asp-perl Latin-1.rasp | tail -1 | hexdump
0000000 a2c3 a2c3 e2e2
0000006

For some reason, the first 2 test characters are UTF-8 encoded, and the last 2
are ISO-8859-1 encoded.
How can I get the same results as the CGI script above?



---------------------------------------------------------------------
To unsubscribe, e-mail: asp-unsubscr...@perl.apache.org
For additional commands, e-mail: asp-h...@perl.apache.org

Reply via email to