>>>>> "Randal" == Randal L Schwartz <[email protected]> writes:
Randal> Getting really frustrated with mod_perl2's apparent inability to
Randal> probably read UTF8 input.
Randal> Here's my mod_perl2 setup:
Randal> Apache 2.2.[something]
Randal> mod_perl 2.0.7 (or nearly that)
Randal> ModPerl::Registry
Randal> Perl "script" with CGI.pm
Randal> Very early in my app:
Randal> ## ensure utf8 CGI params:
Randal> $CGI::PARAM_UTF8 = 1;
Randal> binmode STDIN, ":utf8";
Randal> binmode STDOUT, ":utf8";
Randal> binmode STDERR, ":utf8";
Randal> This works fine in CGI mode: when I ask for $foo = $cgi->param('foo'),
Randal> DBI::data_string_desc($foo) shows a UTF8 string with the proper
Randal> discrepency between bytes and chars.
Randal> But when I try to run it under mod_perl, the returned string appears
Randal> to be the raw ascii bytes, and definitely not utf8. Of course, when I
Randal> store that in the database (using DBD::Pg), the "latin-1" is encoded
Randal> to "utf-8", and I get a bunch of weird chars on the output.
Randal> Has anyone managed to round-trip UTF8 from form to database and back
Randal> using a setup similar to this?
Randal> I suspect part of the problem is this in CGI.pm:
Randal> 'read_from_client' => <<'END_OF_FUNC',
Randal> # Read data from a file handle
Randal> sub read_from_client {
Randal> my($self, $buff, $len, $offset) = @_;
Randal> local $^W=0; # prevent a warning
Randal> return $MOD_PERL
Randal> ? $self->r->read($$buff, $len, $offset)
Randal> : read(\*STDIN, $$buff, $len, $offset);
Randal> }
Randal> END_OF_FUNC
Randal> Since I binmode STDIN, the non-$MOD_PERL works ok here. What's the
Randal> equivalent of $r->read() that marks the incoming stream as UTF8, so I
Randal> get chars instead of bytes? Or can I just read(\*STDIN) in mod_perl2
Randal> as well? (I know that was supported at one point...)
I realized that I never posted my ultimate solution. I monkey patch
CGI.pm:
require CGI;
{
my $orig = \&CGI::param;
no warnings 'redefine';
*CGI::param = sub {
$CGI::LIST_CONTEXT_WARN = 0; # workaround for backward compatibility
$CGI::PARAM_UTF8 = 1;
goto &$orig;
};
}
And this has been working just fine for both CGI and mod_perl. Just for the
record.
--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<[email protected]> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix consulting, Technical writing, Comedy, etc. etc.
Still trying to think of something clever for the fourth line of this .sig