I need to process and output data delivered via a webbrowser using the CGI-interface.
To deal with "real" unicode-data I set the whole STDIN and STDOUT to utf8 with binmode (as recommended at http://www.perldoc.com/perl5.8.0/pod/perluniintro.html. My script would not work otherwise)


While this works perfect in a standard CGI-environment it does not work under mod_perl. Perl reads the input from the CGI-form and does not read it as unicode.


I set up a simple script, that reads lines from a textfield and prints out the sorted lines. (sort order according to german locale)


As long as you only enter "standard" western chars like A-Z everything is fine, but as soon as you come to german umlauts, special spanish chars or whatever, the script produces garbage under mod_perl.

mod_perl:
http://www.goldfisch.at/mod_perl/unicodetest7.pl

standard-cgi:
http://www.customers.goldfisch.at/cgi-bin/unicodetest7.pl

perl is 5.8.5 and mod_perl is latest 1.99_16 and apache 2.0.51.

If somebody shows me a way how to read unicode without using binmode, I would be very glad too. I didnt manage to get "real" unicode without it.

thnx a lot,
peter

---------------unicodetest7.pl-------------------------------------
#!/usr/local/bin/perl -w
use CGI;
use strict;

use POSIX qw(locale_h);
use locale;
setlocale(LC_COLLATE, "de_AT");

binmode(STDOUT,":utf8");
binmode(STDIN,":utf8");

my $query = new CGI;
my $charset = 'UTF-8';
$CGI::XHTML= 0;
print $query->header(-charset=>$charset),$query->start_html(-title=>'Unicodetest');
print "cgi-version = ",$CGI::VERSION," \x{263a}","<br><br>\n";


if ($query->param('submit'))
{
  print "your input sorted : <br><br>";

  my $si=$query->param('unicode');
  $si=~s/\r//g;
  # --- the following is to fix some unresolved CGI-problem
  my $sin='';
  foreach(0..length($si)-1) {
    $sin.=chr(ord(substr($si,$_,1)))
  };
  $si=$sin;
  #----

  foreach (sort( split(/\n/,$si))) {
    s/\r|\n//g;
    print $_;
    print "&nbsp;&nbsp;(length=",length($_),")";
    print "&nbsp;&nbsp;";
    foreach my $i (0..length($_)-1) {
      print sprintf ("%04x",ord(substr($_,$i,1)))."&nbsp;";
    }
    print "<br>\n";
  }
}

print '<br><br>enter your unicode-testtext here : ',$query->start_multipart_form,
$query->textarea(-name=>'unicode',-rows=>10,-columns=>100),
"\n<br>\n",
$query->submit(-name=>'submit',-value=>'proceed'),"\n",
$query->endform,"\n";
print $query->end_html;
----------------------------






--
mag. peter pilsl
goldfisch.at
IT-management
tel +43 699 1 3574035
fax +43 699 4 3574035
[EMAIL PROTECTED]

--
Report problems: http://perl.apache.org/bugs/
Mail list info: http://perl.apache.org/maillist/modperl.html
List etiquette: http://perl.apache.org/maillist/email-etiquette.html



Reply via email to