Re: [cgiapp] Problem displaying French, sometimes
Hi Ron, Your script should send a response header as follows: Content-Type: text/html; charset=UTF-8 and in your HTML source you should have a matching meta-tag: If you're using CA then the response header can be defined by: $self->header_add(-charset => 'utf-8'); or if you're using sessions then you need something like $self->session->header_add(-charset => 'utf-8'); The output from the test script looks good - nice to see the utf8 survived an email round trip! I'll attach the test_utf8.txt file - it's just to test that utf8 data can be read from a file correctly. The next part will be more important for you - getting the data from the database correctly. As I mentioned you will need to edit the script at that point to account for your database setup and table names, etc. mike UTF8 Test data read from file Czech and Slovak characters: Å¡ Å¥ ž ľ Ä Ä Ä Å Å Å¯ ĺ ŠŤ Ž Ľ Ä Ä Ä Å Å Å® Ĺ Polish characters: Å Ä Å¼ Ä Ä Å Å Åº Å Ä Å» Ä Ä Å Å Å¹ Romanian characters: Ä Ä Å Å Å¢ Å£ Croatian and Slovenian characters: Å¡ Ä Å¾ Ä Ä Å Ä Å½ Ä Ä Hungarian characters: Å Å Å° ű German characters: Ã, ä, Ã, ö, Ã, ü, à Russian alphabet: абвгдеÑжзийклмно пÑÑÑÑÑÑ ÑÑÑÑÑÑÑÑÑÑ ÐÐÐÐÐÐÐÐÐÐÐ ÐÐÐÐÐÐРСТУФ ХЦЧШЩЬЫЪÐЮЯ Special Byelorussian and Ukrainian characters: Ð Ñ Ð Ñ Ò Ò Special Serbian and Macedonian characters: Ð Ð Ð Ð Ð Ñ Ñ Ñ Ñ Ñ Arabic: بâجâدâﻫâÙâزâØâØ·âعâÙâصâÙâرâØ´âتâØ«âØ®âØ°âضâظâغâ # CGI::Application community mailing list #### ## To unsubscribe, or change your message delivery options, ## ## visit: http://www.erlbaum.net/mailman/listinfo/cgiapp## #### ## Web archive: http://www.erlbaum.net/pipermail/cgiapp/ ## ## Wiki: http://cgiapp.erlbaum.net/ ## ####
Re: [cgiapp] Problem displaying French, sometimes
Hi Mike Ok. Here's the output. > Don't know if this will come through correctly via email - I'll mail [EMAIL PROTECTED]:~$ ./test_utf8.pl Test phrases - display and concatenation Benützername [UTF8 on, non-ASCII, 12 characters 13 bytes] Se déconnecter [UTF8 on, non-ASCII, 14 characters 15 bytes] Verifié ☺ [UTF8 on, non-ASCII, 9 characters 12 bytes] Benützername Se déconnecter Verifié ☺ [UTF8 on, non-ASCII, 37 characters 42 bytes] Additional Test data - various languages [use -p if you want poetry] Czech and Slovak characters: š ť ž ľ č ě ď ň ř ů ĺ Š Ť Ž Ľ Č Ě Ď Ň Ř Ů Ĺ [UTF8 on, non-ASCII, 77 characters 99 bytes] Polish characters: ł ą ż ę ć ń ś ź Ł Ą Ż Ę Ć Ń Ś Ź [UTF8 on, non-ASCII, 50 characters 66 bytes] Romanian characters: Ă ă Ş ş Ţ ţ [UTF8 on, non-ASCII, 32 characters 38 bytes] Croatian and Slovenian characters: š č ž ć đ Š Č Ž Ć Đ [UTF8 on, non-ASCII, 54 characters 64 bytes] Hungarian characters: Ő ő Ű ű [UTF8 on, non-ASCII, 29 characters 33 bytes] German characters: Ä, ä, Ö, ö, Ü, ü, ß [UTF8 on, non-ASCII, 38 characters 45 bytes] Russian alphabet: абвгдеёжзийклмнопрстуфхцчшчьыъэюя АБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЬЫЪЭЮЯ [UTF8 on, non-ASCII, 85 characters 151 bytes] Special Byelorussian and Ukrainian characters: Ў ў Є є Ґ ґ [UTF8 on, non-ASCII, 58 characters 64 bytes] Special Serbian and Macedonian characters: Ђ Љ Њ Ћ Џ ђ љ њ ћ џ [UTF8 on, non-ASCII, 62 characters 72 bytes] Arabic: بجدﻫوزحطعفصقرشتثخذضظغ [UTF8 on, non-ASCII, 50 characters 114 bytes] Could not open file! at ./test_utf8.pl line 47. Line 47 is: open (DATA, "<:utf8", "test_utf8.txt") or die("Could not open file!"); OK so far? -- Ron Savage [EMAIL PROTECTED] http://savage.net.au/index.html # CGI::Application community mailing list #### ## To unsubscribe, or change your message delivery options, ## ## visit: http://www.erlbaum.net/mailman/listinfo/cgiapp## #### ## Web archive: http://www.erlbaum.net/pipermail/cgiapp/ ## ## Wiki: http://cgiapp.erlbaum.net/ ## ####
Re: [cgiapp] Problem displaying French, sometimes
Hi Mike On Tue, 2008-09-09 at 11:18 +0100, Mike Tonks wrote: > Hi Ron, > > Don't know if this will come through correctly via email - I'll mail Yep, received. $many x $thanx; > In summary - I found I did not need encode or decode functions, but > did need to 'use utf8' and binmode utf8, and when run as cgi or CA > need to ensure all correct utf8 headers are sent to browser. Just for the record, what exactly do you send? > #!/usr/bin/perl -w > > use utf8; #Explicitly allow utf8 in our script - this is critical > > binmode STDOUT, ":utf8"; # Explicitly output utf8 - this is critical I have not tried these 2 yet, but I will today. -- Ron Savage [EMAIL PROTECTED] http://savage.net.au/index.html # CGI::Application community mailing list #### ## To unsubscribe, or change your message delivery options, ## ## visit: http://www.erlbaum.net/mailman/listinfo/cgiapp## #### ## Web archive: http://www.erlbaum.net/pipermail/cgiapp/ ## ## Wiki: http://cgiapp.erlbaum.net/ ## ####
Re: [cgiapp] Problem displaying French, sometimes
Hi Ron, Don't know if this will come through correctly via email - I'll mail you direct with file as an attachment - but I ended up writing a simple perl script to run from the command line to test my utf8 data display and retrieval from the database. You will need to alter the database calls to match your own setup. You can ignore the last bit that attempt to locate and fix corrupt data in the db. In summary - I found I did not need encode or decode functions, but did need to 'use utf8' and binmode utf8, and when run as cgi or CA need to ensure all correct utf8 headers are sent to browser. mike #!/usr/bin/perl -w use utf8; #Explicitly allow utf8 in our script - this is critical binmode STDOUT, ":utf8"; # Explicitly output utf8 - this is critical use DBI qw(:utils); # enable extra DBI debug functions use Encode; use Config::Auto; use Getopt::Easy; use Data::Dumper; my $DELETE_MESSAGES = 0; my $config = Config::Auto::parse("../../web/cgi-bin/BookBank/config.pm"); #warn "db_connect: " . join "|", @{$config->{db_connect}}; my $dbh = DBI->connect(@{$config->{db_connect}}) or die "Database connection failed"; get_options "e-errors f-fix= p-poetry v-verbose D-debug", "usage => usage: prog [-e] [-f] [-p] [-D]"; my $ben = "Benützername"; my $sed = "Se déconnecter"; my $ver = "Verifié ☺"; print "Test phrases - display and concatenation \n"; print "$ben [". data_string_desc($ben) ."]\n"; print "$sed [". data_string_desc($sed) ."]\n"; print "$ver [". data_string_desc($ver) ."]\n"; print "$ben $sed $ver [". data_string_desc("$ben $sed $ver") ."]\n"; print "\n"; print "Additional Test data - various languages [use -p if you want poetry]\n"; foreach my $test (@{&test_data()}) { print $test . " [".data_string_desc("$test")."]\n"; } print "\n"; #Get utf8 data from file open (DATA, "<:utf8", "test_utf8.txt") or die("Could not open file!"); foreach my $test () { print $test; } if ($O{poetry}) { foreach my $test (@{&poetry()}) { print $test . "\n\n"; } } print "\n"; print "Forced Incorrect Encoding \n"; my $double_coded = encode_utf8($sed); my $triple_coded = encode_utf8($double_coded); my $reverse = decode_utf8($sed); print "Double Encoded: $sed - " . $double_coded . " [".data_string_desc($double_coded)."]\n"; print "Triple Encoded: $sed - " . $triple_coded . " [".data_string_desc($triple_coded)."]\n"; print "Reverse Encoded: $sed - " . $reverse . " [".data_string_desc($reverse)."]\n"; print "\n"; print "From Database [table language_test_utf8]\n"; my $test = getLangTest($dbh); foreach my $row (@$test) { print $row->{Text} . " [".data_string_desc($row->{Text})."]\n"; } print "\n"; print "From Languages Table \n"; $test = getLangTerm($dbh); foreach my $row (@$test) { print $row->{Text} . " [".data_string_desc($row->{Text})."]\n"; } print "\n"; unless ($O{errors}) { print "Done initial tests, use -e [errors] to check for errors and -f yes [fix] to try to fix errors in db \n"; exit; } # à â ç é è ê ë î ï ô û ù ü ÿ my $char = "é"; my $err = encode_utf8($char); print "Looking for errors in database [ $char ] [ $err ] \n"; my $dbl_err = encode_utf8($err); print "Test: [ $dbl_err ] \n"; $test = getLangMatch($dbh, $err); foreach my $row (@$test) { # Perl / mysql seems to handle this, but it's not technically correct # Gives 'wide character in print' warning print "Raw: $row->{Text} \n"; my $enc = encode_utf8($row->{Text}); # This is the correct way to do it print "Enc: $enc \n"; # Adding the correct and incorrect string together causes the encoded string to be mangled # Gives 'wide character in print' warning print "Mangled: " . $enc . " [ $row->{Text} ] \n"; my $err2 = $err; $err2 =~ s/Ã/ÃÂ/g; my $fix = $enc; $fix =~ s/$err/$char/g; $fix =~ s/$dbl_err/$char/g; $fix =~ s/$err2/$char/g; # Fixed - works for $err but not $dbl_err ? print "Fixed?: $fix \n"; } # We can do a similar fix via the database, using mysql replace function foreach $char ( qw/à â ç é è ê ë î ï ô û ù ü ÿ À Â Ä È É Ê Ë Î Ï Ô Œ Ù Û Ü Ÿ/ ) { $err = encode_utf8($char); my $err2 = $err; $err2 =~ s/Ã/ÃÂ/g; print "char: $char [ $err ] [ $err2 ] \n"; if ($O{fix} eq "yes") { print "Running fix on database \n"; doLangMatchFix($dbh, $err, $char); } my $rows = getLangMatchFix($dbh, $err, $char); if (scalar (@$rows) > 0) { foreach my $row (@$rows) { print "Fixed?: ".encode_utf8($row->{Text})." >> ".encode_utf8($row->{Fixed})." \n"; } } else { print "No errors found - checking real data \n"; my $rows = getLangMatch($dbh, $char); foreach my $row (@$rows) { print "OK? [ $char ]: ".encode_utf8($row->{Tex
Re: [cgiapp] Problem displaying French, sometimes
Hi Mike On Mon, 2008-09-08 at 09:23 +0100, Mike Tonks wrote: > You got me there. I'm using mysql with utf8 and this works fine for > me. I tend to agree with Peter that utf8 is the way to go. I've tried to go the 'utf8' way (1) httpd.conf: PerlSetEnv PGCLIENTENCODING UTF8 (2) startup.pl: No change (3) sites.fcgi $ENV{'PGCLIENTENCODING'} = 'UTF8'; (4) populate.countries.pl: This program does: use Locale::SubCountry; to load data into Postgres. $ENV{'PGCLIENTENCODING'} = 'UTF8'; and # encode destroys its 2nd parameter, so we protect it. sub my_encode { my($name) = @_; $name =~ s/(.+) \(SEE ALSO.+/$1/; return encode('UTF-8', $name, Encode::FB_CROAK); } # End of my_encode; Note: See the pod for Encode, and in particular this note: UTF-8 vs. utf8 vs. UTF8 (5) Sites.pm: This module displays the data: sub my_decode { my($name) = @_; return decode('UTF-8', $name, Encode::FB_CROAK); } # End of my_decode; (6) Result: The symptoms have reversed compared to my earlier msg. Agghh Now, the mod_perl execution path displays the correct data: CÔTE D'IVOIRE while the fastcgid execution path displays: CÔTE D'IVOIRE (7) Buy h-bomb on ebay. kill $self. After all, what's the point :-(. -- Ron Savage [EMAIL PROTECTED] http://savage.net.au/index.html # CGI::Application community mailing list #### ## To unsubscribe, or change your message delivery options, ## ## visit: http://www.erlbaum.net/mailman/listinfo/cgiapp## #### ## Web archive: http://www.erlbaum.net/pipermail/cgiapp/ ## ## Wiki: http://cgiapp.erlbaum.net/ ## ####
Re: [cgiapp] Problem displaying French, sometimes
You got me there. I'm using mysql with utf8 and this works fine for me. I tend to agree with Peter that utf8 is the way to go. > 475 | info | CGIApp: CÔTE D'IVOIRE. Encoding: UTF8 off, ASCII, 3 > characters 3 bytes | 2008-09-08 09:27:03.887059 > > 3 chars? WTF? > > The code is: > > $self -> log($_ -> name() . '. Encoding: ' . DBI -> data_string_desc($_ > -> name() ) ); > # CGI::Application community mailing list #### ## To unsubscribe, or change your message delivery options, ## ## visit: http://www.erlbaum.net/mailman/listinfo/cgiapp## #### ## Web archive: http://www.erlbaum.net/pipermail/cgiapp/ ## ## Wiki: http://cgiapp.erlbaum.net/ ## ####
Re: [cgiapp] Problem displaying French, sometimes
Hi Peter On Sat, 2008-09-06 at 20:49 -0500, Peter Karman wrote: > Ron Savage wrote on 9/5/08 7:51 PM: > > Hi Folks > > > > Here is the set up (details below): > > o An fcgid scripts calls... > > o A module based on CGI::Application::Dispatch, which calls... > > o My module, which reads country names from Postgres and displays them > > > > This works, so Ivory Coast is displayed as 'CÔte D'ivoire' (ignoring the > > upper-case O with caret for the moment). > > > > But when the first module above is installed as a mod_perl handler, > > and /that/ calls my module, the output is 'CÔte D'ivoire'. > > > > I find this scary, and would love an explanantion. > > > > Sounds like a typical encoding issue. The 'bad' display above is likely > because > you are sending utf8 encoded strings to the browser but claim that the charset > is latin1. > > IMO, the best route is all utf8, all the time. Store strings encoded as utf8 > in > your db, send utf8 to the browser, and encode/decode at your program > boundaries. > It's a real b*tch to track down the problem spots in a multiple-encoding set > up. > That's why I wrote Search::Tools::UTF8 to help me. If I'm having trouble, I > usually throw a to_utf8() function call at suspect strings and make sure I > declare utf8 as my charset in all my http headers and output. Nice to know about Search::Tools::UTF8. Thanx. Using it, the valid output carps (as expected, since the -1 is documented): [Mon Sep 08 10:01:29 2008] [warn] mod_fcgid: stderr: byte -1 (R) is not Latin1 (it's 82 dec / 52 hex) at /home/ron/perl.modules/Local-Sites/lib/Local/Sites/Test/Sites.pm line 73 whereas the invalid output carps: byte 3 (�) is not Latin1 (it's 148 dec / 94 hex) at /home/ron/perl.modules/Local-Sites/lib/Local/Sites/Test/Sites.pm line 73 And in the log (valid, invalid): CGIApp: .. CGIApp: http://127.0.0.1/search/sites.fcgi CGIApp: CÔTE D'IVOIRE. Encoding: UTF8 off, ASCII, 3 characters 3 bytes CGIApp: is_flagged_utf8: CGIApp: is_perl_utf8_string: 0 CGIApp: is_sane_utf8: 1 CGIApp: find_bad_latin1_report: -1 CGIApp: .. CGIApp: http://127.0.0.1/test/sites CGIApp: CÔTE D'IVOIRE. Encoding: UTF8 off, ASCII, 3 characters 3 bytes CGIApp: is_flagged_utf8: CGIApp: is_perl_utf8_string: 1 CGIApp: is_sane_utf8: 0 CGIApp: find_bad_latin1_report: 3 so I'll abandon DBI -> data_string_desc($name). But I knew there was a problem! Your module nicely demonstrates that. Since the underlying module is the same in both cases, the question is why does one calling mechanism work and the other mangle the data? I'll dig into it :-((. -- Ron Savage [EMAIL PROTECTED] http://savage.net.au/index.html # CGI::Application community mailing list #### ## To unsubscribe, or change your message delivery options, ## ## visit: http://www.erlbaum.net/mailman/listinfo/cgiapp## #### ## Web archive: http://www.erlbaum.net/pipermail/cgiapp/ ## ## Wiki: http://cgiapp.erlbaum.net/ ## ####
Re: [cgiapp] Problem displaying French, sometimes
Hi Mike On Sun, 2008-09-07 at 09:33 +0100, Mike Tonks wrote: > I don't have a full explanation, but the second character looks like a > wrongly encoded double-byte utf8 issue, i.e. utf8 character (double > byte) being displayed as two characters. Do you see 'wide character > in print anywhere? Nope. No such warning appears. > This can happen when you concatenate two string together if the uft8 > flags are not set correctly or if corruption has occurred. For > example I recently had a mysql table with badly encoded utf8 stored in > it, which caused similar things to appear. > > The DBI function data_string_desc may be useful to debug the status of > your strings. Opens another can of worms. From the database log (sorry about the wrap): id | level |message | timestamp -+---++ 470 | info | CGIApp: -- | 2008-09-08 09:27:01.123344 471 | info | CGIApp: http://127.0.0.1/search/sites.fcgi | 2008-09-08 09:27:01.127116 472 | info | CGIApp: CÔTE D'IVOIRE. Encoding: UTF8 off, ASCII, 3 characters 3 bytes | 2008-09-08 09:27:01.185259 473 | info | CGIApp: -- | 2008-09-08 09:27:03.845652 474 | info | CGIApp: http://127.0.0.1/test/sites | 2008-09-08 09:27:03.852502 475 | info | CGIApp: CÔTE D'IVOIRE. Encoding: UTF8 off, ASCII, 3 characters 3 bytes | 2008-09-08 09:27:03.887059 3 chars? WTF? The code is: $self -> log($_ -> name() . '. Encoding: ' . DBI -> data_string_desc($_ -> name() ) ); > What character set is your database using? My code uses 'PerlSetEnv PGCLIENTENCODING LATIN1' to set the client character set encoding. -- Ron Savage [EMAIL PROTECTED] http://savage.net.au/index.html # CGI::Application community mailing list #### ## To unsubscribe, or change your message delivery options, ## ## visit: http://www.erlbaum.net/mailman/listinfo/cgiapp## #### ## Web archive: http://www.erlbaum.net/pipermail/cgiapp/ ## ## Wiki: http://cgiapp.erlbaum.net/ ## ####
Re: [cgiapp] Problem displaying French, sometimes
I don't have a full explanation, but the second character looks like a wrongly encoded double-byte utf8 issue, i.e. utf8 character (double byte) being displayed as two characters. Do you see 'wide character in print anywhere? This can happen when you concatenate two string together if the uft8 flags are not set correctly or if corruption has occurred. For example I recently had a mysql table with badly encoded utf8 stored in it, which caused similar things to appear. The DBI function data_string_desc may be useful to debug the status of your strings. What character set is your database using? cheers, mike 2008/9/6 Ron Savage <[EMAIL PROTECTED]>: > Hi Folks > > Here is the set up (details below): > o An fcgid scripts calls... > o A module based on CGI::Application::Dispatch, which calls... > o My module, which reads country names from Postgres and displays them > > This works, so Ivory Coast is displayed as 'CÔte D'ivoire' (ignoring the > upper-case O with caret for the moment). > > But when the first module above is installed as a mod_perl handler, > and /that/ calls my module, the output is 'CÃ"te D'ivoire'. > > I find this scary, and would love an explanantion. > > Note: Commenting out CGI::Simple makes no difference, and commenting out > 'use locale' makes no difference. > > Details: > (1) OS: > Debian > > (2) Web server: > Apache/2.2.9 (Unix) mod_ssl/2.2.9 OpenSSL/0.9.8g > mod_apreq2-20051231/2.6.0 mod_perl/2.0.4 Perl/v5.10.0 > > (3) httpd.conf: > # Warning: > # PerlOptions -GlobalRequest > # won't work with CGI::Application::Dispatch. > > PerlOptions +GlobalRequest > PerlOptions -SetupEnv > PerlSetEnv PGCLIENTENCODING LATIN1 > PerlSwitches -I/home/ron/perl.modules/Local-Sites/lib > PerlSwitches -T > PerlPostConfigRequire /home/ron/httpd/prefork/conf/startup.pl > > #PerlInitHandler Apache2::Reload > #PerlSetVar ReloadAll Off > #PerlSetVar ReloadModules Local::* > > >SetHandler perl-script >PerlResponseHandler Local::Sites::Test::Dispatcher >Order deny,allow >Deny from all >Allow from 127.0.0.1 > > > >SetHandler fcgid-script >Options ExecCGI >Order deny,allow >Deny from all >Allow from 127.0.0.1 > > > (4) startup.pl: > # /home/ron/httpd/prefork/conf/startup.pl > > #use Apache::DBI; > #use Apache2::Reload; > > #use Apache2::RequestRec (); > #use Apache2::RequestUtil (); > #use Apache2::Response(); > use CGI::Application::Dispatch; > use CGI::Simple; > use HTML::Template; > use Local::Sites::Base::DB; > use Local::Sites::Config; > use Local::Sites::Rose::Countries::Manager; > use Local::Sites::Test::Dispatcher; > use Local::Sites::Test::Sites; > use Log::Dispatch; > use Log::Dispatch::DBI; > > 1; > > (5) sites.fcgi, working when called as: > http://127.0.0.1/search/sites.fcgi > > #!/usr/bin/perl > > # Mandatory use lib for FCGID. > > use lib '/home/ron/perl.modules/Local-Sites/lib'; > use strict; > use warnings; > > use CGI::Fast; > use FCGI::ProcManager; > use Local::Sites::Test::Dispatcher; > > # --- > > # Mandatory env var for FCGID. The value in httpd.conf is ignored. > > $ENV{PGCLIENTENCODING} = 'LATIN1'; > my($proc_manager) = FCGI::ProcManager -> new({processes => 2}); > > $proc_manager -> pm_manage(); > > my($cgi); > > while ($cgi = CGI::Fast -> new() ) > { >$proc_manager -> pm_pre_dispatch(); >Local::Sites::Test::Dispatcher -> dispatch(); >$proc_manager -> pm_post_dispatch(); > } > > (6) Local::Sites::Test::Dispatcher: > package Local::Sites::Test::Dispatcher; > > use base 'CGI::Application::Dispatch'; > use strict; > use warnings; > > our $VERSION = '1.00'; > > # --- > > sub dispatch_args > { >return >{ >prefix => 'Local::Sites::Test', >table => >[ > '' => {app => 'sites', rm => 'display'}, > ':app' => {}, > ':app/:rm' => {}, >], >}; > > } # End of dispatch_args. > > # --- > > 1; > > (7) mod_perl activity, failing when called as: > http://127.0.0.1/test/sites > > (8) My module: > package Local::Sites::Test::Sites; > > # Author: > # Ron Savage <[EMAIL PROTECTED]> > > use base 'CGI::Application'; > use locale; > use strict; > use warnings FATAL => 'all', NONFATAL => 'redefine'; > > use CGI::Simple; > use DBI; > use Local::Sites::Base::DB; > use Local::Sites::Config; > use Local::Sites::Rose::Countries::Manager; > use Log::Dispatch; > use Log::Dispatch::DBI; > > our $VERSION = '1.00'; > > # --- > > sub cgiapp_get_query > { >my($self) = @_; > >return CGI::Simple -> new(); > > } # End of cgiapp_get_query. > > # --- > # Convert, for example, AUSTRALIA to Australia. > > sub nice_name > { >my($na
Re: [cgiapp] Problem displaying French, sometimes
Ron Savage wrote on 9/5/08 7:51 PM: > Hi Folks > > Here is the set up (details below): > o An fcgid scripts calls... > o A module based on CGI::Application::Dispatch, which calls... > o My module, which reads country names from Postgres and displays them > > This works, so Ivory Coast is displayed as 'CÔte D'ivoire' (ignoring the > upper-case O with caret for the moment). > > But when the first module above is installed as a mod_perl handler, > and /that/ calls my module, the output is 'CÔte D'ivoire'. > > I find this scary, and would love an explanantion. > Sounds like a typical encoding issue. The 'bad' display above is likely because you are sending utf8 encoded strings to the browser but claim that the charset is latin1. IMO, the best route is all utf8, all the time. Store strings encoded as utf8 in your db, send utf8 to the browser, and encode/decode at your program boundaries. It's a real b*tch to track down the problem spots in a multiple-encoding set up. That's why I wrote Search::Tools::UTF8 to help me. If I'm having trouble, I usually throw a to_utf8() function call at suspect strings and make sure I declare utf8 as my charset in all my http headers and output. -- Peter Karman . http://peknet.com/ . [EMAIL PROTECTED] # CGI::Application community mailing list #### ## To unsubscribe, or change your message delivery options, ## ## visit: http://www.erlbaum.net/mailman/listinfo/cgiapp## #### ## Web archive: http://www.erlbaum.net/pipermail/cgiapp/ ## ## Wiki: http://cgiapp.erlbaum.net/ ## ####
[cgiapp] Problem displaying French, sometimes
Hi Folks Here is the set up (details below): o An fcgid scripts calls... o A module based on CGI::Application::Dispatch, which calls... o My module, which reads country names from Postgres and displays them This works, so Ivory Coast is displayed as 'CÔte D'ivoire' (ignoring the upper-case O with caret for the moment). But when the first module above is installed as a mod_perl handler, and /that/ calls my module, the output is 'CÔte D'ivoire'. I find this scary, and would love an explanantion. Note: Commenting out CGI::Simple makes no difference, and commenting out 'use locale' makes no difference. Details: (1) OS: Debian (2) Web server: Apache/2.2.9 (Unix) mod_ssl/2.2.9 OpenSSL/0.9.8g mod_apreq2-20051231/2.6.0 mod_perl/2.0.4 Perl/v5.10.0 (3) httpd.conf: # Warning: # PerlOptions -GlobalRequest # won't work with CGI::Application::Dispatch. PerlOptions +GlobalRequest PerlOptions -SetupEnv PerlSetEnv PGCLIENTENCODING LATIN1 PerlSwitches -I/home/ron/perl.modules/Local-Sites/lib PerlSwitches -T PerlPostConfigRequire /home/ron/httpd/prefork/conf/startup.pl #PerlInitHandler Apache2::Reload #PerlSetVar ReloadAll Off #PerlSetVar ReloadModules Local::* SetHandler perl-script PerlResponseHandler Local::Sites::Test::Dispatcher Order deny,allow Deny from all Allow from 127.0.0.1 SetHandler fcgid-script Options ExecCGI Order deny,allow Deny from all Allow from 127.0.0.1 (4) startup.pl: # /home/ron/httpd/prefork/conf/startup.pl #use Apache::DBI; #use Apache2::Reload; #use Apache2::RequestRec (); #use Apache2::RequestUtil (); #use Apache2::Response(); use CGI::Application::Dispatch; use CGI::Simple; use HTML::Template; use Local::Sites::Base::DB; use Local::Sites::Config; use Local::Sites::Rose::Countries::Manager; use Local::Sites::Test::Dispatcher; use Local::Sites::Test::Sites; use Log::Dispatch; use Log::Dispatch::DBI; 1; (5) sites.fcgi, working when called as: http://127.0.0.1/search/sites.fcgi #!/usr/bin/perl # Mandatory use lib for FCGID. use lib '/home/ron/perl.modules/Local-Sites/lib'; use strict; use warnings; use CGI::Fast; use FCGI::ProcManager; use Local::Sites::Test::Dispatcher; # --- # Mandatory env var for FCGID. The value in httpd.conf is ignored. $ENV{PGCLIENTENCODING} = 'LATIN1'; my($proc_manager) = FCGI::ProcManager -> new({processes => 2}); $proc_manager -> pm_manage(); my($cgi); while ($cgi = CGI::Fast -> new() ) { $proc_manager -> pm_pre_dispatch(); Local::Sites::Test::Dispatcher -> dispatch(); $proc_manager -> pm_post_dispatch(); } (6) Local::Sites::Test::Dispatcher: package Local::Sites::Test::Dispatcher; use base 'CGI::Application::Dispatch'; use strict; use warnings; our $VERSION = '1.00'; # --- sub dispatch_args { return { prefix => 'Local::Sites::Test', table => [ '' => {app => 'sites', rm => 'display'}, ':app' => {}, ':app/:rm' => {}, ], }; } # End of dispatch_args. # --- 1; (7) mod_perl activity, failing when called as: http://127.0.0.1/test/sites (8) My module: package Local::Sites::Test::Sites; # Author: # Ron Savage <[EMAIL PROTECTED]> use base 'CGI::Application'; use locale; use strict; use warnings FATAL => 'all', NONFATAL => 'redefine'; use CGI::Simple; use DBI; use Local::Sites::Base::DB; use Local::Sites::Config; use Local::Sites::Rose::Countries::Manager; use Log::Dispatch; use Log::Dispatch::DBI; our $VERSION = '1.00'; # --- sub cgiapp_get_query { my($self) = @_; return CGI::Simple -> new(); } # End of cgiapp_get_query. # --- # Convert, for example, AUSTRALIA to Australia. sub nice_name { my($name) = @_; $name = ucfirst lc $name; $name =~ s/([ .(])([a-z])/$1\U$2/g; $name =~ s/(.+) \(See Also.+/$1/; $name =~ s/Of(\)|$)/of$1/; return $name; } # End of nice_name. # --- sub display { my($self) = @_; my($country) = Local::Sites::Rose::Countries::Manager -> get_countries(); my($template) = $self -> load_tmpl('country_state.tmpl'); $template -> param(country_loop => [map{ {name => nice_name($_ -> name() )} } @$country]); $template -> param(encoding => $ENV{'PGCLIENTENCODING'}); for (@$country) { $self -> log($_ -> name() ) if ($_ -> name() =~ /IVOIRE/); } return $template -> output(); } # End of display. # --- sub log { my($self, $s) = @_; $$self{'_log'} -> log(level => 'info',