On 2004-10-14 16:06:41 -0700, Susan Cassidy wrote: > I have a cgi application that works fine using DBD::Pg to insert/select data > from a PostgreSQL using UTF-8 (database created as UNICODE). We have data > in multiple languages stored, which has been working fine. > > > > I have modified the application to use either Oracle or PostgreSQL, > depending on a config file. The PostgreSQL part still works fine - web page > shows up correctly (we specify utf-8 encoding in the header), no problems. > > > > The Oracle way is problematic.
I believe your problem may be with CGI or the STDOUT stream, not with DBD::Oracle. When I use the following script to dump a table: ------------------------------------------------------------------------ #!/usr/bin/perl use DBI; use Encode; binmode STDOUT, ":utf8"; $dbh = DBI->connect("dbi:Oracle:${ARGV[1]}", $ARGV[2], $ARGV[3]); $sth = $dbh->prepare("select * from " . $ARGV[0]); $rv = $sth->execute; while (@ary = $sth->fetchrow_array) { for my $i (0 .. $#ary) { print $sth->{NAME}[$i], ": "; print (Encode::is_utf8($ary[$i]) ? "(utf8) " : "(bytes) "); # print encode('utf-8', $ary[$i]); print $ary[$i]; print "\n"; } print "\n"; } ------------------------------------------------------------------------ It prints something like: ------------------------------------------------------------------------ ID: (bytes) 1 C: (bytes) test ID: (bytes) 2 C: (utf8) ä ------------------------------------------------------------------------ on a UTF-8 terminal. Note that the string containing a non-ASCII character is correctly marked as "utf8", that is perl knows that the string contains only one character (a with umlaut), although it is represented with two bytes. To print that string, the output stream must use the correct I/O layer. At least with my version of perl (5.8.3 on Linux), the default is to convert to latin-1 if you don't explicitely specify ":utf8" with binmode. If it "works" with Postgresql without explicitely setting the I/O layer, I would tend to call that a Bug in DBD::Pg (because it probably means that non-ascii-characters are returned as 2 or 3 characters, not a a single multibyte character). This is perl, v5.8.3 built for i386-linux-thread-multi $DBI::VERSION: 1.40 $DBD::Oracle::VERSION: 1.15 hp PS: I remember I found somewhere in the docs a way to set the binmode for STDIN, STDOUT and STDERR from the locale automatically. I can't find it any more. Can anybody point me to the FM I should read? -- _ | Peter J. Holzer | Shooting the users in the foot is bad. |_|_) | Sysadmin WSR / LUGA | Giving them a gun isn't. | | | [EMAIL PROTECTED] | -- Gordon Schumacher, __/ | http://www.hjp.at/ | mozilla bug #84128
pgpd1IE7utRCP.pgp
Description: PGP signature