On 2004-10-14 16:06:41 -0700, Susan Cassidy wrote:
> I have a cgi application that works fine using DBD::Pg to insert/select data
> from a PostgreSQL using UTF-8 (database created as UNICODE).  We have data
> in multiple languages stored, which has been working fine.
> 
>  
> 
> I have modified the application to use either Oracle or PostgreSQL,
> depending on a config file.  The PostgreSQL part still works fine - web page
> shows up correctly (we specify utf-8 encoding in the header), no problems.
> 
>  
> 
> The Oracle way is problematic.

I believe your problem may be with CGI or the STDOUT stream, not with
DBD::Oracle. 

When I use the following script to dump a table:


------------------------------------------------------------------------
#!/usr/bin/perl
use DBI;
use Encode;

binmode STDOUT, ":utf8";

$dbh = DBI->connect("dbi:Oracle:${ARGV[1]}", $ARGV[2], $ARGV[3]);

$sth = $dbh->prepare("select * from " . $ARGV[0]);

$rv = $sth->execute;

while (@ary = $sth->fetchrow_array) {
    for my $i (0 .. $#ary) {
        print $sth->{NAME}[$i], ": ";
        print (Encode::is_utf8($ary[$i]) ? "(utf8) " : "(bytes) ");
        # print encode('utf-8', $ary[$i]);
        print $ary[$i];
        print "\n";
    }
    print "\n";
}
------------------------------------------------------------------------

It prints something like:

------------------------------------------------------------------------
ID: (bytes) 1
C: (bytes) test

ID: (bytes) 2
C: (utf8) ä

------------------------------------------------------------------------

on a UTF-8 terminal. 

Note that the string containing a non-ASCII character is correctly
marked as "utf8", that is perl knows that the string contains only one
character (a with umlaut), although it is represented with two bytes.
To print that string, the output stream must use the correct I/O layer.
At least with my version of perl (5.8.3 on Linux), the default is to
convert to latin-1 if you don't explicitely specify ":utf8" with
binmode.

If it "works" with Postgresql without explicitely setting the I/O layer,
I would tend to call that a Bug in DBD::Pg (because it probably means
that non-ascii-characters are returned as 2 or 3 characters, not a a
single multibyte character).

This is perl, v5.8.3 built for i386-linux-thread-multi
$DBI::VERSION: 1.40
$DBD::Oracle::VERSION: 1.15

        hp

PS: I remember I found somewhere in the docs a way to set the binmode
for STDIN, STDOUT and STDERR from the locale automatically. I can't find
it any more. Can anybody point me to the FM I should read?

-- 
   _  | Peter J. Holzer      | Shooting the users in the foot is bad. 
|_|_) | Sysadmin WSR / LUGA  | Giving them a gun isn't.
| |   | [EMAIL PROTECTED]        |      -- Gordon Schumacher,
__/   | http://www.hjp.at/   |     mozilla bug #84128

Attachment: pgpd1IE7utRCP.pgp
Description: PGP signature

Reply via email to