Hi,

sorry, the mail got long, but it took me some time to reproduce the 
problematic behaviour and I think we need this detail level to 
understand what's going wrong here...

We're using Perl 5.8.0, DBI 1.30 and DBD::Oracle 1.12, both client and 
server on Linux. We just extend our application framework to support 
UTF8 seamlessly. Nearly everything works well. Only retrieving CLOB's 
from an Oracle 8.1.7 database, which was created with the character set
"german_germany.utf8", fails.

With this small script I can reproduce the error:

--snip--
  #!/usr/bin/perl

  require 5.008;

  use strict;
  use warnings;

  use DBI;
  use DBD::Oracle qw(:ora_types);
  use Encode;

  $ENV{NLS_LANG} = "german_germany.utf8";

  main: {
      my $dbh = DBI->connect (
              "dbi:Oracle:DB",
              "USER", "PASS",
              {
                      AutoCommit  => 1,
                      RaiseError  => 1,
                      PrintError  => 0,
                      LongTruncOk => 1,
              }
      );

      eval { $dbh->do ("drop table clob_test") };

      $dbh->do (
              "create table clob_test (
                      id  integer,
                      str clob,
                      txt varchar2(255)
              )"
      );

      my $id = 0;

      my $data = "some data ���";

      # INSERT -----------------------------------------------------

      my $sth = $dbh->prepare ("insert into clob_test values (?, ?, ?)");

      utf8::upgrade($data);

      { use bytes; print "put: '$data'\n" }

      $sth->bind_param (1, ++$id );
      $sth->bind_param (2, $data, { ora_type => ORA_CLOB } );
      $sth->bind_param (3, $data );
      $sth->execute;
      $sth->finish;

      # READ -------------------------------------------------------

      $sth = $dbh->prepare (
              "select txt, str
               from   clob_test
               where  id = ?"
      );

      $sth->execute ( $id );

      my ($txt) = $sth->fetchrow;
      my $chunk = $sth->blob_read (1, 0, 32768);

      Encode::_utf8_on ($chunk);

      { use bytes; print "got: '$chunk' / '$txt'\n" }

      print "clob result: ".($chunk eq $data ? "OK" : "NOT OK"),"\n";
      print "char result: ".($txt   eq $data ? "OK" : "NOT OK"),"\n";

      END { $dbh->disconnect if $dbh }
  }

--snip--

The $sth->blob_read line throws the following exception:

  DBD::Oracle::st blob_read failed:  (DBD NEED_DATA: OCILobRead)

But when I omit the $ENV{NLS_LANG} line, $sth->blob_read() works and the
script doesn't abort. Astonishingly even the result check at the end 
reports, that the clob variable is valid utf8-encoded! That means, 
Oracle omitted iso-latin-1/utf8 conversion, which is usually done, if 
the client isn't itself utf8-aware.

But exactly that happens to the varchar2 variable, which is reported as
wrong by the script (sure, because it was utf8-converted twice). This is
the script's output (without setting NLS_LANG, in the other case I get 
the exception quoted above):

  put: 'some data äüö' / 'some data äüö'
  got: 'some data äüö' / 'some data �¤�¼�¶'
  clob result: OK
  char result: NOT OK

I learn the following from this:

  a) blob_read() always throws an exception with
     NLS_LANG=german_germany.utf8

  b) as expected: when not setting NLS_LANG, Oracle converts
     values for character columns as expected from/to utf8

  c) but: when not setting NLS_LANG, Oracle does *not*
     utf8 convert any CLOB values

I think we should classify a) and c) as bugs, and b) as absolutely 
correct ;)

Does anybody has any hint what's going on here? Can DBD::Oracle can do 
something for me? Is it a (probably known) problem with the Oracle 
client libaray?

Also I would greatly apprectiate if future versions of DBI would take 
care of variable's utf8 states themself (converting to utf8 if necessary
and setting the utf8-flag for retrieved values - both only with Perl 
5.8.x and if the database encoding is utf8, or if the developer set an 
appropriate attribute or something like that).

Regards,

Joern

-- 
Joern Reder -- Software Development,  dimedis GmbH, 50672 Koeln
               http://www.dimedis.de/
supporting:    http://www.zyn.de/ http://www.exit1.org/
CPAN:          http://www.perl.com/CPAN/modules/by-module/CIPP/JRED/

Reply via email to