Hi,

I'm wondering about a change in 3.x concerning UTF-8.

When comparing the behaviors of the DBD::Pg version that ships with
Ubuntu 14.04 (libdbd-pg-perl 2.19.3-2) against a self-compiled 3.5.3,
I notice that the bind parameters now are interpreted differently,
independantly of pg_enable_utf8's value.

For instance, consider the following code, ran against an UTF-8
database:

  $dbh->do("SET client_encoding TO UTF8");
  $dbh->{pg_enable_utf8}=0;
  binmode(STDOUT);

  $p = "\xc3\xa9";  # U+00C9 as an utf-8 octet sequence
  $sth = $dbh->prepare("SELECT ?,length(?),octet_length(?)",
                       {pg_server_prepare=>0});
  $sth->execute($p,$p,$p);
  @r = $sth->fetchrow_array;
  printf "v%s, sending %s, getting back: %s %s %s\n",
         $DBD::Pg::VERSION, $p, @r;

With DBD::Pg 2.19.3 the client output is:
  v2.19.3, getting back: é 1 2
and the server log (log_statement=all), seen on a utf-8 terminal:
statement: SELECT 'é',length('é'),octet_length('é')
This is fine and what I expect.

With DBD::Pg 3.5.3, the client output is:
  v3.5.3, sending é, getting back: é 2 4
server log (log_statement=all), seen on a utf-8 terminal
  statement: SELECT 'é',length('é'),octet_length('é')

My expectation was that 3.x would behave like 2.x with
the above code, especially when pg_enable_utf8 is 0.
It seems that with the newer version, it results
in double encoding the parameter,  as shown by
the character and octet lengths at the server end.

Anyway, is the above output the expected behavior?

And is there a way to make it just pass-through the parameters
that don't have the utf-8 flag set?


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Reply via email to