send again after subscribing.
On Wed, Dec 18, 2024 at 11:20 AM Shaomei Liu <[email protected]>
wrote:
> Hello,
> I have a project which uses DBI to write to postgres DB.
> after upgrading from RHEL7 to RHEL8, the utf-8 character is not displayed
> properly in the DB. DB has correct utf-8 encoding set.
> for example, left double quotation mark “ is displayed as â\u0080\u009C
> .
> with support from DBI community, the issue was solved by calling decode
> from Encode module before writing to DB.
> wondering what is the change from DBD::pg cause this issue.
>
> perl version is 5.26.3 and 5.16.3 on EL8 and EL7 respectively.
> DBI version is 1.641 and 1.627 on EL8 and EL7 respectively.
>
> here is the program and execution results.
> Any feedback are greatly appreciated!
> thank you
> Shirley
>
> xxx.com> cat testutf_decode.pl
> #!/usr/bin/perl
> use strict;
> use warnings;
> use DBI;
> use Encode 'decode';
> print "DBI version: $DBI::VERSION\n";
>
> my $db = "debugutf";
> my $host = "db";
> my $user = "postgres";
> my $pass = "";
> my $dbh = DBI->connect("DBI:Pg:dbname=$db;host=$host",$user,$pass);
> my $sql = 'INSERT INTO table1 (title) VALUES (?)';
> my $query = $dbh->prepare($sql);
> my $bytes = '“';
> my $chars = decode('UTF-8', $bytes);
> print "$bytes contains ".length($bytes)." characters\n";
> print "after decode $bytes contains ".length($chars)." characters\n";
> #my @values = ($bytes); #=======>without decode, Database shows “ on EL7
> but â\u0080\u009C on EL8
> my @values = ($chars); #======>with decode, Database shows “ on both EL8
> and EL7, decode fixed the issue
> $query->execute(@values);
>
> ############### running on EL8
> xxx.com> ./testutf_decode.pl
> DBI version: 1.641
> “ contains 3 characters
> after decode “ contains 1 characters
>
> [yyy.com]$ psql -Upostgres -hdb debugutf
> psql (16.6)
> debugutf=# select * from table1;
> title
> ---------------
> â\u0080\u009C ==========>NOK without decode
> “ =============>OK with decode, so decode fixed the issue
> (2 rows)
>
> ############### running on EL7
> xxx.com> ./testutf_decode.pl
> DBI version: 1.627
> “ contains 3 characters
> after decode “ contains 1 characters
>
> [yyy.com]$ psql -Upostgres -hdb debugutf
> psql (16.6)
> debugutf=# select * from table1;
> title
> ---------------
> “ =============>OK without decode
> “ =============>OK with decode
> (2 rows)
>