On Mon, Dec 16, 2024 at 5:13 PM Shaomei Liu <sliu.newjer...@gmail.com>
wrote:

> Hello,
> very happy to find this mailing list as it is my last resort!!
> I have a project which uses DBI to write to postgres DB.
> after upgrading from RHEL7 to RHEL8, the utf-8 character is not displayed
> properly in the DB. DB has correct utf-8 encoding set.
> for example, left double quotation mark   “  is displayed as â\u0080\u009C
> .
> You can use this link to check hex utf-8 bytes
> https://www.cogsci.ed.ac.uk/~richard/utf-8.cgi?input=%E2%80%9C&mode=char
>
> below is the file testutf.pl which writes left double quotation mark  “
> to the database. it also shows the query results from psql for both EL8 and
> EL7.
>
> ==========file testutf.pl==========
> #!/usr/bin/perl
> use strict;
> use warnings;
> use DBI;
> print "DBI version: $DBI::VERSION\n";
>
> my $db = "debugutf";
> my $host = "db";
> my $user = "postgres";
> my $pass = "";
> my $dbh = DBI->connect("DBI:Pg:dbname=$db;host=$host",$user,$pass);
> my $sql = 'INSERT INTO table1 (title) VALUES (?)';
> my $query = $dbh->prepare($sql);
> my @values = ('“');
> $query->execute(@values);
> ===================================
>
> ==============on RHEL8
> #execute testutf.pl which wrote “ to database on RHEL8
> text.tac1.dev.bia-boeing.com> ./testutf.pl
> DBI version: 1.641
>
> #from psql
> debugutf=# select * from table1;
>      title
> ---------------
>  â\u0080\u009C  =========>unexpected
> (1 row)
>
>
> ==============on RHEL7
> #execute testutf.pl which wrote “ to database on RHEL8
> text.tac1.dev.bia-boeing.com> ./testutf.pl
> DBI version: 1.627
>
> #from psql
> debugutf=# select * from table1;
>      title
> ---------------
>  “       ============>expected
> (1 row)
>
> Any feedback is appreciated.
> thank you
> Shirley
>

Hello,

This is most likely due to changes in the version of DBD::Pg you are using.
Make sure you include the declaration "use utf8;" in a script where you
will write non-ascii literal strings in the source code, and ensure the
script is written in UTF-8 encoding (the default of most text editors these
days). If you are getting strings from elsewhere, you will need to ensure
that they are being decoded from the UTF-8 encoding in whatever way is
appropriate - for example, Mojolicious automatically decodes request
parameters from UTF-8, and using the ':encoding(UTF-8)' layer or read_text
from File::Slurper will decode text read from a UTF-8-encoded file.

-Dan

Reply via email to