On Mon, Dec 16, 2024 at 5:13 PM Shaomei Liu <sliu.newjer...@gmail.com> wrote:
> Hello, > very happy to find this mailing list as it is my last resort!! > I have a project which uses DBI to write to postgres DB. > after upgrading from RHEL7 to RHEL8, the utf-8 character is not displayed > properly in the DB. DB has correct utf-8 encoding set. > for example, left double quotation mark “ is displayed as â\u0080\u009C > . > You can use this link to check hex utf-8 bytes > https://www.cogsci.ed.ac.uk/~richard/utf-8.cgi?input=%E2%80%9C&mode=char > > below is the file testutf.pl which writes left double quotation mark “ > to the database. it also shows the query results from psql for both EL8 and > EL7. > > ==========file testutf.pl========== > #!/usr/bin/perl > use strict; > use warnings; > use DBI; > print "DBI version: $DBI::VERSION\n"; > > my $db = "debugutf"; > my $host = "db"; > my $user = "postgres"; > my $pass = ""; > my $dbh = DBI->connect("DBI:Pg:dbname=$db;host=$host",$user,$pass); > my $sql = 'INSERT INTO table1 (title) VALUES (?)'; > my $query = $dbh->prepare($sql); > my @values = ('“'); > $query->execute(@values); > =================================== > > ==============on RHEL8 > #execute testutf.pl which wrote “ to database on RHEL8 > text.tac1.dev.bia-boeing.com> ./testutf.pl > DBI version: 1.641 > > #from psql > debugutf=# select * from table1; > title > --------------- > â\u0080\u009C =========>unexpected > (1 row) > > > ==============on RHEL7 > #execute testutf.pl which wrote “ to database on RHEL8 > text.tac1.dev.bia-boeing.com> ./testutf.pl > DBI version: 1.627 > > #from psql > debugutf=# select * from table1; > title > --------------- > “ ============>expected > (1 row) > > Any feedback is appreciated. > thank you > Shirley > Hello, This is most likely due to changes in the version of DBD::Pg you are using. Make sure you include the declaration "use utf8;" in a script where you will write non-ascii literal strings in the source code, and ensure the script is written in UTF-8 encoding (the default of most text editors these days). If you are getting strings from elsewhere, you will need to ensure that they are being decoded from the UTF-8 encoding in whatever way is appropriate - for example, Mojolicious automatically decodes request parameters from UTF-8, and using the ':encoding(UTF-8)' layer or read_text from File::Slurper will decode text read from a UTF-8-encoded file. -Dan