El día viernes, octubre 11, 2019 a las 04:03:31p. m. -0600, Jon Jensen escribió:
> Perl's internal storage of string data is a little odd. \xe4 is the > correct Unicode code point as per: > > https://en.wikipedia.org/wiki/Latin-1_Supplement_%28Unicode_block%29 > > It is not UTF-8 encoded, true, but there's no reason Perl internally needs > to use UTF-8 specifically, and I believe for Latin-1 it does not by > default. It's a question of in-memory storage and processing (some kind of > Unicode) vs. input/output (where you want UTF-8). > > If your script is configured to send UTF-8 to STDOUT, then I would expect > that \xe4 will show up as the UTF-8 \xc3\xa4 instead. I inserted another row into this table, encoded in UTF-8: pos71=# select d02name from d02ben where d02bnr = '08.05.1945' ; освобождение pos71=# select d02name::bytea from d02ben where d02bnr = '08.05.1945' ; \xd0bed181d0b2d0bed0b1d0bed0b6d0b4d0b5d0bdd0b8d0b520202020202020 ... If I run this through Perl DBD::Pg: @row = $sth->fetchrow_array; $HexStr = unpack("H*", $row[0]); print "HexStr: " . $HexStr . "\n"; print "$row[0]\n"; binmode(STDOUT, ':encoding(utf8)'); print "after binmode: $row[0]\n"; it gives: DBI is version 1.642, DBD::Pg is version 3.10.0 client_encoding=UTF8, server_encoding=UTF8 HexStr: 3e41323e313e3634353d38352020202020202020 ... Wide character in print at ./utf8-01.pl line 66. освобождение after binmode: освобождение and if I add an utf8::encode($row[0]) after the fetch, like: @row = $sth->fetchrow_array; utf8::encode($row[0]); it gives the correkt UTF-8 encoding: DBI is version 1.642, DBD::Pg is version 3.10.0 client_encoding=UTF8, server_encoding=UTF8 HexStr: d0bed181d0b2d0bed0b1d0bed0b6d0b4d0b5d0bdd0b8d0b520202020202020 ... освобождение after binmode: оÑвобождение i.e. the array returned by $sth->fetchrow_array does not contain an UTF-8 string. Why it has to be passed through utf8::encode($row[0]) ? Thanks matthias -- Matthias Apitz, ✉ [email protected], http://www.unixarea.de/ +49-176-38902045 Public GnuPG key: http://www.unixarea.de/key.pub 3. Oktober! Wir gratulieren! Der Berliner Fernsehturm wird 50 aus: https://www.jungewelt.de/2019/10-02/index.php
signature.asc
Description: PGP signature
