Thank you, Peter! just a quick update. I got some replies from DBD::Pg community.
perl -MDBD::Pg -e'print $DBD::Pg::VERSION' EL8 3.7.4 EL7 2.19.3 Pg Version 3.0.0 has major change in utf8 handling. that was also the reason the version was bumped from 2.19.3 to 3.0.0 directly. so that confirmed the issue was caused by change from DBD::Pg. but it was released in 2014 so it is hard to know the exact change. again appreciate all the support from this group for solving my issue overnight!! Happy Holidays! Shirley On Thu, Dec 19, 2024 at 9:44 AM Peter J. Holzer <h...@hjp.at> wrote: > On 2024-12-18 09:41:13 -0500, Shaomei Liu wrote: > > if you happen to have an example to show the new behavior is more > > "correct", > > I haven't been on any of the main Perl mailing-lists or newsgroups for a > long time, so this may be outdated, but the general idea is that the > dichotomy between byte strings and character strings was a mistake and > that two strings which compare equal should hehave the same whenever > possible. The difference is just too subtle and error-prone. > > In particular, the string you created in your test script was a byte > string with three bytes ("\xe2\x80\x9C"). That string has length 3 and > it will compare equal to the string with the three characters > U+00E2 LATIN SMALL LETTER A WITH CIRCUMFLEX, U+0080 PADDING CHARACTER, > U+009C STRING TERMINATOR. So it stands to reason that it should be > treated the same as that 3 character string, and the varchar stored in > the database should also be 3 characters long and not just a single > character, just because it happens to be a byte sequence which happens > to match that character's UTF-8 encoding. > > > > On Wed, Dec 18, 2024 at 8:53 AM Felipe Gasper <fel...@felipegasper.com> > wrote: > > > > > > Do we know, in fact, why this changed? > > > > > > The new behaviour may be “more correct”, but it’ll still subtly > > > break a bunch of stuff that worked fine before. > > True. > > But it should probably also be noted that Redhat 7 was released in 2014 and > Redhat 8 in 2019. So the "new behaviour" is now between 5 and 10 years > old. > > I'm too lazy to track down the release which introduced the change > (especially since there seems to be a huge gap in the history on CPAN), > but I would expect that to be mentioned in the release notes at the > time. > > hp > > > -- > _ | Peter J. Holzer | Story must make more sense than reality. > |_|_) | | > | | | h...@hjp.at | -- Charles Stross, "Creative writing > __/ | http://www.hjp.at/ | challenge!" >