Re: pgsql: Adjust string comparison in jsonpath
пн, 12 авг. 2019 г., 3:25 Thomas Munro : > On Mon, Aug 12, 2019 at 10:30 AM Alexander Korotkov > wrote: > > > > This appears to have upset prion when testing on en_US.iso885915. > > > > > > Also lapwing's "InstallCheck-fr_FR" stage crashed on this commit, when > > > running JSON queries, on HEAD and REL_12_STABLE: > > > Thank you for pointing! I hope I can investigate this shortly. > > Hi Alexander, > > I can reproduce this by running LANG="fr_FR.ISO8859-1" initdb, then > running installcheck (on some other OSes that might be called just > "fr_FR"). See this comment in mbutils.c: > > * The functions return a palloc'd, null-terminated string if conversion > * is required. However, if no conversion is performed, the given source > * string pointer is returned as-is. > > You call pfree() on the result of pg_server_to_any() without checking > if it just returned in the input pointer (for example, it does that if > you give it an empty string). That triggers an assertion failure > somewhere inside pfree(). The following fixes that for me, and is > based on code I found elsewhere in the tree. > > --- a/src/backend/utils/adt/jsonpath_exec.c > +++ b/src/backend/utils/adt/jsonpath_exec.c > @@ -2028,8 +2028,10 @@ compareStrings(const char *mbstr1, int mblen1, > cmp = binaryCompareStrings(utf8str1, strlen(utf8str1), > > utf8str2, strlen(utf8str2)); > > - pfree(utf8str1); > - pfree(utf8str2); > + if (mbstr1 != utf8str1) > + pfree(utf8str1); > + if (mbstr2 != utf8str2) > + pfree(utf8str2); > > With that fixed it no longer crashes, but then the regression test > fails due to differences in the output, which look like locale > ordering differences. > Thank you for the diagnostics. Should be fixed by 251c8e39. BTW, test failures appears to be caused not by locale differences, but by using strlen() on non null-terminated original strings. -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company >
pgsql: Fix string comparison in jsonpath
Fix string comparison in jsonpath Take into account pg_server_to_any() may return input string "as is". Reported-by: Andrew Dunstan, Thomas Munro Discussion: https://postgr.es/m/0ed83a33-d900-466a-880a-70ef456c721f%402ndQuadrant.com Author: Alexander Korotkov, Thomas Munro Backpatch-through: 12 Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/251c8e39bc6b0a3ff1620d9ac10888a7660e6b88 Modified Files -- src/backend/utils/adt/jsonpath_exec.c | 37 +++ 1 file changed, 29 insertions(+), 8 deletions(-)
pgsql: Fix string comparison in jsonpath
Fix string comparison in jsonpath Take into account pg_server_to_any() may return input string "as is". Reported-by: Andrew Dunstan, Thomas Munro Discussion: https://postgr.es/m/0ed83a33-d900-466a-880a-70ef456c721f%402ndQuadrant.com Author: Alexander Korotkov, Thomas Munro Backpatch-through: 12 Branch -- REL_12_STABLE Details --- https://git.postgresql.org/pg/commitdiff/de0dc0b758ed7452ceee756d9d620c4393b3cb53 Modified Files -- src/backend/utils/adt/jsonpath_exec.c | 37 +++ 1 file changed, 29 insertions(+), 8 deletions(-)
Re: pgsql: Adjust string comparison in jsonpath
On Mon, Aug 12, 2019 at 10:30 AM Alexander Korotkov wrote: > > > This appears to have upset prion when testing on en_US.iso885915. > > > > Also lapwing's "InstallCheck-fr_FR" stage crashed on this commit, when > > running JSON queries, on HEAD and REL_12_STABLE: > Thank you for pointing! I hope I can investigate this shortly. Hi Alexander, I can reproduce this by running LANG="fr_FR.ISO8859-1" initdb, then running installcheck (on some other OSes that might be called just "fr_FR"). See this comment in mbutils.c: * The functions return a palloc'd, null-terminated string if conversion * is required. However, if no conversion is performed, the given source * string pointer is returned as-is. You call pfree() on the result of pg_server_to_any() without checking if it just returned in the input pointer (for example, it does that if you give it an empty string). That triggers an assertion failure somewhere inside pfree(). The following fixes that for me, and is based on code I found elsewhere in the tree. --- a/src/backend/utils/adt/jsonpath_exec.c +++ b/src/backend/utils/adt/jsonpath_exec.c @@ -2028,8 +2028,10 @@ compareStrings(const char *mbstr1, int mblen1, cmp = binaryCompareStrings(utf8str1, strlen(utf8str1), utf8str2, strlen(utf8str2)); - pfree(utf8str1); - pfree(utf8str2); + if (mbstr1 != utf8str1) + pfree(utf8str1); + if (mbstr2 != utf8str2) + pfree(utf8str2); With that fixed it no longer crashes, but then the regression test fails due to differences in the output, which look like locale ordering differences. -- Thomas Munro https://enterprisedb.com
pgsql: Partially revert "Insert temporary debugging output in regressio
Partially revert "Insert temporary debugging output in regression tests." This reverts much of commit f03a9ca4366d064d89b7cf7ed75d4e43f2ed0667, but leaves the relpages/reltuples probe in select_parallel.sql. The pg_stat_all_tables probes are unstable enough to be annoying, and it no longer seems likely that they will teach us anything more about the underlying problem. I'd still like some more confirmation though that the observed plan instability is caused by VACUUM leaving relpages/reltuples as zero for one of these tables. Discussion: https://postgr.es/m/ca+hukg+0cxrkrwrmf5ymn3gm+bechna2b-q1w8onkbep4ha...@mail.gmail.com Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/b43f7c117e667fb51df36ca62e6c86054b0f8d03 Modified Files -- src/test/regress/expected/select_parallel.out | 13 - src/test/regress/expected/stats.out | 27 --- src/test/regress/sql/select_parallel.sql | 3 --- src/test/regress/sql/stats.sql| 8 4 files changed, 51 deletions(-)
Re: pgsql: Adjust string comparison in jsonpath
On Mon, Aug 12, 2019 at 1:25 AM Thomas Munro wrote: > On Mon, Aug 12, 2019 at 9:04 AM Andrew Dunstan > wrote: > > On 8/11/19 4:10 PM, Alexander Korotkov wrote: > > > Adjust string comparison in jsonpath > > > > > > We have implemented jsonpath string comparison using default database > > > locale. > > > However, standard requires us to compare Unicode codepoints. This commit > > > implements that, but for performance reasons we still use per-byte > > > comparison > > > for "==" operator. Thus, for consistency other comparison operators do > > > per-byte > > > comparison if Unicode codepoints appear to be equal. > > > > > > In some edge cases, when same Unicode codepoints have different binary > > > representations in database encoding, we diverge standard to achieve > > > better > > > performance of "==" operator. In future to implement strict standard > > > conformance, we can do normalization of input JSON strings. > > > > > > > > > This appears to have upset prion when testing on en_US.iso885915. > > Also lapwing's "InstallCheck-fr_FR" stage crashed on this commit, when > running JSON queries, on HEAD and REL_12_STABLE: > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lapwing&dt=2019-08-11%2021%3A02%3A32 > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lapwing&dt=2019-08-11%2020%3A40%3A07 Thank you for pointing! I hope I can investigate this shortly. -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: pgsql: Adjust string comparison in jsonpath
On Mon, Aug 12, 2019 at 9:04 AM Andrew Dunstan wrote: > On 8/11/19 4:10 PM, Alexander Korotkov wrote: > > Adjust string comparison in jsonpath > > > > We have implemented jsonpath string comparison using default database > > locale. > > However, standard requires us to compare Unicode codepoints. This commit > > implements that, but for performance reasons we still use per-byte > > comparison > > for "==" operator. Thus, for consistency other comparison operators do > > per-byte > > comparison if Unicode codepoints appear to be equal. > > > > In some edge cases, when same Unicode codepoints have different binary > > representations in database encoding, we diverge standard to achieve better > > performance of "==" operator. In future to implement strict standard > > conformance, we can do normalization of input JSON strings. > > > > > This appears to have upset prion when testing on en_US.iso885915. Also lapwing's "InstallCheck-fr_FR" stage crashed on this commit, when running JSON queries, on HEAD and REL_12_STABLE: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lapwing&dt=2019-08-11%2021%3A02%3A32 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lapwing&dt=2019-08-11%2020%3A40%3A07 -- Thomas Munro https://enterprisedb.com
Re: pgsql: Adjust string comparison in jsonpath
On 8/11/19 4:10 PM, Alexander Korotkov wrote: > Adjust string comparison in jsonpath > > We have implemented jsonpath string comparison using default database locale. > However, standard requires us to compare Unicode codepoints. This commit > implements that, but for performance reasons we still use per-byte comparison > for "==" operator. Thus, for consistency other comparison operators do > per-byte > comparison if Unicode codepoints appear to be equal. > > In some edge cases, when same Unicode codepoints have different binary > representations in database encoding, we diverge standard to achieve better > performance of "==" operator. In future to implement strict standard > conformance, we can do normalization of input JSON strings. > This appears to have upset prion when testing on en_US.iso885915. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql: Adjust string comparison in jsonpath
Adjust string comparison in jsonpath We have implemented jsonpath string comparison using default database locale. However, standard requires us to compare Unicode codepoints. This commit implements that, but for performance reasons we still use per-byte comparison for "==" operator. Thus, for consistency other comparison operators do per-byte comparison if Unicode codepoints appear to be equal. In some edge cases, when same Unicode codepoints have different binary representations in database encoding, we diverge standard to achieve better performance of "==" operator. In future to implement strict standard conformance, we can do normalization of input JSON strings. Original patch was written by Nikita Glukhov, rewritten by me. Reported-by: Markus Winand Discussion: https://postgr.es/m/8B7FA3B4-328D-43D7-95A8-37B8891B8C78%40winand.at Author: Nikita Glukhov, Alexander Korotkov Backpatch-through: 12 Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/d54ceb9e176152f930e60709e07c636e8e5414f5 Modified Files -- src/backend/utils/adt/jsonpath_exec.c| 72 +++- src/test/regress/expected/jsonb_jsonpath.out | 163 +++ src/test/regress/sql/jsonb_jsonpath.sql | 16 +++ 3 files changed, 248 insertions(+), 3 deletions(-)
pgsql: Adjust string comparison in jsonpath
Adjust string comparison in jsonpath We have implemented jsonpath string comparison using default database locale. However, standard requires us to compare Unicode codepoints. This commit implements that, but for performance reasons we still use per-byte comparison for "==" operator. Thus, for consistency other comparison operators do per-byte comparison if Unicode codepoints appear to be equal. In some edge cases, when same Unicode codepoints have different binary representations in database encoding, we diverge standard to achieve better performance of "==" operator. In future to implement strict standard conformance, we can do normalization of input JSON strings. Original patch was written by Nikita Glukhov, rewritten by me. Reported-by: Markus Winand Discussion: https://postgr.es/m/8B7FA3B4-328D-43D7-95A8-37B8891B8C78%40winand.at Author: Nikita Glukhov, Alexander Korotkov Backpatch-through: 12 Branch -- REL_12_STABLE Details --- https://git.postgresql.org/pg/commitdiff/3218ff5c6aea5841ab547ecca26927716419fe4b Modified Files -- src/backend/utils/adt/jsonpath_exec.c| 72 +++- src/test/regress/expected/jsonb_jsonpath.out | 163 +++ src/test/regress/sql/jsonb_jsonpath.sql | 16 +++ 3 files changed, 248 insertions(+), 3 deletions(-)