Re: Why is citext/regress failing on hamerkop?

Thomas Munro Sat, 11 May 2024 16:32:07 -0700

On Sat, May 11, 2024 at 1:14 PM Thomas Munro <[email protected]> wrote:
> Either way, it seems like we'll need to skip that test on Windows if
> we want hamerkop to be green.  That can probably be cribbed from
> collate.windows.win1252.sql into contrib/citext/sql/citext_utf8.sql's
> prelude... I just don't know how to explain it in the comment 'cause I
> don't know why.


Here's a minimal patch like that.

I don't think it's worth back-patching.  Only 15 and 16 could possibly
be affected, I think, because the test wasn't enabled before that.  I
think this is all just a late-appearing blow-up predicted by the
commit that enabled it:

commit c2e8bd27519f47ff56987b30eb34a01969b9a9e8
Author: Tom Lane <[email protected]>
Date:   Wed Jan 5 13:30:07 2022 -0500

    Enable routine running of citext's UTF8-specific test cases.

    These test cases have been commented out since citext was invented,
    because at the time we had no nice way to deal with tests that
    have restrictions such as requiring UTF8 encoding.  But now we do
    have a convention for that, ie put them into a separate test file
    with an early-exit path.  So let's enable these tests to run when
    their prerequisites are satisfied.

    (We may have to tighten the prerequisites beyond the "encoding = UTF8
    and locale != C" checks made here.  But let's put it on the buildfarm
    and see what blows up.)

Hamerkop is already green on the 15 and 16 branches, apparently
because it's using the pre-meson test stuff that I guess just didn't
run the relevant test.  In other words, nobody would notice the
difference anyway, and a master-only fix would be enough to end this
44-day red streak.

From 1f0e1dc21d4055a0e5109bac39999b290508e2d8 Mon Sep 17 00:00:00 2001
From: Thomas Munro <[email protected]>
Date: Sun, 12 May 2024 10:20:06 +1200
Subject: [PATCH] Skip the citext_utf8 test on Windows.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

On other Windows build farm animals it is already skipped because they
don't use UTF-8 encoding.  On "hamerkop", UTF-8 is used, and then the
test fails.

It is not clear to me (a non-Windows person looking only at buildfarm
evidence) whether Windows is less sophisticated than other OSes and
doesn't know how to downcase Turkish İ with the standard Unicode
database, or if it is more sophisticated than other systems and uses
locale-specific behavior like ICU does.

Whichever the reason, the result is the same: we need to skip the test
on Windows, just as we already do for ICU, at least until a
Windows-savvy developer comes up with a better idea.  The technique for
detecting the OS is borrowed from collate.windows.win1252.sql.

Discussion: https://postgr.es/m/CA%2BhUKGJ1LeC3aE2qQYTK95rFVON3ZVoTQpTKJqxkHdtEyawH4A%40mail.gmail.com
---
 contrib/citext/expected/citext_utf8.out   | 3 +++
 contrib/citext/expected/citext_utf8_1.out | 3 +++
 contrib/citext/sql/citext_utf8.sql        | 3 +++
 3 files changed, 9 insertions(+)

diff --git a/contrib/citext/expected/citext_utf8.out b/contrib/citext/expected/citext_utf8.out
index 5d988dcd485..19538db674e 100644
--- a/contrib/citext/expected/citext_utf8.out
+++ b/contrib/citext/expected/citext_utf8.out
@@ -6,8 +6,11 @@
  * Turkish dotted I is not correct for many ICU locales. citext always
  * uses the default collation, so it's not easy to restrict the test
  * to the "tr-TR-x-icu" collation where it will succeed.
+ *
+ * Also disable for Windows.  It fails similarly, at least in some locales.
  */
 SELECT getdatabaseencoding() <> 'UTF8' OR
+       version() ~ '(Visual C\+\+|mingw32|windows)' OR
        (SELECT (datlocprovider = 'c' AND datctype = 'C') OR datlocprovider = 'i'
         FROM pg_database
         WHERE datname=current_database())
diff --git a/contrib/citext/expected/citext_utf8_1.out b/contrib/citext/expected/citext_utf8_1.out
index 7065a5da190..874ec8519e1 100644
--- a/contrib/citext/expected/citext_utf8_1.out
+++ b/contrib/citext/expected/citext_utf8_1.out
@@ -6,8 +6,11 @@
  * Turkish dotted I is not correct for many ICU locales. citext always
  * uses the default collation, so it's not easy to restrict the test
  * to the "tr-TR-x-icu" collation where it will succeed.
+ *
+ * Also disable for Windows.  It fails similarly, at least in some locales.
  */
 SELECT getdatabaseencoding() <> 'UTF8' OR
+       version() ~ '(Visual C\+\+|mingw32|windows)' OR
        (SELECT (datlocprovider = 'c' AND datctype = 'C') OR datlocprovider = 'i'
         FROM pg_database
         WHERE datname=current_database())
diff --git a/contrib/citext/sql/citext_utf8.sql b/contrib/citext/sql/citext_utf8.sql
index 34b232d64e2..ba283320797 100644
--- a/contrib/citext/sql/citext_utf8.sql
+++ b/contrib/citext/sql/citext_utf8.sql
@@ -6,9 +6,12 @@
  * Turkish dotted I is not correct for many ICU locales. citext always
  * uses the default collation, so it's not easy to restrict the test
  * to the "tr-TR-x-icu" collation where it will succeed.
+ *
+ * Also disable for Windows.  It fails similarly, at least in some locales.
  */
 
 SELECT getdatabaseencoding() <> 'UTF8' OR
+       version() ~ '(Visual C\+\+|mingw32|windows)' OR
        (SELECT (datlocprovider = 'c' AND datctype = 'C') OR datlocprovider = 'i'
         FROM pg_database
         WHERE datname=current_database())
-- 
2.44.0

Re: Why is citext/regress failing on hamerkop?

Reply via email to