Moved from discussion on -committers:

https://postgr.es/m/0ef325fa06e7a1605c4e119c4ecb637c67e5fb4e.ca...@j-davis.com

Summary:

Do not use perl empty patterns like // or qr// or s//.../, the behavior
is too surprising for perl non-experts. There are a few such uses in
our tests; patch attached. Unfortunately, there is no obvious way to
automatically detect them so I am just relying on grep. I'm sure there
are others here who know more about perl than I do, so
suggestions/corrections are welcome.

Long version:

Some may know this already, but we just discovered the dangers of using
empty patterns in perl:

"If the PATTERN evaluates to the empty string, the last successfully
matched regular expression is used instead... If no match has
previously succeeded, this will (silently) act instead as a genuine
empty pattern (which will always match)."

https://perldoc.perl.org/perlop#The-empty-pattern-//

In other words, if you have code like:

   if ('xyz' =~ //)
   {
       print "'xyz' matches //\n";
   }

The match will succeed and print, because there's no previous pattern,
so // is a "genuine" empty pattern, which is treated like /.*/ (I
think?). Then, if you add some other code before it:

   if ('abc' =~ /abc/)
   {
       print "'abc' matches /abc/\n";
   }

   if ('xyz' =~ //)
   {
       print "'xyz' matches //\n";
   }

The first match will succeed, but the second match will fail, because
// is treated like /abc/.

On reflection, that does seem very perl-like. But it can cause
surprising action-at-a-distance if not used carefully, especially for
those who aren't experts in perl. It's much safer to just not use the
empty pattern.

If you use qr// instead:

https://perldoc.perl.org/perlop#qr/STRING/msixpodualn

like:

   if ('abc' =~ qr/abc/)
   {
       print "'abc' matches /abc/\n";
   }

   if ('xyz' =~ qr//)
   {
       print "'xyz' matches //\n";
   }

Then the second match may succeed or may fail, and it's not clear from
the documentation what precise circumstances matter. It seems to fail
on older versions of perl (like 5.16.3) and succeed on newer versions
(5.38.2). However, it may also depend on when the qr// is [re]compiled,
or regex flags, or locale, or may just be undefined.

Regards,
        Jeff Davis


From be5aa677e37180a8c1b0faebcceab5506b1c8130 Mon Sep 17 00:00:00 2001
From: Jeff Davis <j...@j-davis.com>
Date: Mon, 11 Mar 2024 16:44:56 -0700
Subject: [PATCH v1] perl: avoid empty regex patterns

Empty patterns have special behavior that uses the last successful
pattern match. This behavior can be surprising, so remove empty
patterns and instead match against exactly what is intended
(e.g. /^$/ or /.*/).

Unfortunately there's not an easy way to check for this in an
automated way, so it's likely that some cases have been missed and
will be missed in the future. This commit just cleans up known
instances.

Discussion: https://postgr.es/m/1548559.1710188...@sss.pgh.pa.us
---
 src/bin/pg_upgrade/t/003_logical_slots.pl   | 4 ++--
 src/bin/pg_upgrade/t/004_subscription.pl    | 2 +-
 src/bin/psql/t/001_basic.pl                 | 8 ++++----
 src/bin/psql/t/010_tab_completion.pl        | 2 +-
 src/test/recovery/t/037_invalid_database.pl | 2 +-
 5 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/src/bin/pg_upgrade/t/003_logical_slots.pl b/src/bin/pg_upgrade/t/003_logical_slots.pl
index 83d71c3084..256dfd53b1 100644
--- a/src/bin/pg_upgrade/t/003_logical_slots.pl
+++ b/src/bin/pg_upgrade/t/003_logical_slots.pl
@@ -78,7 +78,7 @@ command_checks_all(
 	[
 		qr/max_replication_slots \(1\) must be greater than or equal to the number of logical replication slots \(2\) on the old cluster/
 	],
-	[qr//],
+	[qr/^$/],
 	'run of pg_upgrade where the new cluster has insufficient max_replication_slots'
 );
 ok( -d $newpub->data_dir . "/pg_upgrade_output.d",
@@ -118,7 +118,7 @@ command_checks_all(
 	[
 		qr/Your installation contains logical replication slots that can't be upgraded./
 	],
-	[qr//],
+	[qr/^$/],
 	'run of pg_upgrade of old cluster with slots having unconsumed WAL records'
 );
 
diff --git a/src/bin/pg_upgrade/t/004_subscription.pl b/src/bin/pg_upgrade/t/004_subscription.pl
index df5d6dffbc..c8ee2390d1 100644
--- a/src/bin/pg_upgrade/t/004_subscription.pl
+++ b/src/bin/pg_upgrade/t/004_subscription.pl
@@ -68,7 +68,7 @@ command_checks_all(
 	[
 		qr/max_replication_slots \(0\) must be greater than or equal to the number of subscriptions \(1\) on the old cluster/
 	],
-	[qr//],
+	[qr/^$/],
 	'run of pg_upgrade where the new cluster has insufficient max_replication_slots'
 );
 
diff --git a/src/bin/psql/t/001_basic.pl b/src/bin/psql/t/001_basic.pl
index 9f0b6cf8ca..ce875ce316 100644
--- a/src/bin/psql/t/001_basic.pl
+++ b/src/bin/psql/t/001_basic.pl
@@ -412,23 +412,23 @@ my $perlbin = $^X;
 $perlbin =~ s!\\!/!g if $PostgreSQL::Test::Utils::windows_os;
 my $pipe_cmd = "$perlbin -pe '' >$g_file";
 
-psql_like($node, "SELECT 'one' \\g | $pipe_cmd", qr//, "one command \\g");
+psql_like($node, "SELECT 'one' \\g | $pipe_cmd", qr/.*/, "one command \\g");
 my $c1 = slurp_file($g_file);
 like($c1, qr/one/);
 
-psql_like($node, "SELECT 'two' \\; SELECT 'three' \\g | $pipe_cmd", qr//, "two commands \\g");
+psql_like($node, "SELECT 'two' \\; SELECT 'three' \\g | $pipe_cmd", qr/.*/, "two commands \\g");
 my $c2 = slurp_file($g_file);
 like($c2, qr/two.*three/s);
 
 
-psql_like($node, "\\set SHOW_ALL_RESULTS 0\nSELECT 'four' \\; SELECT 'five' \\g | $pipe_cmd", qr//,
+psql_like($node, "\\set SHOW_ALL_RESULTS 0\nSELECT 'four' \\; SELECT 'five' \\g | $pipe_cmd", qr/.*/,
   "two commands \\g with only last result");
 my $c3 = slurp_file($g_file);
 like($c3, qr/five/);
 unlike($c3, qr/four/);
 
 psql_like($node, "copy (values ('foo'),('bar')) to stdout \\g | $pipe_cmd",
-		  qr//,
+		  qr/.*/,
 		  "copy output passed to \\g pipe");
 my $c4 = slurp_file($g_file);
 like($c4, qr/foo.*bar/s);
diff --git a/src/bin/psql/t/010_tab_completion.pl b/src/bin/psql/t/010_tab_completion.pl
index b6575b075e..d3a95ac066 100644
--- a/src/bin/psql/t/010_tab_completion.pl
+++ b/src/bin/psql/t/010_tab_completion.pl
@@ -414,7 +414,7 @@ check_completion(
 clear_query();
 
 # check no-completions code path
-check_completion("blarg \t\t", qr//, "check completion failure path");
+check_completion("blarg \t\t", qr/.*/, "check completion failure path");
 
 clear_query();
 
diff --git a/src/test/recovery/t/037_invalid_database.pl b/src/test/recovery/t/037_invalid_database.pl
index 32b7d8af57..684cdca928 100644
--- a/src/test/recovery/t/037_invalid_database.pl
+++ b/src/test/recovery/t/037_invalid_database.pl
@@ -98,7 +98,7 @@ ok( $bgpsql->query_safe(
 	"blocked DROP DATABASE completion");
 
 # Try to drop. This will wait due to the still held lock.
-$bgpsql->query_until(qr//, "DROP DATABASE regression_invalid_interrupt;\n");
+$bgpsql->query_until(qr/.*/, "DROP DATABASE regression_invalid_interrupt;\n");
 
 # Ensure we're waiting for the lock
 $node->poll_query_until('postgres',
-- 
2.34.1

Reply via email to