Moved from discussion on -committers: https://postgr.es/m/0ef325fa06e7a1605c4e119c4ecb637c67e5fb4e.ca...@j-davis.com
Summary: Do not use perl empty patterns like // or qr// or s//.../, the behavior is too surprising for perl non-experts. There are a few such uses in our tests; patch attached. Unfortunately, there is no obvious way to automatically detect them so I am just relying on grep. I'm sure there are others here who know more about perl than I do, so suggestions/corrections are welcome. Long version: Some may know this already, but we just discovered the dangers of using empty patterns in perl: "If the PATTERN evaluates to the empty string, the last successfully matched regular expression is used instead... If no match has previously succeeded, this will (silently) act instead as a genuine empty pattern (which will always match)." https://perldoc.perl.org/perlop#The-empty-pattern-// In other words, if you have code like: if ('xyz' =~ //) { print "'xyz' matches //\n"; } The match will succeed and print, because there's no previous pattern, so // is a "genuine" empty pattern, which is treated like /.*/ (I think?). Then, if you add some other code before it: if ('abc' =~ /abc/) { print "'abc' matches /abc/\n"; } if ('xyz' =~ //) { print "'xyz' matches //\n"; } The first match will succeed, but the second match will fail, because // is treated like /abc/. On reflection, that does seem very perl-like. But it can cause surprising action-at-a-distance if not used carefully, especially for those who aren't experts in perl. It's much safer to just not use the empty pattern. If you use qr// instead: https://perldoc.perl.org/perlop#qr/STRING/msixpodualn like: if ('abc' =~ qr/abc/) { print "'abc' matches /abc/\n"; } if ('xyz' =~ qr//) { print "'xyz' matches //\n"; } Then the second match may succeed or may fail, and it's not clear from the documentation what precise circumstances matter. It seems to fail on older versions of perl (like 5.16.3) and succeed on newer versions (5.38.2). However, it may also depend on when the qr// is [re]compiled, or regex flags, or locale, or may just be undefined. Regards, Jeff Davis
From be5aa677e37180a8c1b0faebcceab5506b1c8130 Mon Sep 17 00:00:00 2001 From: Jeff Davis <j...@j-davis.com> Date: Mon, 11 Mar 2024 16:44:56 -0700 Subject: [PATCH v1] perl: avoid empty regex patterns Empty patterns have special behavior that uses the last successful pattern match. This behavior can be surprising, so remove empty patterns and instead match against exactly what is intended (e.g. /^$/ or /.*/). Unfortunately there's not an easy way to check for this in an automated way, so it's likely that some cases have been missed and will be missed in the future. This commit just cleans up known instances. Discussion: https://postgr.es/m/1548559.1710188...@sss.pgh.pa.us --- src/bin/pg_upgrade/t/003_logical_slots.pl | 4 ++-- src/bin/pg_upgrade/t/004_subscription.pl | 2 +- src/bin/psql/t/001_basic.pl | 8 ++++---- src/bin/psql/t/010_tab_completion.pl | 2 +- src/test/recovery/t/037_invalid_database.pl | 2 +- 5 files changed, 9 insertions(+), 9 deletions(-) diff --git a/src/bin/pg_upgrade/t/003_logical_slots.pl b/src/bin/pg_upgrade/t/003_logical_slots.pl index 83d71c3084..256dfd53b1 100644 --- a/src/bin/pg_upgrade/t/003_logical_slots.pl +++ b/src/bin/pg_upgrade/t/003_logical_slots.pl @@ -78,7 +78,7 @@ command_checks_all( [ qr/max_replication_slots \(1\) must be greater than or equal to the number of logical replication slots \(2\) on the old cluster/ ], - [qr//], + [qr/^$/], 'run of pg_upgrade where the new cluster has insufficient max_replication_slots' ); ok( -d $newpub->data_dir . "/pg_upgrade_output.d", @@ -118,7 +118,7 @@ command_checks_all( [ qr/Your installation contains logical replication slots that can't be upgraded./ ], - [qr//], + [qr/^$/], 'run of pg_upgrade of old cluster with slots having unconsumed WAL records' ); diff --git a/src/bin/pg_upgrade/t/004_subscription.pl b/src/bin/pg_upgrade/t/004_subscription.pl index df5d6dffbc..c8ee2390d1 100644 --- a/src/bin/pg_upgrade/t/004_subscription.pl +++ b/src/bin/pg_upgrade/t/004_subscription.pl @@ -68,7 +68,7 @@ command_checks_all( [ qr/max_replication_slots \(0\) must be greater than or equal to the number of subscriptions \(1\) on the old cluster/ ], - [qr//], + [qr/^$/], 'run of pg_upgrade where the new cluster has insufficient max_replication_slots' ); diff --git a/src/bin/psql/t/001_basic.pl b/src/bin/psql/t/001_basic.pl index 9f0b6cf8ca..ce875ce316 100644 --- a/src/bin/psql/t/001_basic.pl +++ b/src/bin/psql/t/001_basic.pl @@ -412,23 +412,23 @@ my $perlbin = $^X; $perlbin =~ s!\\!/!g if $PostgreSQL::Test::Utils::windows_os; my $pipe_cmd = "$perlbin -pe '' >$g_file"; -psql_like($node, "SELECT 'one' \\g | $pipe_cmd", qr//, "one command \\g"); +psql_like($node, "SELECT 'one' \\g | $pipe_cmd", qr/.*/, "one command \\g"); my $c1 = slurp_file($g_file); like($c1, qr/one/); -psql_like($node, "SELECT 'two' \\; SELECT 'three' \\g | $pipe_cmd", qr//, "two commands \\g"); +psql_like($node, "SELECT 'two' \\; SELECT 'three' \\g | $pipe_cmd", qr/.*/, "two commands \\g"); my $c2 = slurp_file($g_file); like($c2, qr/two.*three/s); -psql_like($node, "\\set SHOW_ALL_RESULTS 0\nSELECT 'four' \\; SELECT 'five' \\g | $pipe_cmd", qr//, +psql_like($node, "\\set SHOW_ALL_RESULTS 0\nSELECT 'four' \\; SELECT 'five' \\g | $pipe_cmd", qr/.*/, "two commands \\g with only last result"); my $c3 = slurp_file($g_file); like($c3, qr/five/); unlike($c3, qr/four/); psql_like($node, "copy (values ('foo'),('bar')) to stdout \\g | $pipe_cmd", - qr//, + qr/.*/, "copy output passed to \\g pipe"); my $c4 = slurp_file($g_file); like($c4, qr/foo.*bar/s); diff --git a/src/bin/psql/t/010_tab_completion.pl b/src/bin/psql/t/010_tab_completion.pl index b6575b075e..d3a95ac066 100644 --- a/src/bin/psql/t/010_tab_completion.pl +++ b/src/bin/psql/t/010_tab_completion.pl @@ -414,7 +414,7 @@ check_completion( clear_query(); # check no-completions code path -check_completion("blarg \t\t", qr//, "check completion failure path"); +check_completion("blarg \t\t", qr/.*/, "check completion failure path"); clear_query(); diff --git a/src/test/recovery/t/037_invalid_database.pl b/src/test/recovery/t/037_invalid_database.pl index 32b7d8af57..684cdca928 100644 --- a/src/test/recovery/t/037_invalid_database.pl +++ b/src/test/recovery/t/037_invalid_database.pl @@ -98,7 +98,7 @@ ok( $bgpsql->query_safe( "blocked DROP DATABASE completion"); # Try to drop. This will wait due to the still held lock. -$bgpsql->query_until(qr//, "DROP DATABASE regression_invalid_interrupt;\n"); +$bgpsql->query_until(qr/.*/, "DROP DATABASE regression_invalid_interrupt;\n"); # Ensure we're waiting for the lock $node->poll_query_until('postgres', -- 2.34.1