Hi,

On 2022-05-03 14:23:23 -0400, Tom Lane wrote:
> Andres Freund <and...@anarazel.de> writes:
> >> So it's almost surely a timing issue, and your theory here seems plausible.
> 
> > Unfortunately I don't think my theory holds, because I actually had added a
> > defense against this into the test that I forgot about momentarily...
> 
> Oh, hm.  I can try harder to repro it.

I've now reproduced it a couple times here running under rr, so it's probably
not worth putting too much effort into that.

Attached is a fix for the test that I think should avoid the problem. Couldn't
repro it with it applied, under both rr and valgrind.


My current problem is that I'm running into some IPC::Run issues (bug?). I get
"ack Broken pipe:" iff I add "SELECT pg_sleep(1);" after
"-- wait for lock held by prepared transaction"

It doesn't happen without that debugging thing, but it makes me worried that
it's something that'll come up in random situations.

It looks to me like it's a bug in IPC::Run - with a few changes I get the
failure to happen inside pump_nb(), which seems like it shouldn't error out
just because the child process exited...


I *think* it might not happen without the sleep. But I'm not at all confident.

In general I'm kinda worried on how much effectively unmaintained perl stuff
we're depending :(

Greetings,

Andres Freund
>From 8f37300a61176967ca9071a27417c698ce4301aa Mon Sep 17 00:00:00 2001
From: Andres Freund <and...@anarazel.de>
Date: Tue, 3 May 2022 12:03:33 -0700
Subject: [PATCH] Fix timing issue in deadlock recovery conflict test.

Author:
Reviewed-By:
Discussion: https://postgr.es/m/
Backpatch:
---
 src/test/recovery/t/031_recovery_conflict.pl | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/src/test/recovery/t/031_recovery_conflict.pl b/src/test/recovery/t/031_recovery_conflict.pl
index 192d2d5a31d..8dcb3da0de9 100644
--- a/src/test/recovery/t/031_recovery_conflict.pl
+++ b/src/test/recovery/t/031_recovery_conflict.pl
@@ -226,6 +226,14 @@ check_conflict_stat("tablespace");
 $sect = "startup deadlock";
 $expected_conflicts++;
 
+# Want to test recovery deadlock conflicts, not buffer pin conflicts. Without
+# changing max_standby_streaming_delay it'd be timing dependent what we hit
+# first
+$node_standby->adjust_conf('postgresql.conf', 'max_standby_streaming_delay',
+						   "${PostgreSQL::Test::Utils::timeout_default}s");
+$node_standby->restart();
+reconnect_and_clear();
+
 # Generate a few dead rows, to later be cleaned up by vacuum. Then acquire a
 # lock on another relation in a prepared xact, so it's held continuously by
 # the startup process. The standby psql will block acquiring that lock while
@@ -281,6 +289,9 @@ check_conflict_stat("deadlock");
 
 # clean up for next tests
 $node_primary->safe_psql($test_db, qq[ROLLBACK PREPARED 'lock';]);
+$node_standby->adjust_conf('postgresql.conf', 'max_standby_streaming_delay', '50ms');
+$node_standby->restart();
+reconnect_and_clear();
 
 
 # Check that expected number of conflicts show in pg_stat_database. Needs to
-- 
2.35.1.677.gabf474a5dd

Reply via email to