On Thu, 13 Nov 2025 22:55:53 +0900
Fujii Masao <[email protected]> wrote:

> On Thu, Nov 13, 2025 at 4:09 PM Yugo Nagata <[email protected]> wrote:
> > Thank you for your review!
> > I've attached an updated patch reflecting your suggestion.
> 
> Thanks for updating the patch! LGTM.
> 
> You mentioned that the assertion failure could occur when using \syncpipeline,
> but it seems that multiple PGRES_PIPELINE_SYNC results can also appear
> even without it, which can still trigger the same issue. For example,
> I was able to reproduce the assertion failure in v16 (which doesn't support
> \syncpipeline) with the following setup:
> 
> --------------------------------
> $ cat deadlock.sql
>  \startpipeline
>  select * from a order by i for update;
>  select 1;
>  \endpipeline
> 
> $ cat deadlock2.sql
>  \startpipeline
>  select * from a order by i desc for update;
>  select 1;
>  \endpipeline
> 
> $ psql -c "create table a (i int primary key); insert into a
> values(generate_series(1,1000));"
> 
> $ pgbench -n -j 4 -c 4 -T 5 -M extended -f deadlock.sql -f deadlock2.sql
> ...
> Assertion failed: (res == ((void *)0)), function discardUntilSync,
> file pgbench.c, line 3479.
> --------------------------------
>
> So I've updated the commit message to clarify that while using \syncpipeline
> makes the failure more likely, it can still occur without it. Since the issue
> can also happen in v15 and v16 (which both lack \syncpipeline), I plan to
> backpatch the fix to v15. The failure doesn't occur in v14 because it doesn't
> support retriable error retries.

I could not reproduce it with the latest REL_16_STABLE branch.
Perhaps, the assertion failure you mentioned above was the one
fixed by 1d3ded521?
Or, I am missing something...

> I've also made a few cosmetic tweaks to the patch. Attached is the updated
> version, which I plan to push.

Thank you for updating the patch.

By the way, your prevous email has not been archived [1].
I guess it was not received by the server due to some issue.
Therefore, I've attached patches you've sent.

[1] https://www.postgresql.org/list/pgsql-hackers/since/202511130000/


-- 
Yugo Nagata <[email protected]>
>From f62f3acb82ebea71cb322b5a4b4effb3de557261 Mon Sep 17 00:00:00 2001
From: Fujii Masao <[email protected]>
Date: Thu, 13 Nov 2025 18:43:19 +0900
Subject: [PATCH v4] pgbench: Fix assertion failure when discarding results
 after retriable errors.

Previously, when pgbench ran a custom script that triggered retriable errors
(e.g., deadlocks) in pipeline mode, the following assertion failure could occur:

    Assertion failed: (res == ((void*)0)), function discardUntilSync, file pgbench.c, line 3594.

This typically happened when multiple \syncpipeline commands followed
a statement that caused a retriable error. However, even in v15 and v16
where \syncpipeline is not supported, scripts without it could still trigger
this failure.

The issue was that discardUntilSync() assumed a pipeline sync result
(PGRES_PIPELINE_SYNC) would always be followed by either another sync result
or NULL. This assumption was incorrect: when multiple sync requests were sent,
a sync result could instead be followed by another result type. In such cases,
discardUntilSync() mishandled the results, leading to the assertion failure.

This commit fixes the issue by making discardUntilSync() correctly handle cases
where a pipeline sync result is followed by other result types. It now continues
discarding results until another pipeline sync followed by NULL is reached.

Backpatched to v15, where support for retrying retriable errors in pgbench
was introduced.

Author: Yugo Nagata <[email protected]>
Reviewed-by: Chao Li <[email protected]>
Reviewed-by: Fujii Masao <[email protected]>
Discussion: https://postgr.es/m/[email protected]
Backpatch-through: 15
---
 src/bin/pgbench/pgbench.c | 38 ++++++++++++++++++++++++++++----------
 1 file changed, 28 insertions(+), 10 deletions(-)

diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index d8764ba6fe0..8caf7b8bdaf 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -3563,14 +3563,18 @@ doRetry(CState *st, pg_time_usec_t *now)
 }
 
 /*
- * Read results and discard it until a sync point.
+ * Read and discard results until the last sync point.
  */
 static int
 discardUntilSync(CState *st)
 {
 	bool		received_sync = false;
 
-	/* send a sync */
+	/*
+	 * Send a Sync message to ensure at least one PGRES_PIPELINE_SYNC is
+	 * received and to avoid an infinite loop, since all earlier ones may have
+	 * already been received.
+	 */
 	if (!PQpipelineSync(st->con))
 	{
 		pg_log_error("client %d aborted: failed to send a pipeline sync",
@@ -3578,21 +3582,26 @@ discardUntilSync(CState *st)
 		return 0;
 	}
 
-	/* receive PGRES_PIPELINE_SYNC and null following it */
+	/*
+	 * Continue reading results until the last sync point, i.e., until
+	 * reaching null just after PGRES_PIPELINE_SYNC.
+	 */
 	for (;;)
 	{
 		PGresult   *res = PQgetResult(st->con);
 
+		if (PQstatus(st->con) == CONNECTION_BAD)
+		{
+			pg_log_error("client %d aborted while rolling back the transaction after an error; perhaps the backend died while processing",
+						 st->id);
+			PQclear(res);
+			return 0;
+		}
+
 		if (PQresultStatus(res) == PGRES_PIPELINE_SYNC)
 			received_sync = true;
-		else if (received_sync)
+		else if (received_sync && res == NULL)
 		{
-			/*
-			 * PGRES_PIPELINE_SYNC must be followed by another
-			 * PGRES_PIPELINE_SYNC or NULL; otherwise, assert failure.
-			 */
-			Assert(res == NULL);
-
 			/*
 			 * Reset ongoing sync count to 0 since all PGRES_PIPELINE_SYNC
 			 * results have been discarded.
@@ -3601,6 +3610,15 @@ discardUntilSync(CState *st)
 			PQclear(res);
 			break;
 		}
+		else
+		{
+			/*
+			 * If a PGRES_PIPELINE_SYNC is followed by something other than
+			 * PGRES_PIPELINE_SYNC or NULL, another PGRES_PIPELINE_SYNC will
+			 * appear later. Reset received_sync to false to wait for it.
+			 */
+			received_sync = false;
+		}
 		PQclear(res);
 	}
 
-- 
2.51.2

>From 1865eabfd65232feff106e7c01c8c6c9161571c8 Mon Sep 17 00:00:00 2001
From: Fujii Masao <[email protected]>
Date: Thu, 13 Nov 2025 18:43:19 +0900
Subject: [PATCH v4] pgbench: Fix assertion failure when discarding results
 after retriable errors.

Previously, when pgbench ran a custom script that triggered retriable errors
(e.g., deadlocks) in pipeline mode, the following assertion failure could occur:

    Assertion failed: (res == ((void*)0)), function discardUntilSync, file 
pgbench.c, line 3594.

This typically happened when multiple \syncpipeline commands followed
a statement that caused a retriable error. However, even in v15 and v16
where \syncpipeline is not supported, scripts without it could still trigger
this failure.

The issue was that discardUntilSync() assumed a pipeline sync result
(PGRES_PIPELINE_SYNC) would always be followed by either another sync result
or NULL. This assumption was incorrect: when multiple sync requests were sent,
a sync result could instead be followed by another result type. In such cases,
discardUntilSync() mishandled the results, leading to the assertion failure.

This commit fixes the issue by making discardUntilSync() correctly handle cases
where a pipeline sync result is followed by other result types. It now continues
discarding results until another pipeline sync followed by NULL is reached.

Backpatched to v15, where support for retrying retriable errors in pgbench
was introduced.

Author: Yugo Nagata <[email protected]>
Reviewed-by: Chao Li <[email protected]>
Reviewed-by: Fujii Masao <[email protected]>
Discussion: 
https://postgr.es/m/[email protected]
Backpatch-through: 15
---
 src/bin/pgbench/pgbench.c | 38 ++++++++++++++++++++++++++++----------
 1 file changed, 28 insertions(+), 10 deletions(-)

diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index adf6e45953b..4bdd507582a 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -3475,14 +3475,18 @@ doRetry(CState *st, pg_time_usec_t *now)
 }
 
 /*
- * Read results and discard it until a sync point.
+ * Read and discard results until the last sync point.
  */
 static int
 discardUntilSync(CState *st)
 {
        bool            received_sync = false;
 
-       /* send a sync */
+       /*
+        * Send a Sync message to ensure at least one PGRES_PIPELINE_SYNC is
+        * received and to avoid an infinite loop, since all earlier ones may 
have
+        * already been received.
+        */
        if (!PQpipelineSync(st->con))
        {
                pg_log_error("client %d aborted: failed to send a pipeline 
sync",
@@ -3490,24 +3494,38 @@ discardUntilSync(CState *st)
                return 0;
        }
 
-       /* receive PGRES_PIPELINE_SYNC and null following it */
+       /*
+        * Continue reading results until the last sync point, i.e., until
+        * reaching null just after PGRES_PIPELINE_SYNC.
+        */
        for (;;)
        {
                PGresult   *res = PQgetResult(st->con);
 
+               if (PQstatus(st->con) == CONNECTION_BAD)
+               {
+                       pg_log_error("client %d aborted while rolling back the 
transaction after an error; perhaps the backend died while processing",
+                                                st->id);
+                       PQclear(res);
+                       return 0;
+               }
+
                if (PQresultStatus(res) == PGRES_PIPELINE_SYNC)
                        received_sync = true;
-               else if (received_sync)
+               else if (received_sync && res == NULL)
                {
-                       /*
-                        * PGRES_PIPELINE_SYNC must be followed by another
-                        * PGRES_PIPELINE_SYNC or NULL; otherwise, assert 
failure.
-                        */
-                       Assert(res == NULL);
-
                        PQclear(res);
                        break;
                }
+               else
+               {
+                       /*
+                        * If a PGRES_PIPELINE_SYNC is followed by something 
other than
+                        * PGRES_PIPELINE_SYNC or NULL, another 
PGRES_PIPELINE_SYNC will
+                        * appear later. Reset received_sync to false to wait 
for it.
+                        */
+                       received_sync = false;
+               }
                PQclear(res);
        }
 
-- 
2.51.2

Reply via email to