Hello Fabien,

On Mon, 14 Jun 2021 11:30:14 +0200 (CEST)
Fabien COELHO <coe...@cri.ensmp.fr> wrote:

> >>> Hmmm. Possibly. Another option could be not to report anything after some
> >>> errors. I'm not sure, because it would depend on the use case. I guess the
> >>> command returned an error status as well.
> >>
> >> I did not know any use cases and decisions , but I vote to report nothing 
> >> when error occurs.
> >
> > I would prefer to abort the thread whose connection got an error and report
> > results for other threads, as handled when doConnect fails in 
> > CSTATE_START_TX
> > state.
> 
> It is unclear to me whether it makes much sense to report performance when 
> things go wrong. At least when a one connection per client bench is run 
> ISTM that it should not proceed, because the bench could not even start 
> as prescribe. 

I agreed that when an initial connections fails we cannot start a bench
in the condition that the user wants and that we should stop early to let
the user know it and check the conf. 

I attached a patch, which is a fusion of my previous patch that changes the
state to CSTATE_ABORT when the socket get failure during the bench, and a
part of your patch attached in [1] that exits for initial failures.

[1] 
https://www.postgresql.org/message-id/alpine.DEB.2.22.394.2106141011100.1338009%40pseudo

> When connection break while the bench has already started, 
> maybe it makes more sense to proceed, 

The result would be incomplete also in this case. However, the reason why
it is worth to proceed is that  such information is still useful for users,
or we don't want to waste the bench that has already started?

> although I guess that maybe 
> reattempting connections would make also sense in such case.

This might become possible after pgbench gets the feature to retry in deadlock
or serialization errors. I am working on rebase of the patch  [2] and I will
submit this in a few days.

[2] 
https://www.postgresql.org/message-id/20210524112910.444fbfdfbff747bd3b972...@sraoss.co.jp

-- 
Yugo NAGATA <nag...@sraoss.co.jp>
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index d7479925cb..2d9080c444 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -3171,6 +3171,7 @@ advanceConnectionState(TState *thread, CState *st, StatsData *agg)
 
 					if ((st->con = doConnect()) == NULL)
 					{
+						/* as the bench is already running, we do not abort */
 						pg_log_error("client %d aborted while establishing connection", st->id);
 						st->state = CSTATE_ABORTED;
 						break;
@@ -4405,7 +4406,10 @@ runInitSteps(const char *initialize_steps)
 	initPQExpBuffer(&stats);
 
 	if ((con = doConnect()) == NULL)
+	{
+		pg_log_fatal("connection for initialization failed");
 		exit(1);
+	}
 
 	setup_cancel_handler(NULL);
 	SetCancelConn(con);
@@ -6332,7 +6336,10 @@ main(int argc, char **argv)
 	/* opening connection... */
 	con = doConnect();
 	if (con == NULL)
+	{
+		pg_log_fatal("setup connection failed");
 		exit(1);
+	}
 
 	pg_log_debug("pghost: %s pgport: %s nclients: %d %s: %d dbName: %s",
 				 PQhost(con), PQport(con), nclients,
@@ -6437,7 +6444,10 @@ main(int argc, char **argv)
 
 	errno = THREAD_BARRIER_INIT(&barrier, nthreads);
 	if (errno != 0)
+	{
 		pg_log_fatal("could not initialize barrier: %m");
+		exit(1);
+	}
 
 #ifdef ENABLE_THREAD_SAFETY
 	/* start all threads but thread 0 which is executed directly later */
@@ -6548,7 +6558,7 @@ threadRun(void *arg)
 		if (thread->logfile == NULL)
 		{
 			pg_log_fatal("could not open logfile \"%s\": %m", logpath);
-			goto done;
+			exit(1);
 		}
 	}
 
@@ -6572,16 +6582,10 @@ threadRun(void *arg)
 		{
 			if ((state[i].con = doConnect()) == NULL)
 			{
-				/*
-				 * On connection failure, we meet the barrier here in place of
-				 * GO before proceeding to the "done" path which will cleanup,
-				 * so as to avoid locking the process.
-				 *
-				 * It is unclear whether it is worth doing anything rather
-				 * than coldly exiting with an error message.
-				 */
-				THREAD_BARRIER_WAIT(&barrier);
-				goto done;
+				/* coldly abort on connection failure */
+				pg_log_fatal("cannot create connection for thread %d client %d",
+							 thread->tid, i);
+				exit(1);
 			}
 		}
 
@@ -6647,6 +6651,7 @@ threadRun(void *arg)
 				if (sock < 0)
 				{
 					pg_log_error("invalid socket: %s", PQerrorMessage(st->con));
+					st->state = CSTATE_ABORTED;
 					goto done;
 				}
 
@@ -6707,7 +6712,8 @@ threadRun(void *arg)
 					continue;
 				}
 				/* must be something wrong */
-				pg_log_fatal("%s() failed: %m", SOCKET_WAIT_METHOD);
+				pg_log_error("%s() failed: %m", SOCKET_WAIT_METHOD);
+				state[0].state = CSTATE_ABORTED;
 				goto done;
 			}
 		}
@@ -6733,6 +6739,7 @@ threadRun(void *arg)
 				if (sock < 0)
 				{
 					pg_log_error("invalid socket: %s", PQerrorMessage(st->con));
+					st->state = CSTATE_ABORTED;
 					goto done;
 				}
 

Reply via email to