Hi,

On 2023-01-20 17:12:37 -0800, Andres Freund wrote:
> We have code like this in libpqrcv_connect():
> 
>       conn = palloc0(sizeof(WalReceiverConn));
>       conn->streamConn = PQconnectStartParams(keys, vals,
>                                                                               
>          /* expand_dbname = */ true);
>       if (PQstatus(conn->streamConn) == CONNECTION_BAD)
>       {
>               *err = pchomp(PQerrorMessage(conn->streamConn));
>               return NULL;
>       }
> 
>         [try to establish connection]
> 
>       if (PQstatus(conn->streamConn) != CONNECTION_OK)
>       {
>               *err = pchomp(PQerrorMessage(conn->streamConn));
>               return NULL;
>       }
> 
> 
> Am I missing something, or are we leaking the libpq connection in case of
> errors?
> 
> It doesn't matter really for walreceiver, since it will exit anyway, but we
> also use libpqwalreceiver for logical replication, where it might?
> 
> 
> Seems pretty clear that we should do a PQfinish() before returning NULL? I
> lean towards thinking that this isn't worth backpatching given the current
> uses of libpq, but I could easily be convinced otherwise.
> 

It's bit worse than I earlier thought: We use walrv_connect() during CREATE
SUBSCRIPTION. One can easily exhaust file descriptors right now.  So I think
we need to fix this.

I also noticed the following in libpqrcv_connect, added in 11da97024abb:

        if (logical)
        {
                PGresult   *res;

                res = libpqrcv_PQexec(conn->streamConn,
                                                          
ALWAYS_SECURE_SEARCH_PATH_SQL);
                if (PQresultStatus(res) != PGRES_TUPLES_OK)
                {
                        PQclear(res);
                        ereport(ERROR,
                                        (errmsg("could not clear search path: 
%s",
                                                        
pchomp(PQerrorMessage(conn->streamConn)))));
                }
                PQclear(res);
        }


Which doesn't seem quite right? The comment for the function says:

 * Returns NULL on error and fills the err with palloc'ed error message.

which this doesn't do. Of course we don't expect this to fail, but network
issues etc could still lead us to hit this case.  In this case we'll actually
have an open libpq connection around that we'll leak.


The attached patch fixes both issues.



I seems we don't have any tests for creating a subscription that fails during
connection establishment? That doesn't seem optimal - I guess there may have
been concern around portability of the error messages? I think we can control
for that in a tap test, by failing to connect due to a non-existant database,
then the error is under our control. Whereas e.g. an invalid hostname would
contain an error from gai_strerror().


Greetings,

Andres Freund
>From da9345151423180732d2928fe7fe6ff0b6a11d4d Mon Sep 17 00:00:00 2001
From: Andres Freund <and...@anarazel.de>
Date: Fri, 20 Jan 2023 18:27:05 -0800
Subject: [PATCH v3 1/4] Fix error handling in libpqrcv_connect()

Author:
Reviewed-by:
Discussion: https://postgr.es/m/20230121011237.q52apbvlarfv6...@awork3.anarazel.de
Backpatch:
---
 .../libpqwalreceiver/libpqwalreceiver.c       | 26 +++++++++++--------
 1 file changed, 15 insertions(+), 11 deletions(-)

diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index c40c6220db8..fefc8660259 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -175,10 +175,7 @@ libpqrcv_connect(const char *conninfo, bool logical, const char *appname,
 	conn->streamConn = PQconnectStartParams(keys, vals,
 											 /* expand_dbname = */ true);
 	if (PQstatus(conn->streamConn) == CONNECTION_BAD)
-	{
-		*err = pchomp(PQerrorMessage(conn->streamConn));
-		return NULL;
-	}
+		goto bad_connection_errmsg;
 
 	/*
 	 * Poll connection until we have OK or FAILED status.
@@ -220,10 +217,7 @@ libpqrcv_connect(const char *conninfo, bool logical, const char *appname,
 	} while (status != PGRES_POLLING_OK && status != PGRES_POLLING_FAILED);
 
 	if (PQstatus(conn->streamConn) != CONNECTION_OK)
-	{
-		*err = pchomp(PQerrorMessage(conn->streamConn));
-		return NULL;
-	}
+		goto bad_connection_errmsg;
 
 	if (logical)
 	{
@@ -234,9 +228,9 @@ libpqrcv_connect(const char *conninfo, bool logical, const char *appname,
 		if (PQresultStatus(res) != PGRES_TUPLES_OK)
 		{
 			PQclear(res);
-			ereport(ERROR,
-					(errmsg("could not clear search path: %s",
-							pchomp(PQerrorMessage(conn->streamConn)))));
+			*err = psprintf(_("could not clear search path: %s"),
+							pchomp(PQerrorMessage(conn->streamConn)));
+			goto bad_connection;
 		}
 		PQclear(res);
 	}
@@ -244,6 +238,16 @@ libpqrcv_connect(const char *conninfo, bool logical, const char *appname,
 	conn->logical = logical;
 
 	return conn;
+
+	/* error path, using libpq's error message */
+bad_connection_errmsg:
+	*err = pchomp(PQerrorMessage(conn->streamConn));
+
+	/* error path, error already set */
+bad_connection:
+	PQfinish(conn->streamConn);
+	pfree(conn);
+	return NULL;
 }
 
 /*
-- 
2.38.0

Reply via email to