I updated the patch in accordance with your feedback.

- The variable name has been changed to try_all_addrs, I think this
makes  sense as it is shorter and coincides with the variables
try_next_addr and try_next_host even though it doesn't perfectly
convey its relationship to target_session_attrs.
- I fixed the typos and the inaccurate assertion messages in the 008_ test file
- I refactored how the try_all_addrs variable flows in the code to
coincide with how load_balance_type is handled. try_all_attrs contains
the unparsed value while try_all_addrs_type contains the parsed value
of type enum PGTryAddrType. This may be a bit overkill but it
addressed a couple of your bulletpoints above, apologies if I judged
incorrectly here.
- Changed some of the wording in the docs as it was just wrong. Also
changed all references in the docs to reference "try" instead of
"check".

With regards  to the correct load balancing behavior I can think of 2 options:
1. randomly choose a host and then randomly choose an address within
that host. If the current addresses session_attr does not match
target_session_attr then move on to next address if any remains in
that host and move onto the next random host if no addresses remain.
2. Resolve all addresses in all hosts and randomly select an address
from that list.

I don't believe that any test cases exist that validate the
functionality of a combination of multiple hosts, multiple address
within each host, and target session attributes. Happy to add test
case coverage over this if it helps get this patch moving.

I was also thinking should try_all_addrs input be 0/1 or a more human
readable disable/enable or both?

On Fri, Mar 6, 2026 at 12:44 AM Andrey Borodin <[email protected]> wrote:
>
>
>
> > On 16 Aug 2025, at 04:43, Andrew Jackson <[email protected]> wrote:
> >
> > Attached is the rebased patch.
>
> I've took a look into the patch again.
>
> The behavior and integration with the connection state machine look correct,
> and the tests + docs are in good shape. Some notes:
> 1. Use a dedicated default "0" for check_all_addrs (not 
> DefaultLoadBalanceHosts,
>    this one is used for load balancing, need more "0").
> 2. Guard the two strcmp(conn->check_all_addrs, "1") uses so they are safe when
>    conn->check_all_addrs is NULL.
> 3. Fix the test typos in 008 (standby_expeect_traffic and the three “on node1”
>    messages).
> 4. Parse check_all_addrs once into a bool (like load_balance_type) and use 
> that
>    in the connection path for consistency and clarity.
>
> Now about important part: is the name "check_all_addrs" good?
> I've asked LLM after explaining it what the feature does. PFA attached output.
>
> Personally, I like "try_all_addrs".
>
> It's a bit unclear to me how randomization (load balancing) on different
> addresses should work.
>
>
> Best regards, Andrey Borodin.
>
From 33f5d0c7d62abb1ffa8a9bd8fc659f16e58ef91e Mon Sep 17 00:00:00 2001
From: CommanderKeynes <[email protected]>
Date: Sat, 17 May 2025 08:29:01 -0500
Subject: [PATCH] Add option to try all addrs for target_session.

The current behaviour of libpq with regard to searching
for a matching target_session_attrs in a list of addrs is
that after successfully connecting to a server, if the servers
session_attr does not match the request target_session_attrs
no futher address is considered in that host. This behaviour
is extremely inconvenient in environments where the user is
attempting to implement a high availability setup without having
to modify DNS records after a topology change or maintain a
proxy server layer.

This PR adds a client side option called try_all_addrs.
When set to 1 this option will tell libpq to continue checking
any remaining addresses even if there was a target_session_attrs
mismatch on one of them.

Author: Andrew Jackson
Reviewed-by: Andrey Borodin
Discussion: https://www.postgresql.org/message-id/flat/CAKK5BkESSc69sp2TiTWHvvOHCUey0rDWXSrR9pinyRqyfamUYg%40mail.gmail.com
---
 doc/src/sgml/libpq.sgml                       |  33 +++++
 src/interfaces/libpq/fe-connect.c             |  42 ++++--
 src/interfaces/libpq/libpq-int.h              |  12 ++
 .../libpq/t/007_target_session_attr_dns.pl    | 129 ++++++++++++++++++
 .../t/008_load_balance_dns_try_all_addrs.pl   | 128 +++++++++++++++++
 5 files changed, 334 insertions(+), 10 deletions(-)
 create mode 100644 src/interfaces/libpq/t/007_target_session_attr_dns.pl
 create mode 100644 src/interfaces/libpq/t/008_load_balance_dns_try_all_addrs.pl

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 5bf59a19855..2e2f00daca4 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -2568,6 +2568,39 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
       </listitem>
      </varlistentry>
 
+     <varlistentry id="libpq-connect-try-all-addrs" xreflabel="try_all_addrs">
+      <term><literal>try_all_addrs</literal></term>
+      <listitem>
+       <para>
+        Controls whether or not all addresses within a hostname are tried when attempting to 
+	make a connection with a matching <xref linkend="libpq-connect-target-session-attrs"/>.
+
+        There are two modes:
+        <variablelist>
+         <varlistentry>
+          <term><literal>0</literal> (default)</term>
+          <listitem>
+           <para>
+            If a successful connection is made and that connection is found to have a
+            mismatching <xref linkend="libpq-connect-target-session-attrs"/> do not try
+            any additional addresses and move onto the next host if one was provided.
+           </para>
+          </listitem>
+         </varlistentry>
+         <varlistentry>
+          <term><literal>1</literal></term>
+          <listitem>
+           <para>
+            If a successful connection is made and that connection is found to have a
+            mismatching <xref linkend="libpq-connect-target-session-attrs"/> proceed
+            to try any additional addresses.
+           </para>
+          </listitem>
+         </varlistentry>
+        </variablelist>
+       </para>
+      </listitem>
+     </varlistentry>
     </variablelist>
    </para>
   </sect2>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index a3d12931fff..aaac4565934 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -125,6 +125,7 @@ static int	ldapServiceLookup(const char *purl, PQconninfoOption *options,
 #endif
 #define DefaultTargetSessionAttrs	"any"
 #define DefaultLoadBalanceHosts	"disable"
+#define DefaultTryAllAddrs	"0"
 #ifdef USE_SSL
 #define DefaultSSLMode "prefer"
 #define DefaultSSLCertMode "allow"
@@ -394,6 +395,11 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
 	{"scram_server_key", NULL, NULL, NULL, "SCRAM-Server-Key", "D", SCRAM_MAX_KEY_LEN * 2,
 	offsetof(struct pg_conn, scram_server_key)},
 
+	{"try_all_addrs", "PGTRYALLADDRS",
+		DefaultTryAllAddrs, NULL,
+		"Try-All-Addrs", "", 1,
+	offsetof(struct pg_conn, try_all_addrs)},
+
 	/* OAuth v2 */
 	{"oauth_issuer", NULL, NULL, NULL,
 		"OAuth-Issuer", "", 40,
@@ -2018,6 +2024,21 @@ pqConnectOptions2(PGconn *conn)
 	else
 		conn->target_server_type = SERVER_TYPE_ANY;
 
+	if (conn->try_all_addrs){
+		if (strcmp(conn->try_all_addrs, "0") == 0)
+			conn->try_all_addrs_type = TRY_ALL_ADDRS_DISABLE;
+		else if (strcmp(conn->try_all_addrs, "1") == 0)
+			conn->try_all_addrs_type = TRY_ALL_ADDRS_ENABLE;
+		else {
+			conn->status = CONNECTION_BAD;
+			libpq_append_conn_error(conn, "invalid %s value: \"%s\"",
+									"try_all_addrs",
+									conn->try_all_addrs);
+			return false;
+		}
+	} else
+		conn->try_all_addrs_type = TRY_ALL_ADDRS_DISABLE;
+
 	if (conn->scram_client_key)
 	{
 		int			len;
@@ -4434,11 +4455,11 @@ keep_going:						/* We will come back to here until there is
 						conn->status = CONNECTION_OK;
 						sendTerminateConn(conn);
 
-						/*
-						 * Try next host if any, but we don't want to consider
-						 * additional addresses for this host.
-						 */
-						conn->try_next_host = true;
+						if (conn->try_all_addrs_type == TRY_ALL_ADDRS_ENABLE)
+							conn->try_next_addr = true;
+						else
+							conn->try_next_host = true;
+
 						goto keep_going;
 					}
 				}
@@ -4489,11 +4510,11 @@ keep_going:						/* We will come back to here until there is
 						conn->status = CONNECTION_OK;
 						sendTerminateConn(conn);
 
-						/*
-						 * Try next host if any, but we don't want to consider
-						 * additional addresses for this host.
-						 */
-						conn->try_next_host = true;
+						if (conn->try_all_addrs_type == TRY_ALL_ADDRS_ENABLE)
+							conn->try_next_addr = true;
+						else
+							conn->try_next_host = true;
+
 						goto keep_going;
 					}
 				}
@@ -5127,6 +5148,7 @@ freePGconn(PGconn *conn)
 	free(conn->inBuffer);
 	free(conn->outBuffer);
 	free(conn->rowBuf);
+	free(conn->try_all_addrs);
 	termPQExpBuffer(&conn->errorMessage);
 	termPQExpBuffer(&conn->workBuffer);
 
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index a701c25038a..b1406c9dfc8 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -250,6 +250,15 @@ typedef enum
 	LOAD_BALANCE_RANDOM,		/* Randomly shuffle the hosts */
 } PGLoadBalanceType;
 
+/* Try address type (decoded value of try_all_addr) */
+typedef enum
+{
+	TRY_ALL_ADDRS_DISABLE = 0,	/* Do not try subsequent addresses in host
+								 * after target_session_attrs mismatch (default) */
+	TRY_ALL_ADDRS_ENABLE,		/* Try remaining addresses in host even after
+								 * target_session_attrs mismatch */
+} PGTryAddrType;
+
 /* Boolean value plus a not-known state, for GUCs we might have to fetch */
 typedef enum
 {
@@ -430,6 +439,7 @@ struct pg_conn
 	char	   *scram_client_key;	/* base64-encoded SCRAM client key */
 	char	   *scram_server_key;	/* base64-encoded SCRAM server key */
 	char	   *sslkeylogfile;	/* where should the client write ssl keylogs */
+	char       *try_all_addrs;  /* whether to try all ips within a host */
 
 	bool		cancelRequest;	/* true if this connection is used to send a
 								 * cancel request, instead of being a normal
@@ -534,6 +544,8 @@ struct pg_conn
 	PGTargetServerType target_server_type;	/* desired session properties */
 	PGLoadBalanceType load_balance_type;	/* desired load balancing
 											 * algorithm */
+	PGTryAddrType try_all_addrs_type;     /* parsed representation of try_all_addrs */
+
 	bool		try_next_addr;	/* time to advance to next address/host? */
 	bool		try_next_host;	/* time to advance to next connhost[]? */
 	int			naddr;			/* number of addresses returned by getaddrinfo */
diff --git a/src/interfaces/libpq/t/007_target_session_attr_dns.pl b/src/interfaces/libpq/t/007_target_session_attr_dns.pl
new file mode 100644
index 00000000000..75701bcd30b
--- /dev/null
+++ b/src/interfaces/libpq/t/007_target_session_attr_dns.pl
@@ -0,0 +1,129 @@
+
+# Copyright (c) 2023-2025, PostgreSQL Global Development Group
+use strict;
+use warnings FATAL => 'all';
+use Config;
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+if (!$ENV{PG_TEST_EXTRA} || $ENV{PG_TEST_EXTRA} !~ /\bload_balance\b/)
+{
+	plan skip_all =>
+	  'Potentially unsafe test load_balance not enabled in PG_TEST_EXTRA';
+}
+
+# Cluster setup which is shared for testing both load balancing methods
+my $can_bind_to_127_0_0_2 =
+  $Config{osname} eq 'linux' || $PostgreSQL::Test::Utils::windows_os;
+
+# Checks for the requirements for testing load balancing method 2
+if (!$can_bind_to_127_0_0_2)
+{
+	plan skip_all => 'load_balance test only supported on Linux and Windows';
+}
+
+my $hosts_path;
+if ($windows_os)
+{
+	$hosts_path = 'c:\Windows\System32\Drivers\etc\hosts';
+}
+else
+{
+	$hosts_path = '/etc/hosts';
+}
+
+my $hosts_content = PostgreSQL::Test::Utils::slurp_file($hosts_path);
+
+my $hosts_count = () =
+  $hosts_content =~ /127\.0\.0\.[1-3] pg-loadbalancetest/g;
+if ($hosts_count != 3)
+{
+	# Host file is not prepared for this test
+	plan skip_all => "hosts file was not prepared for DNS load balance test";
+}
+
+$PostgreSQL::Test::Cluster::use_tcp = 1;
+$PostgreSQL::Test::Cluster::test_pghost = '127.0.0.1';
+my $port = PostgreSQL::Test::Cluster::get_free_port();
+
+my $node_primary1 = PostgreSQL::Test::Cluster->new('primary1', port => $port);
+$node_primary1->init(has_archiving => 1, allows_streaming => 1);
+
+# Start it
+$node_primary1->start;
+
+# Take backup from which all operations will be run
+$node_primary1->backup('my_backup');
+
+my $node_standby = PostgreSQL::Test::Cluster->new('standby', port => $port, own_host => 1);
+$node_standby->init_from_backup($node_primary1, 'my_backup',
+	has_restoring => 1);
+$node_standby->start();
+
+my $node_primary2 = PostgreSQL::Test::Cluster->new('node1', port => $port, own_host => 1);
+$node_primary2 ->init();
+$node_primary2 ->start();
+
+# target_session_attrs=primary should always choose the first one.
+$node_primary1->connect_ok(
+	"host=pg-loadbalancetest port=$port target_session_attrs=primary try_all_addrs=1",
+	"target_session_attrs=primary connects to the first node",
+	sql => "SELECT 'connect1'",
+	log_like => [qr/statement: SELECT 'connect1'/]);
+$node_primary1->connect_ok(
+	"host=pg-loadbalancetest port=$port target_session_attrs=read-write try_all_addrs=1",
+	"target_session_attrs=read-write connects to the first node",
+	sql => "SELECT 'connect1'",
+	log_like => [qr/statement: SELECT 'connect1'/]);
+$node_primary1->connect_ok(
+	"host=pg-loadbalancetest port=$port target_session_attrs=any try_all_addrs=1",
+	"target_session_attrs=any connects to the first node",
+	sql => "SELECT 'connect1'",
+	log_like => [qr/statement: SELECT 'connect1'/]);
+$node_standby->connect_ok(
+	"host=pg-loadbalancetest port=$port target_session_attrs=standby try_all_addrs=1",
+	"target_session_attrs=standby connects to the third node",
+	sql => "SELECT 'connect1'",
+	log_like => [qr/statement: SELECT 'connect1'/]);
+$node_standby->connect_ok(
+	"host=pg-loadbalancetest port=$port target_session_attrs=read-only try_all_addrs=1",
+	"target_session_attrs=read-only connects to the third node",
+	sql => "SELECT 'connect1'",
+	log_like => [qr/statement: SELECT 'connect1'/]);
+
+
+$node_primary1->stop();
+
+# target_session_attrs=primary should always choose the first one.
+$node_primary2->connect_ok(
+	"host=pg-loadbalancetest port=$port target_session_attrs=primary try_all_addrs=1",
+	"target_session_attrs=primary connects to the first node",
+	sql => "SELECT 'connect1'",
+	log_like => [qr/statement: SELECT 'connect1'/]);
+$node_primary2->connect_ok(
+	"host=pg-loadbalancetest port=$port target_session_attrs=read-write try_all_addrs=1",
+	"target_session_attrs=read-write connects to the first node",
+	sql => "SELECT 'connect1'",
+	log_like => [qr/statement: SELECT 'connect1'/]);
+$node_standby->connect_ok(
+	"host=pg-loadbalancetest port=$port target_session_attrs=any try_all_addrs=1",
+	"target_session_attrs=any connects to the first node",
+	sql => "SELECT 'connect1'",
+	log_like => [qr/statement: SELECT 'connect1'/]);
+$node_standby->connect_ok(
+	"host=pg-loadbalancetest port=$port target_session_attrs=standby try_all_addrs=1",
+	"target_session_attrs=standby connects to the third node",
+	sql => "SELECT 'connect1'",
+	log_like => [qr/statement: SELECT 'connect1'/]);
+$node_standby->connect_ok(
+	"host=pg-loadbalancetest port=$port target_session_attrs=read-only try_all_addrs=1",
+	"target_session_attrs=read-only connects to the third node",
+	sql => "SELECT 'connect1'",
+	log_like => [qr/statement: SELECT 'connect1'/]);
+
+$node_primary2->stop();
+$node_standby->stop();
+
+
+done_testing();
diff --git a/src/interfaces/libpq/t/008_load_balance_dns_try_all_addrs.pl b/src/interfaces/libpq/t/008_load_balance_dns_try_all_addrs.pl
new file mode 100644
index 00000000000..9218ab785a7
--- /dev/null
+++ b/src/interfaces/libpq/t/008_load_balance_dns_try_all_addrs.pl
@@ -0,0 +1,128 @@
+# Copyright (c) 2023-2025, PostgreSQL Global Development Group
+use strict;
+use warnings FATAL => 'all';
+use Config;
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+if (!$ENV{PG_TEST_EXTRA} || $ENV{PG_TEST_EXTRA} !~ /\bload_balance\b/)
+{
+	plan skip_all =>
+	  'Potentially unsafe test load_balance not enabled in PG_TEST_EXTRA';
+}
+
+my $can_bind_to_127_0_0_2 =
+  $Config{osname} eq 'linux' || $PostgreSQL::Test::Utils::windows_os;
+
+# Checks for the requirements for testing load balancing method 2
+if (!$can_bind_to_127_0_0_2)
+{
+	plan skip_all => 'load_balance test only supported on Linux and Windows';
+}
+
+my $hosts_path;
+if ($windows_os)
+{
+	$hosts_path = 'c:\Windows\System32\Drivers\etc\hosts';
+}
+else
+{
+	$hosts_path = '/etc/hosts';
+}
+
+my $hosts_content = PostgreSQL::Test::Utils::slurp_file($hosts_path);
+
+my $hosts_count = () =
+  $hosts_content =~ /127\.0\.0\.[1-3] pg-loadbalancetest/g;
+if ($hosts_count != 3)
+{
+	# Host file is not prepared for this test
+	plan skip_all => "hosts file was not prepared for DNS load balance test";
+}
+
+$PostgreSQL::Test::Cluster::use_tcp = 1;
+$PostgreSQL::Test::Cluster::test_pghost = '127.0.0.1';
+
+my $port = PostgreSQL::Test::Cluster::get_free_port();
+local $Test::Builder::Level = $Test::Builder::Level + 1;
+my $node_primary1 = PostgreSQL::Test::Cluster->new("primary1", port => $port);
+$node_primary1->init(has_archiving => 1, allows_streaming => 1);
+
+# Start it
+$node_primary1->start();
+
+# Take backup from which all operations will be run
+$node_primary1->backup("my_backup");
+
+my $node_standby = PostgreSQL::Test::Cluster->new("standby", port => $port, own_host => 1);
+$node_standby->init_from_backup($node_primary1, "my_backup",
+	has_restoring => 1);
+$node_standby->start();
+
+my $node_primary2 = PostgreSQL::Test::Cluster->new("node1", port => $port, own_host => 1);
+$node_primary2->init();
+$node_primary2->start();
+sub test_target_session_attr {
+	my $target_session_attrs = shift;
+	my $test_num = shift;
+	my $primary1_expect_traffic = shift;
+	my $standby_expect_traffic = shift;
+	my $primary2_expect_traffic = shift;
+	# Statistically the following loop with load_balance_hosts=random will almost
+	# certainly connect at least once to each of the nodes. The chance of that not
+	# happening is so small that it's negligible: (2/3)^50 = 1.56832855e-9
+	foreach my $i (1 .. 50)
+	{
+		$node_primary1->connect_ok(
+			"host=pg-loadbalancetest port=$port load_balance_hosts=random target_session_attrs=${target_session_attrs} try_all_addrs=1",
+			"repeated connections with random load balancing",
+			sql => "SELECT 'connect${test_num}'");
+	}
+	my $node_primary1_occurrences = () =
+	  $node_primary1->log_content() =~ /statement: SELECT 'connect${test_num}'/g;
+	my $node_standby_occurrences = () =
+	  $node_standby->log_content() =~ /statement: SELECT 'connect${test_num}'/g;
+	my $node_primary2_occurrences = () =
+	  $node_primary2->log_content() =~ /statement: SELECT 'connect${test_num}'/g;
+
+	my $total_occurrences =
+	  $node_primary1_occurrences + $node_standby_occurrences + $node_primary2_occurrences;
+
+	if ($primary1_expect_traffic == 1) {
+		ok($node_primary1_occurrences > 0, "received at least one connection on node primary1");
+	}else{
+		ok($node_primary1_occurrences == 0, "received no connections on node primary1");
+	}
+	if ($standby_expect_traffic == 1) {
+		ok($node_standby_occurrences > 0, "received at least one connection on node standby");
+	}else{
+		ok($node_standby_occurrences == 0, "received no connections on node standby");
+	}
+
+	if ($primary2_expect_traffic == 1) {
+		ok($node_primary2_occurrences > 0, "received at least one connection on node primary2");
+	}else{
+		ok($node_primary2_occurrences == 0, "received no connections on primary2");
+	}
+
+	ok($total_occurrences == 50, "received 50 connections across all nodes");
+}
+
+test_target_session_attr('any',
+	1, 1, 1, 1);
+test_target_session_attr('read-only',
+	2, 0, 1, 0);
+test_target_session_attr('read-write',
+	3, 1, 0, 1);
+test_target_session_attr('primary',
+	4, 1, 0, 1);
+test_target_session_attr('standby',
+	5, 0, 1, 0);
+
+
+$node_primary1->stop();
+$node_primary2->stop();
+$node_standby->stop();
+
+done_testing();
-- 
2.49.0

Reply via email to