Let me start with this: I agree with you that both HOLD and GoAway
would work well as protocol extensions. And if that's what is needed
to get stuff to continue moving in the protocol space, then fine
that's what I'll do... But I have some reasons to prefer a protocol
version bump at least for GoAway

The primary reason I have is that protocol extensions are currently
enormously underspecified. Especially in regards to what their values
can be and also in regards to their own versioning and compatibility.
e.g. if we add _pq_.goaway=true. And later we want to add a field to
it, how do we do that? I can see a few options:
1. _pq_.goaway=v2,on (first acceptable value is used, this would need
immediate and constant grease)
2. _pq_.goaway=true _pq_.goaway_v2=true (have clients specify both a
new and old one and specify how the server should behave it gets both
of these)

I feel like this requires significant discussion, design and
implementation work. I tried to do that in patch 7 and 8 here[2], but
those parts of the patchset got very little review and/or feedback. I
think a big reason was that the proposal went in a too complex
direction, by trying to handle too many of the possible usecases for
the protocol extensions. I think as long as we limit the discussion to
protocol extensions that don't need to be changed on an active
connection using something like SetProtocolParameter and those
protocol extensions only have an "on/off" style value, then I think we
can make some incremental progress here. This would apply to both
GoAway (middleware should just not forward GoAway messages to clients)
and Hold (server does not send anything different for this feature).

Still this seems like quite a bit more work than "simply" including
"no-op implementation" features in a protocol bump. Especially because
I think the benefit of protocol parameters for these features is
negligible, or even negative because of the secondary reason:

The secondary reason is that I'd really like clients to actually
support the longer cancel token feature in 3.2. It's not that hard to
implement for client authors, but I don't think many users care about
it (because the primary beneficiary are server implementers, but those
only benefit if there's enough clients that implement it, so chicken
and egg). By giving people some extra goodies in 3.3 my hope is that
clients will actually implement it. So basically I agree that protocol
versions do require some additional work on client author side, but I
(selfishly) think that would be a good thing in this case. Because it
resolves this chicken and egg problem. To take advice from RFC 8170,
I'd like to align incentives better, by having protocol 3.3 contain
features that are beneficial for both client and server authors.

[2]: 
https://www.postgresql.org/message-id/CAGECzQRbAGqJnnJJxTdKewTsNOovUt4bsx3NFfofz3m2j-t7tA%40mail.gmail.com

-- detailed response below (things I did not respond to I agree with) --

On Thu, 11 Dec 2025 at 20:21, Jacob Champion
<[email protected]> wrote:
> NegotiateProtocolVersion is the only in-band tool we have to ratchet
> the protocol forward. Why go through all this pain of getting NPV
> packets working, only to immediately limit its power to the most
> trivial cases?

I think it's a fairly easy test to uphold. To be clear, I'm not saying
we should indefinitely limit that power. Eventually we'd probably want
to add things that are more difficult to implement for clients
(possibly after evaluating them as a protocol extension), but that
discussion can be punted to when we get there imo.

> So it
> seems strange to optimize for combinatorics out of the gate, by
> burning through a client-mandatory minor version every year.

To me 2 protocol extensions a year is strictly more complexity added
than 1 minor version a year. i.e. IF the changes are "no-op
implementable", why not group them together in a single identifier.

> You still have N*M. Implementers have to test each feature of their
> 3.10 client against server versions 3.0-9, rather than testing against
> a single server that turns individual extension support on and off.

I don't understand this argument. If you can have a single server
version that turns protocol extensions on and off, then why couldn't
you have a single server version that can turn different protocol
versions on and off.

> An example of an established network protocol that follows this same
> strategy would be helpful. How do their clients deal with the
> minor-version treadmill?

I agree that it would be helpful, but I'm not sure there's such a
network protocol. All protocols I know have infrequent version bumps,
which then often results in ossification. So frequent version bumps
seem like a good way to avoid that from happening. Using protocol
extension for everything might mean we ossify the protocol version
(again).

> > Finally, because we don't have any protocol extensions yet. All
> > clients still need to build infrastructure for them, including libpq.
>
> For clients still on 3.0 (the vast majority of them), they'd have to
> add infrastructure for sliding minor version ranges, too.

Yes, but adding infrastructure for both protocol versions (which we
already have now) and protocol extensions is even more work. libpq
still has no support for protocol parameters.

> Yes. Or even just "deployed". GitHub shows zero hits outside of the
> Postgres fork graph.

Yeah, that's sad, but unsurprising. Almost no-one cares about security
and that's the only end-user feature of 3.2.

> Google's results show that an organization called "cardo" tried
> max_protocol_version=latest. They had to revert it. :( Time for
> grease.

While I totally agree that we need grease, this case actually involved
people that did not update their PgBouncer version to a new enough
version that supports NegotiateProtocolVersion. [3]

[3]: https://www.cardogis.com/AenderungenIwan7#oktober-2025

> > Why exactly does that
> > matter for 3.3? Anything that stands default deployment in the way for
> > 3.2, will continue to stand default deployment in the way for 3.3.
>
> Exactly. Don't you want to make sure that clients in the ecosystem are
> able to use this _before_ we rev the version again, and again? We
> don't ever get these numbers back.

To me not every protocol version needs to be implemented by every
client. If 3.2 is never used by anyone in the wild, then half of the
world immediately switches to 3.3, and then the other half implements
3.4, then I'll be extremely happy.

> I'd rather we get a bunch of nice
> features without any flipping at all, if that's possible. It looks
> possible to me.

Me too, but I don't understand how that would work. Sending protocol
extensions is just as much of a breaking change for this ungreased
middleware as a protocol version bump. So having libpq request
_pq_.bindhold=true by default would also need some flip.

> I think proposals should attempt to answer those questions as a
> prerequisite to commit, personally. Or at least, we should be moving
> in that direction, if that's too harsh on the first authors who are
> trying to get things moving inside the protocol.

Agreed, but I don't think that has to come from the author
necessarily. I'm happy to provide that input on proposals and explain
if and why it would be hard for something like pgbouncer or other
servers.

> More generally, it bothers me that we still don't have a clear mental
> model of middlebox extensibility. We're just retreading the
> discussions from [1] instead of starting from where we stopped, and
> that's exhausting for me.

I'm still of the opinion that the requirements for [1] are good enough
for middleboxes to handle extensibility. I think those requirements
could be extended to allow GoAway too, by adding possibility 3 with
"The new message sent by the server can be dropped completely by the
middleware to imitate the lower protocol version". Remembering and
re-reading the thread and this email, it's unclear to me what your
thoughts on this are.

> (As a reminder: 3.2 broke my testing rig, which relied on implicit
> assumptions around minor-version extensibility for middleboxes. I
> didn't speak up until very late, because it was just a testing rig,
> and I could change it. I should have spoken up immediately, because
> IIRC, pgpool then broke as well.)

I'm not sure what exactly you're talking about here. You mean libpq
complaining about not receiving a BackendKeyData? If so, I agree that
wasn't a great situation. But I don't think it was related to the
current protocol being under specified, more than the new feature.

> A server is always free to decide at the _application_ layer that it
> will error out for a particular packet that it can parse at the
> _network_ layer. But it seems a lot more user-friendly to just decline
> the protocol bit, if it's directly tied to an application-level
> feature that isn't implemented. I think we should encourage that when
> possible; otherwise we've traded protocol fragmentation for
> application fragmentation.

I agree in principle, but does it really matter in practice in the
case of Hold in practice? If the network layer does not support it,
then really all that the user's application can do is throw an error.
Whether that error is thrown by the database/middleware or by the
client doesn't matter much in the end I think. The main reason where
it would matter is if the client could fall back to something else,
but in the case of HOLD that something else would probably be send
HOLD with SQL. And any server that would throw an error for protocol
based HOLD probably (should) also throw one for application level
HOLD.

> [1] 
> https://postgr.es/m/CAGECzQR5PMud4q8Atyz0gOoJ1xNH33g7g-MLXFML1_Vrhbzs6Q%40mail.gmail.com


Reply via email to