On Mon, Jun 30, 2025 at 11:44 AM Jelte Fennema-Nio <postg...@jeltef.nl> wrote: > > It looks like Heikki has an open item for this, so I'll defer to him > > Oh... Sorry for the confusion. I added that open item to the list (so > it would not be missed), and I added Heikki as the owner because he > committed the original patch. I checked now, and since Heikki hasn't > sent an email to hackers since June 10th, I'm not sure that was the > correct decision.
I don't know that it's necessarily forbidden to add an open item on someone else's behalf, but I think they should definitely be aware of it, if you do. :D > So feel free to take ownership of the open item. I > changed it to "Unknown" for now (I did the same for some of the other > protocol docs related patches). I'll avoid taking it over for now, mostly so that I'm not imposing a different set of requirements on the follow-up as compared to the originally committed patch. But I can pick it up if Heikki's not able to. > I understand your intent, but I don't like that the protocol docs > would be saying anything about libpq specific behaviour. I'd like to > be explicit about protocol behaviour, without considering the specific > client. That we weren't before is exactly how we got into the current > mess (maybe those IETF are on to something with their MUST/SHOULD/etc > stuff). Underspecification is how you get into this sort of mess, and that's already done. So hiding the specific change behind a veil of "can't say libpq" inside the Postgres docs doesn't make sense to me, personally -- when other people say "Postgres wire-compatible", they're talking about us. Plus, modern IETF specs are _very_ good at mentioning when problems in widely-deployed implementations have led to a protocol decision. (Calling them out by name, in the permanent RFC, would probably be unwise in many situations. But we don't have any reason to avoid calling ourselves out in our own docs.) > But I > do understand your concern, so how about this wording: > > Since protocol version 3.2, if the server sent no BackendKeyData, then > that means that the backend does not support canceling queries using > the CancelRequest messages. In protocol versions before 3.2 the > behaviour is undefined if the client receives no BackendKeyData. > > That way we define the behavior as "sensible" in 3.2, while still > allowing for the rare case that someone somewhere relied on the "all > zeros" cancel message being sent in libpq for PG17. And if it's a big > enough problem, we could then still change libpq to continue sending > "all zeros" if the server used protocol 3.0. 1. I find this less useful to implementers. Implementers, I think, want to know what libpq is going to do in reality. 2. Without a way to enforce or test this behavior, I'm not excited about tying the 3.2 protocol definition to a change that we still might revert for 3.0. Maybe that is the least bad way forward, but I would want more committers than just me to buy into that agreement first. > > Also, what should we do if the server sends a zero-length key? > > Given that that only applies to protocol 3.2, It applies to 3.0 too. (There is no longer any code in the client that locks the length of the key to four bytes.) This applies to PG18 and onwards. > I'd like to define that > strictly. How about simply not allowing that and throwing an error in > libpq in that case? e.g. by defining the secret so that it should be > at minimum 4 and at most 256 bytes? No objection here. --Jacob