On Monday, October 30, 2023 10:29 AM Kyotaro Horiguchi <horikyota....@gmail.com> wrote: > > At Fri, 27 Oct 2023 14:57:10 +0530, Amit Kapila <amit.kapil...@gmail.com> > wrote in > > On Fri, Oct 27, 2023 at 2:02 PM Alvaro Herrera <alvhe...@alvh.no-ip.org> > wrote: > > > > > > On 2023-Oct-27, Kyotaro Horiguchi wrote: > > > > > > > @@ -1433,8 +1433,8 @@ > InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause, > > > > { > > > > ereport(ERROR, > > > > > errcode(ERRCODE_INVALID_PARAMETER_VALUE), > > > > - errmsg("replication slots must not > be invalidated during the upgrade"), > > > > - > errhint("\"max_slot_wal_keep_size\" must be set to -1 during the upgrade")); > > > > > > Hmm, if I read this code right, this error is going to be thrown by > > > the checkpointer while finishing a checkpoint. Fortunately, the > > > checkpoint record has already been written, but I don't know what > > > would happen if this is thrown while trying to write the shutdown > > > checkpoint. Probably nothing terribly good. > > > > > > I don't think this is useful. If the setting is invalid during > > > binary upgrade, let's prevent it from being set at all right from > > > the start of the upgrade process. > > > > We are already forcing the required setting > > "max_slot_wal_keep_size=-1" during the upgrade similar to some of the > > other settings like "full_page_writes". However, the user can provide > > an option for "max_slot_wal_keep_size" in which case it will be > > overridden. Now, I think (a) we can ensure that our setting always > > takes precedence in this case. The other idea is (b) to parse the > > user-provided options and check if "max_slot_wal_keep_size" has a > > value different than expected and raise an error accordingly. Or we > > can simply (c) document the usage of max_slot_wal_keep_size in the > > upgrade. I am not sure whether it is worth complicating the code for > > this as the user shouldn't be using such an option during the upgrade. > > So, I think doing (a) and (c) could be simpler. > > > > > > In InvalidatePossiblyObsoleteSlot() we could have just an Assert() > > > or elog(PANIC). > > > > > > > Yeah, we can change to either of those. > > This discussion seems like a bit off from my point. I suggested adding a check > for that setting when IsBinaryUpgraded is true at the GUC level as shown in > the > attached patch. I believe Álvaro made a similar suggestion. While the error > message is somewhat succinct, I think it is sufficient given the low > possilibility > of the scenario and the fact that it cannot occur inadvertently.
Thanks for the diff and I think the approach basically works. One notable behavior of this approach it will reject the GUC setting even if there are no slots on old cluster or user set the value to a big enough value which doesn't cause invalidation. The behavior doesn't look bad to me but just mention it for reference. Best Regards, Hou zj