On Thu, Sep 14, 2023 at 10:37 AM Dilip Kumar <dilipbal...@gmail.com> wrote: > > On Thu, Sep 14, 2023 at 10:00 AM Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > On Thu, Sep 14, 2023 at 9:21 AM Dilip Kumar <dilipbal...@gmail.com> wrote: > > > > > ----------- > > > > > > > > 3) Introduce a new pg_upgrade option(e.g. skip_slot_check), and suggest > > > > if user > > > > already did the upgrade check for stopped server, they can use this > > > > option > > > > when trying to upgrade later. > > > > > > > > Pros: Can save some efforts for user to advance each slot's lsn. > > > > > > > > Cons: I didn't see similar options in pg_upgrade, might need some > > > > agreement. > > > > > > Yeah right, in fact during the --check command we can give that > > > suggestion as well. > > > > > > > Hmm, we can't mandate users to skip checking slots because that is the > > whole point of --check slots. > > I mean not to mandate skipping in the --check command. But once the > check command has already checked the slot then we can issue a > suggestion to the user that the slots are already checked so that > during the actual upgrade we can --skip checking the slots. So for > user who has already run the check command and is now following with > an upgrade can skip slot checking if we can provide such an option. >
oh, okay, we can document and request the user to follow as you suggest but I guess it will be more work for the user and also is less intuitive. > > > I feel option 2 looks best to me unless there is some design issue to > > > that, as of now I do not see any issue with that though. Let's see > > > what others think. > > > > > > > By the way, did you consider the previous approach this patch was > > using? Basically, instead of getting the last checkpoint location from > > the control file, we will read the WAL file starting from the > > confirmed_flush location of a slot and if we find any WAL other than > > expected WALs like shutdown checkpoint, running_xacts, etc. then we > > will error out. > > So basically, while scanning from confirmed_flush we must ensure that > we find a first record as SHUTDOWN CHECKPOINT record at the same LSN, > and after that, we should not get any other WAL other than like you > said shutdown checkpoint, running_xacts. That way we will ensure both > aspect that the confirmed flush LSN is at the shutdown checkpoint and > after that there is no real activity in the system. > Right. > I think to me, > this seems like the best available option so far. > Yeah, let's see if someone else has a different opinion or has a better idea. -- With Regards, Amit Kapila.