On Wed, Sep 6, 2023 at 8:47 AM Zhijie Hou (Fujitsu) <houzj.f...@fujitsu.com> wrote: > > On Tuesday, September 5, 2023 3:35 PM Kuroda, Hayato/黒田 隼人 > <kuroda.hay...@fujitsu.com> wrote: > > > > Dear Hou-san, > > > > > Based on this, it’s possible that the slots we get each time when > > > checking wal_status are different, because they may get changed in between > > these checks. > > > This may not cause serious problems for now, because we will either > > > copy all the slots including ones invalidated when upgrading or we > > > report ERROR. But I feel it's better to get consistent result each > > > time we check the slots to close the possibility for problems in the > > > future. So, I feel we could centralize the check for wal_status and > > > slots fetch, so that even if some slots status changed after that, it > > > won't have > > a risk to affect our check. What do you think ? > > > > Thank you for giving the suggestion! I agreed that to centralize checks, > > and I > > had already started to modify. Here is the updated patch. > > > > In this patch all slot infos are extracted in the > > get_old_cluster_logical_slot_infos(), > > upcoming functions uses them. Based on the change, two attributes > > confirmed_flush and wal_status were added in LogicalSlotInfo. > > > > IIUC we cannot use strcut List in the client codes, so structures and > > related > > functions are added in the function.c. These are used for extracting unique > > plugins, but it may be overkill because check_loadable_libraries() handle > > duplicated entries. If we can ignore duplicated entries, these functions > > can be > > removed. > > > > Also, for simplifying codes, only a first-met invalidated slot is output in > > the > > check_old_cluster_for_valid_slots(). Warning messages int the function were > > removed. I think it may be enough because check_new_cluster_is_empty() do > > similar thing. Please tell me if it should be reverted... > > Thank for updating the patch ! here are few comments. > > 1. > > + res = executeQueryOrDie(conn, "SHOW wal_level;"); > + wal_level = PQgetvalue(res, 0, 0); > > + res = executeQueryOrDie(conn, "SHOW wal_level;"); > + wal_level = PQgetvalue(res, 0, 0); > > I think it would be better to do a sanity check using PQntuples() before > calling PQgetvalue() in above places. > > 2. > > +/* > + * Helper function for get_old_cluster_logical_slot_infos() > + */ > +static WALAvailability > +GetWALAvailabilityByString(const char *str) > +{ > + WALAvailability status = WALAVAIL_INVALID_LSN; > + > + if (strcmp(str, "reserved") == 0) > + status = WALAVAIL_RESERVED; > > Not a comment, but I am wondering if we could use conflicting field to do this > check, so that we could avoid the new conversion function and structure > movement. What do you think ? >
I also think referring to the conflicting field would be better not only for the purpose of avoiding extra code but also to give accurate information about invalidated slots for which we want to give an error. Additionally, I think we should try to avoid writing a new function strtoLSN as that adds a maintainability burden. We can probably send the value fetched from pg_controldata in the query for comparison with confirmed_flush LSN. -- With Regards, Amit Kapila.