Dear Peter, Thank you for reviewing! PSA new version.
> src/backend/replication/slot.c > > 3. InvalidatePossiblyObsoleteSlot > > + /* > + * Raise an ERROR if the logical replication slot is invalidating. It > + * would not happen because max_slot_wal_keep_size is set to -1 during > + * the upgrade, but it stays safe. > + */ > + if (*invalidated && SlotIsLogical(s) && IsBinaryUpgrade) > + elog(ERROR, "Replication slots must not be invalidated during the > upgrade."); > > 3a. > That comment didn't seem good. I think you mean like in the suggestion below. > > SUGGESTION > It should not be possible for logical replication slots to be > invalidated because max_slot_wal_keep_size is set to -1 during the > upgrade. The following is just for sanity-checking. This part was updated in v35. Please tell me if current version is still bad... > 3b. > I wasn't sure if 'max_slot_wal_keep_size' GUC is accessible in this > scope, but if it is available then maybe > Assert(max_slot_wal_keep_size_mb == -1); should also be included in > this sanity check. IIUC, guc parameters are visible from all the postgres processes. Added. > src/bin/pg_upgrade/check.c > > 4. check_new_cluster_logical_replication_slots > > + conn = connectToServer(&new_cluster, "template1"); > + > + prep_status("Checking for logical replication slots"); > > There is some inconsistency with all the subsequent pg_fatals within > this function -- some of them mention "New cluster" but most of them > do not. > > Meanwhile, Kuroda-san showed me sample output like: > > Checking for presence of required libraries ok > Checking database user is the install user ok > Checking for prepared transactions ok > Checking for new cluster tablespace directories ok > Checking for logical replication slots > New cluster must not have logical replication slots but found 1 slot. > Failure, exiting > > So, I felt the log message title ("Checking...") should be changed to > include the words "new cluster" just like the log preceding it: > > "Checking for logical replication slots" ==> "Checking for new cluster > logical replication slots" > > Now all the subsequent pg_fatals clearly are for "new cluster" Changed. > 5. check_new_cluster_logical_replication_slots > > + if (nslots_on_new) > + pg_fatal(ngettext("New cluster must not have logical replication > slots but found %d slot.", > + "New cluster must not have logical replication slots but found %d slots.", > + nslots_on_new), > + nslots_on_new); > > 5a. > TBH, I didn't see why you go to unnecessary trouble to have a plural > message here. The message could just be like: > "New cluster must have 0 logical replication slots but found %d." > > ~ > > 5b. > However, now (from the previous review comment #4) if "New cluster" is > already explicit in the log, the pg_fatal message can become just: > "New cluster must have ..." ==> "Expected 0 logical replication slots > but found %d." Basically it's better. But the initial character should be lower case and period is not needed. Modified like that. > 9. get_old_cluster_logical_slot_infos > > + i_slotname = PQfnumber(res, "slot_name"); > + i_plugin = PQfnumber(res, "plugin"); > + i_twophase = PQfnumber(res, "two_phase"); > + i_caught_up = PQfnumber(res, "caught_up"); > + i_invalid = PQfnumber(res, "conflicting"); > > IMO SQL should be using an alias for this column, so you can say: > i_invalid = PQfnumber(res, "invalid") > > which seems better than switching the wording in code. Modified. The argument of PQfnumber() must be same as the column name, so the word "as invalid" was added to SQL. > src/bin/pg_upgrade/pg_upgrade.h > > 10. LogicalSlotInfo > > +typedef struct > +{ > + char *slotname; /* slot name */ > + char *plugin; /* plugin */ > + bool two_phase; /* can the slot decode 2PC? */ > + bool caught_up; /* Is confirmed_flush_lsn the same as latest > + * checkpoint LSN? */ > + bool invalid; /* Is the slot usable? */ > +} LogicalSlotInfo; > > ~ > > + bool invalid; /* Is the slot usable? */ > This field name and comment have opposite meanings. Invalid means NOT usable. > > SUGGESTION > /* If true, the slot is unusable. */ Fixed. > src/bin/pg_upgrade/server.c > > 11. start_postmaster > > * we only modify the new cluster, so only use it there. If there is a > * crash, the new cluster has to be recreated anyway. fsync=off is a big > * win on ext4. > + * > + * Also, the max_slot_wal_keep_size is set to -1 to prevent the WAL removal > + * required by logical slots. The setting could avoid the invalidation of > + * slots during the upgrade. > */ > ~ > > IMO this comment "to prevent the WAL removal required by logical > slots" is ambiguous about how it could be interpreted. Needs > rearranging for clarity. The description was changed. How do you think? > 12. start_postmaster > > (cluster == &new_cluster) ? > - " -c synchronous_commit=off -c fsync=off -c full_page_writes=off" : "", > + " -c synchronous_commit=off -c fsync=off -c full_page_writes=off -c > max_slot_wal_keep_size=-1 " : > + " -c max_slot_wal_keep_size=-1", > > Instead of putting the same option on both sides of the ternary, I was > wondering if it might be better to hardwire the max_slot_wal_keep_size > just 1 time in the format string? Fixed. > .../pg_upgrade/t/003_logical_replication_slots.pl > > 13. > # Remove the remained slot > > /remained/remaining/ Fixed. Best Regards, Hayato Kuroda FUJITSU LIMITED
v36-0001-Flush-logical-slots-to-disk-during-a-shutdown-ch.patch
Description: v36-0001-Flush-logical-slots-to-disk-during-a-shutdown-ch.patch
v36-0002-pg_upgrade-Allow-to-replicate-logical-replicatio.patch
Description: v36-0002-pg_upgrade-Allow-to-replicate-logical-replicatio.patch