Dear Amit, Thank you for reviewing! PSA new version.
> > Internally, I added a new decoding mode - DECODING_MODE_SILENT - and > used it. > > If the decoding context is in the mode, the output plugin is not loaded, but > > any WALs are decoded without skipping. > > > > I think it may be okay not to load the output plugin as we are not > going to process any record in this case but is that the only reason > or you have something else in mind as well? My main concern was for skipping to set output plugin options. Even if the pgoutput plugin, some options like protocol_version, publications, etc are required while loading a plugin. We cannot predict requirements for external plugins. Based on that I thought output plugins should not be loaded during the decode. > > Also, a new flag "did_process" is also > > added. This flag is set if wrappers for output plugin callbacks are called > > during > > the silent mode. > > Isn't it sufficient to add a test for silent mode in > begin/stream_start/begin_prepare kind of APIs and set > ctx->did_process? In all other APIs, we can assert that did_process > shouldn't be set and we never reach there when decoding mode is > silent. > > > + /* Check whether the meaningful change was found */ > + found = (ctx->reorder->by_txn_last_xid != InvalidTransactionId || > + ctx->did_process); > > Are you talking about this check in the patch? If so, can you please > explain when does the first check help? I changed around here so I describe once again. A flag (output_skipped) is set when the transaction is decoded till the end in silent mode. It is done in DecodeTXNNeedSkip() because the function is the common path for both committed/aborted transactions. Also, DecodeTXNNeedSkip() returns true when the decoding context is in the silent mode. Therefore, any cb_wrapper functions would not be called anymore. DecodingContextHasdecodedItems() just returns output_skipped. This approach needs to read WALs till end of transactions before returning the upgrading function, but codes look simpler than the previous version. > > > > Based on that, I added another binary function > binary_upgrade_create_logical_replication_slot(). > > This function is similar to pg_create_logical_replication_slot(), but the > > restart_lsn and confirmed_flush are set to *next* WAL segment. The pointed > > filename is returned and it is passed to pg_resetwal command. > > > > I am not sure if it is a good idea that a > binary_upgrade_create_logical_replication_slot() API does the logfile > name calculation. > > > One consideration is that pg_log_standby_snapshot() must be executed before > > slots consuming changes. New cluster does not have RUNNING_XACTS records > so that > > decoding context on new cluster cannot be create a consistent snapshot > > as-is. > > This may lead to discard changes during the upcoming consuming event. To > > prevent it the function is called after the final pg_resetwal. > > > > How do you think? > > > > + /* > + * Also, we mu execute pg_log_standby_snapshot() when logical replication > + * slots are migrated. Because RUNNING_XACTS record is required to create > + * a consistent snapshot. > + */ > + if (count_old_cluster_logical_slots()) > + create_consistent_snapshot(); > > We shouldn't do this separately. Instead > binary_upgrade_create_logical_replication_slot() should ensure that > corresponding WAL is reserved similar to what we do in > ReplicationSlotReserveWal() and then similarly invoke > LogStandbySnapshot() to ensure that we have enough information to > start. I did not handle these parts because they needed more analysis. Let's discuss in later versions. > > Few minor comments: > ================== > 1. The commit message and other comments like atop > get_old_cluster_logical_slot_infos() needs to be adjusted as per > recent changes. I revisited comments and updated. > 2. > @@ -1268,7 +1346,11 @@ stream_start_cb_wrapper(ReorderBuffer *cache, > ReorderBufferTXN *txn, > LogicalErrorCallbackState state; > ErrorContextCallback errcallback; > > - Assert(!ctx->fast_forward); > + /* > + * In silent mode all the two-phase callbacks are not set so that the > + * wrapper should not be called. > + */ > + Assert(ctx->decoding_mode == DECODING_MODE_NORMAL); > > This and other similar comments doesn't seems to be consistent as the > function name and comments are not matching. Fixed. Best Regards, Hayato Kuroda FUJITSU LIMITED
v47-0001-pg_upgrade-Allow-to-replicate-logical-replicatio.patch
Description: v47-0001-pg_upgrade-Allow-to-replicate-logical-replicatio.patch