On Thu, Jun 13, 2024 at 1:09 PM Masahiko Sawada <sawada.m...@gmail.com> wrote: > > On Wed, Jun 12, 2024 at 6:59 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > > > Yeah, starting with a single worker sounds good for now. Do you think > > we should sync all the sequences in a single transaction or have some > > threshold value above which a different transaction would be required > > or maybe a different sequence sync worker altogether? Now, having > > multiple sequence-sync workers requires some synchronization so that > > only a single worker is allocated for one sequence. > > > > The simplest thing is to use a single sequence sync worker that syncs > > all sequences in one transaction but with a large number of sequences, > > it could be inefficient. OTOH, I am not sure if it would be a problem > > in reality. > > I think that we can start with using a single worker and one > transaction, and measure the performance with a large number of > sequences. >
Fair enough. However, this raises the question Dilip and Vignesh are discussing whether we need a new relfilenode for sequence update even during initial sync? As per my understanding, the idea is that similar to tables, the CREATE SUBSCRIPTION command (with copy_data = true) will create the new sequence entries in pg_subscription_rel with the state as 'i'. Then the sequence-sync worker would start a transaction and one-by-one copy the latest sequence values for each sequence (that has state as 'i' in pg_subscription_rel) and mark its state as ready 'r' and commit the transaction. Now if there is an error during this operation it will restart the entire operation. The idea of creating a new relfilenode is to handle the error so that if there is a rollback, the sequence state will be rolled back to 'i' and the sequence value will also be rolled back. The other option could be that we update the sequence value without a new relfilenode and if the transaction rolled back then only the sequence's state will be rolled back to 'i'. This would work with a minor inconsistency that sequence values will be up-to-date even when the sequence state is 'i' in pg_subscription_rel. I am not sure if that matters because anyway, they can quickly be out-of-sync with the publisher again. Now, say we don't want to maintain the state of sequences for initial sync at all then after the error how will we detect if there are any pending sequences to be synced? One possibility is that we maintain a subscription level flag 'subsequencesync' in 'pg_subscription' to indicate whether sequences need sync. This flag would indicate whether to sync all the sequences in pg_susbcription_rel. This would mean that if there is an error while syncing the sequences we will resync all the sequences again. This could be acceptable considering the chances of error during sequence sync are low. The benefit is that both the REFRESH PUBLICATION SEQUENCES and CREATE SUBSCRIPTION can use the same idea and sync all sequences without needing a new relfilenode. Users can always refer 'subsequencesync' flag in 'pg_subscription' to see if all the sequences are synced after executing the command. > > > Or yet another idea I came up with is that a tablesync worker will > > > synchronize both the table and sequences owned by the table. That is, > > > after the tablesync worker caught up with the apply worker, the > > > tablesync worker synchronizes sequences associated with the target > > > table as well. One benefit would be that at the time of initial table > > > sync being completed, the table and its sequence data are consistent. > > Correction; it's not guaranteed that the sequence data and table data > are consistent even in this case since the tablesync worker could get > on-disk sequence data that might have already been updated. > The benefit of this approach is not clear to me. Our aim is to sync all sequences before the upgrade, so not sure if this helps because anyway both table values and corresponding sequences can again be out-of-sync very quickly. > > > > > > > > > > 3) Refreshing the sequence can be achieved through the existing > > > > command: ALTER SUBSCRIPTION ... REFRESH PUBLICATION(no syntax change > > > > here). > > > > The subscriber identifies stale sequences, meaning sequences present > > > > in pg_subscription_rel but absent from the publication, and removes > > > > them from the pg_subscription_rel system table. The subscriber also > > > > checks for newly added sequences in the publisher and synchronizes > > > > their values from the publisher using the steps outlined in the > > > > subscription creation process. It's worth noting that previously > > > > synchronized sequences won't be synchronized again; the sequence sync > > > > will occur solely for the newly added sequences. > > > > > > > > 4) Introducing a new command for refreshing all sequences: ALTER > > > > SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES. > > > > The subscriber will remove stale sequences and add newly added > > > > sequences from the publisher. Following this, it will re-synchronize > > > > the sequence values for all sequences in the updated list from the > > > > publisher, following the steps outlined in the subscription creation > > > > process. > > > > > > The difference between 3) and 4) is whether or not to re-synchronize > > > the previously synchronized sequences. Do we really want to introduce > > > a new command for 4)? I felt that we can invent an option say > > > copy_all_sequence for the REFRESH PUBLICATION command to cover the 4) > > > case. > > > > > > > Yeah, that is also an option but it could confuse along with copy_data > > option. Say the user has selected copy_data = false but > > copy_all_sequences = true then the first option indicates to *not* > > copy the data of table and sequences and the second option indicates > > to copy the sequences data which sounds contradictory. The other idea > > is to have an option copy_existing_sequences (which indicates to copy > > existing sequence values) but that also has somewhat the same drawback > > as copy_all_sequences but to a lesser degree. > > Good point. And I understood that the REFRESH PUBLICATION SEQUENCES > command would be helpful when users want to synchronize sequences > between two nodes before upgrading. > Right. > > > > > > > > > > 5) Incorporate the pg_sequence_state function to fetch the sequence > > > > value from the publisher, along with the page LSN. Incorporate > > > > SetSequence function, which will procure a new relfilenode for the > > > > sequence and set the new relfilenode with the specified value. This > > > > will facilitate rollback in case of any failures. > > > > > > Does it mean that we create a new relfilenode for every update of the > > > value? > > > > > > > We need it for initial sync so that if there is an error both the > > sequence state in pg_subscription_rel and sequence values can be > > rolled back together. > > Agreed. > > > However, it is unclear whether we need to create > > a new relfilenode while copying existing sequences (say during ALTER > > SUBSCRIPTION .. REFRESH PUBLICATION SEQUENCES, or whatever command we > > decide)? Probably the answer lies in how we want to implement this > > command. If we want to copy all sequence values during the command > > itself then it is probably okay but if we want to handover this task > > to the sequence-sync worker then we need some state management and a > > new relfilenode so that on error both state and sequence values are > > rolled back. > > What state transition of pg_subscription_rel entries for sequences do > we need while copying sequences values? For example, we insert an > entry with 'init' state at CREATE SUBSCRIPTION and then the > sequence-sync worker updates to 'ready' and copies the sequence data. > And at REFRESH PUBLICATION SEQUENCES, we update the state back to > 'init' again so that the sequence-sync worker can process it? Given > REFRESH PUBLICATION SEQUENCES won't be executed very frequently, it > might be acceptable to transactionally update sequence values. > Do you mean that sync the sequences during the REFRESH PUBLICATION SEQUENCES command itself? If so, there is an argument that we can do the same during CREATE SUBSCRIPTION. It would be beneficial to keep the method to sync the sequences same for both the CREATE and REFRESH commands. I have speculated on one idea above and would be happy to see your thoughts. -- With Regards, Amit Kapila.