On Wed, Aug 17, 2022 at 5:55 PM Robert Haas <robertmh...@gmail.com> wrote: > > Regarding the question of whether we need a cleanup lock on the new > bucket I am not really seeing the advantage of going down that path. > Simply fixing this code to take a cleanup lock instead of hoping that > it always gets one by accident is low risk and should fix the observed > problem. Getting rid of the cleanup lock will be more invasive and I'd > like to see some evidence that it's a necessary step before we take > the risk of breaking things. >
The patch proposed by you is sufficient to fix the observed issue. BTW, we are able to reproduce the issue and your patch fixed it. The idea is to ensure that checkpointer only tries to sync the buffer for the new bucket, otherwise, it will block while acquiring the lock on the old bucket buffer in SyncOneBuffer because the replay process would already have it and we won't be able to hit required condition. To simulate it, we need to stop the replay before we acquire the lock for the old bucket. Then, let checkpointer advance the buf_id beyond the buffer which we will get for the old bucket (in the place where it loops over all buffers, and mark the ones that need to be written with BM_CHECKPOINT_NEEDED.). After that let the replay process proceed till the point where it checks for the clean-up lock on the new bucket. Next, let the checkpointer advance to sync the buffer corresponding to the new bucket buffer. This will reproduce the required condition. We have tried many other combinations but couldn't able to hit it. For example, we were not able to generate it via bgwriter because it expects the buffer to have zero usage and ref count which is not possible during the replay in hash_xlog_split_allocate_page() as we already have increased the usage count for the new bucket buffer before checking the cleanup lock on it. I agree with you that getting rid of the clean-up lock on the new bucket is a more invasive patch and should be done separately if required. Yesterday, I have done a brief analysis and I think that is possible but it doesn't seem to be a good idea to backpatch it. -- With Regards, Amit Kapila.