Hi, On 2022-10-14 10:40:11 +0530, Amit Kapila wrote: > On Fri, Oct 14, 2022 at 2:25 AM Andres Freund <and...@anarazel.de> wrote: > > > > > > > > Attached are two patches. The first patch is what Robert has proposed > > > with some changes in comments to emphasize the fact that cleanup lock > > > on the new bucket is just to be consistent with the old bucket page > > > locking as we are initializing it just before checking for cleanup > > > lock. In the second patch, I removed the acquisition of cleanup lock > > > on the new bucket page and changed the comments/README accordingly. > > > > > > I think we can backpatch the first patch and the second patch can be > > > just a HEAD-only patch. Does that sound reasonable to you? > > > > Not particularly, no. I don't understand how "overwrite a page and then get > > a > > cleanup lock" can sensibly be described by this comment: > > > > > +++ b/src/backend/access/hash/hashpage.c > > > @@ -807,7 +807,8 @@ restart_expand: > > > * before changing the metapage's mapping info, in case we can't > > > get the > > > * disk space. Ideally, we don't need to check for cleanup lock on > > > new > > > * bucket as no other backend could find this bucket unless meta > > > page is > > > - * updated. However, it is good to be consistent with old bucket > > > locking. > > > + * updated and we initialize the page just before it. However, it > > > is just > > > + * to be consistent with old bucket locking. > > > */ > > > buf_nblkno = _hash_getnewbuf(rel, start_nblkno, MAIN_FORKNUM); > > > if (!IsBufferCleanupOK(buf_nblkno)) > > > > This is basically saying "I am breaking basic rules of locking just to be > > consistent", no? > > > > Fair point. How about something like: "XXX Do we really need to check > for cleanup lock on the new bucket? Here, we initialize the page, so > ideally we don't need to perform any operation that requires such a > check."?.
This still seems to omit that the code is quite broken. > Feel free to suggest something better. How about something like: XXX: This code is wrong, we're overwriting the buffer before "acquiring" the cleanup lock. Currently this is not known to have bad consequences because XYZ and the fix seems a bit too risky for the backbranches. Greetings, Andres Freund