On Tue, Mar 7, 2023 at 4:49 PM Ryo Yamaji (Fujitsu) <yamaji....@fujitsu.com> wrote: > > From: Tom Lane <t...@sss.pgh.pa.us> > > I don't see a bug here, or at least I'm not willing to move the goalposts > > to where you want them to be. > > I believe that we do guarantee arrival-order locking of individual tuple > > versions. However, in the > > example you show, a single row is being updated over and over. So, > > initially we have a single "winner" > > transaction that got the tuple lock first and updated the row. When it > > commits, each other transaction > > serially comes off the wait queue for that tuple lock and discovers that it > > now needs a lock on a > > different tuple version than it has got. > > So it tries to get lock on whichever is the latest tuple version. > > That might still appear serial as far as the original 100 sessions go, > > because they were all queued on the > > same tuple lock to start with. > > But when the new sessions come in, they effectively line-jump because they > > will initially try to lock > > whichever tuple version is committed live at that instant, and thus they > > get ahead of whichever remain of > > the original 100 sessions for the lock on that tuple version (since those > > are all still blocked on some older > > tuple version, whose lock is held by whichever session is performing the > > next-to-commit update). > > > I don't see any way to make that more stable that doesn't involve requiring > > sessions to take locks on > > already-dead-to-them tuples; which sure seems like a nonstarter, not least > > because we don't even have a > > way to find such tuples. The update chains only link forward not back. > > Thank you for your reply. > When I was doing this test, I confirmed the following two actions. > (1) The first 100 sessions are overtaken by the last 10. > (2) the order of the preceding 100 sessions changes > > (1) I was concerned from the user's point of view that the lock order for the > same tuple was not preserved. > However, as you pointed out, in many cases the order of arrival is guaranteed > from the perspective of the tuple. > You understand the PostgreSQL architecture and understand that you need to > use it. > > (2) This behavior is rare. Typically, the first session gets > AccessExclusiveLock to the tuple and ShareLock to the > transaction ID. Subsequent sessions will wait for AccessExclusiveLock to the > tuple. However, we ignored > AccessExclusiveLock in the tuple from the log and observed multiple sessions > waiting for ShareLock to the > transaction ID. The log shows that the order of the original 100 sessions has > been changed due to the above > movement. >
I think for (2), the test is hitting the case of walking the update chain via heap_lock_updated_tuple() where we don't acquire the lock on the tuple. See comments atop heap_lock_updated_tuple(). You can verify if that is the case by adding some DEBUG logs in that function. > At first, I thought both (1) and (2) were obstacles. However, I understood > from your indication that (1) is not a bug. > I would be grateful if you could also give me your opinion on (2). > If my above observation is correct then it is not a bug as it is behaving as per the current design. -- With Regards, Amit Kapila.