On Sun, Dec 29, 2013 at 9:09 AM, Heikki Linnakangas <hlinnakan...@vmware.com> wrote: >>> While mulling this over further, I had an idea about this: suppose we >>> marked the tuple in some fashion that indicates that it's a promise >>> tuple. I imagine an infomask bit, although the concept makes me wince >>> a bit since we don't exactly have bit space coming out of our ears >>> there. Leaving that aside for the moment, whenever somebody looks at >>> the tuple with a mind to calling XactLockTableWait(), they can see >>> that it's a promise tuple and decide to wait on some other heavyweight >>> lock instead. The simplest thing might be for us to acquire a >>> heavyweight lock on the promise tuple before making index entries for >>> it, and then have callers wait on that instead always instead of >>> transitioning from the tuple lock to the xact lock. > > Yeah, that seems like it should work. You might not even need an infomask > bit for that; just take the "other heavyweight lock" always before calling > XactLockTableWait(), whether it's a promise tuple or not. If it's not, > acquiring the extra lock is a waste of time but if you're going to sleep > anyway, the overhead of one extra lock acquisition hardly matters.
Are you suggesting that I lock the tuple only (say, through a special LockPromiseTuple() call), or lock the tuple *and* call XactLockTableWait() afterwards? You and Robert don't seem to be in agreement about which here. From here on I assume Robert's idea (only get the special promise lock where appropriate), because that makes more sense to me. I've taken a look at this idea, but got frustrated. You're definitely going to need an infomask bit for this. Otherwise, how do you differentiate between a "pending" promise tuple and a "fulfilled" promise tuple (or a tuple that never had anything to do with promises in the first place)? You'll want to wake up as soon as it becomes clear that the former is not going to become the latter on the one hand. On the other hand, you really will want to wait until xact end on the pending promise tuple when it becomes a fulfilled promise, or on an already-fulfilled promise tuple, or a plain old tuple. It's either locking the promise tuple, or locking the xid; never both, because the combination makes no sense to any case (unless you're talking about the case where you lock the promise tuple and then later *somehow* decide that you need to lock the xid as the upserter releases promise tuple locks directly within ExecInsert() upon successful insertion). The fact that your LockPromiseTuple() call didn't find someone else with the lock does not mean no one ever promised the tuple (assuming no infomask bit has the relevant info). Obviously you can't just have upserters hold on to the promise tuple locks until xact end if the promiser's insertion succeeds, for the same reason we don't with regular in-memory tuple locks: they're totally unbounded. So not only are you going to need an infomask promise bit, you're going to need to go and unset the bit in the event of a *successful* insertion, so that waiters know to wait on your xact now when you finally UnlockPromiseTuple() within ExecInsert() to finish off successful insertion. *And*, all XactLockTableWait() promise waiters need to go back and check that just-in-case. This problem illustrates what I mean about conflating row locking with value locking. >> I think the interlocking with buffer locks and heavyweight locks to >> make that work could be complex. > > Hmm. Can you elaborate? What I meant is that you should be wary of what you go on to describe below. > The inserter has to acquire the heavyweight lock before releasing the buffer > lock, because otherwise another inserter (or deleter or updater) might see > the tuple, acquire the heavyweight lock, and fall to sleep on > XactLockTableWait(), before the inserter has grabbed the heavyweight lock. > If that race condition happens, you have the original problem again, ie. the > updater unnecessarily waits for the inserting transaction to finish, even > though it already killed the tuple it inserted. Right. Can you suggest a workaround to the above problems? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers