proc.c has the following code --- unchanged since Postgres95 --- in HandleDeadlock(): /* --------------------- * Check to see if we've been awoken by anyone in the interim. * * If we have we can return and resume our transaction -- happy day. * Before we are awoken the process releasing the lock grants it to * us so we know that we don't have to wait anymore. * * Damn these names are LONG! -mer * --------------------- */ if (IpcSemaphoreGetCount(MyProc->sem.semId, MyProc->sem.semNum) == IpcSemaphoreDefaultStartValue) { UnlockLockTable(); return; } /* * you would think this would be unnecessary, but... * * this also means we've been removed already. in some ports (e.g., * sparc and aix) the semop(2) implementation is such that we can * actually end up in this handler after someone has removed us from * the queue and bopped the semaphore *but the test above fails to * detect the semaphore update* (presumably something weird having to * do with the order in which the semaphore wakeup signal and SIGALRM * get handled). */ if (MyProc->links.prev == INVALID_OFFSET || MyProc->links.next == INVALID_OFFSET) { UnlockLockTable(); return; } Well, the reason control can get to the "apparently unnecessary" part is not some weird portability glitch; it is that the first test is WRONG. IpcSemaphoreGetCount() calls semctl(GETNCNT), which returns not the current value of the semaphore as this code expects, but the number of waiters on the semaphore. Since the process doing this test is the only one that'll ever wait on that semaphore, it is impossible for IpcSemaphoreGetCount() to return anything but zero, and so the first if() has never ever succeeded in the entire history of Postgres. IpcSemaphoreGetCount is used nowhere else. I see no particularly good reason to have it at all, and certainly not to have it with a name so easily mistaken for IpcSemaphoreGetValue. It's about to be toast. Next question is whether to recode the first test "correctly" with IpcSemaphoreGetValue(), or just remove it. Since we clearly have done just fine with testing our internal link pointers to detect removal from the queue, I'm inclined to remove the first test and thus save an unnecessary kernel call. Comments? regards, tom lane