Re: [HACKERS] lwlocks and starvation
Neil Conway wrote: > On Thu, 2004-12-02 at 09:59 -0500, Bruce Momjian wrote: > > OK, either you have to own the issue or I have to add something to the > > TODO list. > > Can you add it to the TODO list and assign it to me? > Done: * Fix priority ordering of read and write light-weight locks (Neil) -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] lwlocks and starvation
On Thu, 2004-12-02 at 09:59 -0500, Bruce Momjian wrote: > OK, either you have to own the issue or I have to add something to the > TODO list. Can you add it to the TODO list and assign it to me? -Neil ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: Re: [HACKERS] lwlocks and starvation
Neil Conway <[EMAIL PROTECTED]> wrote on 02.12.2004, 05:55:43: > On Wed, 2004-12-01 at 21:51 -0500, Bruce Momjian wrote: > > Neil, where are we on this? Should we add comments? Add a TODO? A patch? > > I'm not sure what the right resolution is. As I said, I don't think it's > wise to apply a patch that could have a significant impact on > performance without (a) testing its performance effect and/or (b) having > any evidence that the problem it addresses actually effects anyone in > the real world. I agree, it would be unwise to apply the patch. In contrast to my comments on the nodeAgg perf tweak, changing the way lwlocks are allocated could have catastrophic effect on the whole system. We've got no evidence that it repesents a beneficial performance change, but the risk of it causing some unforeseen problem in at least one corner of the code is seems very high. I'm pleased we found this, but changing it should be deferred till at least 8.1. Perhaps a TODO item could read "investigating lwlock scheduling algorithms"? We might yet find that there is no single right answer and we need different algorithms for different locks... Best Regards, Simon Riggs ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] lwlocks and starvation
OK, either you have to own the issue or I have to add something to the TODO list. --- Neil Conway wrote: > On Wed, 2004-12-01 at 21:51 -0500, Bruce Momjian wrote: > > Neil, where are we on this? Should we add comments? Add a TODO? A patch? > > I'm not sure what the right resolution is. As I said, I don't think it's > wise to apply a patch that could have a significant impact on > performance without (a) testing its performance effect and/or (b) having > any evidence that the problem it addresses actually effects anyone in > the real world. I'll try to run some benchmarks when I get a chance. > > I wrote up most of a patch to implement the "wake up all shared wakers > on LWLockRelease()" behavior to see how that would change performance, > but the patch has a subtle bug in it that I can't seem to find (I've > attached it -- comments welcome). > > Certainly if we decide to leave things as they are I think we ought to > document why the behavior is intentional, but I don't think we have > enough data to make that decision yet. > > -Neil > [ Attachment, skipping... ] > > ---(end of broadcast)--- > TIP 4: Don't 'kill -9' the postmaster -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] lwlocks and starvation
On Wed, 2004-12-01 at 21:51 -0500, Bruce Momjian wrote: > Neil, where are we on this? Should we add comments? Add a TODO? A patch? I'm not sure what the right resolution is. As I said, I don't think it's wise to apply a patch that could have a significant impact on performance without (a) testing its performance effect and/or (b) having any evidence that the problem it addresses actually effects anyone in the real world. I'll try to run some benchmarks when I get a chance. I wrote up most of a patch to implement the "wake up all shared wakers on LWLockRelease()" behavior to see how that would change performance, but the patch has a subtle bug in it that I can't seem to find (I've attached it -- comments welcome). Certainly if we decide to leave things as they are I think we ought to document why the behavior is intentional, but I don't think we have enough data to make that decision yet. -Neil --- src/backend/storage/lmgr/lwlock.c +++ src/backend/storage/lmgr/lwlock.c @@ -411,7 +411,7 @@ LWLockRelease(LWLockId lockid) { volatile LWLock *lock = LWLockArray + lockid; - PGPROC *head; + PGPROC *release = NULL; PGPROC *proc; int i; @@ -450,34 +450,67 @@ * any waiters if someone has already awakened waiters that haven't * yet acquired the lock. */ - head = lock->head; - if (head != NULL) + if (lock->head != NULL) { if (lock->exclusive == 0 && lock->shared == 0 && lock->releaseOK) { /* - * Remove the to-be-awakened PGPROCs from the queue. If the - * front waiter wants exclusive lock, awaken him only. - * Otherwise awaken as many waiters as want shared access. + * Remove the to-be-awakened PGPROCs from the queue. If + * the front waiter wants an exclusive lock, awaken him + * only. Otherwise, awaken all the shared waiters in the + * queue. */ - proc = head; - if (!proc->lwExclusive) + proc = lock->head; + if (proc->lwExclusive) { -while (proc->lwWaitLink != NULL && - !proc->lwWaitLink->lwExclusive) +lock->head = proc->lwWaitLink; +proc->lwWaitLink = NULL; +release = proc; + } + else + { +/* + * Walk through the wait queue. We form two linked + * lists: exclusive waiters, who will be the lock's + * new wait queue, and shared waiters, all of whom + * will be awaken. + */ +PGPROC *exclusive_head = NULL; +PGPROC *exclusive_tail = NULL; +PGPROC *shared_head = proc; +PGPROC *shared_tail = proc; +proc = proc->lwWaitLink; +while (proc != NULL) +{ + if (proc->lwExclusive) + { + if (!exclusive_head) + exclusive_head = exclusive_tail = proc; + else + { + exclusive_tail->lwWaitLink = proc; + exclusive_tail = proc; + } + } + else + { + shared_tail->lwWaitLink = proc; + shared_tail = proc; + } proc = proc->lwWaitLink; +} + +/* NULL terminate both lists */ +shared_tail->lwWaitLink = NULL; +if (exclusive_tail) + exclusive_tail->lwWaitLink = NULL; +/* The exclusive list is the new wait queue */ +lock->head = exclusive_head; +release = shared_head; } - /* proc is now the last PGPROC to be released */ - lock->head = proc->lwWaitLink; - proc->lwWaitLink = NULL; /* prevent additional wakeups until retryer gets to run */ lock->releaseOK = false; } - else - { - /* lock is still held, can't awaken anything */ - head = NULL; - } } /* We are done updating shared state of the lock itself. */ @@ -486,11 +519,11 @@ /* * Awaken any waiters I removed from the queue. */ - while (head != NULL) + while (release != NULL) { LOG_LWDEBUG("LWLockRelease", lockid, "release waiter"); - proc = head; - head = proc->lwWaitLink; + proc = release; + release = proc->lwWaitLink; proc->lwWaitLink = NULL; proc->lwWaiting = false; PGSemaphoreUnlock(&proc->sem); ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] lwlocks and starvation
Neil, where are we on this? Should we add comments? Add a TODO? A patch? --- Neil Conway wrote: > On Wed, 2004-11-24 at 23:30 -0500, Tom Lane wrote: > > It is not a 100% solution because it does not > > cover the case where a waiting exclusive locker is released, then a new > > shared locker arrives at the lock before the exclusive locker is given > > any cycles to acquire the lock. However I don't see any cure for the > > latter problem that's not worse than the disease > > Yeah, I don't think this is a problem -- eventually the exclusive waiter > will win the coin flip anyway. > > > On the other hand we might consider that this isn't a big problem and > > just leave things as they are. We haven't seen any indication that > > starvation is a real problem in practice, and so it might be better to > > avoid extra trips through the kernel scheduler. > > Yes, I'm a little concerned about applying a patch to address what is, > so far, an entirely academic concern -- especially if it might hurt > performance. > > -Neil > > -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] lwlocks and starvation
On Thu, 2004-11-25 at 11:28, Neil Conway wrote: > Simon Riggs wrote: > > I'd been thinking about lock release order also, thinking that this > > could be related to the CS storms observed earlier and the apparent > > lock-step behaviour commented upon previously. > > As much as I would love to believe this (because it would mean we could > probably solve the problem pretty easily), I don't think lock release > order is the (primary) culprit behind the CS storm issue. Wish we had a way to tell... though I think you are right. > As I > understand it, the heavily contended lock in those situations is the > BufMgrLock, and that is _always_ acquired in exclusive mode. So you're saying this doesn't effect BufMgrLock contention? Oh. > > ISTM that waking > > shared waiters in xid order would bring the most benefit and minimise > > any data issues. Readers waiting behind an exclusive waiter, where the > > reader has a lower xid might reasonably be woken without a problem since > > they will never see the changes made by the exclusive waiter anyway. > > I'm not sure I understand. What "data issues" are you referring to? > LWLocks are used to protect non-transactional resources (such as shared > memory data structures), so the relative xids of two waiter processes > doesn't affect whether they can see each other's modifications (i.e. > because they always can). Yes, understand the difference. But when we check for tuple visibility, the changes would be ignored, hence it is OK to resort the list from FIFO to xid order, but... > In any case, the idea of considering information such as the xid when > deciding which waiters to release is interesting. It's not immediately > apparent to me quite *how* to make use of that info, but it's definitely > something to consider... Agreed. It's a wacky idea and I'm not embarrassed having them. The day I stop having them will be the day I stop having good ideas too. But please don't let this distract from your own good-idea: Well done to you, Neil. -- Best Regards, Simon Riggs ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] lwlocks and starvation
Simon Riggs wrote: I'd been thinking about lock release order also, thinking that this could be related to the CS storms observed earlier and the apparent lock-step behaviour commented upon previously. As much as I would love to believe this (because it would mean we could probably solve the problem pretty easily), I don't think lock release order is the (primary) culprit behind the CS storm issue. As I understand it, the heavily contended lock in those situations is the BufMgrLock, and that is _always_ acquired in exclusive mode. ISTM that waking shared waiters in xid order would bring the most benefit and minimise any data issues. Readers waiting behind an exclusive waiter, where the reader has a lower xid might reasonably be woken without a problem since they will never see the changes made by the exclusive waiter anyway. I'm not sure I understand. What "data issues" are you referring to? LWLocks are used to protect non-transactional resources (such as shared memory data structures), so the relative xids of two waiter processes doesn't affect whether they can see each other's modifications (i.e. because they always can). In any case, the idea of considering information such as the xid when deciding which waiters to release is interesting. It's not immediately apparent to me quite *how* to make use of that info, but it's definitely something to consider... -Neil ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] lwlocks and starvation
On Wed, 2004-11-24 at 12:52, Neil Conway wrote: > Bruce Momjian wrote: > > I thought the new readers will sit after the writer in the FIFO queue so > > the writer will not starve. > > AFAICS, that is not the case. See lwlock.c, circa line 264: in LW_SHARED > mode, we check if "exclusive" is zero; if so, we acquire the lock > (increment the shared lock count and do not block). And "exclusive" is > set non-zero only when we _acquire_ a lock in exclusive mode, not when > we add an exclusive waiter to the wait queue. Wow...well spotted. That could explain many recent performance results. On Wed, 2004-11-24 at 08:23, Neil Conway wrote: > LWLockRelease() currently does something like (simplifying a lot): > > acquire lwlock spinlock > decrement lock count > if lock is free > if first waiter in queue is waiting for exclusive lock, > awaken him; else, walk through the queue and awaken > all the shared waiters until we reach an exclusive waiter > end if > release lwlock spinlock > > This has the nice property that locks are granted in FIFO order. Is it > essential that we maintain that property? If not, we could instead walk > through the wait queue and awaken *all* the shared waiters, and get a > small improvement in throughput. I'd been thinking about lock release order also, thinking that this could be related to the CS storms observed earlier and the apparent lock-step behaviour commented upon previously. FIFO is the most easily theoretically predictable, but others are possible. ISTM that waking shared waiters in xid order would bring the most benefit and minimise any data issues. Readers waiting behind an exclusive waiter, where the reader has a lower xid might reasonably be woken without a problem since they will never see the changes made by the exclusive waiter anyway. That probably needs to be within a limited window of inspection beyond the exclusive waiter to limit the complexity, say 4-8 places beyond the exclusive waiter. Exactly what we do from here is going to dramatically effect performance in various situations, so I think trying a few different algorithms should help the understanding. IMHO a concern remains that oprofile is not good enough instrumentation to spot this kind of issue. Instrumentation at the lwlock level *would* have spotted this and other issues too, and will also help us determine what the differences are between the various ways forward for (possibly) changing the current behaviour. -- Best Regards, Simon Riggs ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] lwlocks and starvation
On Wed, 2004-11-24 at 23:30 -0500, Tom Lane wrote: > It is not a 100% solution because it does not > cover the case where a waiting exclusive locker is released, then a new > shared locker arrives at the lock before the exclusive locker is given > any cycles to acquire the lock. However I don't see any cure for the > latter problem that's not worse than the disease Yeah, I don't think this is a problem -- eventually the exclusive waiter will win the coin flip anyway. > On the other hand we might consider that this isn't a big problem and > just leave things as they are. We haven't seen any indication that > starvation is a real problem in practice, and so it might be better to > avoid extra trips through the kernel scheduler. Yes, I'm a little concerned about applying a patch to address what is, so far, an entirely academic concern -- especially if it might hurt performance. -Neil ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] lwlocks and starvation
I wrote: > Neil Conway <[EMAIL PROTECTED]> writes: >> AFAICS, that is not the case. See lwlock.c, circa line 264: in LW_SHARED >> mode, we check if "exclusive" is zero; if so, we acquire the lock >> (increment the shared lock count and do not block). And "exclusive" is >> set non-zero only when we _acquire_ a lock in exclusive mode, not when >> we add an exclusive waiter to the wait queue. > Ooh, you are right. I think that may qualify as a bug ... I am thinking of addressing this issue by applying the attached patch. The patch prevents a would-be shared locker from cutting in front of a waiting exclusive locker. It is not a 100% solution because it does not cover the case where a waiting exclusive locker is released, then a new shared locker arrives at the lock before the exclusive locker is given any cycles to acquire the lock. However I don't see any cure for the latter problem that's not worse than the disease (granting the lock to the released waiter immediately is definitely not better). On the other hand we might consider that this isn't a big problem and just leave things as they are. We haven't seen any indication that starvation is a real problem in practice, and so it might be better to avoid extra trips through the kernel scheduler. In particular, I think the existing behavior may fit in better with the idea that a shared locker should be able to acquire and release the lock multiple times during his timeslice, regardless of whether there is contention. And on the third hand, I think the heavily-contended LWLocks are only taken in exclusive mode anyway, so this may be mostly a moot point. Comments? regards, tom lane *** src/backend/storage/lmgr/lwlock.c.orig Sun Aug 29 01:06:48 2004 --- src/backend/storage/lmgr/lwlock.c Wed Nov 24 23:13:19 2004 *** *** 261,267 } else { ! if (lock->exclusive == 0) { lock->shared++; mustwait = false; --- 261,273 } else { ! /* !* If there is anyone waiting for the lock, we don't take it. !* This could only happen if an exclusive locker is blocked !* by existing shared lockers; we want to queue up behind him !* rather than risk starving him indefinitely. !*/ ! if (lock->exclusive == 0 && lock->head == NULL) { lock->shared++; mustwait = false; *** *** 376,382 } else { ! if (lock->exclusive == 0) { lock->shared++; mustwait = false; --- 382,394 } else { ! /* !* If there is anyone waiting for the lock, we don't take it. !* This could only happen if an exclusive locker is blocked !* by existing shared lockers; we want to queue up behind him !* rather than risk starving him indefinitely. !*/ ! if (lock->exclusive == 0 && lock->head == NULL) { lock->shared++; mustwait = false; ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] lwlocks and starvation
Neil Conway <[EMAIL PROTECTED]> writes: > (Speaking of which, the "exclusive" field is declared as a "char"; I > wonder if it wouldn't be more clear to declare it as "bool", and treat > it as a boolean field. I coded it that way because I was thinking of it as a count (0 or 1), for symmetry with the count of shared holders. You could argue it either way I suppose. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] lwlocks and starvation
Neil Conway <[EMAIL PROTECTED]> writes: > AFAICS, that is not the case. See lwlock.c, circa line 264: in LW_SHARED > mode, we check if "exclusive" is zero; if so, we acquire the lock > (increment the shared lock count and do not block). And "exclusive" is > set non-zero only when we _acquire_ a lock in exclusive mode, not when > we add an exclusive waiter to the wait queue. Ooh, you are right. I think that may qualify as a bug ... regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] lwlocks and starvation
Neil Conway <[EMAIL PROTECTED]> writes: > LWLockRelease() currently does something like (simplifying a lot): > ... > This has the nice property that locks are granted in FIFO order. Is it > essential that we maintain that property? Not really, although avoiding indefinite starvation is important. > If not, we could instead walk > through the wait queue and awaken *all* the shared waiters, and get a > small improvement in throughput. I think this would increase the odds of starvation for exclusive waiters significantly. It would also complicate the list manipulation in LWLockRelease, and the net loss of cycles to that might outweigh any possible speedup anyway. > I can see that this might starve exclusive waiters; however, we allow > the following: > Proc A => LWLockAcquire(lock, LW_SHARED); -- succeeds > Proc B => LWLockAcquire(lock, LW_EXCLUSIVE); -- blocks > Proc C => LWLockAcquire(lock, LW_SHARED); -- succeeds > i.e. we don't *really* follow strict FIFO order anyway. Uh, no; proc C will queue up behind proc B. See LWLockAcquire. Were this not so, again we'd be risking starving proc B, since there could easily be an indefinitely long series of people acquiring and releasing the lock in shared mode. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] lwlocks and starvation
Bruce Momjian wrote: I thought the new readers will sit after the writer in the FIFO queue so the writer will not starve. AFAICS, that is not the case. See lwlock.c, circa line 264: in LW_SHARED mode, we check if "exclusive" is zero; if so, we acquire the lock (increment the shared lock count and do not block). And "exclusive" is set non-zero only when we _acquire_ a lock in exclusive mode, not when we add an exclusive waiter to the wait queue. (Speaking of which, the "exclusive" field is declared as a "char"; I wonder if it wouldn't be more clear to declare it as "bool", and treat it as a boolean field. The storage/alignment requirements should be the same (bool is a typedef for char, at least a C compiler), but IMHO it would be more logical.) -Neil ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] lwlocks and starvation
Neil Conway wrote: > Bruce Momjian wrote: > > My guess is the existing behavior was designed to allow waking of > > multiple waiters _sometimes_ without starving of exclusive waiters. > > Well, I think the current algorithm *does* allow starvation, at least in > some situations. Consider a workload in which a new shared reader > arrives every 50 ms, and holds the lock for, say, 500 ms. If an > exclusive waiter arrives, they will starve with the current algorithm. I thought the new readers will sit after the writer in the FIFO queue so the writer will not starve. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] lwlocks and starvation
Bruce Momjian wrote: My guess is the existing behavior was designed to allow waking of multiple waiters _sometimes_ without starving of exclusive waiters. Well, I think the current algorithm *does* allow starvation, at least in some situations. Consider a workload in which a new shared reader arrives every 50 ms, and holds the lock for, say, 500 ms. If an exclusive waiter arrives, they will starve with the current algorithm. There should be a comment in the code explaining this usage and I bet it was intentional. Oh, I bet it was intentional as well :) I'm mostly curious to see exactly what the reasoning was, and whether it is necessary that we preserve the FIFO behavior while considering optimizations. -Neil ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] lwlocks and starvation
Neil Conway wrote: > LWLockRelease() currently does something like (simplifying a lot): > > acquire lwlock spinlock > decrement lock count > if lock is free > if first waiter in queue is waiting for exclusive lock, > awaken him; else, walk through the queue and awaken > all the shared waiters until we reach an exclusive waiter > end if > release lwlock spinlock > > This has the nice property that locks are granted in FIFO order. Is it > essential that we maintain that property? If not, we could instead walk > through the wait queue and awaken *all* the shared waiters, and get a > small improvement in throughput. > > I can see that this might starve exclusive waiters; however, we allow > the following: > > Proc A => LWLockAcquire(lock, LW_SHARED); -- succeeds > Proc B => LWLockAcquire(lock, LW_EXCLUSIVE); -- blocks > Proc C => LWLockAcquire(lock, LW_SHARED); -- succeeds > > i.e. we don't *really* follow strict FIFO order anyway. My guess is the existing behavior was designed to allow waking of multiple waiters _sometimes_ without starving of exclusive waiters. There should be a comment in the code explaining this usage and I bet it was intentional. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[HACKERS] lwlocks and starvation
LWLockRelease() currently does something like (simplifying a lot): acquire lwlock spinlock decrement lock count if lock is free if first waiter in queue is waiting for exclusive lock, awaken him; else, walk through the queue and awaken all the shared waiters until we reach an exclusive waiter end if release lwlock spinlock This has the nice property that locks are granted in FIFO order. Is it essential that we maintain that property? If not, we could instead walk through the wait queue and awaken *all* the shared waiters, and get a small improvement in throughput. I can see that this might starve exclusive waiters; however, we allow the following: Proc A => LWLockAcquire(lock, LW_SHARED); -- succeeds Proc B => LWLockAcquire(lock, LW_EXCLUSIVE); -- blocks Proc C => LWLockAcquire(lock, LW_SHARED); -- succeeds i.e. we don't *really* follow strict FIFO order anyway. -Neil ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly