Re: [PERFORM] Proposal of tunable fix for scalability of 8.4

Scott Carey Thu, 19 Mar 2009 23:31:38 -0700

________________________________________
From: [email protected] 
[[email protected]] On Behalf Of Simon Riggs 
[[email protected]]
Sent: Wednesday, March 18, 2009 12:53 AM
To: Matthew Wakeling
Cc: [email protected]
Subject: Re: [PERFORM] Proposal of tunable fix for scalability of 8.4

> On Mon, 2009-03-16 at 16:26 +0000, Matthew Wakeling wrote:
> > One possibility would be for the locks to alternate between exclusive
> > and
> > shared - that is:
> >
> > 1. Take a snapshot of all shared waits, and grant them all -
> > thundering
> >      herd style.
> > 2. Wait until ALL of them have finished, granting no more.
> > 3. Take a snapshot of all exclusive waits, and grant them all, one by
> > one.
> > 4. Wait until all of them have been finished, granting no more.
> > 5. Back to (1)
>
> I agree with that, apart from the "granting no more" bit.
>
> Currently we queue up exclusive locks, but there is no need to since for
> ProcArrayLock commits are all changing different data.
>
> The most useful behaviour is just to have two modes:
> * exclusive-lock held - all other x locks welcome, s locks queue
> * shared-lock held - all other s locks welcome, x locks queue
>
> This *only* works for ProcArrayLock.
>
> --
>  Simon Riggs           www.2ndQuadrant.com
>  PostgreSQL Training, Services and Support
> 

I want to comment on an important distinction between these two variants.  The 
"granting no more" bit WILL decrease performance under high contention.  Here 
is my reasoning.

We have two "two lists" proposals.  

Type A:  allow line cutting (Simon, above):
* exclusive-lock held and all exclusives process - all other NEW x locks 
welcome, s locks queue
* shared-lock held and all shareds process- all other NEW s locks welcome, x 
locks queue

Type B: forbid line cutting (Matthew, above, modified to allow multiple 
exclusive for ProcArrayLock --
 for other types exclusive would be one at a time)
* exclusive-lock held and all exclusives process - all NEW lock requests queue
* shared-lock held and shareds process - all NEW lock requests queue

A big benefit of the "wake all" proposal, is that a lot of access does not have 
to context switch out and back in.  On a quick assessment, the type A above 
would lock and context switch even less than the wake-all (since exclusives 
don't go one at a time) but otherwise be similar.  But this won't matter much 
if it is shared lock dominated.
I would LOVE to have seen context switch rate numbers with the results so far, 
but many base unix tools don't show it by default (can get it from sar, rstat 
reports it) average # of context switches per transaction is an awesome measure 
of lock contention and lock efficiency. 

In type A above, the ratio of requests that require a context switch is Q / (M 
+ Q), where Q is the average queue size when the 'shared-exclusive' swap occrs 
and M is the average number of "line cutters".

In type B, the ratio of requests that must context switch is always == 1.  
Every request must queue and wait!  This may perform worse than the current 
lock!

One way to guarantee some fairness is to compromise between the two. 

Lets call this proposal C.  Unfortunately, this is less elegant than the other 
two, since it has logic for both.  It could be made tunable to be the complete 
spectrum though.
* exclusive-lock held and all exclusives process - first N new X requests 
welcome, N+1 and later X requests and all shared locks queue.
* shared-lock held and shareds process - first N new S requests welcom, N+1 and 
later S requests and all X locks queue

So, if shared locks are queuing and exclusive hold the lock and are operating, 
and another exclusive request arrives, it can cut in line only if it is one of 
the first N to do so before it will queue and wait and give shared locks their 
turn. 
This counting condition can be done with an atomically incrementing integer 
using compare and set operations and no locks, and under heavy contention will 
reduce the number of context switches per operation to Q/(N + Q) where N is the 
number of 'line cutters' achieved and Q is the average queue size when the 
queued items are unlocked.  Note how this is the same as the 'unbounded' 
equation with M above, except that N can never be greater than M (the 'natural' 
line cut count).
So for N = Q half are forced to context switch and half cut in line without a 
context switch.  N can be tunable, and it can be a different number for shared 
and exclusive to bias towards one or the other if desired. 
-- 
Sent via pgsql-performance mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Proposal of tunable fix for scalability of 8.4

Reply via email to