On 21.11.2010 15:18, Robert Haas wrote:
On Sat, Nov 20, 2010 at 4:07 PM, Tom Lane<t...@sss.pgh.pa.us>  wrote:
Robert Haas<robertmh...@gmail.com>  writes:
So what DO we need to guard against here?

I think the general problem can be stated as "process A changes two or
more values in shared memory in a fairly short span of time, and process
B, which is concurrently examining the same variables, sees those
changes occur in a different order than A thought it made them in".

In practice we do not need to worry about changes made with a kernel
call in between, as any sort of context swap will cause the kernel to
force cache synchronization.

Also, the intention is that the locking primitives will take care of
this for any shared structures that are protected by a lock.  (There
were some comments upthread suggesting maybe our lock code is not
bulletproof; but if so that's something to fix in the lock code, not
a logic error in code using the locks.)

So what this boils down to is being an issue for shared data structures
that we access without using locks.  As, for example, the latch
structures.

So is the problem case a race involving owning/disowning a latch vs.
setting that same latch?

No. (or maybe that as well, but that's not what we've been concerned about here). As far as I've understood correctly, the problem is that process A does something like this:

/* set a shared variable */
((volatile bool *) shmem)->variable = true;
/* Wake up process B to notice that we changed the variable */
SetLatch();

And process B does this:

for (;;)
{
  ResetLatch();
  if (((volatile bool *) shmem)->variable)
    DoStuff();

  WaitLatch();
}

This is the documented usage pattern of latches. The problem arises if process A runs just before ResetLatch, but the effect of setting the shared variable doesn't become visible until after the if-test in process B. Process B will clear the is_set flag in ResetLatch(), but it will not DoStuff(), so it in effect misses the wakeup from process A and goes back to sleep even though it would have work to do.

This situation doesn't arise in the current use of latches, because the shared state comparable to shmem->variable in the above example is protected by a spinlock. But it might become an issue in some future use case.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to