Re: [HACKERS] Latch implementation that wakes on postmaster death on both win32 and Unix

Florian Pflug Fri, 08 Jul 2011 07:58:38 -0700

On Jul8, 2011, at 14:40 , Heikki Linnakangas wrote:
> On 08.07.2011 13:58, Florian Pflug wrote:
>> On Jul8, 2011, at 11:57 , Peter Geoghegan wrote:
>>> On 7 July 2011 19:15, Robert Haas<robertmh...@gmail.com>  wrote:
>>>>> I'm not concerned about the possibility of spurious extra cycles of
>>>>> auxiliary process event loops - should I be?
>>>> 
>>>> A tight loop would be bad, but an occasional spurious wake-up seems 
>>>> harmless.
>>> 
>>> We should also assert !PostmasterIsAlive() from within the latch code
>>> after waking due to apparent Postmaster death. The reason that I don't
>>> want to follow Florian's suggestion to check it in production is that
>>> I don't know what to do if the postmaster turns out to be alive. Why
>>> is it more reasonable to try again than to just return?
>> 
>> I'd say return, but don't indicate postmaster death in the return value
>> if PostmasterIsAlive() returns true. Or don't call PostmasterIsAlive() in
>> WaitLatch(), and return indicating postmaster death whenever select()
>> says so, and put the burden of re-checking on the callers.
> 
> I put the burden on the callers. Removing the return value from WaitLatch() 
> altogether just makes life unnecessarily difficult for callers that could 
> safely use that information, even if you sometimes get spurious wakeups. In 
> particular, the coding in pgarch.c is nicer if you can simply check the 
> return code for WL_TIMEOUT, rather than call time(NULL) to figure out if the 
> timeout was reached.
> 
> Attached is a new version of this patch. PostmasterIsAlive() now uses read() 
> on the pipe instead of kill().


I did notice a few (very minor) loose ends...

SyncRepWaitForLSN() says
  /*
   * Wait on latch for up to 60 seconds. This allows us to check for
   * postmaster death regularly while waiting. Note that timeout here
   * does not necessarily release from loop.
   */
  WaitLatch(&MyProc->waitLatch, 60000000L);

I guess that 60-second timeout is unnecessary now that we'll wake up
on postmaster death anyway.

Also, none of the callers of WaitLatch() seems to actually inspect
the WL_POSTMASTER_DEATH bit of the result. We might want to make
their !PostmasterIsAlive() check conditional on the WL_POSTMASTER_DEATH
bit being set. At least in the case of SyncRepWaitForLSN(), it seems
that avoiding the extra read() syscall might be beneficial.

Maybe these cleanups would better be done in a separate patch, though...

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Latch implementation that wakes on postmaster death on both win32 and Unix

Reply via email to