Re: [HACKERS] Latch implementation that wakes on postmaster death on both win32 and Unix

Florian Pflug Fri, 08 Jul 2011 09:17:05 -0700

On Jul8, 2011, at 17:56 , Peter Geoghegan wrote:
> On 8 July 2011 15:58, Florian Pflug <[email protected]> wrote:
>> SyncRepWaitForLSN() says
>>  /*
>>   * Wait on latch for up to 60 seconds. This allows us to check for
>>   * postmaster death regularly while waiting. Note that timeout here
>>   * does not necessarily release from loop.
>>   */
>>  WaitLatch(&MyProc->waitLatch, 60000000L);
>> 
>> I guess that 60-second timeout is unnecessary now that we'll wake up
>> on postmaster death anyway.
> 
> We won't wake up on Postmaster death here, because we haven't asked to
> - not yet, anyway. We're just using the new interface here for that
> one function call in v8.


Oh, Right. I still think it'd might be worthwhile to convert it, but
but not in this patch.

>> Also, none of the callers of WaitLatch() seems to actually inspect
>> the WL_POSTMASTER_DEATH bit of the result. We might want to make
>> their !PostmasterIsAlive() check conditional on the WL_POSTMASTER_DEATH
>> bit being set. At least in the case of SyncRepWaitForLSN(), it seems
>> that avoiding the extra read() syscall might be beneficial.
> 
> I don't think so. Postmaster death is an anomaly, so why bother with
> any sort of optimisation for that case? Also, that's exactly the sort
> of thing that we're trying to caution callers against doing with this
> comment:
> 
> "That should be rare in practice, but the caller should not use the
> return value for anything critical, re-checking the situation with
> PostmasterIsAlive() or read() on a socket if necessary."

Uh, I phrased that badly. What I meant was doing

  if ((result & WL_POSTMASTER_DEATH) && (!PostmasterIsAlive())

instead of

  if (!PostmasterIsAlive)

It seems that currently SyncRepWaitForLSN() will execute
PostmasterIsAlive() after every wake up. But actually it only needs
to do that if WaitLatch() sets WL_POSTMASTER_DEATH. Usually we wouldn't
care, but in the case of SyncRepWaitForLSN() I figures that we might.
It's in the code path of COMMIT (in the case of synchronous replication)
after all...

We'd not optimize the case of a dead postmaster, but the case of
an live one. Which I do hope is the common case ;-)

> You might say that the only reason we even bother reporting postmaster
> death in the returned bitfield is because there is an expectation that
> it will report it, given that we use the same masks on wakeEvents to
> inform the function what events we'll actually be waiting on for the
> call.

I kinda guessed that to be the reason after reading the latest patch ;-)

best regards,
Florian Pflug




-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Latch implementation that wakes on postmaster death on both win32 and Unix

Reply via email to