Re: LogwrtResult contended spinlock

Andres Freund Fri, 29 Jan 2021 18:30:38 -0800

Hi,

On 2021-01-29 22:47:56 -0300, Alvaro Herrera wrote:
> So I tried this, but -- perhaps not suprisingly -- I can't get it to
> work properly; the synchronization fails.


What do you mean by "synchronization fails"?


> Strangely, all tests work for me, but the pg_upgrade one in particular
> fails.

It's one of the few tests using fsync=on.


> +static inline void
> +pg_atomic_monotonic_advance_u64(volatile pg_atomic_uint64 *ptr, uint64 
> target_)
> +{
> +#ifndef PG_HAVE_ATOMIC_U64_SIMULATION
> +     AssertPointerAlignment(ptr, 8);
> +#endif
> +     /* FIXME is this algorithm correct if we have u64 simulation? */

I don't see a problem.


> +     while (true)
> +     {
> +             uint64          currval;
> +
> +             currval = pg_atomic_read_u64(ptr);
> +             if (currval > target_)
> +                     break;  /* already done by somebody else */
> +             if (pg_atomic_compare_exchange_u64(ptr, &currval, target_))
> +                     break;  /* successfully done */
> +     }
> +}

I suggest writing this as

    currval = pg_atomic_read_u64(ptr);
    while (currval < target_)
    {
        if (pg_atomic_compare_exchange_u64(ptr, &currval, target_))
            break;
    }

>  /*
>   * Inserting to WAL is protected by a small fixed number of WAL insertion
> @@ -596,8 +599,10 @@ typedef struct XLogCtlData
>  {
>       XLogCtlInsert Insert;
>  
> +     XLogwrtAtomic LogwrtRqst;
> +     XLogwrtAtomic LogwrtResult;
> +
>       /* Protected by info_lck: */
> -     XLogwrtRqst LogwrtRqst;

Not sure putting these into the same cacheline is a good idea.


> @@ -2166,12 +2163,11 @@ AdvanceXLInsertBuffer(XLogRecPtr upto, bool 
> opportunistic)
>                       if (opportunistic)
>                               break;
>  
> -                     /* Before waiting, get info_lck and update LogwrtResult 
> */
> -                     SpinLockAcquire(&XLogCtl->info_lck);
> -                     if (XLogCtl->LogwrtRqst.Write < OldPageRqstPtr)
> -                             XLogCtl->LogwrtRqst.Write = OldPageRqstPtr;
> -                     LogwrtResult = XLogCtl->LogwrtResult;
> -                     SpinLockRelease(&XLogCtl->info_lck);
> +                     /* Before waiting, update LogwrtResult */
> +                     
> pg_atomic_monotonic_advance_u64(&XLogCtl->LogwrtRqst.Write, OldPageRqstPtr);
> +
> +                     LogwrtResult.Write = 
> pg_atomic_read_u64(&XLogCtl->LogwrtResult.Write);
> +                     LogwrtResult.Flush = 
> pg_atomic_read_u64(&XLogCtl->LogwrtResult.Flush);

I don't think it's quite as easy as this. Write/Flush now aren't
guaranteed to be coherent with each other - previously they were. And
because it's in a global variable used everywhere, we can't just be more
careful about update protocols in one place...

We also shouldn't re-read a variable that we just did via the
pg_atomic_monotonic_advance_u64().

I think we should stop updating both the Write/Flush position at the
same time. That way we don't have an expectation of them being coherent
with each other. Most places really don't need both anyway.


>       {
> -             SpinLockAcquire(&XLogCtl->info_lck);
> -             XLogCtl->LogwrtResult = LogwrtResult;
> -             if (XLogCtl->LogwrtRqst.Write < LogwrtResult.Write)
> -                     XLogCtl->LogwrtRqst.Write = LogwrtResult.Write;
> -             if (XLogCtl->LogwrtRqst.Flush < LogwrtResult.Flush)
> -                     XLogCtl->LogwrtRqst.Flush = LogwrtResult.Flush;
> -             SpinLockRelease(&XLogCtl->info_lck);
> +             pg_atomic_monotonic_advance_u64(&XLogCtl->LogwrtResult.Write, 
> LogwrtResult.Write);
> +             pg_atomic_monotonic_advance_u64(&XLogCtl->LogwrtResult.Flush, 
> LogwrtResult.Flush);
> +
> +             pg_atomic_monotonic_advance_u64(&XLogCtl->LogwrtRqst.Write, 
> LogwrtResult.Write);
> +             pg_atomic_monotonic_advance_u64(&XLogCtl->LogwrtRqst.Flush, 
> LogwrtResult.Flush);

Hm. We went from one cheap atomic operation (SpinLockAcquire) to four
expensive ones in the happy path. That's not free...

I don't think we need to manipulate XLogCtl->LogwrtResult.* using atomic
ops - they can only be updated with WALWriteLock held, right?

XLogCtl->LogwrtResult was updated with plain assignment before, why did
you change it to pg_atomic_monotonic_advance_u64()?


> @@ -1169,6 +1169,8 @@ XLogInsertRecord(XLogRecData *rdata,
>       {
>               /* advance global request to include new block(s) */
>               pg_atomic_monotonic_advance_u64(&XLogCtl->LogwrtRqst.Write, 
> EndPos);
> +             pg_memory_barrier();
> +

That's not really useful - the path that actually updates already
implies a barrier. It'd probably be better to add a barrier to a "never
executed cmpxchg" fastpath.



> @@ -2905,6 +2909,7 @@ XLogFlush(XLogRecPtr record)
>               WriteRqstPtr = pg_atomic_read_u64(&XLogCtl->LogwrtRqst.Write);
>               LogwrtResult.Write = 
> pg_atomic_read_u64(&XLogCtl->LogwrtResult.Write);
>               LogwrtResult.Flush = 
> pg_atomic_read_u64(&XLogCtl->LogwrtResult.Flush);
> +             pg_read_barrier();

I'm not sure you really can get away with just a read barrier in these
places. We can't e.g. have later updates to other shared memory
variables be "moved" to before the barrier (which a read barrier
allows).

Greetings,

Andres Freund

Re: LogwrtResult contended spinlock

Reply via email to