I spent a little time reviewing the xlog.c logic, which I hadn't looked
at in awhile.  I see I made a mistake earlier: I claimed that only when
a backend wanted to commit or ran out of space in the WAL buffers would
it issue any write().  This is not true: there is code in XLogInsert()
that will try to issue write() if the WAL buffers are more than half
full:

    /*
     * If cache is half filled then try to acquire write lock and do
     * XLogWrite. Ignore any fractional blocks in performing this check.
     */
    LogwrtRqst.Write.xrecoff -= LogwrtRqst.Write.xrecoff % BLCKSZ;
    if (LogwrtRqst.Write.xlogid != LogwrtResult.Write.xlogid ||
        (LogwrtRqst.Write.xrecoff >= LogwrtResult.Write.xrecoff +
         XLogCtl->XLogCacheByte / 2))
    {
        if (LWLockConditionalAcquire(WALWriteLock, LW_EXCLUSIVE))
        {
            LogwrtResult = XLogCtl->Write.LogwrtResult;
            if (XLByteLT(LogwrtResult.Write, LogwrtRqst.Write))
                XLogWrite(LogwrtRqst);
            LWLockRelease(WALWriteLock);
        }
    }

Because of the "conditional acquire" call, this will not block if
someone else is currently doing a WAL write or fsync, but will just
fall through in that case.  However, if the code does acquire the
lock then the backend will issue some writes --- synchronously, if
O_SYNC or O_DSYNC mode is being used.  It would be better to remove
this code and allow a background process to issue writes for filled
WAL pages.

Note this is done before acquiring WALInsertLock, so it does not block
other would-be inserters of WAL records.

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Reply via email to