Re: Unlogged relations and WAL-logging

Peter Eisentraut Fri, 01 Sep 2023 05:50:21 -0700

Is the patch0003-Remove-unnecessary-smgrimmedsync-when-creating-unlog.patch stillrelevant, or can this commitfest entry be closed?


On 23.08.23 16:40, Heikki Linnakangas wrote:

5. In heapam_relation_set_new_filenode(), we do this:
        /*
* If required, set up an init fork for an unlogged tableso that it can * be correctly reinitialized on restart. An immediatesync is required * even if the page has been logged, because the write didnot go through * shared_buffers and therefore a concurrent checkpoint mayhave moved the * redo pointer past our xlog record. Recovery may as wellremove it * while replaying, for example, XLOG_DBASE_CREATE orXLOG_TBLSPC_CREATE * record. Therefore, logging is necessary even ifwal_level=minimal.
         */
        if (persistence == RELPERSISTENCE_UNLOGGED)
        {
                Assert(rel->rd_rel->relkind == RELKIND_RELATION ||
                           rel->rd_rel->relkind == RELKIND_MATVIEW ||
rel->rd_rel->relkind ==RELKIND_TOASTVALUE);
                smgrcreate(srel, INIT_FORKNUM, false);
                log_smgrcreate(newrnode, INIT_FORKNUM);
                smgrimmedsync(srel, INIT_FORKNUM);
        }
The comment doesn't make much sense, we haven't written nor WAL-logged
any page here, with nor without the buffer cache. It made more sense
before commit fa0f466d53.


Well, it seems to me (and perhaps I am just confused) that complaining
that there's no page written here might be a technicality. The point
is that there's no synchronization between the work we're doing here
-- which is creating a fork, not writing a page -- and any concurrent
checkpoint. So we both need to log it, and also sync it immediately.


I see. I pushed the fix from the other thread that makes smgrcreate()
call register_dirty_segment (commit 4b4798e13). I believe that makes
this smgrimmedsync() unnecessary. If a concurrent checkpoint happens
with a redo pointer greater than this WAL record, it must've received
the fsync request created by smgrcreate(). That depends on the fact that
we write the WAL record *after* smgrcreate(). Subtle..

Hmm, we have a similar smgrimmedsync() call after index build, because
we have written pages directly with smgrextend(skipFsync=true). If no
checkpoints have occurred during the index build, we could call
register_dirty_segment() instead of smgrimmedsync(). That would avoid
the fsync() latency when creating an index on an empty or small index.

This is all very subtle to get right though. That's why I'd like to
invent a new bulk-creation facility that would handle this stuff, and
make the callers less error-prone.

Having a more generic and less error-prone bulk-creation mechanism isstill on my long TODO list..

Re: Unlogged relations and WAL-logging

Reply via email to