On Fri, Aug 25, 2023 at 8:47 AM Heikki Linnakangas <hlinn...@iki.fi> wrote: > 1. Create an unlogged table > 2. ALTER TABLE unlogged_tbl SET TABLESPACE ... -- This calls > RelationCopyStorage > 3. a checkpoint happens while the command is running > 4. After the ALTER TABLE has finished, shut down postgres cleanly. > 5. Lose power > > When you recover, the unlogged table is not reset, because it was a > clean postgres shutdown. But the part of the file that was copied after > the checkpoint at step 3 was never fsync'd, so part of the file is lost. > I was able to reproduce with a VM that I kill to simulate step 5. > > This is the same scenario that the smgrimmedsync() call above protects > from for WAL-logged relations. But we got the condition wrong: instead > of worrying about the init fork, we need to call smgrimmedsync() for all > *other* forks of an unlogged relation. > > Fortunately the fix is trivial, see attached.
The general rule throughout the system is that the init-fork of an unlogged relation is treated the same as a permanent relation: it is WAL-logged and fsyncd. But the other forks of an unlogged relation are neither WAL-logged nor fsync'd ... except in the case of a clean shutdown, when we fsync even that data. In other words, somehow it feels like we ought to be trying to defer the fsync here until a clean shutdown actually occurs, instead of performing it immediately. Admittedly, the bookkeeping seems like a problem, so maybe this is the best we can do, but it's clearly worse than what we do in other cases. -- Robert Haas EDB: http://www.enterprisedb.com