On Thu, Oct 22, 2020 at 9:50 PM Kyotaro Horiguchi <horikyota....@gmail.com> wrote: > By the way, heap scan finds the size of target relation using > smgrnblocks(). I'm not sure why we don't miss recently-extended pages > on a heap-scan? It seems to be possible that concurrent checkpoint > fsyncs relation files inbetween the extension and scanning and the > scanning gets smaller size than it really is.
Yeah. That's a narrow window: fsync() returns an error after the file shrinks and we immediately panic. A version with a wider window: the kernel tries to write in the background, gets an I/O error, shrinks the file, but we don't know this and we continue running until the next checkpoint calls fsync(), sees the error and panics. Seq scans between those two events fail to see recently committed data at the end of the table.