Robert Haas <robertmh...@gmail.com> writes: > On Fri, Aug 7, 2020 at 12:09 PM Tom Lane <t...@sss.pgh.pa.us> wrote: >> ... it's not very clear what to do in >> backends that haven't got access to that table, but maybe we could just >> accept that backends that are forced to flush dirty buffers might do some >> useless writes in such cases.
> I don't see how that can work. It's not that the writes are useless; > it's that they will fail outright because the file doesn't exist. At least in the case of segment zero, the file will still exist. It'll have been truncated to zero length, and if the filesystem is stupid about holes in files then maybe a write to a high block number would consume excessive disk space, but does anyone still care about such filesystems? I don't remember at the moment how we handle higher segments, but likely we could make them still exist too, postponing all the unlinks till after checkpoint. Or we could just have the backends give up on recycling a particular buffer if they can't write it (which is the response to an I/O failure already, I hope). > My viewpoint on this is - I have yet to see anybody really get hosed > because they drop one relation and that causes a full scan of > shared_buffers. I mean, it's slightly expensive, but computers are > fast. Whatever. What hoses people is dropping a lot of relations in > quick succession, either by spamming DROP TABLE commands or by running > something like DROP SCHEMA, and then suddenly they're scanning > shared_buffers over and over again, and their standby is doing the > same thing, and now it hurts. Yeah, trying to amortize the cost across multiple drops seems like what we really want here. I'm starting to think about a "relation dropper" background process, which would be somewhat like the checkpointer but it wouldn't have any interest in actually doing buffer I/O. We'd send relation drop commands to it, and it would scan all of shared buffers and flush related buffers, and then finally do the file truncates or unlinks. Amortization would happen by considering multiple target relations during any one scan over shared buffers. I'm not very clear on how this would relate to the checkpointer's handling of relation drops, but it could be worked out; if we were lucky maybe the checkpointer could stop worrying about that. regards, tom lane