On Thu, Aug 20, 2020 at 5:46 PM Tom Lane <t...@sss.pgh.pa.us> wrote: > Jason Myers <j.my...@brstrat.com> writes: > > However we were still seeing orphaned files on crash, and I believe I > > tracked it down to subsequent CREATE INDEX statements also creating these > > orphaned files (if they are running during a crash). > > Is that issue known as well? I don't believe I can use the same trick to > > sidestep that one... > > Yeah, it's entirely intentional that we don't try to clean up orphaned > disk files after a database crash. There's a long discussion of this and > related topics in src/backend/access/transam/README. What that says about > why not is that such files' contents might be useful for forensic analysis > of the crash, and anyway "Orphan files are harmless --- at worst they > waste a bit of disk space". A point not made in that text, but true > anyway, is that it'd also be quite expensive to search a large database > for orphaned files, so people would likely not want to pay that price > on the way to getting their database back up. > > There might be value in a user-invokable tool that runs in an existing > non-crashed database and looks for orphan files, but I'm not aware that > anyone has written one. (Race conditions against concurrent table > creation would be a problem; but probably that can be finessed somehow, > maybe by noting the file's creation time.) > > In the meantime I've got to say that routinely kill 9'ing database > processes just doesn't seem like a very good idea. Yeah, we do our best > to ensure that there won't be data loss, but you're really doubling down > on a hard assumption that Postgres contains zero bugs when you operate > that way. I'd suggest reconfiguring things to avoid the OOM kill hazard; > or if your cloud provider makes that effectively impossible, maybe you > need another provider. But on most systems I'd think you could use ulimit > or the like even if you don't have root privileges. > > regards, tom lane >
Understood, thanks for the reply. -Jason