On 12/02/2021 4:33 am, Thorsten Schöning wrote:
The behaviour you describe happens exactly when two processes e.g.
concurrently hold HANDLEs on the same file and one of those deletes
the file then. Windows keeps file names until all open HANDLEs are
closed and depending on how those HANDLEs have been opened by the
first app, concurrent deletion is perferctly fine for Windows.
Though, a such deleted file can't be opened easily anymore and looks
like it has lost permissions only. But that's not the case, it's
deleted already. It might be that this happens for Postgres to itself
somehow when some other app has an open HANDLE. I don't think that
some other app is deleting that file by purpose instead, reading it
for some reason seems more likely to me.
Using Process Monitor, Thorsten's explanation above appears to correctly
diagnose what is happening. ProcMon data shows postgres.exe performing
"CreateFile" operations on the affected WAL files, with the result
status "DELETE PENDING". Which according to
https://stackoverflow.com/a/29892104 means:
"Windows allows a process to delete a file, even though it is still
opened by another process (e.g. Windows indexing service or
Antivirus). It gets internally marked as "delete pending". The file
does not actually get removed from the file system, it is still
there after the File.Delete call. Anybody that tries to open the
file after that gets an access denied error. The file doesn't
actually get removed until the last handle to the file object gets
closed"
which is the same behaviour Thorsten describes above (great info, thanks
Thorsten).
The mystery now is that the only process logged as touching the affected
WAL files is postgres.exe (of which there are many separate processes).
Could it be that one of the postgres.exe instances is holding the
affected WAL files in use after another postgres.exe instance has
flagged the file as deleted? (or to put it the other way, a postgres.exe
instance is flagging the file as deleted while another instance still
has an open handle to the file)? If it is some other process such as the
indexer (disabled) or AV (excluded from pgdata) is obtaining a handle on
the WAL files, it isn't being logged by ProcMon.
Kind regards,
Guy