Dear Alexander, > I agree with your analysis and would like to propose a PoC fix (see > attached). With this patch applied, 20 iterations succeeded for me.
There are no reviewers so that I will review again. Let's move the PoC to the concrete patch. Note that I only focused on fixes of random failure, other parts are out-of-scope. Basically, code comments can be updated accordingly. 01. ``` /* * This function might be called for a regular file or for a junction * point (which we use to emulate symlinks). The latter must be unlinked * with rmdir() on Windows. Before we worry about any of that, let's see * if we can unlink directly, since that's expected to be the most common * case. */ snprintf(tmppath, sizeof(tmppath), "%s.tmp", path); if (pgrename(path, tmppath) == 0) { if (unlink(tmppath) == 0) return 0; curpath = tmppath; } ``` You can modify comments atop changes because it is not trivial. Below is my draft: ``` * XXX: we rename the target file to ".tmp" before calling unlink. The * removal may fail with STATUS_DELETE_PENDING status on Windows, so * creating the same file would fail. This assumes that renaming is a * synchronous operation. ``` 02. ``` loops = 0; while (lstat(curpath, &st) < 0 && lstat_error_was_status_delete_pending()) { if (++loops > 100) /* time out after 10 sec */ return -1; pg_usleep(100000); /* us */ } ``` Comments can be added atop the part. Below one is my draft. ``` /* * Wait until the removal is really finished to avoid ERRORs for creating a * same file in other functions. */ ``` Best Regards, Hayato Kuroda FUJITSU LIMITED