On Tue, Sep 7, 2021 at 7:00 PM Alexander Lakhin <[email protected]> wrote: > 07.09.2021 09:05, Michael Paquier wrote: > > On Tue, Sep 07, 2021 at 09:00:01AM +0300, Alexander Lakhin wrote: > >> The new approach looks very promising. Knowing that the file is really > >> in the DELETE_PENDING state simplifies a lot. > >> I've tested the patch v2_0001_Check... with my demo tests [1] and [2], > >> and it definitely works.
> > Oho, nice. Just to be sure. You are referring to > > v2-0001-Check*.patch posted here, right? > > https://www.postgresql.org/message-id/ca+hukgkj3p+2acibgaccf_cxe0jlcyevwhexvopk6ul1+v-...@mail.gmail.com > Yes, i've tested that one, on the master branch (my tests needed a minor > modification due to PostgresNode changes). Thanks very much! Time to tidy up some loose ends. There are a couple of judgement calls involved. Here's what Andres and I came up with in an off-list chat. Any different suggestions? 1. I abandoned the "direct NtCreateFile()" version for now. I guess using more and wider unstable interfaces might expose us to greater risk of silent API/behavior changes or have subtle bugs. If we ever have a concrete reason to believe that RtlGetLastNtStatus() is not reliable here, we could reconsider. 2. I dropped the assertion that the signal event has been created before the first call to the open() wrapper. Instead, I taught pg_usleep() to fall back to plain old SleepEx() if the signal stuff isn't up yet. Other solutions are possible of course, but it struck me as a bad idea to place initialisation ordering constraints on very basic facilities like open() and stat(). I should point out explicitly that with this patch, stat() benefits from open()'s tolerance for sharing violations, as a side effect. That is, it'll retry for a short time in the hope that whoever opened our file without allowing sharing will soon go away. I don't know how useful that bandaid loop really is in practice, but I don't see why we'd want that for open() and not stat(), so this change seems good to me on consistency grounds at the very least. 3. We fixed the warnings about macro redefinition with #define UMDF_USING_NTSTATUS and #include <ntstatus.h> in win32_port.h. (Other ideas considered: (1) Andres reported that it also works to move the #include to ~12 files that need things from it, ie things that were suppressed from windows.h by that macro and must now be had from ntstatus.h, but the files you have to change are probably different in back branches if we decide to do that, (2) I tried defining that macro locally in files that need it, *before* including c.h/postgres.h, and then locally include ntstatus.h afterwards, but that seems to violate project style and generally seems weird.) Another thing to point out explicitly is that I added a new file src/port/win32ntdll.c, which is responsible for fishing out the NT function pointers. It was useful to be able to do that in the abandoned NtCreateFile() variant because it needed three of them and I could reduce boiler-plate noise with a static array of function names to loop over. In this version the array has just one element, but I'd still rather centralise this stuff in one place and make it easy to add any more of these that we eventually find a need for. BTW, I also plan to help Victor get his "POSIX semantics" patch[1] into the tree (and extend it to cover more ops). That should make these problems go away in a more complete way IIUC, but doesn't work everywhere (not sure if we have any build farm animals where it doesn't work, if so it might be nice to change that), so it's complementary to this patch. (My earlier idea that that stuff would magically start happening for free on all relevant systems some time soon has faded.) [1] https://www.postgresql.org/message-id/flat/a529b660-da15-5b62-21a0-9936768210fd%40postgrespro.ru
From 6ab7d5e6b5cc6bf62513a5264641e13fc007ebc7 Mon Sep 17 00:00:00 2001 From: Thomas Munro <[email protected]> Date: Sun, 5 Sep 2021 23:49:23 +1200 Subject: [PATCH v3] Check for STATUS_DELETE_PENDING on Windows. 1. Update our open() wrapper to check for NT's STATUS_DELETE_PENDING and translate it to appropriate errors. This is done with RtlGetLastNtStatus(), which is dynamically loaded from ntdll. A new file win32ntdll.c centralizes lookup of NT functions, in case we decide to add more in the future. 2. Remove non-working code that was trying to do something similar for stat(), and just reuse the open() wrapper code. As a side effect, stat() also gains resilience against "sharing violation" errors. 3. Since stat() is used very early in process startup, remove the requirement that the Win32 signal event has been created before pgwin32_open_handle() is reached. Instead, teach pg_usleep() to fall back to a non-interruptible sleep if reached before the signal event is available. Reviewed-by: Andres Freund <[email protected]> Reviewed-by: Alexander Lakhin <[email protected]> Discussion: https://postgr.es/m/CA%2BhUKGJz_pZTF9mckn6XgSv69%2BjGwdgLkxZ6b3NWGLBCVjqUZA%40mail.gmail.com --- configure | 6 ++ configure.ac | 1 + src/backend/port/win32/signal.c | 12 ++- src/include/port.h | 1 + src/include/port/win32_port.h | 7 ++ src/include/port/win32ntdll.h | 18 ++++ src/port/open.c | 102 ++++++++++---------- src/port/win32ntdll.c | 64 +++++++++++++ src/port/win32stat.c | 164 +------------------------------- src/tools/msvc/Mkvcbuild.pm | 3 +- 10 files changed, 169 insertions(+), 209 deletions(-) create mode 100644 src/include/port/win32ntdll.h create mode 100644 src/port/win32ntdll.c diff --git a/configure b/configure index c550cacd5a..adfe03f3f2 100755 --- a/configure +++ b/configure @@ -16818,6 +16818,12 @@ esac ;; esac + case " $LIBOBJS " in + *" win32ntdll.$ac_objext "* ) ;; + *) LIBOBJS="$LIBOBJS win32ntdll.$ac_objext" + ;; +esac + case " $LIBOBJS " in *" win32security.$ac_objext "* ) ;; *) LIBOBJS="$LIBOBJS win32security.$ac_objext" diff --git a/configure.ac b/configure.ac index 2ee710102f..2819b91a8c 100644 --- a/configure.ac +++ b/configure.ac @@ -1922,6 +1922,7 @@ if test "$PORTNAME" = "win32"; then AC_LIBOBJ(system) AC_LIBOBJ(win32env) AC_LIBOBJ(win32error) + AC_LIBOBJ(win32ntdll) AC_LIBOBJ(win32security) AC_LIBOBJ(win32setlocale) AC_LIBOBJ(win32stat) diff --git a/src/backend/port/win32/signal.c b/src/backend/port/win32/signal.c index 580a517f3f..61f06a29f6 100644 --- a/src/backend/port/win32/signal.c +++ b/src/backend/port/win32/signal.c @@ -52,7 +52,17 @@ static BOOL WINAPI pg_console_handler(DWORD dwCtrlType); void pg_usleep(long microsec) { - Assert(pgwin32_signal_event != NULL); + if (unlikely(pgwin32_signal_event == NULL)) + { + /* + * If we're reached by pgwin32_open_handle() early in startup before + * the signal event is set up, just fall back to a regular + * non-interruptible sleep. + */ + SleepEx((microsec < 500 ? 1 : (microsec + 500) / 1000), FALSE); + return; + } + if (WaitForSingleObject(pgwin32_signal_event, (microsec < 500 ? 1 : (microsec + 500) / 1000)) == WAIT_OBJECT_0) diff --git a/src/include/port.h b/src/include/port.h index 82f63de325..ec64be429c 100644 --- a/src/include/port.h +++ b/src/include/port.h @@ -290,6 +290,7 @@ extern bool rmtree(const char *path, bool rmtopdir); * passing of other special options. */ #define O_DIRECT 0x80000000 +extern HANDLE pgwin32_open_handle(const char *, int, bool); extern int pgwin32_open(const char *, int,...); extern FILE *pgwin32_fopen(const char *, const char *); #define open(a,b,c) pgwin32_open(a,b,c) diff --git a/src/include/port/win32_port.h b/src/include/port/win32_port.h index 05c5a53442..f11ac5e47a 100644 --- a/src/include/port/win32_port.h +++ b/src/include/port/win32_port.h @@ -43,9 +43,16 @@ #define _WINSOCKAPI_ #endif +/* + * Tell windows.h that we're going to include ntstatus.h, to avoid + * double-definition of some macros. + */ + #define UMDF_USING_NTSTATUS + #include <winsock2.h> #include <ws2tcpip.h> #include <windows.h> +#include <ntstatus.h> #undef small #include <process.h> #include <signal.h> diff --git a/src/include/port/win32ntdll.h b/src/include/port/win32ntdll.h new file mode 100644 index 0000000000..c4d6a9c00e --- /dev/null +++ b/src/include/port/win32ntdll.h @@ -0,0 +1,18 @@ +/*------------------------------------------------------------------------- + * + * win32ntdll.h + * Dynamically loaded Windows NT functions. + * + * Portions Copyright (c) 2021, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * src/include/port/win32ntdll.h + * + *------------------------------------------------------------------------- + */ + +typedef NTSTATUS (__stdcall *RtlGetLastNtStatus_t)(void); + +extern RtlGetLastNtStatus_t pg_RtlGetLastNtStatus; + +extern int initialize_ntdll(void); diff --git a/src/port/open.c b/src/port/open.c index 14c6debba9..6c7a70c367 100644 --- a/src/port/open.c +++ b/src/port/open.c @@ -19,11 +19,12 @@ #include "postgres_fe.h" #endif +#include "port/win32ntdll.h" + #include <fcntl.h> #include <assert.h> #include <sys/stat.h> - static int openFlagsToCreateFileFlags(int openFlags) { @@ -56,38 +57,25 @@ openFlagsToCreateFileFlags(int openFlags) } /* - * - file attribute setting, based on fileMode? + * Internal function used by pgwin32_open() and _pgstat64(). When + * backup_semantics is true, directories may be opened (for limited uses). On + * failure, INVALID_HANDLE_VALUE is returned and errno is set. */ -int -pgwin32_open(const char *fileName, int fileFlags,...) +HANDLE +pgwin32_open_handle(const char *fileName, int fileFlags, bool backup_semantics) { - int fd; - HANDLE h = INVALID_HANDLE_VALUE; + HANDLE h; SECURITY_ATTRIBUTES sa; int loops = 0; + if (initialize_ntdll() < 0) + return INVALID_HANDLE_VALUE; + /* Check that we can handle the request */ assert((fileFlags & ((O_RDONLY | O_WRONLY | O_RDWR) | O_APPEND | (O_RANDOM | O_SEQUENTIAL | O_TEMPORARY) | _O_SHORT_LIVED | O_DSYNC | O_DIRECT | (O_CREAT | O_TRUNC | O_EXCL) | (O_TEXT | O_BINARY))) == fileFlags); -#ifndef FRONTEND - Assert(pgwin32_signal_event != NULL); /* small chance of pg_usleep() */ -#endif - -#ifdef FRONTEND - - /* - * Since PostgreSQL 12, those concurrent-safe versions of open() and - * fopen() can be used by frontends, having as side-effect to switch the - * file-translation mode from O_TEXT to O_BINARY if none is specified. - * Caller may want to enforce the binary or text mode, but if nothing is - * defined make sure that the default mode maps with what versions older - * than 12 have been doing. - */ - if ((fileFlags & O_BINARY) == 0) - fileFlags |= O_TEXT; -#endif sa.nLength = sizeof(sa); sa.bInheritHandle = TRUE; @@ -102,6 +90,7 @@ pgwin32_open(const char *fileName, int fileFlags,...) &sa, openFlagsToCreateFileFlags(fileFlags), FILE_ATTRIBUTE_NORMAL | + (backup_semantics ? FILE_FLAG_BACKUP_SEMANTICS : 0) | ((fileFlags & O_RANDOM) ? FILE_FLAG_RANDOM_ACCESS : 0) | ((fileFlags & O_SEQUENTIAL) ? FILE_FLAG_SEQUENTIAL_SCAN : 0) | ((fileFlags & _O_SHORT_LIVED) ? FILE_ATTRIBUTE_TEMPORARY : 0) | @@ -140,38 +129,55 @@ pgwin32_open(const char *fileName, int fileFlags,...) /* * ERROR_ACCESS_DENIED is returned if the file is deleted but not yet * gone (Windows NT status code is STATUS_DELETE_PENDING). In that - * case we want to wait a bit and try again, giving up after 1 second - * (since this condition should never persist very long). However, - * there are other commonly-hit cases that return ERROR_ACCESS_DENIED, - * so care is needed. In particular that happens if we try to open a - * directory, or of course if there's an actual file-permissions - * problem. To distinguish these cases, try a stat(). In the - * delete-pending case, it will either also get STATUS_DELETE_PENDING, - * or it will see the file as gone and fail with ENOENT. In other - * cases it will usually succeed. The only somewhat-likely case where - * this coding will uselessly wait is if there's a permissions problem - * with a containing directory, which we hope will never happen in any - * performance-critical code paths. + * case, we'd better ask for the NT status too so we can translate it + * to a more Unix-like error. We hope that nothing clobbers the NT + * status in between the internal NtCreateFile() call and CreateFile() + * returning. + * + * If there's no O_CREAT flag, then we'll pretend the file is + * invisible. With O_CREAT, we have no choice but to report that + * there's a file in the way (which wouldn't happen on Unix). */ - if (err == ERROR_ACCESS_DENIED) + if (err == ERROR_ACCESS_DENIED && + pg_RtlGetLastNtStatus() == STATUS_DELETE_PENDING) { - if (loops < 10) - { - struct stat st; - - if (stat(fileName, &st) != 0) - { - pg_usleep(100000); - loops++; - continue; - } - } + if (fileFlags & O_CREAT) + err = ERROR_FILE_EXISTS; + else + err = ERROR_FILE_NOT_FOUND; } _dosmaperr(err); - return -1; + return INVALID_HANDLE_VALUE; } + return h; +} + +int +pgwin32_open(const char *fileName, int fileFlags,...) +{ + HANDLE h; + int fd; + + h = pgwin32_open_handle(fileName, fileFlags, false); + if (h == INVALID_HANDLE_VALUE) + return -1; + +#ifdef FRONTEND + + /* + * Since PostgreSQL 12, those concurrent-safe versions of open() and + * fopen() can be used by frontends, having as side-effect to switch the + * file-translation mode from O_TEXT to O_BINARY if none is specified. + * Caller may want to enforce the binary or text mode, but if nothing is + * defined make sure that the default mode maps with what versions older + * than 12 have been doing. + */ + if ((fileFlags & O_BINARY) == 0) + fileFlags |= O_TEXT; +#endif + /* _open_osfhandle will, on error, set errno accordingly */ if ((fd = _open_osfhandle((intptr_t) h, fileFlags & O_APPEND)) < 0) CloseHandle(h); /* will not affect errno */ diff --git a/src/port/win32ntdll.c b/src/port/win32ntdll.c new file mode 100644 index 0000000000..b010becadf --- /dev/null +++ b/src/port/win32ntdll.c @@ -0,0 +1,64 @@ +/*------------------------------------------------------------------------- + * + * win32ntdll.c + * Dynamically loaded Windows NT functions. + * + * Portions Copyright (c) 2021, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * + * IDENTIFICATION + * src/port/win32ntdll.c + * + *------------------------------------------------------------------------- + */ + +#include "c.h" + +#include "port/win32ntdll.h" + +RtlGetLastNtStatus_t pg_RtlGetLastNtStatus; + +int +initialize_ntdll(void) +{ + static bool initialized; + HMODULE module; + + static const struct + { + const char *name; + pg_funcptr_t *address; + } routines[] = { + {"RtlGetLastNtStatus", (pg_funcptr_t *) &pg_RtlGetLastNtStatus} + }; + + if (initialized) + return 0; + + if (!(module = LoadLibraryEx("ntdll.dll", NULL, 0))) + { + _dosmaperr(GetLastError()); + return -1; + } + + for (int i = 0; i < lengthof(routines); ++i) + { + pg_funcptr_t address; + + address = (pg_funcptr_t) GetProcAddress(module, routines[i].name); + if (!address) + { + _dosmaperr(GetLastError()); + FreeLibrary(module); + + return -1; + } + + *(pg_funcptr_t *) routines[i].address = address; + } + + initialized = true; + + return 0; +} diff --git a/src/port/win32stat.c b/src/port/win32stat.c index 2ad8ee1359..c851400dc8 100644 --- a/src/port/win32stat.c +++ b/src/port/win32stat.c @@ -18,53 +18,6 @@ #include "c.h" #include <windows.h> -/* - * In order to support MinGW and MSVC2013 we use NtQueryInformationFile as an - * alternative for GetFileInformationByHandleEx. It is loaded from the ntdll - * library. - */ -#if _WIN32_WINNT < 0x0600 -#include <winternl.h> - -#if !defined(__MINGW32__) && !defined(__MINGW64__) -/* MinGW includes this in <winternl.h>, but it is missing in MSVC */ -typedef struct _FILE_STANDARD_INFORMATION -{ - LARGE_INTEGER AllocationSize; - LARGE_INTEGER EndOfFile; - ULONG NumberOfLinks; - BOOLEAN DeletePending; - BOOLEAN Directory; -} FILE_STANDARD_INFORMATION; -#define FileStandardInformation 5 -#endif /* !defined(__MINGW32__) && - * !defined(__MINGW64__) */ - -typedef NTSTATUS (NTAPI * PFN_NTQUERYINFORMATIONFILE) - (IN HANDLE FileHandle, - OUT PIO_STATUS_BLOCK IoStatusBlock, - OUT PVOID FileInformation, - IN ULONG Length, - IN FILE_INFORMATION_CLASS FileInformationClass); - -static PFN_NTQUERYINFORMATIONFILE _NtQueryInformationFile = NULL; - -static HMODULE ntdll = NULL; - -/* - * Load DLL file just once regardless of how many functions we load/call in it. - */ -static void -LoadNtdll(void) -{ - if (ntdll != NULL) - return; - ntdll = LoadLibraryEx("ntdll.dll", NULL, 0); -} - -#endif /* _WIN32_WINNT < 0x0600 */ - - /* * Convert a FILETIME struct into a 64 bit time_t. */ @@ -162,120 +115,18 @@ int _pgstat64(const char *name, struct stat *buf) { /* - * We must use a handle so lstat() returns the information of the target - * file. To have a reliable test for ERROR_DELETE_PENDING, we use - * NtQueryInformationFile from Windows 2000 or - * GetFileInformationByHandleEx from Server 2008 / Vista. + * Our open wrapper will report STATUS_DELETE_PENDING as ENOENT. We + * request FILE_FLAG_BACKUP_SEMANTICS so that we can open directories too, + * for limited purposes. We use the private handle-based version, so we + * don't risk running out of fds. */ - SECURITY_ATTRIBUTES sa; HANDLE hFile; int ret; -#if _WIN32_WINNT < 0x0600 - IO_STATUS_BLOCK ioStatus; - FILE_STANDARD_INFORMATION standardInfo; -#else - FILE_STANDARD_INFO standardInfo; -#endif - - if (name == NULL || buf == NULL) - { - errno = EINVAL; - return -1; - } - /* fast not-exists check */ - if (GetFileAttributes(name) == INVALID_FILE_ATTRIBUTES) - { - _dosmaperr(GetLastError()); - return -1; - } - - /* get a file handle as lightweight as we can */ - sa.nLength = sizeof(SECURITY_ATTRIBUTES); - sa.bInheritHandle = TRUE; - sa.lpSecurityDescriptor = NULL; - hFile = CreateFile(name, - GENERIC_READ, - (FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE), - &sa, - OPEN_EXISTING, - (FILE_FLAG_NO_BUFFERING | FILE_FLAG_BACKUP_SEMANTICS | - FILE_FLAG_OVERLAPPED), - NULL); + hFile = pgwin32_open_handle(name, O_RDONLY, true); if (hFile == INVALID_HANDLE_VALUE) - { - DWORD err = GetLastError(); - - CloseHandle(hFile); - _dosmaperr(err); return -1; - } - - memset(&standardInfo, 0, sizeof(standardInfo)); - -#if _WIN32_WINNT < 0x0600 - if (_NtQueryInformationFile == NULL) - { - /* First time through: load ntdll.dll and find NtQueryInformationFile */ - LoadNtdll(); - if (ntdll == NULL) - { - DWORD err = GetLastError(); - - CloseHandle(hFile); - _dosmaperr(err); - return -1; - } - - _NtQueryInformationFile = (PFN_NTQUERYINFORMATIONFILE) (pg_funcptr_t) - GetProcAddress(ntdll, "NtQueryInformationFile"); - if (_NtQueryInformationFile == NULL) - { - DWORD err = GetLastError(); - CloseHandle(hFile); - _dosmaperr(err); - return -1; - } - } - - if (!NT_SUCCESS(_NtQueryInformationFile(hFile, &ioStatus, &standardInfo, - sizeof(standardInfo), - FileStandardInformation))) - { - DWORD err = GetLastError(); - - CloseHandle(hFile); - _dosmaperr(err); - return -1; - } -#else - if (!GetFileInformationByHandleEx(hFile, FileStandardInfo, &standardInfo, - sizeof(standardInfo))) - { - DWORD err = GetLastError(); - - CloseHandle(hFile); - _dosmaperr(err); - return -1; - } -#endif /* _WIN32_WINNT < 0x0600 */ - - if (standardInfo.DeletePending) - { - /* - * File has been deleted, but is not gone from the filesystem yet. - * This can happen when some process with FILE_SHARE_DELETE has it - * open, and it will be fully removed once that handle is closed. - * Meanwhile, we can't open it, so indicate that the file just doesn't - * exist. - */ - CloseHandle(hFile); - errno = ENOENT; - return -1; - } - - /* At last we can invoke fileinfo_to_stat */ ret = fileinfo_to_stat(hFile, buf); CloseHandle(hFile); @@ -296,11 +147,6 @@ _pgfstat64(int fileno, struct stat *buf) return -1; } - /* - * Since we already have a file handle there is no need to check for - * ERROR_DELETE_PENDING. - */ - return fileinfo_to_stat(hFile, buf); } diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm index 84f15f7e85..bebb0578dc 100644 --- a/src/tools/msvc/Mkvcbuild.pm +++ b/src/tools/msvc/Mkvcbuild.pm @@ -107,7 +107,8 @@ sub mkvcbuild pg_strong_random.c pgcheckdir.c pgmkdirp.c pgsleep.c pgstrcasecmp.c pqsignal.c mkdtemp.c qsort.c qsort_arg.c bsearch_arg.c quotes.c system.c strerror.c tar.c thread.c - win32env.c win32error.c win32security.c win32setlocale.c win32stat.c); + win32env.c win32error.c win32ntdll.c + win32security.c win32setlocale.c win32stat.c); push(@pgportfiles, 'strtof.c') if ($vsVersion < '14.00'); -- 2.30.2
