Eryk Sun <[email protected]> added the comment:
> I've found a catch via ProcessHacker: CreateFile() with
> GENERIC_WRITE (or FILE_GENERIC_WRITE) additionally grants
> FILE_READ_ATTRIBUTES for some reason.
CreateFileW always requests at least SYNCHRONIZE and FILE_READ_ATTRIBUTES
access.
The I/O manager requires synchronize access if a file is opened in synchronous
mode. CreateFileW goes a step further. It always requests synchronize access,
even with asynchronous mode (overlapped). The File object gest signaled when an
I/O request completes, but it's not very useful in the context of overlapping
requests.
Requesting read-attributes access supports API functions that query certain
file information. Here are some of the more common queries that require
read-attributes access:
FileBasicInformation (GetFileInformationByHandleEx, GetFileTime)
FileAllInformation (GetFileInformationByHandle)
FileAttributeTagInformation (GetFileInformationByHandleEx)
Thus os.fstat(fd) can succeed even if the file is opened in O_WRONLY mode.
CreateFileW also implicitly requests DELETE access if FILE_FLAG_DELETE_ON_CLOSE
is used, instead of letting the call fail with an invalid-parameter error if
delete access isn't requested. This behavior isn't documented.
> undoing the side effect applies to O_CREAT and O_TRUNC too: we can create
> and/or
> truncate the file, but then fail.
I think truncation via TRUNCATE_EXISTING (O_TRUNC, with O_WRONLY or O_RDWR) or
overwriting with CREATE_ALWAYS (O_CREAT | O_TRUNC) is at least tolerable
because the caller doesn't care about the existing data. When overwriting, the
caller also wants to remove any alternate data streams and extended attributes
in the file. Nothing important is lost. Also, since both cases retain the
original file's security descriptor, at least failure after truncation or
overwriting isn't a security hole.
Unless we require CREATE_NEW (O_CREAT | O_EXCL) whenever O_TEMPORARY is used
(i.e. as the tempfile module uses it), there is a potential for an existing
file to be deleted if all handles are closed on failure, as discussed
previously. This is unacceptable not only because of potential unrecoverable
data loss, but also because the security descriptor is lost.
With OPEN_ALWAYS (O_CREAT), CREATE_ALWAYS or CREATE_NEW, there's the chance of
leaving behind a new empty file or alternate data stream on failure, which is a
problem, but at least nothing is lost.
> _open_osfhandle() can still fail with EMFILE.
The CRT supports 8192 open file descriptors (128 arrays of 64 fds), so failing
with EMFILE should be rare, in extreme cases. There's also a remote possibility
of memory corruption that causes __acrt_lowio_set_os_handle() to fail with
EBADF because the fd value is negative, or its handle value isn't the default
INVALID_HANDLE_VALUE, or the CRT _nhandle count is corrupt. These aren't
practical concerns, just as DuplicateHandle() failing isn't a practical
concern, but failure should be handled conservatively.
> the same issue would apply even in case of direct implementation of
> os.open()/open() via CreateFile()
Migrating to CreateFileW() might need to be shelved until Python uses native OS
File handles instead of CRT file descriptors. The remaining reliance on the CRT
low I/O layer ties our hands for now.
> Truncation can simply be deferred until we have the fd and then performed
> manually.
What if it fails after overwriting an existing file? Manually overwriting only
after getting the new fd is complicated. To match CREATE_ALWAYS (O_CREAT |
O_TRUNC), before overwriting it would have to query the existing file
attributes and fail the call if FILE_ATTRIBUTE_HIDDEN or FILE_ATTRIBUTE_SYSTEM
is set. If the file itself has to be overwritten (i.e. the default, anonymous
data stream), as opposed to a named data stream, it would have to delete all
named data streams and extended attributes in the file. Normally that's all
implemented atomically in the filesystem.
In contrast, TRUNCATE_EXISTING (O_TRUNC) is simple to emulate, since
CreateFileW implents it non-atomically with a subsequent NtSetInformationFile:
FileAllocationInformation system call.
> But I still don't know how to deal with O_TEMPORARY, unless there is a
> way to unset FILE_DELETE_ON_CLOSE on a handle.
For now, that's possible with NTFS and the Windows API in all supported
versions of Windows by using a second kernel File with DELETE access, which is
opened before the last handle to the first kernel File is closed. After you
close the first open, use the second one to call SetFileInformation:
FileDispositionInfo to undelete the file. That said, if NTFS changes the
default for delete-on-close to use a POSIX-style delete (immediate unlink), it
won't be possible to 'undelete' the file.
Windows 10 supports additional flags with FileDispositionInfoEx (21), or NTAPI
FileDispositionInformationEx [1]. This provides a better way to disable or
modify the delete-on-close state per kernel File object, if the filesystem
supports it. If FILE_DISPOSITION_ON_CLOSE (8) is set with
FILE_DISPOSITION_DO_NOT_DELETE (0), the on-close disposition will be disabled.
It is not possible, as far as I know, to enable it again. For example:
>>> fd = os.open('spam.txt', os.O_TEMPORARY|os.O_CREAT)
>>> h = msvcrt.get_osfhandle(fd)
>>> info = ctypes.c_ulong(8)
>>> kernel32.SetFileInformationByHandle(h, 21, ctypes.byref(info),
ctypes.sizeof(info))
1
>>> os.close(fd)
>>> os.path.exists('spam.txt')
True
If FILE_DISPOSITION_ON_CLOSE is set with FILE_DISPOSITION_DELETE (1) and
FILE_DISPOSITION_POSIX_SEMANTICS (2), the delete-on-close behavior is changed
to use POSIX semantics, which immediately unlinks the file even if there are
existing opens. For example:
>>> fd = os.open('spam.txt', os.O_TEMPORARY|os.O_CREAT)
>>> h = msvcrt.get_osfhandle(fd)
>>> info = ctypes.c_ulong(8|2|1)
>>> kernel32.SetFileInformationByHandle(h, 21, ctypes.byref(info),
ctypes.sizeof(info))
1
Add a second open:
>>> fd2 = os.open('spam.txt', os.O_TEMPORARY)
Normally the second open would keep the file linked in the directory after it's
'deleted', but not with POSIX semantics:
>>> os.close(fd)
>>> os.path.exists('spam.txt')
False
>>> 'spam.txt' in os.listdir('.')
False
---
[1]
https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/ntddk/ns-ntddk-_file_disposition_information_ex
----------
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue42606>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com