[issue47260] os.closerange() can be no-op in a seccomp sandbox
Change by Alexey Izbyshev : -- keywords: +patch stage: -> patch review ___ Python tracker <https://bugs.python.org/issue47260> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue47260] os.closerange() can be no-op in a seccomp sandbox
Alexey Izbyshev added the comment: > It's been years now and that hasn't happened, even with more recent flag > additions. I think it's safe to say it won't, and such a fallback upon error > won't put us back into a bogus pre-close_range situation where we're > needlessly close()ing a bunch of closed fds. Yes, I also find it unlikely that close_range() will start reporting an error if it fails to close some fd, especially if not given some new flag. Such error reporting doesn't seem very useful to userspace because there is no way to associate the error with the fd. But even if such change happens, it will be just a performance regression, not a potential correctness/security issue. -- keywords: -patch stage: patch review -> ___ Python tracker <https://bugs.python.org/issue47260> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue47260] os.closerange() can be no-op in a seccomp sandbox
Change by Alexey Izbyshev : -- keywords: +patch pull_requests: +30443 stage: -> patch review pull_request: https://github.com/python/cpython/pull/32418 ___ Python tracker <https://bugs.python.org/issue47260> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue47260] os.closerange() can be no-op in a seccomp sandbox
New submission from Alexey Izbyshev : After #40422 _Py_closerange() assumes that close_range() closes all file descriptors even if it returns an error (other than ENOSYS): if (close_range(first, last, 0) == 0 || errno != ENOSYS) { /* Any errors encountered while closing file descriptors are ignored; * ENOSYS means no kernel support, though, * so we'll fallback to the other methods. */ } else /* fallbacks */ This assumption can be wrong on Linux if a seccomp sandbox denies the underlying syscall, pretending that it returns EPERM or EACCES. In this case _Py_closerange() won't close any descriptors at all, which in the worst case can be a security issue. I propose to fix this by falling back to other methods in case of *any* close_range() error. Note that fallbacks will not be triggered on any problems with closing individual file descriptors because close_range() is documented to ignore such errors on both Linux[1] and FreeBSD[2]. [1] https://man7.org/linux/man-pages/man2/close_range.2.html [2] https://www.freebsd.org/cgi/man.cgi?query=close_range&sektion=2 -- assignee: izbyshev components: Library (Lib) keywords: 3.10regression messages: 416986 nosy: gregory.p.smith, izbyshev, kevans, kevans91 priority: normal severity: normal status: open title: os.closerange() can be no-op in a seccomp sandbox type: behavior versions: Python 3.10, Python 3.11 ___ Python tracker <https://bugs.python.org/issue47260> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue47245] potential undefined behavior with subprocess using vfork() on Linux?
Alexey Izbyshev added the comment: > 3. We have to fix error-path in order not to change heap state (contents and > allocations), possibly do not touch locks. During vfork() child execution - > the only parent THREAD (not the process) is blocked. For example, it's not > allowed to touch GIL. Child process may die unexpectedly and leave GIL > locked. Is it possible to rewrite children path for vfork() case without any > Py* calls ? As an idea, we can prepare all low-level things (all the pointers > to strings and plain values) before vfork(), so child code will use only that > data. What specifically do you propose to fix? There is no problem with GIL if the child dies because the GIL is locked and unlocked only by the parent and the child never touches it. Similarly, only Py_* calls known to be safe are used. As for "pointers to strings", it's not clear to me what you mean, but if you mean allocations, they are already done before (v)fork(), since the child code is required to be async-signal-safe even if plain fork() is used. -- ___ Python tracker <https://bugs.python.org/issue47245> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35823] Use vfork() in subprocess on Linux
Alexey Izbyshev added the comment: The preceding comment is wrong, see discussion in #47245 and https://bugzilla.kernel.org/show_bug.cgi?id=215813#c14 for explanation of why that bug report is irrelevant for CPython. -- ___ Python tracker <https://bugs.python.org/issue35823> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue47245] potential undefined behavior with subprocess using vfork() on Linux?
Alexey Izbyshev added the comment: > As for glibc specifics, I'm mostly thinking of the calls we do in the child. > According to the "Standard Description (POSIX.1)" calls to anything other > than `_exit()` or `exec*()` are not allowed. But the longer "Linux > Description" in that vfork(2) man page does not say that. Which implies > merely by omission that calls to other things are okay so long as you > understand everything they do to the process heap/stack/state. (I wish it > were *explicit* about that) If we're talking about the kernel side of things, sure, we rely on Linux being "sane" here, though I suppose on *BSDs the situation is similar. > Some of the calls we do from our child_exec() code... many are likely "just" > syscall shims and thus fine - but that is technically up to libc. Yes, but I wouldn't say that "being just syscall shims" is specific for glibc. It's just a "natural" property that just about any libc is likely to possess. (Yeah, I know, those are vague words, but in my experience "glibc-specific" is usually applied to some functionality/bug present in glibc and absent in other libcs, and I don't think we rely on something like that). Of course, there are also LD_PRELOAD things that could be called instead of libc, but good news here is that we don't create new constrains for them (CPython is not the only software that uses vfork()), and they're on their own otherwise. > A few others are Py functions that go elsewhere in CPython and while they may > be fine for practical reasons today with dangerous bits on conditional > branches that technically should not be possible to hit given the state by > the time we're at this point in _posixsubprocess, pose a future risk - anyone > touching implementations of those is likely unaware of vfork'ed child > limitations that must be met. We already have async-signal-safety requirement for all such code because of fork(). Requirements of vfork() are a bit more strict, but at least the set of functions we have to watch for dangerous changes is the same. And I suspect that most practical violations of vfork()-safety also violate async-signal-safety. > For example if one of the potential code paths that trigger an indirect > Py_FatalError() is hit... that fatal exit code is definitely not > post-vfork-child safe. The pre-exec child dying via that could screw up the > vfork parent process's state. Yeah, and it can break the fork parent too, at least because it uses exit() (not _exit()), so stdio buffers will be flushed twice, in the child and in the parent. -- ___ Python tracker <https://bugs.python.org/issue47245> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue47245] potential undefined behavior with subprocess using vfork() on Linux
Alexey Izbyshev added the comment: In short: both this bug report and [1] are invalid. The reason why doing syscall(SYS_vfork) is illegal is explained by Florian Weimer in [2]: >The syscall function in glibc does not protect the on-stack return address >against overwriting, so it can't be used to call SYS_vfork on x86. This is off-topic here because CPython calls vfork() via libc, but I'll still expand the Florian's comment. Suppose one wants to write my_vfork() wrapper over vfork syscall. When vfork syscall returns the first time, my_vfork() has to return to its caller. This necessarily involves knowing the return address. On some architectures this return address is stored on the stack by the caller (e.g. x86). The problem is that any data in my_vfork() stack frame can be overwritten by its caller once it returns the first time. Then, when vfork syscall returns the second time, my_vfork() could be unable to return to its caller because the data it fetches from its (now invalid) stack frame is garbage. This is precisely what happens when one implements my_vfork() as syscall(SYS_vfork). To avoid this, the most common strategy is to store the return address into a register that's guaranteed to be preserved around syscall by the OS ABI. For example, the x86-64 musl implementation [3] stores the retur n address in rdx (which is preserved around syscall) and then restores it after syscall (both on the first and the second return of the syscall). Now back to CPython. The real problem with stack sharing between the child and the parent could be due to compiler bugs, e.g. if a variable stored on the stack is modified in the child branch of "if (vfork())", but the compiler assumes that some other variable sharing the stack slot with the first one is *not* modified in the parent branch (i.e. after vfork() returns the second time). But all production compilers handle vfork() (and setjmp(), which has a similar issue) in a special way to avoid this, and GCC has __attribute__((returns_twice)) that a programmer could use for custom functions behaving in this way (my_vfork() would have to use this attribute). Regarding a separate stack for the child and clone(CLONE_VM|CLONE_VFORK), I considered this in #35823, but this has its own problems. The most important one is that CPython would be in business of choosing the correct stack size for the child's stack, but CPython is *not* in control of all the code executing in the child because it calls into libc. In practice, people use various LD_PRELOAD-based software that wraps various libc functions (e.g. Scratchbox 2 build environment for Sailfish OS is an LD_PRELOAD-based sandbox), so even common-sense assumptions like "execve() in libc can't use a lot of stack!" might turn out to be wrong. CPython *could* work around that by using direct syscalls for everything in the child, or by using some "large" size that "should be enough for everybody", but I wouldn't want to see that unless we have a real problem with vfork(). Note that vfork()-based posix_spawn() implementations in C libraries do *not* have this problem because they fully control all code in the child (e.g. they would use a non-interposable execve() symbol or a direct syscall). > Immediate action item: Add a way for people to disable vfork at runtime by > setting a flag in the subprocess module, just in case. I don't think any action is needed at all, and I think this issue should be closed. > Our current assumptions around the use of vfork() are very much glibc > specific. Could you clarify what glibc-specific assumptions you mean? In #35823 I tried to use as little assumptions as possible. [1] https://bugzilla.kernel.org/show_bug.cgi?id=215813 [2] https://bugzilla.kernel.org/show_bug.cgi?id=215813#c2 [3] https://git.musl-libc.org/cgit/musl/tree/src/process/x86_64/vfork.s?id=ced75472d7e3d73d5b057e36ccbc7b7fcba95104 -- nosy: +izbyshev ___ Python tracker <https://bugs.python.org/issue47245> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42738] subprocess: don't close all file descriptors by default (close_fds=False)
Alexey Izbyshev added the comment: > As a concrete example, we have a (non-Python) build system and task runner > that orchestrates many tasks to run in parallel. Some of those tasks end up > invoking Python scripts that use subprocess.run() to run other programs. Our > task runner intentionally passes an inheritable file descriptor that is > unique to each task as a form of a keep-alive token; if the child processes > continue to pass inheritable file descriptors to their children, then we can > determine whether all of the processes spawned from a task have terminated by > checking whither the last open handle to that file descriptor has been > closed. This is particularly important when a processes exits before its > children, sometimes uncleanly due to being force killed by the system or by a > user. I don't see how such scheme could be usable in a general-purpose build system. Python is just one example of a program that closes file descriptors for child processes, but there might be arbitrary more. In general, there is no guarantee that a descriptor would be inherited in a process tree if it contains at least one process running code that you don't fully control. To properly control process trees, I'd suggest to consider prctl(PR_SET_CHILD_SUBREAPER) or PID namespaces on Linux and job objects on Windows (don't know about other OSes). > Regarding security, PEP 446 already makes it so that any files opened from > within a Python program are non-inheritable by default, which I agree is a > good default. One can make the argument that it's not Python's job to enforce > a security policy on file descriptors that a Python process has inherited > from a parent process, since Python cannot distinguish from descriptors that > were accidentally or intentionally inherited. While I agree that such argument could be made, closing FDs for children also affects FDs opened by C extensions, which are not subject to PEP 446. And this is important not only for security: for example, it's easy to create a deadlock if one process is blocked on a read() from a pipe, which it expects to be closed at a proper moment, but the pipe write end is leaked to some unrelated long-running process which keeps it alive indefinitely without being aware of it. And again, even if subprocess didn't close FDs by default (or somehow closed only non-inherited FDs), how could this be used in a wider picture? Any third-party library (even written in Python) could still decide to close e.g. some range of descriptors, so one wouldn't be able to rely on FD inheritance in a general setting, just as one can't rely on inheritance of environment variables (works most of the time, but no promises). Traditional Unix things like inheritance across fork()/exec() and no O_CLOEXEC by default are convenient for quick hacks, but my impression is that such defaults are considered to be design bugs in the modern setting by many. -- ___ Python tracker <https://bugs.python.org/issue42738> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue44656] Dangerous mismatch between MAXPATHLEN and MAX_PATH on Windows
New submission from Alexey Izbyshev : In PC/getpathp.c CPython uses buffers with length MAXPATHLEN+1, which is 257 on Windows[1]. On Windows 7, where PathCch* functions are not available, CPython <= 3.8 fallbacks to PathCombineW()/PathCanonicalizeW()[2]. Those functions assume that the destination buffer has room for MAX_PATH (260) characters. This creates a dangerous setup: for example, gotlandmark()[3] can overflow the destination if `filename` is long enough, and `filename` can be user-controlled. I couldn't devise a simple way to trigger a buffer overflow in a default Python installation, though it is possible if one, for example, makes sure that the landmark file ("lib\os.py") can't be found in the default locations and then supplies their own, long enough paths via e.g. PYTHONPATH environment variable which eventually end up in gotlandmark(). Even when such buffer overflow is triggered on my machine, I couldn't notice any change in behavior, probably because 3 bytes is small enough to not overwrite anything important. However, I'm not comfortable with this. Could we just raise MAXPATHLEN from 256 to 260 on Windows to avoid such kind of issues for sure? Please also note that while the issue described above affects only Python <= 3.8 on Windows 7, I think it would make sense to increase MAXPATHLEN in newer versions too to avoid any similar situations in the future (i.e. if two pieces of code interact and one of them uses MAX_PATH while another uses MAXPATHLEN). [1] https://github.com/python/cpython/blob/0389426fa4af4dfc8b1d7f3f291932d928392d8b/Include/osdefs.h#L13 [2] https://github.com/python/cpython/blob/0389426fa4af4dfc8b1d7f3f291932d928392d8b/PC/getpathp.c#L278 [3] https://github.com/python/cpython/blob/0389426fa4af4dfc8b1d7f3f291932d928392d8b/PC/getpathp.c#L333 -- components: Windows messages: 397655 nosy: izbyshev, paul.moore, steve.dower, tim.golden, zach.ware priority: normal severity: normal status: open title: Dangerous mismatch between MAXPATHLEN and MAX_PATH on Windows type: security versions: Python 3.6, Python 3.7, Python 3.8 ___ Python tracker <https://bugs.python.org/issue44656> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue32592] Drop support of Windows Vista and Windows 7
Alexey Izbyshev added the comment: > If we had a dedicated maintainer who was supporting Win7 and making releases > for it, then we (i.e. they) could support it. But then, there's nothing to > stop someone doing that already, and even to stop them charging money for it > if they want (which they wouldn't be able to do under the auspices of > python-dev). So I suspect nobody is really that motivated ;) That's totally fair. > (Also, a "little bit of complaining" here is totally worthwhile, as it helps > us gauge community sentiment. Without it, we're limited to trawling forums > and Twitter - especially without conferences to actually meet and hear > directly from people - and so our inputs are biased. Having polite, informed, > discussion on the tracker is a great way for us to see and capture these > positions.) I agree. In my comment, I only intended to contrast "complaining" with stepping up to do the work. I didn't mean to imply that it's inappropriate per-se. -- ___ Python tracker <https://bugs.python.org/issue32592> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue32592] Drop support of Windows Vista and Windows 7
Alexey Izbyshev added the comment: > If Win8-only calls are not used, then presumably it should still build and > run on Windows 7, presumably with the flag flipped back to Win7. And if there > are Win8-only calls used and the flag is set to Win7+, I assume that the MSVC > compiler will catch any instances of Win8-only calls at build time? Is this > understanding correct? It's the latter. In the patch, calls to functions like PathCchCanonicalizeEx(), which had previously been conditional on runtime availability of those functions, were made direct. Those functions are not from UCRT (so you can't just distribute it to regain Windows 7 support) and are not available in Windows 7. There was also another patch that adds a check to Python 3.9+ installer that prevents installation on Windows 7 (but again, even if you take the Windows "embedded" distro instead of the installer, it won't run due to Win8-only functions). > If the latter is true, its very likely a lost cause. However, if the former > is the case, then at least devs using Python 3.9+ on Windows 7 will be able > to easily build their own versions with support, though that could change at > any time. For now, it's not entirely lost cause in the sense that the number of Win8-specific patches and their size is very small (I'm not aware of anything not already mentioned, but I may be not up-to-date). So an interested party can probably revert them locally without much hassle. > For those in the know, have there been a lot of reports/discontent > surrounding the dropping of support in Py3.9, by anyone aware of that? I semi-casually follow the bug tracker and remember only bpo-41412 and bpo-42027. Personally, I do consider tying of Windows support in CPython to arbitrary support periods set by Microsoft unfortunate, and it's also true that Win8-specific changes I'm aware of are not critical and don't remove any significant maintenance burden. For now, both relative and absolute usage of Windows 7 is huge and hard to ignore, so developers who use CPython in their own software are faced with an unpleasant choice of sticking to 3.8 or dropping Win 7. (I'm one of them since software I work on has to run in build environments setup for other projects, and such environments routinely stick to old OS versions due to breakage on updates). But this is what we have according to PEP 11, and a little grumbling here certainly won't change that :) -- ___ Python tracker <https://bugs.python.org/issue32592> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue32592] Drop support of Windows Vista and Windows 7
Alexey Izbyshev added the comment: As far as I understand, commit [1] explicitly prevents CPython from running on Windows 7, and it's included into 3.9. So it seems to be too late to complain, despite that, according to Wikipedia, more than 15% of all Windows PCs are still running Windows 7 [2]. [1] https://github.com/izbyshev/cpython/commit/6a65eba44bfd82ccc8bed4b5c6dd6637549955d5 [2] https://en.wikipedia.org/wiki/Windows_7 -- ___ Python tracker <https://bugs.python.org/issue32592> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43308] subprocess.Popen leaks file descriptors opened for DEVNULL or PIPE stdin/stdout/stderr arguments
Change by Alexey Izbyshev : -- nosy: +gregory.p.smith ___ Python tracker <https://bugs.python.org/issue43308> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43113] os.posix_spawn errors with wrong information when shebang points to not-existing file
Alexey Izbyshev added the comment: I generally agree, but getting a good, short error message seems to be the hard part here. I previously complained[1] about the following proposal by @hroncok: FileNotFoundError: [Errno 2] No such file or directory: Either './demo' or the interpreter of './demo' not found. But may be it's just me. Does anybody else feel that mentioning "the interpreter" is this way could be confusing in prevalent cases when the actual problem is missing './demo' itself? If we can come up with a good message, I can look into turning it into a PR. The error message above also reads to me like there are no other possible reasons of ENOENT. On Linux, binfmt_misc[2] provides a way to run arbitrary code on execve(). This is used, for example, to transparently run binaries for foreign arches via qemu-user, so probably ENOENT would be returned if QEMU itself it missing. QEMU *may* be thought as a kind of interpreter here, though it's completely unrelated to a hash-bang or an ELF interpreter. But I don't know about other POSIX platforms. As a theoretical example, if the dynamic library loader is implemented in the kernel, the system call could return ENOENT in case of a missing library. Do we need to worry about it? Does anybody know about the situation on macOS, where posix_spawn() is a system call too? [1] https://bugs.python.org/issue43113#msg386210 [2] https://en.wikipedia.org/wiki/Binfmt_misc -- ___ Python tracker <https://bugs.python.org/issue43113> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43113] os.posix_spawn errors with wrong information when shebang points to not-existing file
Alexey Izbyshev added the comment: How do you propose to approach documentation of such behavior? The underlying cause is the ambiguity of ENOENT error code from execve() returned by the kernel, so it applies to all places where Python can call execve(), including os.posixspawn(), os.execve() and subprocess, so it's not clear to me where such documentation should be placed. And, of course, this behavior is not specific to CPython. The Linux man pages mention various causes of this error[1], though POSIX doesn't[2]. While ENOENT ambiguity is indeed confusing, one of the top results of my DDG search on "linux no such file or directory but script exists" is this link[3]. [1] https://man7.org/linux/man-pages/man2/execve.2.html [2] https://pubs.opengroup.org/onlinepubs/9699919799/functions/execve.html [3] https://stackoverflow.com/questions/3949161/no-such-file-or-directory-but-it-exists -- ___ Python tracker <https://bugs.python.org/issue43113> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43113] os.posix_spawn errors with wrong information when shebang points to not-existing file
Alexey Izbyshev added the comment: > IMO the fix is simple: only create OSError from the errno, never pass a > filename. This will remove a normally helpful piece of the error message in exchange to being marginally less confusing in a rare case of non-existing interpreter (the user will still be left wondering why the file they tried to run exists, but "No such file or directory" is reported). So the only "win" here would be for CPython developers because users will be less likely to report a bug like this one. > posix_spawn() is really complex function which can fail in many different > ways. This issue is not specific to posix_spawn(): subprocess and os.execve() report the filename too. Any fix would need to change those too for consistency. -- nosy: +gregory.p.smith ___ Python tracker <https://bugs.python.org/issue43113> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43113] os.posix_spawn errors with wrong information when shebang points to not-existing file
Alexey Izbyshev added the comment: > FileNotFoundError: [Errno 2] No such file or directory: Either './demo' or > the interpreter of './demo' not found. This doesn't sound good to me because a very probable and a very improbable reasons are combined together without any distinction. Another possible issue is that usage of the word "interpreter" in this narrow sense might be non-obvious for users. ISTM that the most minimal way to check for the possibility of interpreter issue would be do something like `access(exe_path, X_OK)` in case of ENOENT: if it's successful, then a "bad interpreter" condition is likely. But in case of os.posix_spawnp(), the search in PATH is performed by libc, so CPython doesn't know exe_path. OTOH, subprocess and os.exec*p do perform their own search in PATH, but in case of subprocess it happens in the context of the child process, so we'll probably need to devise a separate error code to report to the parent via the error pipe to distinguish this condition. So far, I'm not convinced that the result is worth it, but I do agree that such mysterious "No such file" errors are not user-friendly. -- ___ Python tracker <https://bugs.python.org/issue43113> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43113] os.posix_spawn errors with wrong information when shebang points to not-existing file
Alexey Izbyshev added the comment: > Ideally, the error would say: > FileNotFoundError: ./demo: /usr/bin/hugo: bad interpreter: No such file or > directory The kernel simply returns ENOENT on an attempt to execve() a file with non-existing hash-bang interpreter. The same occurs on an attempt to run a dynamically linked ELF executable with INTERP program header containing a non-existing path. And, of course, the same error is returned if the executable file itself doesn't exist, so there is no simple way to distinguish such cases. Bash simply assumes[1] that if a file contains a hash-bang and the error from execve() is not recognized otherwise, it's a "bad interpreter". Note that all of the above is completely unrelated to os.posix_spawn(): subprocess or os.execve() would produce the same message. [1] https://git.savannah.gnu.org/cgit/bash.git/tree/execute_cmd.c?h=bash-5.1#n5854 -- nosy: +izbyshev ___ Python tracker <https://bugs.python.org/issue43113> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43069] Python fails to read a script whose path is `/dev/fd/X`
Alexey Izbyshev added the comment: I would suggest to start digging from the following piece of code in `maybe_pyc_file()` (Python/pythonrun.c): int ispyc = 0; if (ftell(fp) == 0) { if (fread(buf, 1, 2, fp) == 2 && ((unsigned int)buf[1]<<8 | buf[0]) == halfmagic) ispyc = 1; rewind(fp); } -- nosy: +izbyshev ___ Python tracker <https://bugs.python.org/issue43069> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42606] Support POSIX atomicity guarantee of O_APPEND on Windows
Alexey Izbyshev added the comment: > I think truncation via TRUNCATE_EXISTING (O_TRUNC, with O_WRONLY or O_RDWR) > or overwriting with CREATE_ALWAYS (O_CREAT | O_TRUNC) is at least tolerable > because the caller doesn't care about the existing data. Yes, I had a thought that creating or truncating a file when asked is in some sense "less wrong" than deleting an existing file on open() failure, but I'm still not comfortable with it. It would be nice, for example, if an open() with O_CREAT | O_EXCL failed, then the file would indeed not be created in all cases. >> Truncation can simply be deferred until we have the fd and then performed >> manually. > If the file itself has to be overwritten (i.e. the default, anonymous data > stream), as opposed to a named data stream, it would have to delete all named > data streams and extended attributes in the file. Normally that's all > implemented atomically in the filesystem. > In contrast, TRUNCATE_EXISTING (O_TRUNC) is simple to emulate, since > CreateFileW implents it non-atomically with a subsequent > NtSetInformationFile: FileAllocationInformation system call. Oh. So CREATE_ALWAYS for an existing file has a very different semantics than TRUNCATE_EXISTING, which means we can't easily use OPEN_ALWAYS with a deferred manual truncation, I see. >> But I still don't know how to deal with O_TEMPORARY, unless there is a >> way to unset FILE_DELETE_ON_CLOSE on a handle. > For now, that's possible with NTFS and the Windows API in all supported > versions of Windows by using a second kernel File with DELETE access, which > is opened before the last handle to the first kernel File is closed. After > you close the first open, use the second one to call SetFileInformation: > FileDispositionInfo to undelete the file. So the idea is to delete the file for a brief period, but then undelete it. As I understand it, any attempt to open the file while it's in the deleted state (but still has a directory entry) will fail. This is probably not critical since it could happen only on an unlikely failure of _open_oshandle(), but is still less than perfect. > Windows 10 supports additional flags with FileDispositionInfoEx (21), or > NTAPI FileDispositionInformationEx [1]. This provides a better way to disable > or modify the delete-on-close state per kernel File object, if the filesystem > supports it. This is nice and would reduce our non-atomicity to just the following: if _wopen() would have failed even before CreateFile() (e.g. due to EMFILE), our reimplementation could still create a handle with FILE_DELETE_ON_CLOSE, so if our process is terminated before we unset it, we'll still lose the file. But it seems like the best we can get in any situation when we need to wrap a handle with an fd. === Regarding using WriteFile()-with-OVERLAPPED approach in FileIO, I've looked at edge cases of error remapping in MSVCRT write() again. If the underlying WriteFile() fails, it only remaps ERROR_ACCESS_DENIED to EBADF, presumably to deal with write() on a O_RDONLY fd. But if WriteFile() writes zero bytes and does not fail, it gets more interesting: * If the file type is FILE_TYPE_CHAR and the first byte to write was 26 (Ctrl-Z), it's treated as success. I don't think I understand: it *seems* like it's handling something like writing to the *input* of a console, but I'm not sure it's even possible in this manner. * Anything else is a failure with ENOSPC. This is mysterious too. I've checked how java.io.FileOutputStream deals with WriteFile() succeeding with zero size (just to compare with another language runtime) and haven't found any special handling[1]. Moreover, it seems like FileOutputStream.write(bytes[]) will enter an infinite loop if WriteFile() always returns 0 for a non-empty byte array[2]. So it's unclear to me what MSVCRT is up to here and whether FileIO should do the same. [1] https://github.com/openjdk/jdk/blob/b4ace3e9799dab5970d84094a0fd1b2d64c7f923/src/java.base/windows/native/libjava/io_util_md.c#L522 [2] https://github.com/openjdk/jdk/blob/b4ace3e9799dab5970d84094a0fd1b2d64c7f923/src/java.base/share/native/libjava/io_util.c#L195 -- ___ Python tracker <https://bugs.python.org/issue42606> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42606] Support POSIX atomicity guarantee of O_APPEND on Windows
Alexey Izbyshev added the comment: > FYI, here are the access rights applicable to files Thanks, I checked that mapping in headers when I was writing _Py_wopen_noraise() as well. But I've found a catch via ProcessHacker: CreateFile() with GENERIC_WRITE (or FILE_GENERIC_WRITE) additionally grants FILE_READ_ATTRIBUTES for some reason. This is why I add FILE_READ_ATTRIBUTES to FILE_GENERIC_WRITE for DuplicateHandle() -- otherwise, the duplicated handle didn't have the same rights in my testing. So basically I have to deal with (a) _wopen() not guaranteeing contractually any specific access rights and (b) CreateFile() not respecting the specified access rights exactly. This is where my distrust and my original question about "default" access rights come from. > I overlooked a case that's a complication. For O_TEMPORARY, the open uses > FILE_FLAG_DELETE_ON_CLOSE; adds DELETE to the requested access; and adds > FILE_SHARE_DELETE to the share mode. > _Py_wopen_noraise() can easily keep the required DELETE access. The > complication is that you have to be careful not to close the original file > descriptor until after you've successfully created the duplicate file > descriptor. If it fails, you have to return the original file descriptor from > _wopen(). If you close all handles for the kernel File and fail the call, the > side effect of deleting the file is unacceptable. Good catch! But now I realize that the problem with undoing the side effect applies to O_CREAT and O_TRUNC too: we can create and/or truncate the file, but then fail. Even if we could assume that DuplicateHandle() can't fail, _open_osfhandle() can still fail with EMFILE. And since there is no public function in MSVCRT to replace an underlying handle of an existing fd, we can't preallocate an fd to avoid this. There would be no problem if we could just reduce access rights of an existing handle, but it seems there is no such API. I don't like the idea of silently dropping the atomic append guarantee in case of a failure, so I'm not sure how to proceed with the current approach. Moreover, the same issue would apply even in case of direct implementation of os.open()/open() via CreateFile() because we still need to wrap the handle with an fd, and this may fail with EMFILE. For O_CREAT/O_TRUNC, it seems like it could be reasonably avoided: * Truncation can simply be deferred until we have the fd and then performed manually. * To undo the file creation, we could use GetLastError() to learn whether CreateFile() with OPEN_ALWAYS actually created the file, and then delete it on failure (still non-atomic, and deletion can fail, but probably better than nothing). But I still don't know how to deal with O_TEMPORARY, unless there is a way to unset FILE_DELETE_ON_CLOSE on a handle. As an aside, it's also very surprising to me that O_TEMPORARY is allowed for existing files at all. If not for that, there would be no issue on failure apart from non-atomicity. Maybe I should forgo the idea of supporting O_APPEND for os.open(), and instead just support it in FileIO via WriteFile()-with-OVERLAPPED approach... -- ___ Python tracker <https://bugs.python.org/issue42606> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42606] Support POSIX atomicity guarantee of O_APPEND on Windows
Alexey Izbyshev added the comment: > I don't know what you mean by default access rights. I meant the access rights of the handle created by _wopen(). In my PR I basically assume that _wopen() uses GENERIC_READ/GENERIC_WRITE access rights, but _wopen() doesn't have a contractual obligation to do exactly that AFAIU. For example, if it got some extra access rights, then my code would "drop" them while switching FILE_WRITE_DATA off. > For example, if FILE_WRITE_DATA isn't granted, then open() can't open for > appending. A direct CreateFileW() call can remove FILE_WRITE_DATA from the > desired access. Indeed, I haven't thought about it. Are you aware of a common scenario when a regular file allows appending but not writing? But, at least, this is not a regression: currently open()/os.open() can't open such files for appending too. > An opener could also work like your PR via os.open(), msvcrt.get_osfhandle(), > _winapi.GetFileType(), _winapi.DuplicateHandle(), os.close(), and > msvcrt.open_osfhandle(). True, but it still falls into "etc" category of "ctypes/pywin32/etc" :) Certainly doable, but it seems better to have consistent O_APPEND behavior across platforms out-of-the-box. -- ___ Python tracker <https://bugs.python.org/issue42606> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful
Change by Alexey Izbyshev : -- nosy: +izbyshev ___ Python tracker <https://bugs.python.org/issue42969> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42888] Not installed “libgcc_s.so.1” causes exit crash.
Alexey Izbyshev added the comment: Thank you for testing. I've added a NEWS entry to the PR, so it's ready for review by the core devs. Note that PyThread_exit_thread() can still be called by daemon threads if they try to take the GIL after Py_Finalize(), and also via C APIs like PyEval_RestoreThreads() (see bpo-42969), so, in general, libgcc_s is still required for CPython. -- versions: +Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue42888> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42606] Support POSIX atomicity guarantee of O_APPEND on Windows
Alexey Izbyshev added the comment: > It's possible to query the granted access of a kernel handle via > NtQueryObject: ObjectBasicInformation Ah, thanks for the info. But it wouldn't help for option (1) that I had in mind because open() and os.open() currently set only msvcrt-level O_APPEND. It probably could be used in my current PR instead of just assuming the default access rights for file handles, but I'm not sure whether it makes sense: can a new handle have non-default access rights? Or can the default change at this point of Windows history? > Python could implement its own umask value in Windows. os.umask() would set > the C umask value as well, but only for the sake of consistency with C > extensions and embedding. Would it be a long shot to ask MS to add the needed functionality to MSVCRT instead? Or at least to declare the bug with _umask_s() that I described a feature. Maybe Steve Dower could help. > Open attribute flags could also be supported, such as O_ATTR_HIDDEN and > O_ATTR_SYSTEM. These are needed because a hidden or system file is required > to remain as such when it's overwritten, else CreateFileW fails. I didn't know that, thanks. Indeed, a simple open(path, 'w') fails on a hidden file with "Permission denied". > If it's important enough, msvcrt.open() and msvcrt.O_TEXT could be provided. Yes, I'd be glad to move support for more obscure MSVCRT flags to msvcrt.open() -- the less MSVCRT details we leak in os.open(), the more freedom we have to implement it via proper Win32 APIs. === Anyway, even if the blockers for implementing open()/os.open() via CreateFile() are solved, my current PR doesn't seem to conflict with such an implementation (it could just be replaced in the future). Currently, the only way to achieve atomic append in Python on Windows that I know is to use a custom opener that would call CreateFile() with the right arguments via ctypes/pywin32/etc. -- ___ Python tracker <https://bugs.python.org/issue42606> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42606] Support POSIX atomicity guarantee of O_APPEND on Windows
Alexey Izbyshev added the comment: Could anybody provide their thoughts on this RFE? Thanks. -- ___ Python tracker <https://bugs.python.org/issue42606> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42888] Not installed “libgcc_s.so.1” causes exit crash.
Alexey Izbyshev added the comment: I've made a PR to remove most calls to pthread_exit(). @xxm: could you test it in your environment? -- ___ Python tracker <https://bugs.python.org/issue42888> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42888] Not installed “libgcc_s.so.1” causes exit crash.
Change by Alexey Izbyshev : -- keywords: +patch pull_requests: +23063 stage: -> patch review pull_request: https://github.com/python/cpython/pull/24241 ___ Python tracker <https://bugs.python.org/issue42888> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42888] Not installed “libgcc_s.so.1” causes parser crash.
Alexey Izbyshev added the comment: I've encountered this issue too. My use case was a 32-bit Python on a 64-bit CentOS system, and my understanding of the issue was that 64-bit libgcc_s is somehow counted as a "provider" of libgcc_s for 32-bit libc by the package manager, so 32-bit libgcc_s is never installed even when "yum install glibc.i686" is used because everything seems "already OK": $ yum deplist glibc ... dependency: libgcc provider: libgcc.x86_64 4.1.2-55.el5 provider: libgcc.i386 4.1.2-55.el5 ... (The above is for CentOS 5, but at the time I tested 6 and 7 as well, with the same result). But I suggest not to dismiss this issue as a packaging bug. Glibc needs libgcc_s for pthread_exit() because it's implemented in terms of pthread_cancel(), and the latter wants to do stack unwinding (probably to cleanup resources for each stack frame). But the code for unwinding lives in libgcc_s. The non-intuitive thing here is that glibc tried to optimize this dependency by dlopen'ing libgcc_s when pthread_cancel() is called instead of making it a startup dependency. Many people consider this a terrible design choice since your program can now fail at an arbitrary moment due to dlopen() failure (and missing libgcc_s is not the only possible reason[1]). Since CPython doesn't use pthread_cancel() directly, I propose a simple solution -- just `return NULL` from thread functions instead of calling pthread_exit(). The last time I looked, pthread_exit() in CPython is only called from top frames of the threads, so a direct `return` should suffice. If the top-level thread function simply returns, no stack unwinding is needed, so glibc never uses its cancellation machinery. I've tested that this solution worked at the time (for 3.8 I think), and the test suite passed. If there is interest in going this way, I can test again. [1] https://www.sourceware.org/bugzilla/show_bug.cgi?id=13119 -- nosy: +izbyshev ___ Python tracker <https://bugs.python.org/issue42888> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42780] os.set_inheritable() fails for O_PATH file descriptors on Linux
Change by Alexey Izbyshev : -- components: +Library (Lib) nosy: +vstinner versions: -Python 3.6, Python 3.7 ___ Python tracker <https://bugs.python.org/issue42780> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42736] Add support for making Linux prctl(...) calls to subprocess
Change by Alexey Izbyshev : -- nosy: +izbyshev ___ Python tracker <https://bugs.python.org/issue42736> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38435] Start the deprecation cycle for subprocess preexec_fn
Change by Alexey Izbyshev : -- nosy: +izbyshev ___ Python tracker <https://bugs.python.org/issue38435> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42738] subprocess: don't close all file descriptors by default (close_fds=False)
Alexey Izbyshev added the comment: > Using close_fds=False, subprocess can use posix_spawn() which is safer and > faster than fork+exec. For example, on Linux, the glibc implements it as a > function using vfork which is faster than fork if the parent allocated a lot > of memory. On macOS, posix_spawn() is even a syscall. On Linux, unless you care specifically about users with Python 3.10+ on older kernels, implementing support for closerange() syscall in subprocess would provide better net benefit. This is because (a) performance advantage of posix_spawn() is no longer relevant on Linux after bpo-35823 and (b) supporting closerange() would benefit even those users who still need close_fds=True. -- nosy: +izbyshev ___ Python tracker <https://bugs.python.org/issue42738> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42707] Python uses ANSI CP for stdio on Windows console instead of using console or OEM CP
Alexey Izbyshev added the comment: > I've been struggling to understand today why a simple file redirection > couldn't work properly today (encoding issues) The core issue is that "working properly" is not defined in general when we're talking about piping/redirection, as opposed to the console. Different programs that consume Python's output (or produce its input) can have different expectations wrt. data encoding, and there is no way for Python to know it in advance. In your examples, you use programs like "more" and "type" to print the Python's output back to the console, so in this case using the OEM code page would produce the result that you expect. But, for example, in case Python's output was to be consumed by a C program that uses simple `fopen()/wscanf()/wprintf()` to work with text files, the ANSI code page would be appropriate because that's what the Microsoft C runtime library defaults to for wide character operations. Python has traditionally used the ANSI code page as the default IO encoding for non-console cases (note that Python makes no distinction between non-console `sys.std*` and the builtin `open()` wrt. encoding), and this behavior can't be changed. You can use `PYTHONIOENCODING` or enable the UTF-8 mode[1] to change the default encoding. Note that in your example you could simply use `PYTHONIOENCODING=cp850`, which would remove the need to use `chcp`. [1] https://docs.python.org/3/using/cmdline.html#envvar-PYTHONUTF8 -- nosy: +izbyshev, vstinner versions: +Python 3.10, Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue42707> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42655] Fix subprocess extra_groups gid conversion
Alexey Izbyshev added the comment: This bug would have been caught at compile time if `_Py_Gid_Converter()` used `gid_t *` instead of `void *`. I couldn't find any call sites where `void *` would be needed, so probably `_Py_Gid_Converter()` should be fixed too (in a separate PR/issue?). The same applies to `_Py_Uid_Converter()`. -- nosy: +gregory.p.smith, izbyshev ___ Python tracker <https://bugs.python.org/issue42655> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42606] Support POSIX atomicity guarantee of O_APPEND on Windows
Change by Alexey Izbyshev : -- keywords: +patch pull_requests: +22575 stage: -> patch review pull_request: https://github.com/python/cpython/pull/23712 ___ Python tracker <https://bugs.python.org/issue42606> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42606] Support POSIX atomicity guarantee of O_APPEND on Windows
New submission from Alexey Izbyshev : On POSIX-conforming systems, O_APPEND flag for open() must ensure that no intervening file modification occurs between changing the file offset and the write operation[1]. In effect, two processes that independently opened the same file with O_APPEND can't overwrite each other's data. On Windows, however, the Microsoft C runtime implements O_APPEND as an lseek(fd, 0, SEEK_END) followed by write(), which obviously doesn't provide the above guarantee. This affects both os.open() and the builtin open() Python functions, which rely on _wopen() from MSVCRT. A demo is attached. While POSIX O_APPEND doesn't guarantee the absence of partial writes, the guarantee of non-overlapping writes alone is still useful in cases such as debug logging from multiple processes without file locking or other synchronization. Moreover, for local filesystems, partial writes don't really occur in practice (barring conditions such as ENOSPC or EIO). Windows offers two ways to achieve non-overlapping appends: 1. WriteFile()[2] with OVERLAPPED structure with Offset and OffsetHigh set to -1. This is essentially per-write O_APPEND. 2. Using a file handle with FILE_APPEND_DATA access right but without FILE_WRITE_DATA access right. While (1) seems easy to add to FileIO, there are some problems: * os.write(fd) can't use it without caller's help because it has no way to know that the fd was opened with O_APPEND (there is no fcntl() in MSVCRT) * write() from MSVCRT (currently used by FileIO) performs some additional remapping of error codes and checks after it calls WriteFile(), so we'd have to emulate that behavior or risk breaking compatibility. I considered to go for (2) by reimplementing _wopen() via CreateFile(), which would also allow us to solve a long-standing issue of missing FILE_SHARE_DELETE on file handles, but hit several problems: * the most serious one is rather silly: we need to honor the current umask to possibly create a read-only file, but there is no way to query it without changing it, which is not thread-safe. Well, actually, I did discover a way: _umask_s(), when called with an invalid mask, returns both EINVAL error and the current umask. But this behavior directly contradicts MSDN, which claims that _umask_s() doesn't modify its second argument on failure[3]. So I'm not willing to rely on this until Microsoft fixes their docs. * os module exposes some MSVCRT-specific flags for use with os.open() (like O_TEMPORARY), which a reimplementation would have to support. It seems easy in most cases, but there is O_TEXT, which enables some obscure legacy behavior in MSVCRT such as removal of a trailing byte 26 (Ctrl-Z) when a file is opened with O_RDWR. More generally, it's unclear to me whether os.open() is explicitly intended to be a gateway to MSVCRT and thus support all current and future flags or is just expected to work similarly to MSVCRT in common cases. So in the end I decided to let _wopen() create the initial fd as usual, but then fix it up via DuplicateHandle() -- see the PR. [1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/write.html [2] https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-writefile [3] https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/umask-s?view=msvc-160 -- components: IO, Windows files: test.py messages: 382784 nosy: eryksun, izbyshev, paul.moore, steve.dower, tim.golden, vstinner, zach.ware priority: normal severity: normal status: open title: Support POSIX atomicity guarantee of O_APPEND on Windows type: enhancement versions: Python 3.10 Added file: https://bugs.python.org/file49661/test.py ___ Python tracker <https://bugs.python.org/issue42606> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42569] Callers of _Py_fopen/_Py_wfopen may be broken after addition of audit hooks
Alexey Izbyshev added the comment: Great approach :) -- ___ Python tracker <https://bugs.python.org/issue42569> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42602] seekable() returns True on pipe objects in Windows
Alexey Izbyshev added the comment: Yes, despite that MSVCRT knows the type of the file descriptor because it calls GetFileType() on its creation, it doesn't check it in lseek() implementation and simply calls SetFilePointer(), which spuriously succeeds for pipes. MSDN says the following[1]: Calling the SetFilePointer function with a handle to a non-seeking device such as a pipe or a communications device is not supported, even though the SetFilePointer function may not return an error. The behavior of the SetFilePointer function in this case is undefined. [1] https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-setfilepointer -- components: +Windows nosy: +eryksun, izbyshev, paul.moore, steve.dower, tim.golden, vstinner, zach.ware versions: +Python 3.10, Python 3.8 ___ Python tracker <https://bugs.python.org/issue42602> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue32381] Python 3.6 cannot reopen .pyc file with non-ASCII path
Change by Alexey Izbyshev : -- resolution: -> fixed stage: patch review -> resolved status: open -> closed versions: -Python 3.7 ___ Python tracker <https://bugs.python.org/issue32381> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue32381] Python 3.6 cannot reopen .pyc file with non-ASCII path
Alexey Izbyshev added the comment: Thanks for the fix and backports! -- resolution: fixed -> stage: resolved -> patch review status: closed -> open versions: +Python 3.7 ___ Python tracker <https://bugs.python.org/issue32381> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42585] Segmentation fault on Linux with multiprocess queue
Change by Alexey Izbyshev : -- components: +Library (Lib) nosy: +davin, pitrou versions: -Python 3.6 ___ Python tracker <https://bugs.python.org/issue42585> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42569] Callers of _Py_fopen/_Py_wfopen may be broken after addition of audit hooks
Alexey Izbyshev added the comment: > So it should be, "if they fail and you're in a context where exceptions are > allowed, raise an exception" (which will chain back to the one raised from an > audit hook". What exception should be raised if _Py_fopen() fails (returns NULL)? We can't just check errno anymore because _Py_fopen() might have failed because of the audit hook and thus didn't set errno. So it seems to me that addition of audit hooks broke the implicit contract of _Py_fopen()/_Py_wfopen(), and callers were not fixed. -- ___ Python tracker <https://bugs.python.org/issue42569> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42569] Callers of _Py_fopen/_Py_wfopen may be broken after addition of audit hooks
Alexey Izbyshev added the comment: > To implement PEP 446: create non-inheritable file descriptors. Yes, I understand that was the original role. But currently there is no easy way to deal with errors from the helpers because of exception vs. errno conundrum. Maybe they should be split to two functions each (one that always reports errors via an exception, and the other is a low-level one)? Or, alternatively, keep only exception-reporting variants but check all callers so that they don't use errno and call them from the right context (with GIL, etc.)? -- ___ Python tracker <https://bugs.python.org/issue42569> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue32381] Python 3.6 cannot reopen .pyc file with non-ASCII path
Alexey Izbyshev added the comment: > It seems like PyErr_ProgramText() is no longer used in Python. Isn't it a part of the public API? I can't find it in the docs, but it seems to be declared in the public header. -- ___ Python tracker <https://bugs.python.org/issue32381> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue32381] Python 3.6 cannot reopen .pyc file with non-ASCII path
Alexey Izbyshev added the comment: Thanks for the patch, Victor, it looks good. Just so it doesn't get lost: the problem with the contract of PyErr_ProgramText() which I mentioned in my dup 42568 is still there. -- ___ Python tracker <https://bugs.python.org/issue32381> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue32381] Python 3.6 cannot reopen .pyc file with non-ASCII path
Alexey Izbyshev added the comment: Thanks, Eryk, for catching the dup, I missed it somehow. @ZackerySpytz: do you plan to proceed with your PR? If not, I can pick it up -- this issue broke the software I develop after upgrade to 3.8. I filed issue 42569 to hopefully clarify the status of _Py_fopen() which became murky to me. -- nosy: +izbyshev ___ Python tracker <https://bugs.python.org/issue32381> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42569] Callers of _Py_fopen/_Py_wfopen may be broken after addition of audit hooks
New submission from Alexey Izbyshev : Before addition of audit hooks in 3.8, _Py_fopen() and _Py_wfopen() were simple wrappers around corresponding C runtime functions. They didn't require GIL, reported errors via errno and could be safely called during early interpreter initialization. With the addition of PySys_Audit() calls, they can also raise an exception, which makes it unclear how they should be used. At least one caller[1] is confused, so an early-added hook (e.g. from sitecustomize.py) that raises a RuntimeError on at attempt to open the main file causes the following: $ ./python /home/test/test.py ./python: can't open file '/home/test/test.py': [Errno 22] Invalid argument Traceback (most recent call last): File "/home/test/.local/lib/python3.10/site-packages/sitecustomize.py", line 10, in hook raise RuntimeError("XXX") RuntimeError: XXX "Invalid argument" is reported by pymain_run_file() due to a bogus errno, and the real problem (exception from the hook) is noticed only later. Could somebody share the current intended status/role of these helpers? Understanding that seems to be required to deal with issue 32381. [1] https://github.com/python/cpython/blob/066394018a84/Modules/main.c#L314 -- components: Interpreter Core messages: 382499 nosy: izbyshev, steve.dower, vstinner priority: normal severity: normal status: open title: Callers of _Py_fopen/_Py_wfopen may be broken after addition of audit hooks type: behavior versions: Python 3.10, Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue42569> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42568] Python can't run .pyc files with non-ASCII path on Windows
New submission from Alexey Izbyshev : > python тест.pyc python: Can't reopen .pyc file The issue is caused by _Py_fopen() being used as though it can deal with paths encoded in FS-default encoding (UTF-8 by default on Windows), but in fact it's just a simple wrapper around fopen() from the C runtime, so it uses the current ANSI code page, breaking if PYTHONLEGACYWINDOWSFSENCODING is not enabled. I could find only two callers if _Py_fopen() on Windows: * https://github.com/python/cpython/blob/db68544122f5/Python/pythonrun.c#L380 (which caused this issue) * https://github.com/python/cpython/blob/db68544122f5/Python/errors.c#L1708 PyErr_ProgramText() doesn't seem to be called in CPython, but https://github.com/python/cpython/blob/db68544122f5/Include/pyerrors.h#L243 claims that filename is "decoded from the filesystem encoding", which doesn't match the code. -- components: Interpreter Core, Windows messages: 382490 nosy: izbyshev, paul.moore, steve.dower, tim.golden, vstinner, zach.ware priority: normal severity: normal status: open title: Python can't run .pyc files with non-ASCII path on Windows type: behavior versions: Python 3.10, Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue42568> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42457] ArgumentParser nested subparsers resolve paths in improper order
Change by Alexey Izbyshev : -- nosy: +rhettinger versions: -Python 3.6, Python 3.7 ___ Python tracker <https://bugs.python.org/issue42457> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42388] subprocess.check_output(['echo', 'test'], text=True, input=None) fails
Alexey Izbyshev added the comment: > (probably can't even limit that to the case when `text` is used, since it was > added in 3.7) Well, actually, we can, since we probably don't need to preserve compatibility with the AttributeError currently caused by `text=True` with `input=None`. However, it seems to me that treating `input=None` differently depending on the other arguments would be confusing. -- ___ Python tracker <https://bugs.python.org/issue42388> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42388] subprocess.check_output(['echo', 'test'], text=True, input=None) fails
Alexey Izbyshev added the comment: It seems that allowing `input=None` to mean "redirect stdin to a pipe and send an empty string there" in `subprocess.check_output` was an accident(?), and this behavior is inconsistent with `subprocess.run` and `communicate`, where `input=None` has the same effect as not specifying it at all. The docs for `subprocess.check_output` say: The arguments shown above are merely some common ones. The full function signature is largely the same as that of run() - most arguments are passed directly through to that interface. However, explicitly passing input=None to inherit the parent’s standard input file handle is not supported. They don't make it clear the effect of passing `input=None` though. I also couldn't find any tests that would check that effect. Since we can't just forbid `input=None` for `check_output` at this point (probably can't even limit that to the case when `text` is used, since it was added in 3.7), I think that we need to extend the check pointed to by ThiefMaster to cover `text`, clarify the docs and add a test. -- nosy: +gregory.p.smith, izbyshev versions: +Python 3.10, Python 3.8 ___ Python tracker <https://bugs.python.org/issue42388> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42097] Python 3.7.9 logging/threading/fork hang
Alexey Izbyshev added the comment: By the way, I don't see a direct relation between `test.py` (which doesn't use `subprocess` directly) and your comment describing `subprocess` usage with threads. So if you think that the bug in `test.py` is unrelated to the problem you face, feel free to reopen. -- ___ Python tracker <https://bugs.python.org/issue42097> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42097] Python 3.7.9 logging/threading/fork hang
Alexey Izbyshev added the comment: (Restored test.py attachment) The issue happens due to an incorrect usage of `multiprocessing.Pool`. ``` # Set up multiprocessing pool, initialising logging in each subprocess with multiprocessing.Pool(initializer=process_setup, initargs=(get_queue(),)) as pl: # 100 seems to work fine, 500 fails most of the time. # If you're having trouble reproducing the error, try bumping this number up to 1000 pl.map(do_work, range(1)) if _globalListener is not None: # Stop the listener and join the thread it runs on. # If we don't do this, we may lose log messages when we exit. _globalListener.stop() ``` Leaving `with` statement causes `pl.terminate()` [1, 2] Since multiprocessing simply sends SIGTERM to all workers, a worker might be killed while it holds the cross-process lock guarding `_globalQueue`. In this case, `_globalListener.stop()` blocks forever trying to acquire that lock (to add a sentinel to `_globalQueue` to make a background thread stop monitoring it). Consider using `Pool.close()` and `Pool.join()` to properly wait for task completion. [1] https://docs.python.org/3.9/library/multiprocessing.html#multiprocessing.pool.Pool.terminate [2] https://docs.python.org/3.9/library/multiprocessing.html#programming-guidelines -- nosy: +izbyshev resolution: -> not a bug stage: -> resolved status: open -> closed ___ Python tracker <https://bugs.python.org/issue42097> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42097] Python 3.7.9 logging/threading/fork hang
Change by Alexey Izbyshev : Added file: https://bugs.python.org/file49531/test.py ___ Python tracker <https://bugs.python.org/issue42097> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42146] subprocess.Popen() leaks cwd in case of uid/gid overflow
Alexey Izbyshev added the comment: Thanks for merging! I've rebased PR 22970. -- ___ Python tracker <https://bugs.python.org/issue42146> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42146] subprocess.Popen() leaks cwd in case of uid/gid overflow
Change by Alexey Izbyshev : -- type: behavior -> resource usage ___ Python tracker <https://bugs.python.org/issue42146> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42146] subprocess.Popen() leaks cwd in case of uid/gid overflow
Change by Alexey Izbyshev : -- keywords: +patch stage: -> patch review ___ Python tracker <https://bugs.python.org/issue42146> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42146] subprocess.Popen() leaks cwd in case of uid/gid overflow
Alexey Izbyshev added the comment: I've submitted both PRs. Regarding PR 22970: * I made it a draft since we'd probably want to fix the leak first, but then it will have to be rebased. * It fixes a bug with _enable_gc(): if it failed after fork(), we'd raise OSError instead. Additionally, if fork() succeeded(), the errno inside OSError would be zero, and we'd leak the child process. -- keywords: -patch stage: patch review -> ___ Python tracker <https://bugs.python.org/issue42146> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42146] subprocess.Popen() leaks cwd in case of uid/gid overflow
Change by Alexey Izbyshev : -- pull_requests: +21885 pull_request: https://github.com/python/cpython/pull/22970 ___ Python tracker <https://bugs.python.org/issue42146> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42146] subprocess.Popen() leaks cwd in case of uid/gid overflow
Change by Alexey Izbyshev : -- keywords: +patch pull_requests: +21882 stage: -> patch review pull_request: https://github.com/python/cpython/pull/22966 ___ Python tracker <https://bugs.python.org/issue42146> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42146] subprocess.Popen() leaks cwd in case of uid/gid overflow
New submission from Alexey Izbyshev : The following test demonstrates the leak: ``` import subprocess cwd = 'x' * 10**6 for __ in range(100): try: subprocess.call(['/xxx'], cwd=cwd, user=2**64) except OverflowError: pass from resource import * print(getrusage(RUSAGE_SELF).ru_maxrss) ``` The leak was introduced by bpo-36046. Previously, `cleanup:` label was not reachable after `cwd_obj2` was initialized at https://github.com/python/cpython/blob/492d513ccbebeec40a8ba85cbd894a027ca5b2b3/Modules/_posixsubprocess.c#L892 I'll submit a PR with a simple fix suitable for backporting to 3.9. Also, I think it might make sense to unify the two almost-identical cleanup paths we have now. I'll follow up with another PR. -- assignee: izbyshev components: Extension Modules keywords: 3.9regression messages: 379575 nosy: gregory.p.smith, izbyshev, patrick.mclean priority: normal severity: normal status: open title: subprocess.Popen() leaks cwd in case of uid/gid overflow type: behavior versions: Python 3.10, Python 3.9 ___ Python tracker <https://bugs.python.org/issue42146> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35823] Use vfork() in subprocess on Linux
Alexey Izbyshev added the comment: @ronaldoussoren > I'd prefer to not use vfork on macOS. For one I don't particularly trust that > vfork would work reliably when using higher level APIs, but more importantly > posix_spawn on macOS has some options that are hard to achieve otherwise and > might be useful to expose in subprocess. I can't comment on vfork() usability on macOS myself, but given the number of issues/considerations described here, I expect that significant research would be needed to check that. Regarding your second point about extra posix_spawn() options, I actually don't see why it would be an argument against vfork(). Even on Linux, subprocess prefers posix_spawn() to vfork()/fork() when possible, so vfork() support doesn't hinder posix_spawn(). -- ___ Python tracker <https://bugs.python.org/issue35823> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35823] Use vfork() in subprocess on Linux
Alexey Izbyshev added the comment: > regarding excluding the setsid() case: I was being conservative as I couldn't > find a reference of what was and wasn't allowed after vfork. Yes, there is no list of functions allowed after vfork(), except for the conservative POSIX.1 list consisting only of _exit() and execve(), so we can only take async-signal-safe functions as a first approximation and work from there. Thankfully, on Linux, C libraries don't do anything fancy in most cases. But, for example, it appears that on FreeBSD we wouldn't be able to use sigprocmask()/sigaction()[1]. BTW, commit[2] and the linked review are an interesting reading for anybody who would like to support posix_spawn() and/or vfork() in subprocess on FreeBSD. > Confirming, in glibc is appears to be a shim for the setsid syscall (based on > not finding any code implementing anything special for it) and in uclibc > (*much* easier to read) it is clearly just a setsid syscall shim. I also recommend musl[3] when simplicity (and correctness) is required :) [1] https://svnweb.freebsd.org/base/head/lib/libc/gen/posix_spawn.c?view=markup&pathrev=362111#l126 [2] https://svnweb.freebsd.org/base?view=revision&revision=352712 [3] https://git.musl-libc.org/cgit/musl/tree/src/unistd/setsid.c?id=a5aff1972c9e3981566414b09a28e331ccd2be5d -- ___ Python tracker <https://bugs.python.org/issue35823> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35823] Use vfork() in subprocess on Linux
Alexey Izbyshev added the comment: > Thank you for taking this on! I'm calling it fixed for now as the buildbots > are looking happy with it. If issues with it arise we can address them. Thank you for reviewing and merging! Using POSIX_CALL for pthread_sigmask() is incorrect, however, since it *returns* the error instead of setting errno. I've opened a PR with a fix and additional handling of the first pthread_sigmask() call in the parent (which can be done easily). I've also noticed a memory leak why doing the above: cwd_obj2 is not released on the error path. It probably slipped with patches adding user/groups support. I'll file a separate issue. Would you also clarify why you disallowed vfork() in case setsid() is needed? Have you discovered any potential issues with setsid()? Note that setsid() doesn't suffer from thread sync issues like setuid()/setgid()/etc. -- ___ Python tracker <https://bugs.python.org/issue35823> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35823] Use vfork() in subprocess on Linux
Change by Alexey Izbyshev : -- pull_requests: +21862 pull_request: https://github.com/python/cpython/pull/22944 ___ Python tracker <https://bugs.python.org/issue35823> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36034] Suprise halt caused by -Werror=implicit-function-declaration in ./Modules/posixmodule.c
Change by Alexey Izbyshev : -- resolution: -> not a bug ___ Python tracker <https://bugs.python.org/issue36034> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35823] Use vfork() in subprocess on Linux
Alexey Izbyshev added the comment: I've updated my PR. * After a discussion with Alexander Monakov (a GCC developer), moved vfork() into a small function to isolate it from both subprocess_fork_exec() and child_exec(). This appears to be the best strategy to avoid -Wclobbered false positives as well as potential compiler bugs. * Got rid of VFORK_USABLE checks in function parameter lists. Now `child_sigmask` parameter is always present, but is NULL if vfork() is not available or usable. The type is `void *` to avoid hard dependence on sigset_t, which other CPython source files appear to avoid. * Disabled vfork() in case setuid()/setgid()/setgroups() needed. * Added more comments explaining vfork()-related stuff. * Added commit message and NEWS entry. Potential improvements: * To avoid repeating long parameter lists in several functions, we can move them to a struct. The downside is that we'd need to convert child_exec() to use that struct all over the place. I don't have a strong preference here. * Are any documentation changes needed? We don't change any interfaces, so I'm not sure. Potential future directions: * If desired, a vfork() opt-out may be implemented. But it'd need to disable posix_spawn() on Linux as well, since it might be implemented via vfork(), and we have no good way to check that. -- ___ Python tracker <https://bugs.python.org/issue35823> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35823] Use vfork() in subprocess on Linux
Alexey Izbyshev added the comment: Well, much later than promised, but I'm picking it up. Since in the meantime support for setting uid/gid/groups was merged, and I'm aware about potential issues with calling corresponding C library functions in a vfork()-child, I asked a question on musl mailing list: https://www.openwall.com/lists/musl/2020/10/12/1 So, it seems we'll need to fallback to fork() if set*id() is needed, which is in line with our previous discussion about avoidance of vfork() in privileged processes anyway. I'm also discussing -Wclobbered warnings with a GCC developer. I wouldn't like to restructure code just to avoid GCC false positives, so currently I'm leaning towards disabling this warning entirely for subprocess_fork_exec() and documenting that arbitrary stores to local variables between vfork() and child_exec() are not allowed due to stack sharing, but we'll see if a better solution emerges. -- assignee: -> izbyshev ___ Python tracker <https://bugs.python.org/issue35823> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35823] Use vfork() in subprocess on Linux
Alexey Izbyshev added the comment: I'd really like to get this merged eventually because vfork()-based solution is fundamentally more generic than posix_spawn(). Apart from having no issue with close_fds=True, it will also continue to allow subprocess to add any process context tweaks without waiting for availability of corresponding features in posix_spawn(). If there is no objection from Gregory or other devs, I can pick up from where I left in two weeks. -- ___ Python tracker <https://bugs.python.org/issue35823> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue34977] Release Windows Store app containing Python
Change by Alexey Izbyshev : -- pull_requests: +16379 pull_request: https://github.com/python/cpython/pull/5812 ___ Python tracker <https://bugs.python.org/issue34977> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue31904] Python should support VxWorks RTOS
Change by Alexey Izbyshev : -- nosy: +izbyshev ___ Python tracker <https://bugs.python.org/issue31904> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36046] support dropping privileges when running subprocesses
Alexey Izbyshev added the comment: > 1) This is intentional, this is for dropping privileges before running some > (possibly untrusted) command, we do not want to leave a path for the > subprocess to gain root back. If there is a subprocess that needs root for > some operations, it would presumably have the ability to drop privileges > itself, and would not need the python script to do it before running it. > 2) While POSIX leaves it unspecified what changes are permitted for an > unprivileged process, these are permitted for a privileged process, which is > the main use case for this functionality. In the case the OS does not support > it for an unpriviliged process, the syscall would fail with EPERM, which can > be handled from the calling python code. Thanks for your explanation. In case of a privileged process, the behavior of setreuid/setregid/setgroups does seem well-defined. But setuid/setgid change all ids (real, effective, saved) too in this case. Do you prefer setreuid/setregid because they provide stricter semantics in non-privileged processes compared to setuid/setgid? (The latter ones change the effective id only, potentially preserving the process ability to switch ids later). > I am fine removing the workaround, and allowing it to fail with EPERM. > Perhaps we could find another way around this for running the tests in an > unprivileged environment, or just leave the test only running the EPERM > case... > I could change the API to have have group= and supp_groups= if you prefer. Personally, I do prefer separate arguments because subprocess is a relatively low-level module and tends to expose underlying operations, so it seems logical to have a knob that maps to setgid() and another one that maps to setgroups(). @gregory.p.smith: what do you think about it? -- ___ Python tracker <https://bugs.python.org/issue36046> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36046] support dropping privileges when running subprocesses
Alexey Izbyshev added the comment: Patrick, could you provide more background that would explain your choice of setreuid/setregid functions and the desired handling of supplementary groups? I'm not a security expert, so I may not have sufficient expertise to judge on that, but maybe my questions could be helpful for others: 1) setreuid/setregid, as used in your PR, will set the real, effective and saved ids to the specified value [1]. So this precludes the use-case where a user wants to temporarily switch the effective id to the real id ("drop privileges") and then switch it back to the old effective (saved) id in the child. This use case is supported for non-root processes by POSIX_SPAWN_RESETIDS flag used with posix_spawn() (the flag is implemented by simply calling setuid(getuid()) in the child). Is it okay that the proposed API doesn't support this? 2) While setreuid/setregid on Linux permit setting the real id to either the effective or the real id, POSIX leaves it unspecified [2]. Are we okay with potential portability problems? 3) setgroups() always requires privileges on Linux [3]. So, if we always call setgroups() if subprocess.Popen(groups=...) is used, we preclude the use case where a non-privileged process wants to switch its gid but doesn't want to touch its supplementary groups. Is this a valid use case we want to care about? The current workaround for the above is to call setgroups() only if geteuid() == 0, but I'm not sure it's a good one: (a) ISTM we're basically guessing the intent here: what if a process really wants to change its supplementary groups, but a user run it without appropriate privileges by mistake? I think such process would like to get an exception instead of silently ignored call to setgroups(). (b) geteuid() == 0 test is not precise. The Linux man page documents that the caller needs the CAP_SETGID capability, which doesn't necessarily imply that the effective id is zero. Another solution would be to split groups into two arguments: gid and supplementary groups. This way users can explicitly tell us whether they want to switch supplementary groups or not. Overall, I'd really like to have security experts review this PR if possible. [1] http://man7.org/linux/man-pages/man2/setreuid.2.html [2] http://pubs.opengroup.org/onlinepubs/9699919799/functions/setreuid.html [3] http://man7.org/linux/man-pages/man2/setgroups.2.html -- assignee: gregory.p.smith -> ___ Python tracker <https://bugs.python.org/issue36046> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36046] support dropping privileges when running subprocesses
Change by Alexey Izbyshev : -- assignee: -> gregory.p.smith ___ Python tracker <https://bugs.python.org/issue36046> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36067] subprocess terminate() "invalid handle" error when process is gone
Alexey Izbyshev added the comment: > Interesting. Because both errors/conditions are mapped to > ERROR_INVALID_HANDLE we need the creation time. I can work on a patch for > that. I don't understand why any patch for CPython is needed at all. Using invalid handles is a serious programming bug (it's similar to using freed memory), so CPython doesn't really have to attempt to detect the programmer's error, at least not if this attempt significantly complicates the existing code. In my opinion, the CI failure linked in the first comment simply indicates a bug in psutil and is unrelated to CPython at all. -- nosy: +izbyshev ___ Python tracker <https://bugs.python.org/issue36067> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36046] support dropping privileges when running subprocesses
Change by Alexey Izbyshev : -- nosy: +izbyshev ___ Python tracker <https://bugs.python.org/issue36046> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36034] Suprise halt caused by -Werror=implicit-function-declaration in ./Modules/posixmodule.c
Alexey Izbyshev added the comment: I don't know what you mean by "in-line" pre-processing output, but you can use -E option to get the normal preprocessor output. Line directives will tell you where those functions come from on a system where there is no compilation error. Of course, if those functions are defined in some other headers on AIX 6.1, it won't help you, so you might as well just grep /usr/include :) -- nosy: +izbyshev ___ Python tracker <https://bugs.python.org/issue36034> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35984] test__xxsubinterpreters leaked [3, 4, 3] memory blocks, sum=1
Alexey Izbyshev added the comment: Thank you for your introduction about _xxsubinterpreters, Eric. This particular leak is easy: it's right in _channel_send(). I've submitted a PR. I've also done a quick scan of neighboring code, and it seems there are other leaks as well, e.g.: * PyThread_free_lock() is not called at https://github.com/python/cpython/blob/dcb68f47f74b0cc8a1896d4a4c5a6b83c0bbeeae/Modules/_xxsubinterpretersmodule.c#L761 (and below) * data is not released and freed at https://github.com/python/cpython/blob/dcb68f47f74b0cc8a1896d4a4c5a6b83c0bbeeae/Modules/_xxsubinterpretersmodule.c#L1387 Do you think it'd make sense to go through the module to find and fix leaks? Or is this code in an early stage for such cleanup? As a side note, such leaks should be easily found by static analyzers such as Coverity (assuming it understands CPython allocation functions like PyMem_NEW), so it might make sense to check out its reports on the module. -- ___ Python tracker <https://bugs.python.org/issue35984> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35984] test__xxsubinterpreters leaked [3, 4, 3] memory blocks, sum=1
Change by Alexey Izbyshev : -- keywords: +patch pull_requests: +11876 stage: test needed -> patch review ___ Python tracker <https://bugs.python.org/issue35984> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35984] test__xxsubinterpreters leaked [3, 4, 3] memory blocks, sum=1
Alexey Izbyshev added the comment: I'll look into it later today. An obvious guess is that my test simply exposed an existing leak because the exception code path wasn't tested before AFAIK, but I need to check it. -- assignee: -> izbyshev ___ Python tracker <https://bugs.python.org/issue35984> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35972] _xxsubinterpreters: channel_send() may truncate ints on 32-bit platforms
Alexey Izbyshev added the comment: "long long" is mandated to be at least 64-bit by C99 (5.2.4.2.1 Sizes of integer types). If it were 32-bit, no warnings would have been issued. -- ___ Python tracker <https://bugs.python.org/issue35972> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35972] _xxsubinterpreters: channel_send() may truncate ints on 32-bit platforms
Change by Alexey Izbyshev : -- keywords: +patch pull_requests: +11852 stage: -> patch review ___ Python tracker <https://bugs.python.org/issue35972> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35972] _xxsubinterpreters: channel_send() may truncate ints on 32-bit platforms
New submission from Alexey Izbyshev : Victor Stinner pointed out that on x86 Gentoo Installed with X 3.x buildbot, there is a compiler warning: Python/pystate.c:1483:18: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] (https://buildbot.python.org/all/#/builders/103/builds/2067) This warning reveals a bug: static int _long_shared(PyObject *obj, _PyCrossInterpreterData *data) { int64_t value = PyLong_AsLongLong(obj); if (value == -1 && PyErr_Occurred()) { if (PyErr_ExceptionMatches(PyExc_OverflowError)) { PyErr_SetString(PyExc_OverflowError, "try sending as bytes"); } return -1; } data->data = (void *)value; A 64-bit value is converted to void *, which is 32-bit on 32-bit platforms. I've implemented a PR that uses Py_ssize_t instead to match the pointer size and to preserve the ability to work with negative numbers. -- assignee: izbyshev components: Extension Modules messages: 335272 nosy: eric.snow, izbyshev, vstinner priority: normal severity: normal status: open title: _xxsubinterpreters: channel_send() may truncate ints on 32-bit platforms type: behavior versions: Python 3.8 ___ Python tracker <https://bugs.python.org/issue35972> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35823] Use vfork() in subprocess on Linux
Alexey Izbyshev added the comment: I've been struggling with fixing spurious -Wclobbered GCC warnings. Originally, I've got the following: /scratch2/izbyshev/cpython/Modules/_posixsubprocess.c: In function ‘subprocess_fork_exec’: /scratch2/izbyshev/cpython/Modules/_posixsubprocess.c:612:15: warning: variable ‘gc_module’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Wclobbered] PyObject *gc_module = NULL; ^ /scratch2/izbyshev/cpython/Modules/_posixsubprocess.c:616:15: warning: variable ‘preexec_fn_args_tuple’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Wclobbered] PyObject *preexec_fn_args_tuple = NULL; ^ /scratch2/izbyshev/cpython/Modules/_posixsubprocess.c:621:17: warning: variable ‘cwd’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Wclobbered] const char *cwd; ^~~ /scratch2/izbyshev/cpython/Modules/_posixsubprocess.c:623:9: warning: variable ‘need_to_reenable_gc’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Wclobbered] int need_to_reenable_gc = 0; ^~~ /scratch2/izbyshev/cpython/Modules/_posixsubprocess.c:624:38: warning: variable ‘argv’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Wclobbered] char *const *exec_array, *const *argv = NULL, *const *envp = NULL; ^~~~ /scratch2/izbyshev/cpython/Modules/_posixsubprocess.c:624:59: warning: variable ‘envp’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Wclobbered] char *const *exec_array, *const *argv = NULL, *const *envp = NULL; ^~~~ /scratch2/izbyshev/cpython/Modules/_posixsubprocess.c:626:9: warning: variable ‘need_after_fork’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Wclobbered] int need_after_fork = 0; ^~~ /scratch2/izbyshev/cpython/Modules/_posixsubprocess.c:627:9: warning: variable ‘saved_errno’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Wclobbered] int saved_errno = 0; ^~~ I've checked that all warnings are spurious: all flagged variables are either not modified in the child or modified and used only by the child. A simple way to suppress the warnings would be "volatile", but I don't want to spray "volatile" over the huge declaration block of subprocess_fork_exec(). Another way is to move vfork() to a separate function and ensure that this function does as little as possible with its local variables. I've implemented two versions of this approach, both are ugly in some sense. I've pushed the first into the PR branch and the second into a separate branch https://github.com/izbyshev/cpython/tree/single-do-fork-exec. Yet another way would be to simply disable this diagnostic for _posixsubprocess (e.g. via #pragma GCC diagnostic), but I'm not sure we want to do that -- may be it'll be fixed in the future or a real defect will be introduced into our code. -- ___ Python tracker <https://bugs.python.org/issue35823> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35823] Use vfork() in subprocess on Linux
Alexey Izbyshev added the comment: Thank you for the review and your thoughts, Gregory. > With this in place we may want to make the _use_posix_spawn() logic in > subprocess.py stricter? That could be its own followup PR. Yes, I think we can always use vfork() on Linux unless we find some issues. (Victor Stinner doesn't seem to be against: https://github.com/python/cpython/pull/11668#issuecomment-457637207). Though C libraries are in a better position to provide safe vfork/exec, by using our implementation we will: * avoid breaking QEMU user-mode (and preserve existing subprocess behavior) * automatically support fast process creation on musl (which has a good posix_spawn, but doesn't have easy means (macros) on detect that we're on musl and thus avoid unacceptable posix_spawn). * avoid having different code paths (and thus potential surprises) depending on arguments of subprocess.Popen() > This is a scary issue. But I think a reasonable approach could be to never > use vfork when running as whatever we choose to define a "privileged user" to > be. > getuid() or geteuid() return 0? don't use vfork. I thought about something like this, but initially dismissed it because I (mistakenly) decided that an application may concurrently switch between arbitrary uids back and forth, so that checking getuid() becomes useless (if we're already talking about an exotic app that calls setuid() concurrently with spawning children, why stop there?). But now I realize that an app may use *arbitrary* uids only if one of its real, effective or saved uids is zero (that is, if we don't take capabilities into account), so at least we could call getresuid() to get all 3 uids in a race-free way and check whether the app is *definitely* privileged... > the concept of "privileged user" can obviously mean a lot more than that and > likely goes beyond what we should attempt to ascertain ourselves. ...but you're making a good point here. So I'm not sure that we want such checks, but if we do, we'd probably need to allow users to disable them -- what if some heavy privileged daemon wants a faster subprocess? > How about also providing a disable-only global setting so that someone > writing code they consider to have elevated privileges can prevent its use > entirely. subprocess.disable_use_of_vfork() and > subprocess.is_vfork_enabled() calls perhaps (just setting/reading a static > int vfork_disallowed = 0 flag within _posixsubprocess.c). I think it's reasonable. I'll look into it when I'm done with current issues. > True setuid vs vfork attack security would suggest code needs to opt-in to > vfork() or posix_spawn() rather than opt-out. Which would destroy the > benefit for most users (who won't bother) for the sake of an issue that just > is not what most code ever does (setuid/setgid/etc. calls are very uncommon > for most software). Agree, especially since we're not talking about "just" using setuid() but rather about using setuid() *in a multithreaded process in a racy manner while spawning children*. Honestly, I'm not sure such apps can blame Python if they get a security hole :) -- ___ Python tracker <https://bugs.python.org/issue35823> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35823] Use vfork() in subprocess on Linux
Alexey Izbyshev added the comment: I've checked subprocess.Popen() error reporting in QEMU user-mode and WSL and confirm that it works both with my patch (vfork/exec) and the traditional fork/exec, but doesn't work with glibc's posix_spawn. The first command below uses posix_spawn() internally because close_fds=False. Under QEMU posix_spawn() from glibc doesn't report errors because it relies on address-space sharing of clone(CLONE_VM|CLONE_VFORK), which is not emulated by QEMU. The second command uses vfork()/exec() in _posixsubprocess, but lack of address-space sharing doesn't matter because the error data is transferred via a pipe. $ qemu-x86_64 --version qemu-x86_64 version 2.11.1 Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers $ ldd --version ldd (GNU libc) 2.27 Copyright (C) 2018 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. Written by Roland McGrath and Ulrich Drepper. $ qemu-x86_64 ./python3 -c 'import subprocess; subprocess.call("/xxx", close_fds=False)' $ qemu-x86_64 ./python3 -c 'import subprocess; subprocess.call("/xxx")' Traceback (most recent call last): File "", line 1, in File "/home/izbyshev/install/python-3.8a/lib/python3.8/subprocess.py", line 324, in call with Popen(*popenargs, **kwargs) as p: File "/home/izbyshev/install/python-3.8a/lib/python3.8/subprocess.py", line 830, in __init__ self._execute_child(args, executable, preexec_fn, close_fds, File "/home/izbyshev/install/python-3.8a/lib/python3.8/subprocess.py", line 1648, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: 'xxx' For WSL, I've tested Ubuntu 18.04 (glibc 2.27) on Windows 10 1709 (Fall Creators Update) and 1803. In 1709, the result is the same as for QEMU (i.e. the first command silently succeeds). In 1803, both commands raise the exception because address-space sharing was fixed for clone(CLONE_VM|CLONE_VFORK) and vfork() in WSL: https://github.com/Microsoft/WSL/issues/1878 I've also run all subprocess tests under QEMU user-mode (by making sys.executable to point to a wrapper that runs python under QEMU) for the following code bases: * current master (fork/exec or posix_spawn can be used) * classic (fork/exec always) * my patch + master (vfork/exec or posix_spawn) * vfork/exec always All tests pass in all four flavors, which indicates that we don't have a test for error reporting with close_fds=False :) Out of curiosity I also did the same on WSL in 1803. Only 3 groups of tests fail, all because of WSL bugs: * It's not possible to send signals to zombies, e.g.: == ERROR: test_terminate_dead (test.test_subprocess.POSIXProcessTestCase) -- Traceback (most recent call last): File "/home/test/cpython/Lib/test/test_subprocess.py", line 1972, in test_terminate_dead self._kill_dead_process('terminate') File "/home/test/cpython/Lib/test/test_subprocess.py", line 1941, in _kill_dead_process getattr(p, method)(*args) File "/home/test/cpython/Lib/subprocess.py", line 1877, in terminate self.send_signal(signal.SIGTERM) File "/home/test/cpython/Lib/subprocess.py", line 1872, in send_signal os.kill(self.pid, sig) ProcessLookupError: [Errno 3] No such process -- * getdents64 syscall doesn't work properly (doesn't return the rest of entries on the second call, so close_fds=True doesn't fully work with large number of open descriptors). == FAIL: test_close_fds_when_max_fd_is_lowered (test.test_subprocess.POSIXProcessTestCase) Confirm that issue21618 is fixed (may fail under valgrind). -- Traceback (most recent call last): File "/home/test/cpython/Lib/test/test_subprocess.py", line 2467, in test_close_fds_when_max_fd_is_lowered self.assertFalse(remaining_fds & opened_fds, AssertionError: {34, 35, 36, 37, 38, 39, 40, 41, 42} is not false : Some fds were left open. -- * Signal masks in /proc/self/status are not correct (always zero) == FAIL: test_restore_signals (test.test_subprocess.POSIXProcessTestCase) -- Traceback (most recent call last): File "/home/test
[issue35823] Use vfork() in subprocess on Linux
Alexey Izbyshev added the comment: > W.r.t. closing all file descriptors > 2: posix_spawn_file_actions_addclose > can do this when using posix_spawn. That would have a performance cost, you'd > basically have to resort to closing all possible file descriptors and cannot > use the smarter logic used in _posixsubprocess. This is costly to the point of people reporting bugs: bpo-35757. If that really causes 0.1s delay like the reporter said, it would defeat the purpose of using posix_spawn() in the first place. > However, the smarter closing code in _posixsubprocess is not safe w.r.t. > vfork according to the comment above _close_open_fds_maybe_unsafe: that > function uses some functions that aren't async-safe and one of those calls > malloc. No, it's not so on Linux: https://github.com/python/cpython/blob/62c35a8a8ff5854ed470b1c16a7a14f3bb80368c/Modules/_posixsubprocess.c#L314 Moreover, as I already argued in msg332235, using malloc() after vfork() should be *safer* than after fork() for a simple reason: all memory-based locks will still work as though the child is merely a thread in the parent process. This is true even for things like futex(FUTEX_WAKE_PRIVATE), despite that this operation is mistakenly documented as "process-private" in man pages. It's actually more like "private to tasks sharing the same address space". This is in contrast with fork(): if it's called while another thread is holding some mutex in malloc(), nobody would unlock it in the child unless the C library has precautions against that. -- ___ Python tracker <https://bugs.python.org/issue35823> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35537] use os.posix_spawn in subprocess
Alexey Izbyshev added the comment: > Is sys.platform equal to 'linux' on WSL? Sorry, I don't know WSL. If it's > equal, is it possible to explicitly exclude WSL in the subprocess test, > _use_posix_spawn()? I don't have immediate access to WSL right now, but I'll try to get one and investigate what we have there wrt. posix_spawn() and vfork(). > So the bug only affects "QEMU User Space". I never used that before. Yes, I was specifically referring to QEMU user-space. One use case that heavily relies on it is Tizen. Its packages are built in a chroot jail containing the build environment with binaries native to the target architecture, making an illusion that packages are built on the target system and are not cross-compiled. So the binaries are run under QEMU user-space emulation. But in reality, because of unacceptable performance of binary translation, many frequently-used binaries, like coreutils and compilers, are replaced with host-native binaries in a way transparent to the build system (so it has no idea whether it runs host-native or target-native binaries). > Does the difference matters? The bug only occurs in some very specific cases: > * WSL > * QEMU User Emulation > Are these use cases common enough to block the whole idea of using > posix_spawn() on Linux. I don't think so. I really want to use posix_spawn() > for best performances! Moreover, I expect that glibc implementation of > posix_spawn() is safer than Python _posixsubprocess. The glibc has access to > low-level stuff like it's internal signals, cancellation points, etc. > _posixsubprocess is more generic and doesn't worry about such low-level > stuff, whereas they might cause bad surprised. It's true that a C library is in a better position to implement something like posix_spawn(), but I still think that vfork()/exec() is worth to consider at least on Linux. See bpo-35823, which should also work under QEMU user-mode and WSL (but needs testing). -- ___ Python tracker <https://bugs.python.org/issue35537> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35823] Use vfork() in subprocess on Linux
Change by Alexey Izbyshev : -- keywords: +patch, patch pull_requests: +11484, 11485 stage: -> patch review ___ Python tracker <https://bugs.python.org/issue35823> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35823] Use vfork() in subprocess on Linux
Change by Alexey Izbyshev : -- keywords: +patch pull_requests: +11484 stage: -> patch review ___ Python tracker <https://bugs.python.org/issue35823> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35823] Use vfork() in subprocess on Linux
Change by Alexey Izbyshev : -- keywords: +patch, patch, patch pull_requests: +11484, 11485, 11486 stage: -> patch review ___ Python tracker <https://bugs.python.org/issue35823> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35823] Use vfork() in subprocess on Linux
New submission from Alexey Izbyshev : This issue is to propose a (complementary) alternative to the usage of posix_spawn() in subprocess (see bpo-35537). As mentioned by Victor Stinner in msg332236, posix_spawn() has the potential of being faster and safer than fork()/exec() approach. However, some of the currently available implementations of posix_spawn() have technical problems (this mostly summarizes discussions in bpo-35537): * In glibc < 2.24 on Linux, posix_spawn() doesn't report errors to the parent properly, breaking existing subprocess behavior. * In glibc >= 2.25 on Linux, posix_spawn() doesn't report errors to the parent in certain environments, such as QEMU user-mode emulation and Windows subsystem for Linux. * In FreeBSD, as of this writing, posix_spawn() doesn't block signals in the child process, so a signal handler executed between vfork() and execve() may change memory shared with the parent [1]. Regardless of implementation, posix_spawn() is also unsuitable for some subprocess use cases: * posix_spawnp() can't be used directly to implement file searching logic of subprocess because of different semantics, requiring workarounds. * posix_spawn() has no standard way to specify the current working directory for the child. * posix_spawn() has no way to close all file descriptors > 2 in the child, which is the *default* mode of operation of subprocess.Popen(). May be even more importantly, fundamentally, posix_spawn() will always be less flexible than fork()/exec() approach. Any additions will have to go through POSIX standardization or be unportable. Even if approved, a change will take years to get to actual users because of the requirement to update the C library, which may be more than a decade behind in enterprise Linux distros. This is in contrast to having an addition implemented in CPython. For example, a setrlimit() action for posix_spawn() is currently rejected in POSIX[2], despite being trivial to add. I'm interested in avoiding posix_spawn() problems on Linux while still delivering comparable performance and safety. To that end I've studied implementations of posix_spawn() in glibc[3] and musl[4], which use vfork()/execve()-like approach, and investigated challenges of using vfork() safely on Linux (e.g. [5]) -- all of that for the purpose of using vfork()/exec() instead of fork()/exec() or posix_spawn() in subprocess where possible. The unique property of vfork() is that the child shares the address space (including heap and stack) as well as thread-local storage with the parent, which means that the child must be very careful not to surprise the parent by changing the shared resources under its feet. The parent is suspended until the child performs execve(), _exit() or dies in any other way. The most safe way to use vfork() is if one has access to the C library internals and can do the the following: 1) Disable thread cancellation before vfork() to ensure that the parent thread is not suddenly cancelled by another thread with pthread_cancel() while being in the middle of child creation. 2) Block all signals before vfork(). This ensures that no signal handlers are run in the child. But the signal mask is preserved by execve(), so the child must restore the original signal mask. To do that safely, it must reset dispositions of all non-ignored signals to the default, ensuring that no signal handlers are executed in the window between restoring the mask and execve(). Note that libc-internal signals should be blocked too, in particular, to avoid "setxid problem"[5]. 3) Use a separate stack for the child via clone(CLONE_VM|CLONE_VFORK), which has exactly the same semantics as vfork(), but allows the caller to provide a separate stack. This way potential compiler bugs arising from the fact that vfork() returns twice to the same stack frame are avoided. 4) Call only async-signal-safe functions in the child. In an application, only (1) and (4) can be done easily. One can't disable internal libc signals for (2) without using syscall(), which requires knowledge of the kernel ABI for the particular architecture. clone(CLONE_VM) can't be used at least before glibc 2.24 because it corrupts the glibc pid/tid cache in the parent process[6,7]. (As may be guessed, this problem was solved by glibc developers when they implemented posix_spawn() via clone()). Even now, the overall message seems to be that clone() is a low-level function not intended to be used by applications. Even with the above, I still think that in context of subprocess/CPython the sufficient vfork()-safety requirements are provided by the following. Despite being easy, (1) seems to be not necessary: CPython never uses pthread_cancel() internally, so Python code can't do that. A non-Python thread in an embedding app could try, but cancellation, in my knowledge, is not supported by CPyth
[issue35537] use os.posix_spawn in subprocess
Alexey Izbyshev added the comment: Another problem with posix_spawn() on glibc: it doesn't report errors to the parent process when run under QEMU user-space emulation and Windows Subsystem for Linux. This is because starting with commit [1] (glibc 2.25) posix_spawn() relies on address space sharing semantics of clone(CLONE_VM) to report errors, but it's not implemented in QEMU and WSL, so clone(CLONE_VM) and vfork() behave like fork(). See also [2], [3]. This can be easily demonstrated: $ cat test.c #include int main(int argc, char *argv[], char *envp[]) { return posix_spawn(0, "non-existing", 0, 0, argv, envp); } $ gcc test.c $ ./a.out $ echo $? 2 $ aarch64-linux-gnu-gcc -static test.c $ qemu-aarch64 ./a.out $ echo $? 0 There is no problem with musl (it doesn't rely on address space sharing). [1] https://sourceware.org/git/?p=glibc.git;a=commit;h=4b4d4056bb154603f36c6f8845757c1012758158 [2] https://bugs.launchpad.net/qemu/+bug/1673976 [3] https://lists.gnu.org/archive/html/qemu-devel/2017-11/msg04890.html -- ___ Python tracker <https://bugs.python.org/issue35537> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35537] use os.posix_spawn in subprocess
Alexey Izbyshev added the comment: >> * pass_fds: there is not API to mark a fd as inheritable (clear O_CLOEXEC >> flag) > POSIX has a bug for this [5]. It's marked fixed, but the current POSIX docs > doesn't reflect the changes. The idea is to make > posix_spawn_file_actions_adddup2() clear O_CLOEXEC if the source and the > destination are the same (unlike dup2(), which would do nothing). musl has > already implemented the new behavior [6], but glibc hasn't. Update: glibc has implemented it for 2.29 (https://sourceware.org/bugzilla/show_bug.cgi?id=23640). -- ___ Python tracker <https://bugs.python.org/issue35537> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35755] Remove current directory from posixpath.defpath to enhance security
Alexey Izbyshev added the comment: Thanks for the info on CS_PATH, Victor. IMHO it'd make sense to use the libc-provided default PATH at least in shutil.which() since its intent is to emulate "which" from the default shell. -- ___ Python tracker <https://bugs.python.org/issue35755> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35537] use os.posix_spawn in subprocess
Alexey Izbyshev added the comment: > It should be compared to the current code. Currently, _posixsubprocess uses a > loop calling execv(). I don't think that calling posix_spawn() in a loop > until one doesn't fail is more inefficient. > The worst case would be when applying process attributes and run file actions > would dominate performances... I don't think that such case exists. Syscalls > like dup2() and close() should be way faster than the final successful > execv() in the overall posix_spawn() call. I'm talking about the case when > the program exists. I had vfork() used internally by posix_spawn() in mind when I considered repeatedly calling it "prohibitively expensive". While vfork() is much cheaper than fork(), it doesn't mean that its performance is comparable to dup2() and close(). But on systems where posix_spawn is a syscall the overhead could indeed be lesser. Anyway, it should be measured. >> Iterate over PATH entries and perform a quick check for common exec errors >> (ENOENT, ENOTDIR, EACCES) manually (e.g. by stat()). > I dislike this option. There is a high risk of race condition if the program > is created, deleted or modified between the check and exec. It can cause > subtle bugs, or worse: security vulnerabilities. IMHO the only valid check, > to test if a program exists, is to call exec(). While exec() is of course cleaner, IMHO races don't matter in this case. If our stat()-like check fails, we could as well imagine that it is exec() that failed (while doing the same checks as our stat()), so it doesn't matter what happens with the tested entry afterwards. If the check succeeds, we simply call posix_spawn() that will perform the same checks again. If any race condition occurs, the problem will be detected by posix_spawn(). > Alexey: Do you want to work on a PR to reimplement the "executable_list" and > loop used by subprocess with _posixsubproces? I'm currently focused on researching whether it's possible to use vfork()-like approach instead of fork() on Linux, so if anyone wants to implement the PR you suggested, feel free to do it. -- ___ Python tracker <https://bugs.python.org/issue35537> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35537] use os.posix_spawn in subprocess
Alexey Izbyshev added the comment: Thank you for the answers, Kyle! > I'll be preparing a patch for our posix_spawn's signal handling. Great! > My mistake in my setuid assessment was pointed out to me- it doesn't seem > like a highly likely attack vector, but it is indeed possible. The specific scenario with non-synchronized posix_spawn/setuid is certainly not a good practice and could probably be considered an application bug (such application effectively spawns a child with "random" privileges -- depending on whether setuid() completed before or after vfork()). That said, in Linux C libraries protection from that scenario is usually achieved automatically if all signals are blocked before vfork(). This is the result of a Linux-specific detail: at syscall level, setuid() affects a single thread only, but setuid() library function must affect the process as a whole. This forces C libraries to signal all threads when setuid() is called and wait until all threads perform setuid() syscall. This waiting can't complete until vfork() returns (because signals are blocked), so setuid() is effectively postponed. I don't know how setuid() behaves on FreeBSD (so the above may be not applicable at all). > If the child errored out, they will indeed be properly reapable at that point > due to how vfork is implemented -- any observation to the contrary is to be > considered a bug. Ah, that's good, thanks for the confirmation! -- ___ Python tracker <https://bugs.python.org/issue35537> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35755] Remove current directory from posixpath.defpath to enhance security
Alexey Izbyshev added the comment: Would it make sense to use os.confstr('CS_PATH') instead of a hardcoded path, or is identical behavior on all POSIX platforms preferred to that? -- nosy: +izbyshev ___ Python tracker <https://bugs.python.org/issue35755> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com