[issue42388] subprocess.check_output(['echo', 'test'], text=True, input=None) fails

2020-11-22 Thread Alexey Izbyshev

Alexey Izbyshev  added the comment:

It seems that allowing `input=None` to mean "redirect stdin to a pipe and send 
an empty string there" in `subprocess.check_output` was an accident(?), and 
this behavior is inconsistent with `subprocess.run` and `communicate`, where 
`input=None` has the same effect as not specifying it at all.

The docs for `subprocess.check_output` say:

The arguments shown above are merely some common ones. The full function 
signature is largely the same as that of run() - most arguments are passed 
directly through to that interface. However, explicitly passing input=None to 
inherit the parent’s standard input file handle is not supported.

They don't make it clear the effect of passing `input=None` though. I also 
couldn't find any tests that would check that effect.

Since we can't just forbid `input=None` for `check_output` at this point 
(probably can't even limit that to the case when `text` is used, since it was 
added in 3.7), I think that we need to extend the check pointed to by 
ThiefMaster to cover `text`, clarify the docs and add a test.

--
nosy: +gregory.p.smith, izbyshev
versions: +Python 3.10, Python 3.8

___
Python tracker 
<https://bugs.python.org/issue42388>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42606] Support POSIX atomicity guarantee of O_APPEND on Windows

2021-01-19 Thread Alexey Izbyshev


Alexey Izbyshev  added the comment:

Could anybody provide their thoughts on this RFE? Thanks.

--

___
Python tracker 
<https://bugs.python.org/issue42606>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42606] Support POSIX atomicity guarantee of O_APPEND on Windows

2021-01-21 Thread Alexey Izbyshev


Alexey Izbyshev  added the comment:

> I don't know what you mean by default access rights.

I meant the access rights of the handle created by _wopen(). In my PR I 
basically assume that _wopen() uses GENERIC_READ/GENERIC_WRITE access rights, 
but _wopen() doesn't have a contractual obligation to do exactly that AFAIU. 
For example, if it got some extra access rights, then my code would "drop" them 
while switching FILE_WRITE_DATA off.

> For example, if FILE_WRITE_DATA isn't granted, then open() can't open for 
> appending. A direct CreateFileW() call can remove FILE_WRITE_DATA from the 
> desired access.

Indeed, I haven't thought about it. Are you aware of a common scenario when a 
regular file allows appending but not writing?

But, at least, this is not a regression: currently open()/os.open() can't open 
such files for appending too.

> An opener could also work like your PR via os.open(), msvcrt.get_osfhandle(), 
> _winapi.GetFileType(), _winapi.DuplicateHandle(), os.close(), and 
> msvcrt.open_osfhandle().

True, but it still falls into "etc" category of "ctypes/pywin32/etc" :) 
Certainly doable, but it seems better to have consistent O_APPEND behavior 
across platforms out-of-the-box.

--

___
Python tracker 
<https://bugs.python.org/issue42606>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42606] Support POSIX atomicity guarantee of O_APPEND on Windows

2021-01-22 Thread Alexey Izbyshev


Alexey Izbyshev  added the comment:

> FYI, here are the access rights applicable to files

Thanks, I checked that mapping in headers when I was writing 
_Py_wopen_noraise() as well. But I've found a catch via ProcessHacker: 
CreateFile() with GENERIC_WRITE (or FILE_GENERIC_WRITE) additionally grants 
FILE_READ_ATTRIBUTES for some reason. This is why I add FILE_READ_ATTRIBUTES to 
FILE_GENERIC_WRITE for DuplicateHandle() -- otherwise, the duplicated handle 
didn't have the same rights in my testing. So basically I have to deal with (a) 
_wopen() not guaranteeing contractually any specific access rights and (b) 
CreateFile() not respecting the specified access rights exactly. This is where 
my distrust and my original question about "default" access rights come from.

> I overlooked a case that's a complication. For O_TEMPORARY, the open uses 
> FILE_FLAG_DELETE_ON_CLOSE; adds DELETE to the requested access; and adds 
> FILE_SHARE_DELETE to the share mode. 
> _Py_wopen_noraise() can easily keep the required DELETE access. The 
> complication is that you have to be careful not to close the original file 
> descriptor until after you've successfully created the duplicate file 
> descriptor. If it fails, you have to return the original file descriptor from 
> _wopen(). If you close all handles for the kernel File and fail the call, the 
> side effect of deleting the file is unacceptable. 

Good catch! But now I realize that the problem with undoing the side effect 
applies to O_CREAT and O_TRUNC too: we can create and/or truncate the file, but 
then fail. Even if we could assume that DuplicateHandle() can't fail, 
_open_osfhandle() can still fail with EMFILE. And since there is no public 
function in MSVCRT to replace an underlying handle of an existing fd, we can't 
preallocate an fd to avoid this. There would be no problem if we could just 
reduce access rights of an existing handle, but it seems there is no such API.

I don't like the idea of silently dropping the atomic append guarantee in case 
of a failure, so I'm not sure how to proceed with the current approach.

Moreover, the same issue would apply even in case of direct implementation of 
os.open()/open() via CreateFile() because we still need to wrap the handle with 
an fd, and this may fail with EMFILE. For O_CREAT/O_TRUNC, it seems like it 
could be reasonably avoided:

* Truncation can simply be deferred until we have the fd and then performed 
manually.

* To undo the file creation, we could use GetLastError() to learn whether 
CreateFile() with OPEN_ALWAYS actually created the file, and then delete it on 
failure (still non-atomic, and deletion can fail, but probably better than 
nothing).

But I still don't know how to deal with O_TEMPORARY, unless there is a way to 
unset FILE_DELETE_ON_CLOSE on a handle.

As an aside, it's also very surprising to me that O_TEMPORARY is allowed for 
existing files at all. If not for that, there would be no issue on failure 
apart from non-atomicity.

Maybe I should forgo the idea of supporting O_APPEND for os.open(), and instead 
just support it in FileIO via WriteFile()-with-OVERLAPPED approach...

--

___
Python tracker 
<https://bugs.python.org/issue42606>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42606] Support POSIX atomicity guarantee of O_APPEND on Windows

2021-01-21 Thread Alexey Izbyshev


Alexey Izbyshev  added the comment:

> It's possible to query the granted access of a kernel handle via 
> NtQueryObject: ObjectBasicInformation

Ah, thanks for the info. But it wouldn't help for option (1) that I had in mind 
because open() and os.open() currently set only msvcrt-level O_APPEND.

It probably could be used in my current PR instead of just assuming the default 
access rights for file handles, but I'm not sure whether it makes sense: can a 
new handle have non-default access rights? Or can the default change at this 
point of Windows history?

> Python could implement its own umask value in Windows. os.umask() would set 
> the C umask value as well, but only for the sake of consistency with C 
> extensions and embedding.

Would it be a long shot to ask MS to add the needed functionality to MSVCRT 
instead? Or at least to declare the bug with _umask_s() that I described a 
feature. Maybe Steve Dower could help.

> Open attribute flags could also be supported, such as O_ATTR_HIDDEN and 
> O_ATTR_SYSTEM. These are needed because a hidden or system file is required 
> to remain as such when it's overwritten, else CreateFileW fails.

I didn't know that, thanks. Indeed, a simple open(path, 'w') fails on a hidden 
file with "Permission denied".

> If it's important enough, msvcrt.open() and msvcrt.O_TEXT could be provided.

Yes, I'd be glad to move support for more obscure MSVCRT flags to msvcrt.open() 
-- the less MSVCRT details we leak in os.open(), the more freedom we have to 
implement it via proper Win32 APIs.

===

Anyway, even if the blockers for implementing open()/os.open() via CreateFile() 
are solved, my current PR doesn't seem to conflict with such an implementation 
(it could just be replaced in the future).

Currently, the only way to achieve atomic append in Python on Windows that I 
know is to use a custom opener that would call CreateFile() with the right 
arguments via ctypes/pywin32/etc.

--

___
Python tracker 
<https://bugs.python.org/issue42606>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful

2021-01-21 Thread Alexey Izbyshev


Change by Alexey Izbyshev :


--
nosy: +izbyshev

___
Python tracker 
<https://bugs.python.org/issue42969>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42888] Not installed “libgcc_s.so.1” causes exit crash.

2021-01-21 Thread Alexey Izbyshev


Alexey Izbyshev  added the comment:

Thank you for testing. I've added a NEWS entry to the PR, so it's ready for 
review by the core devs.

Note that PyThread_exit_thread() can still be called by daemon threads if they 
try to take the GIL after Py_Finalize(), and also via C APIs like 
PyEval_RestoreThreads() (see bpo-42969), so, in general, libgcc_s is still 
required for CPython.

--
versions: +Python 3.8, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue42888>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42888] Not installed “libgcc_s.so.1” causes exit crash.

2021-01-18 Thread Alexey Izbyshev


Change by Alexey Izbyshev :


--
keywords: +patch
pull_requests: +23063
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/24241

___
Python tracker 
<https://bugs.python.org/issue42888>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42888] Not installed “libgcc_s.so.1” causes exit crash.

2021-01-18 Thread Alexey Izbyshev


Alexey Izbyshev  added the comment:

I've made a PR to remove most calls to pthread_exit().

@xxm: could you test it in your environment?

--

___
Python tracker 
<https://bugs.python.org/issue42888>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42780] os.set_inheritable() fails for O_PATH file descriptors on Linux

2020-12-29 Thread Alexey Izbyshev


Change by Alexey Izbyshev :


--
components: +Library (Lib)
nosy: +vstinner
versions:  -Python 3.6, Python 3.7

___
Python tracker 
<https://bugs.python.org/issue42780>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43113] os.posix_spawn errors with wrong information when shebang points to not-existing file

2021-02-03 Thread Alexey Izbyshev


Alexey Izbyshev  added the comment:

> IMO the fix is simple: only create OSError from the errno, never pass a 
> filename.

This will remove a normally helpful piece of the error message in exchange to 
being marginally less confusing in a rare case of non-existing interpreter (the 
user will still be left wondering why the file they tried to run exists, but 
"No such file or directory" is reported). So the only "win" here would be for 
CPython developers because users will be less likely to report a bug like this 
one.

> posix_spawn() is really complex function which can fail in many different 
> ways.

This issue is not specific to posix_spawn(): subprocess and os.execve() report 
the filename too. Any fix would need to change those too for consistency.

--
nosy: +gregory.p.smith

___
Python tracker 
<https://bugs.python.org/issue43113>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42606] Support POSIX atomicity guarantee of O_APPEND on Windows

2021-01-26 Thread Alexey Izbyshev


Alexey Izbyshev  added the comment:

> I think truncation via TRUNCATE_EXISTING (O_TRUNC, with O_WRONLY or O_RDWR) 
> or overwriting with CREATE_ALWAYS (O_CREAT | O_TRUNC) is at least tolerable 
> because the caller doesn't care about the existing data. 

Yes, I had a thought that creating or truncating a file when asked is in some 
sense "less wrong" than deleting an existing file on open() failure, but I'm 
still not comfortable with it. It would be nice, for example, if an open() with 
O_CREAT | O_EXCL failed, then the file would indeed not be created in all cases.

>> Truncation can simply be deferred until we have the fd and then performed 
>> manually.

> If the file itself has to be overwritten (i.e. the default, anonymous data 
> stream), as opposed to a named data stream, it would have to delete all named 
> data streams and extended attributes in the file. Normally that's all 
> implemented atomically in the filesystem. 

> In contrast, TRUNCATE_EXISTING (O_TRUNC) is simple to emulate, since 
> CreateFileW implents it non-atomically with a subsequent 
> NtSetInformationFile: FileAllocationInformation system call. 

Oh. So CREATE_ALWAYS for an existing file has a very different semantics than 
TRUNCATE_EXISTING, which means we can't easily use OPEN_ALWAYS with a deferred 
manual truncation, I see.

>> But I still don't know how to deal with O_TEMPORARY, unless there is a 
>> way to unset FILE_DELETE_ON_CLOSE on a handle.

> For now, that's possible with NTFS and the Windows API in all supported 
> versions of Windows by using a second kernel File with DELETE access, which 
> is opened before the last handle to the first kernel File is closed. After 
> you close the first open, use the second one to call SetFileInformation: 
> FileDispositionInfo to undelete the file.

So the idea is to delete the file for a brief period, but then undelete it. As 
I understand it, any attempt to open the file while it's in the deleted state 
(but still has a directory entry) will fail. This is probably not critical 
since it could happen only on an unlikely failure of _open_oshandle(), but is 
still less than perfect.

> Windows 10 supports additional flags with FileDispositionInfoEx (21), or 
> NTAPI FileDispositionInformationEx [1]. This provides a better way to disable 
> or modify the delete-on-close state per kernel File object, if the filesystem 
> supports it.

This is nice and would reduce our non-atomicity to just the following: if 
_wopen() would have failed even before CreateFile() (e.g. due to EMFILE), our 
reimplementation could still create a handle with FILE_DELETE_ON_CLOSE, so if 
our process is terminated before we unset it, we'll still lose the file. But it 
seems like the best we can get in any situation when we need to wrap a handle 
with an fd.

===

Regarding using WriteFile()-with-OVERLAPPED approach in FileIO, I've looked at 
edge cases of error remapping in MSVCRT write() again. If the underlying 
WriteFile() fails, it only remaps ERROR_ACCESS_DENIED to EBADF, presumably to 
deal with write() on a O_RDONLY fd. But if WriteFile() writes zero bytes and 
does not fail, it gets more interesting:

* If the file type is FILE_TYPE_CHAR and the first byte to write was 26 
(Ctrl-Z), it's treated as success. I don't think I understand: it *seems* like 
it's handling something like writing to the *input* of a console, but I'm not 
sure it's even possible in this manner.

* Anything else is a failure with ENOSPC. This is mysterious too.

I've checked how java.io.FileOutputStream deals with WriteFile() succeeding 
with zero size (just to compare with another language runtime) and haven't 
found any special handling[1]. Moreover, it seems like 
FileOutputStream.write(bytes[]) will enter an infinite loop if WriteFile() 
always returns 0 for a non-empty byte array[2]. So it's unclear to me what 
MSVCRT is up to here and whether FileIO should do the same.

[1] 
https://github.com/openjdk/jdk/blob/b4ace3e9799dab5970d84094a0fd1b2d64c7f923/src/java.base/windows/native/libjava/io_util_md.c#L522
[2] 
https://github.com/openjdk/jdk/blob/b4ace3e9799dab5970d84094a0fd1b2d64c7f923/src/java.base/share/native/libjava/io_util.c#L195

--

___
Python tracker 
<https://bugs.python.org/issue42606>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43069] Python fails to read a script whose path is `/dev/fd/X`

2021-01-30 Thread Alexey Izbyshev


Alexey Izbyshev  added the comment:

I would suggest to start digging from the following piece of code in 
`maybe_pyc_file()` (Python/pythonrun.c):

 int ispyc = 0;
 if (ftell(fp) == 0) {
 if (fread(buf, 1, 2, fp) == 2 &&
 ((unsigned int)buf[1]<<8 | buf[0]) == halfmagic)
 ispyc = 1;
 rewind(fp);
 }

--
nosy: +izbyshev

___
Python tracker 
<https://bugs.python.org/issue43069>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43113] os.posix_spawn errors with wrong information when shebang points to not-existing file

2021-02-03 Thread Alexey Izbyshev


Alexey Izbyshev  added the comment:

> Ideally, the error would say:

> FileNotFoundError: ./demo: /usr/bin/hugo: bad interpreter: No such file or 
> directory

The kernel simply returns ENOENT on an attempt to execve() a file with 
non-existing hash-bang interpreter. The same occurs on an attempt to run a 
dynamically linked ELF executable with INTERP program header containing a 
non-existing path. And, of course, the same error is returned if the executable 
file itself doesn't exist, so there is no simple way to distinguish such cases.

Bash simply assumes[1] that if a file contains a hash-bang and the error from 
execve() is not recognized otherwise, it's a "bad interpreter".

Note that all of the above is completely unrelated to os.posix_spawn(): 
subprocess or os.execve() would produce the same message.

[1] 
https://git.savannah.gnu.org/cgit/bash.git/tree/execute_cmd.c?h=bash-5.1#n5854

--
nosy: +izbyshev

___
Python tracker 
<https://bugs.python.org/issue43113>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43113] os.posix_spawn errors with wrong information when shebang points to not-existing file

2021-02-03 Thread Alexey Izbyshev


Alexey Izbyshev  added the comment:

> FileNotFoundError: [Errno 2] No such file or directory: Either './demo' or 
> the interpreter of './demo' not found.

This doesn't sound good to me because a very probable and a very improbable 
reasons are combined together without any distinction. Another possible issue 
is that usage of the word "interpreter" in this narrow sense might be 
non-obvious for users.

ISTM that the most minimal way to check for the possibility of interpreter 
issue would be do something like `access(exe_path, X_OK)` in case of ENOENT: if 
it's successful, then a "bad interpreter" condition is likely. But in case of 
os.posix_spawnp(), the search in PATH is performed by libc, so CPython doesn't 
know exe_path. OTOH, subprocess and os.exec*p do perform their own search in 
PATH, but in case of subprocess it happens in the context of the child process, 
so we'll probably need to devise a separate error code to report to the parent 
via the error pipe to distinguish this condition.

So far, I'm not convinced that the result is worth it, but I do agree that such 
mysterious "No such file" errors are not user-friendly.

--

___
Python tracker 
<https://bugs.python.org/issue43113>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43113] os.posix_spawn errors with wrong information when shebang points to not-existing file

2021-02-10 Thread Alexey Izbyshev


Alexey Izbyshev  added the comment:

How do you propose to approach documentation of such behavior? The underlying 
cause is the ambiguity of ENOENT error code from execve() returned by the 
kernel, so it applies to all places where Python can call execve(), including 
os.posixspawn(), os.execve() and subprocess, so it's not clear to me where such 
documentation should be placed. And, of course, this behavior is not specific 
to CPython.

The Linux man pages mention various causes of this error[1], though POSIX 
doesn't[2].

While ENOENT ambiguity is indeed confusing, one of the top results of my DDG 
search on "linux no such file or directory but script exists" is this link[3].

[1] https://man7.org/linux/man-pages/man2/execve.2.html
[2] https://pubs.opengroup.org/onlinepubs/9699919799/functions/execve.html
[3] 
https://stackoverflow.com/questions/3949161/no-such-file-or-directory-but-it-exists

--

___
Python tracker 
<https://bugs.python.org/issue43113>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43113] os.posix_spawn errors with wrong information when shebang points to not-existing file

2021-02-10 Thread Alexey Izbyshev


Alexey Izbyshev  added the comment:

I generally agree, but getting a good, short error message seems to be the hard 
part here. I previously complained[1] about the following proposal by @hroncok:

FileNotFoundError: [Errno 2] No such file or directory: Either './demo' or the 
interpreter of './demo' not found.

But may be it's just me. Does anybody else feel that mentioning "the 
interpreter" is this way could be confusing in prevalent cases when the actual 
problem is missing './demo' itself? If we can come up with a good message, I 
can look into turning it into a PR.

The error message above also reads to me like there are no other possible 
reasons of ENOENT. On Linux, binfmt_misc[2] provides a way to run arbitrary 
code on execve(). This is used, for example, to transparently run binaries for 
foreign arches via qemu-user, so probably ENOENT would be returned if QEMU 
itself it missing. QEMU *may* be thought as a kind of interpreter here, though 
it's completely unrelated to a hash-bang or an ELF interpreter.

But I don't know about other POSIX platforms. As a theoretical example, if the 
dynamic library loader is implemented in the kernel, the system call could 
return ENOENT in case of a missing library. Do we need to worry about it? Does 
anybody know about the situation on macOS, where posix_spawn() is a system call 
too?

[1] https://bugs.python.org/issue43113#msg386210
[2] https://en.wikipedia.org/wiki/Binfmt_misc

--

___
Python tracker 
<https://bugs.python.org/issue43113>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32592] Drop support of Windows Vista and Windows 7

2021-03-23 Thread Alexey Izbyshev


Alexey Izbyshev  added the comment:

As far as I understand, commit [1] explicitly prevents CPython from running on 
Windows 7, and it's included into 3.9. So it seems to be too late to complain, 
despite that, according to Wikipedia, more than 15% of all Windows PCs are 
still running Windows 7 [2].

[1] 
https://github.com/izbyshev/cpython/commit/6a65eba44bfd82ccc8bed4b5c6dd6637549955d5
[2] https://en.wikipedia.org/wiki/Windows_7

--

___
Python tracker 
<https://bugs.python.org/issue32592>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32592] Drop support of Windows Vista and Windows 7

2021-03-24 Thread Alexey Izbyshev


Alexey Izbyshev  added the comment:

> If we had a dedicated maintainer who was supporting Win7 and making releases 
> for it, then we (i.e. they) could support it. But then, there's nothing to 
> stop someone doing that already, and even to stop them charging money for it 
> if they want (which they wouldn't be able to do under the auspices of 
> python-dev). So I suspect nobody is really that motivated ;)

That's totally fair.

> (Also, a "little bit of complaining" here is totally worthwhile, as it helps 
> us gauge community sentiment. Without it, we're limited to trawling forums 
> and Twitter - especially without conferences to actually meet and hear 
> directly from people - and so our inputs are biased. Having polite, informed, 
> discussion on the tracker is a great way for us to see and capture these 
> positions.)

I agree. In my comment, I only intended to contrast "complaining" with stepping 
up to do the work. I didn't mean to imply that it's inappropriate per-se.

--

___
Python tracker 
<https://bugs.python.org/issue32592>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32592] Drop support of Windows Vista and Windows 7

2021-03-23 Thread Alexey Izbyshev


Alexey Izbyshev  added the comment:

> If Win8-only calls are not used, then presumably it should still build and 
> run on Windows 7, presumably with the flag flipped back to Win7. And if there 
> are Win8-only calls used and the flag is set to Win7+, I assume that the MSVC 
> compiler will catch any instances of Win8-only calls at build time? Is this 
> understanding correct?

It's the latter. In the patch, calls to functions like PathCchCanonicalizeEx(), 
which had previously been conditional on runtime availability of those 
functions, were made direct. Those functions are not from UCRT (so you can't 
just distribute it to regain Windows 7 support) and are not available in 
Windows 7.

There was also another patch that adds a check to Python 3.9+ installer that 
prevents installation on Windows 7 (but again, even if you take the Windows 
"embedded" distro instead of the installer, it won't run due to Win8-only 
functions).

> If the latter is true, its very likely a lost cause. However, if the former 
> is the case, then at least devs using Python 3.9+ on Windows 7 will be able 
> to easily build their own versions with support, though that could change at 
> any time.

For now, it's not entirely lost cause in the sense that the number of 
Win8-specific patches and their size is very small (I'm not aware of anything 
not already mentioned, but I may be not up-to-date). So an interested party can 
probably revert them locally without much hassle.

> For those in the know, have there been a lot of reports/discontent 
> surrounding the dropping of support in Py3.9, by anyone aware of that? 

I semi-casually follow the bug tracker and remember only bpo-41412 and 
bpo-42027.

Personally, I do consider tying of Windows support in CPython to arbitrary 
support periods set by Microsoft unfortunate, and it's also true that 
Win8-specific changes I'm aware of are not critical and don't remove any 
significant maintenance burden. For now, both relative and absolute usage of 
Windows 7 is huge and hard to ignore, so developers who use CPython in their 
own software are faced with an unpleasant choice of sticking to 3.8 or dropping 
Win 7. (I'm one of them since software I work on has to run in build 
environments setup for other projects, and such environments routinely stick to 
old OS versions due to breakage on updates). But this is what we have according 
to PEP 11, and a little grumbling here certainly won't change that :)

--

___
Python tracker 
<https://bugs.python.org/issue32592>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43308] subprocess.Popen leaks file descriptors opened for DEVNULL or PIPE stdin/stdout/stderr arguments

2021-02-24 Thread Alexey Izbyshev


Change by Alexey Izbyshev :


--
nosy: +gregory.p.smith

___
Python tracker 
<https://bugs.python.org/issue43308>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42738] subprocess: don't close all file descriptors by default (close_fds=False)

2021-10-27 Thread Alexey Izbyshev


Alexey Izbyshev  added the comment:

> As a concrete example, we have a (non-Python) build system and task runner 
> that orchestrates many tasks to run in parallel. Some of those tasks end up 
> invoking Python scripts that use subprocess.run() to run other programs. Our 
> task runner intentionally passes an inheritable file descriptor that is 
> unique to each task as a form of a keep-alive token; if the child processes 
> continue to pass inheritable file descriptors to their children, then we can 
> determine whether all of the processes spawned from a task have terminated by 
> checking whither the last open handle to that file descriptor has been 
> closed. This is particularly important when a processes exits before its 
> children, sometimes uncleanly due to being force killed by the system or by a 
> user.

I don't see how such scheme could be usable in a general-purpose build system. 
Python is just one example of a program that closes file descriptors for child 
processes, but there might be arbitrary more. In general, there is no guarantee 
that a descriptor would be inherited in a process tree if it contains at least 
one process running code that you don't fully control. To properly control 
process trees, I'd suggest to consider prctl(PR_SET_CHILD_SUBREAPER) or PID 
namespaces on Linux and job objects on Windows (don't know about other OSes).

> Regarding security, PEP 446 already makes it so that any files opened from 
> within a Python program are non-inheritable by default, which I agree is a 
> good default. One can make the argument that it's not Python's job to enforce 
> a security policy on file descriptors that a Python process has inherited 
> from a parent process, since Python cannot distinguish from descriptors that 
> were accidentally or intentionally inherited.

While I agree that such argument could be made, closing FDs for children also 
affects FDs opened by C extensions, which are not subject to PEP 446. And this 
is important not only for security: for example, it's easy to create a deadlock 
if one process is blocked on a read() from a pipe, which it expects to be 
closed at a proper moment, but the pipe write end is leaked to some unrelated 
long-running process which keeps it alive indefinitely without being aware of 
it.

And again, even if subprocess didn't close FDs by default (or somehow closed 
only non-inherited FDs), how could this be used in a wider picture? Any 
third-party library (even written in Python) could still decide to close e.g. 
some range of descriptors, so one wouldn't be able to rely on FD inheritance in 
a general setting, just as one can't rely on inheritance of environment 
variables (works most of the time, but no promises). Traditional Unix things 
like inheritance across fork()/exec() and no O_CLOEXEC by default are 
convenient for quick hacks, but my impression is that such defaults are 
considered to be design bugs in the modern setting by many.

--

___
Python tracker 
<https://bugs.python.org/issue42738>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44656] Dangerous mismatch between MAXPATHLEN and MAX_PATH on Windows

2021-07-16 Thread Alexey Izbyshev


New submission from Alexey Izbyshev :

In PC/getpathp.c CPython uses buffers with length MAXPATHLEN+1, which is 257 on 
Windows[1]. On Windows 7, where PathCch* functions are not available, CPython 
<= 3.8 fallbacks to PathCombineW()/PathCanonicalizeW()[2]. Those functions 
assume that the destination buffer has room for MAX_PATH (260) characters. This 
creates a dangerous setup: for example, gotlandmark()[3] can overflow the 
destination if `filename` is long enough, and `filename` can be user-controlled.

I couldn't devise a simple way to trigger a buffer overflow in a default Python 
installation, though it is possible if one, for example, makes sure that the 
landmark file ("lib\os.py") can't be found in the default locations and then 
supplies their own, long enough paths via e.g. PYTHONPATH environment variable 
which eventually end up in gotlandmark(). Even when such buffer overflow is 
triggered on my machine, I couldn't notice any change in behavior, probably 
because 3 bytes is small enough to not overwrite anything important.

However, I'm not comfortable with this. Could we just raise MAXPATHLEN from 256 
to 260 on Windows to avoid such kind of issues for sure?

Please also note that while the issue described above affects only Python <= 
3.8 on Windows 7, I think it would make sense to increase MAXPATHLEN in newer 
versions too to avoid any similar situations in the future (i.e. if two pieces 
of code interact and one of them uses MAX_PATH while another uses MAXPATHLEN).

[1] 
https://github.com/python/cpython/blob/0389426fa4af4dfc8b1d7f3f291932d928392d8b/Include/osdefs.h#L13
[2] 
https://github.com/python/cpython/blob/0389426fa4af4dfc8b1d7f3f291932d928392d8b/PC/getpathp.c#L278
[3] 
https://github.com/python/cpython/blob/0389426fa4af4dfc8b1d7f3f291932d928392d8b/PC/getpathp.c#L333

--
components: Windows
messages: 397655
nosy: izbyshev, paul.moore, steve.dower, tim.golden, zach.ware
priority: normal
severity: normal
status: open
title: Dangerous mismatch between MAXPATHLEN and MAX_PATH on Windows
type: security
versions: Python 3.6, Python 3.7, Python 3.8

___
Python tracker 
<https://bugs.python.org/issue44656>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue47260] os.closerange() can be no-op in a seccomp sandbox

2022-04-08 Thread Alexey Izbyshev


Change by Alexey Izbyshev :


--
keywords: +patch
pull_requests: +30443
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/32418

___
Python tracker 
<https://bugs.python.org/issue47260>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue47260] os.closerange() can be no-op in a seccomp sandbox

2022-04-08 Thread Alexey Izbyshev


Alexey Izbyshev  added the comment:

> It's been years now and that hasn't happened, even with more recent flag 
> additions. I think it's safe to say it won't, and such a fallback upon error 
> won't put us back into a bogus pre-close_range situation where we're 
> needlessly close()ing a bunch of closed fds.

Yes, I also find it unlikely that close_range() will start reporting an error 
if it fails to close some fd, especially if not given some new flag. Such error 
reporting doesn't seem very useful to userspace because there is no way to 
associate the error with the fd.

But even if such change happens, it will be just a performance regression, not 
a potential correctness/security issue.

--
keywords:  -patch
stage: patch review -> 

___
Python tracker 
<https://bugs.python.org/issue47260>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue47260] os.closerange() can be no-op in a seccomp sandbox

2022-04-08 Thread Alexey Izbyshev


Change by Alexey Izbyshev :


--
keywords: +patch
stage:  -> patch review

___
Python tracker 
<https://bugs.python.org/issue47260>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue47245] potential undefined behavior with subprocess using vfork() on Linux

2022-04-07 Thread Alexey Izbyshev


Alexey Izbyshev  added the comment:

In short: both this bug report and [1] are invalid.

The reason why doing syscall(SYS_vfork) is illegal is explained by Florian 
Weimer in [2]:

>The syscall function in glibc does not protect the on-stack return address 
>against overwriting, so it can't be used to call SYS_vfork on x86.

This is off-topic here because CPython calls vfork() via libc, but I'll still 
expand the Florian's comment. Suppose one wants to write my_vfork() wrapper 
over vfork syscall. When vfork syscall returns the first time, my_vfork() has 
to return to its caller. This necessarily involves knowing the return address. 
On some architectures this return address is stored on the stack by the caller 
(e.g. x86). The problem is that any data in my_vfork() stack frame can be 
overwritten by its caller once it returns the first time. Then, when vfork 
syscall returns the second time, my_vfork() could be unable to return to its 
caller because the data it fetches from its (now invalid) stack frame is 
garbage. This is precisely what happens when one implements my_vfork() as 
syscall(SYS_vfork). To avoid this, the most common strategy is to store the 
return address into a register that's guaranteed to be preserved around syscall 
by the OS ABI. For example, the x86-64 musl implementation [3] stores the retur
 n address in rdx (which is preserved around syscall) and then restores it 
after syscall (both on the first and the second return of the syscall).

Now back to CPython. The real problem with stack sharing between the child and 
the parent could be due to compiler bugs, e.g. if a variable stored on the 
stack is modified in the child branch of "if (vfork())", but the compiler 
assumes that some other variable sharing the stack slot with the first one is 
*not* modified in the parent branch (i.e. after vfork() returns the second 
time). But all production compilers handle vfork() (and setjmp(), which has a 
similar issue) in a special way to avoid this, and GCC has 
__attribute__((returns_twice)) that a programmer could use for custom functions 
behaving in this way (my_vfork() would have to use this attribute).

Regarding a separate stack for the child and clone(CLONE_VM|CLONE_VFORK), I 
considered this in #35823, but this has its own problems. The most important 
one is that CPython would be in business of choosing the correct stack size for 
the child's stack, but CPython is *not* in control of all the code executing in 
the child because it calls into libc. In practice, people use various 
LD_PRELOAD-based software that wraps various libc functions (e.g. Scratchbox 2 
build environment for Sailfish OS is an LD_PRELOAD-based sandbox), so even 
common-sense assumptions like "execve() in libc can't use a lot of stack!" 
might turn out to be wrong. CPython *could* work around that by using direct 
syscalls for everything in the child, or by using some "large" size that 
"should be enough for everybody", but I wouldn't want to see that unless we 
have a real problem with vfork(). Note that vfork()-based posix_spawn() 
implementations in C libraries do *not* have this problem because they fully 
control all
  code in the child (e.g. they would use a non-interposable execve() symbol or 
a direct syscall).

> Immediate action item: Add a way for people to disable vfork at runtime by 
> setting a flag in the subprocess module, just in case.

I don't think any action is needed at all, and I think this issue should be 
closed.

> Our current assumptions around the use of vfork() are very much glibc 
> specific.

Could you clarify what glibc-specific assumptions you mean? In #35823 I tried 
to use as little assumptions as possible.

[1] https://bugzilla.kernel.org/show_bug.cgi?id=215813
[2] https://bugzilla.kernel.org/show_bug.cgi?id=215813#c2
[3] 
https://git.musl-libc.org/cgit/musl/tree/src/process/x86_64/vfork.s?id=ced75472d7e3d73d5b057e36ccbc7b7fcba95104

--
nosy: +izbyshev

___
Python tracker 
<https://bugs.python.org/issue47245>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35823] Use vfork() in subprocess on Linux

2022-04-08 Thread Alexey Izbyshev


Alexey Izbyshev  added the comment:

The preceding comment is wrong, see discussion in #47245 and 
https://bugzilla.kernel.org/show_bug.cgi?id=215813#c14 for explanation of why 
that bug report is irrelevant for CPython.

--

___
Python tracker 
<https://bugs.python.org/issue35823>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue47245] potential undefined behavior with subprocess using vfork() on Linux?

2022-04-08 Thread Alexey Izbyshev


Alexey Izbyshev  added the comment:

> As for glibc specifics, I'm mostly thinking of the calls we do in the child.

> According to the "Standard Description (POSIX.1)" calls to anything other 
> than `_exit()` or `exec*()` are not allowed.  But the longer "Linux 
> Description" in that vfork(2) man page does not say that.  Which implies 
> merely by omission that calls to other things are okay so long as you 
> understand everything they do to the process heap/stack/state.  (I wish it 
> were *explicit* about that)

If we're talking about the kernel side of things, sure, we rely on Linux being 
"sane" here, though I suppose on *BSDs the situation is similar.

> Some of the calls we do from our child_exec() code... many are likely "just" 
> syscall shims and thus fine - but that is technically up to libc.

Yes, but I wouldn't say that "being just syscall shims" is specific for glibc. 
It's just a "natural" property that just about any libc is likely to possess. 
(Yeah, I know, those are vague words, but in my experience "glibc-specific" is 
usually applied to some functionality/bug present in glibc and absent in other 
libcs, and I don't think we rely on something like that).

Of course, there are also LD_PRELOAD things that could be called instead of 
libc, but good news here is that we don't create new constrains for them 
(CPython is not the only software that uses vfork()), and they're on their own 
otherwise.

> A few others are Py functions that go elsewhere in CPython and while they may 
> be fine for practical reasons today with dangerous bits on conditional 
> branches that technically should not be possible to hit given the state by 
> the time we're at this point in _posixsubprocess, pose a future risk - anyone 
> touching implementations of those is likely unaware of vfork'ed child 
> limitations that must be met.

We already have async-signal-safety requirement for all such code because of 
fork(). Requirements of vfork() are a bit more strict, but at least the set of 
functions we have to watch for dangerous changes is the same. And I suspect 
that most practical violations of vfork()-safety also violate 
async-signal-safety.

> For example if one of the potential code paths that trigger an indirect 
> Py_FatalError() is hit... that fatal exit code is definitely not 
> post-vfork-child safe.  The pre-exec child dying via that could screw up the 
> vfork parent process's state.

Yeah, and it can break the fork parent too, at least because it uses exit() 
(not _exit()), so stdio buffers will be flushed twice, in the child and in the 
parent.

--

___
Python tracker 
<https://bugs.python.org/issue47245>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue47245] potential undefined behavior with subprocess using vfork() on Linux?

2022-04-08 Thread Alexey Izbyshev


Alexey Izbyshev  added the comment:

> 3. We have to fix error-path in order not to change heap state (contents and 
> allocations), possibly do not touch locks. During vfork() child execution - 
> the only parent THREAD (not the process) is blocked. For example, it's not 
> allowed to touch GIL. Child process may die unexpectedly and leave GIL 
> locked. Is it possible to rewrite children path for vfork() case without any 
> Py* calls ? As an idea, we can prepare all low-level things (all the pointers 
> to strings and plain values) before vfork(), so child code will use only that 
> data.

What specifically do you propose to fix? There is no problem with GIL if the 
child dies because the GIL is locked and unlocked only by the parent and the 
child never touches it. Similarly, only Py_* calls known to be safe are used. 
As for "pointers to strings", it's not clear to me what you mean, but if you 
mean allocations, they are already done before (v)fork(), since the child code 
is required to be async-signal-safe even if plain fork() is used.

--

___
Python tracker 
<https://bugs.python.org/issue47245>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue47260] os.closerange() can be no-op in a seccomp sandbox

2022-04-08 Thread Alexey Izbyshev


New submission from Alexey Izbyshev :

After #40422 _Py_closerange() assumes that close_range() closes all file 
descriptors even if it returns an error (other than ENOSYS):

if (close_range(first, last, 0) == 0 || errno != ENOSYS) {
/* Any errors encountered while closing file descriptors are ignored;
 * ENOSYS means no kernel support, though,
 * so we'll fallback to the other methods. */
}
else
/* fallbacks */


This assumption can be wrong on Linux if a seccomp sandbox denies the 
underlying syscall, pretending that it returns EPERM or EACCES. In this case 
_Py_closerange() won't close any descriptors at all, which in the worst case 
can be a security issue.

I propose to fix this by falling back to other methods in case of *any* 
close_range() error. Note that fallbacks will not be triggered on any problems 
with closing individual file descriptors because close_range() is documented to 
ignore such errors on both Linux[1] and FreeBSD[2].

[1] https://man7.org/linux/man-pages/man2/close_range.2.html
[2] https://www.freebsd.org/cgi/man.cgi?query=close_range=2

--
assignee: izbyshev
components: Library (Lib)
keywords: 3.10regression
messages: 416986
nosy: gregory.p.smith, izbyshev, kevans, kevans91
priority: normal
severity: normal
status: open
title: os.closerange() can be no-op in a seccomp sandbox
type: behavior
versions: Python 3.10, Python 3.11

___
Python tracker 
<https://bugs.python.org/issue47260>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



<    1   2   3   4