[issue34780] Hang on startup if stdin refers to a pipe with an outstanding concurrent operation on Windows

2020-01-19 Thread Zachary Ware


Change by Zachary Ware :


--
versions: +Python 3.9 -Python 2.7, Python 3.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34780] Hang on startup if stdin refers to a pipe with an outstanding concurrent operation on Windows

2018-10-15 Thread Alexey Izbyshev


Alexey Izbyshev  added the comment:

Ping!

Thanks to @eryksun for providing feedback here and for the patch review.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34780] Hang on startup if stdin refers to a pipe with an outstanding concurrent operation on Windows

2018-09-23 Thread Eryk Sun


Eryk Sun  added the comment:

> lseek() succeeds on pipes on Windows, but is nearly useless

lseek isn't meaningful for pipe and character files (i.e. FILE_TYPE_PIPE and 
FILE_TYPE_CHAR). While SEEK_SET and SEEK_CUR operations trivially succeed in 
these cases, the underlying device simply ignores the current file position. I 
think it would be reasonable to fail these cases instead of succeeding 
misleadingly.

When a file is opened for synchronous access, its  FilePositionInformation is 
managed by the I/O manager, not the device or file system. All the I/O manager 
does is get or set the CurrentByteOffset value in the File object [1]. It 
doesn't matter whether the device actually uses this information. 

Regarding the observed SEEK_END behavior, the named-pipe file system (NPFS) 
supports querying the FileStandardInformation of a pipe, in which it sets the 
EndOfFile value as the number of bytes available to be read from the pipe's 
inbound (server-side) queue. So SEEK_END (or WinAPI FILE_END) does provide some 
information to us, but it's misleading because the seek itself is meaningless.

[1]: https://msdn.microsoft.com/en-us/library/windows/hardware/ff545834

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34780] Hang on startup if stdin refers to a pipe with an outstanding concurrent operation on Windows

2018-09-23 Thread Alexey Izbyshev


Change by Alexey Izbyshev :


--
keywords: +patch
pull_requests: +8924
stage:  -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34780] Hang on startup if stdin refers to a pipe with an outstanding concurrent operation on Windows

2018-09-23 Thread Alexey Izbyshev


New submission from Alexey Izbyshev :

In the following code inspired by a production issue I had to debug recently 
subprocess.call() won't return:

import os
import subprocess
import sys
import time

r, w = os.pipe()
p1 = subprocess.Popen([sys.executable, '-c',
   'import sys; sys.stdin.read()'],
  stdin=r)

time.sleep(1)
subprocess.call([sys.executable, '-c', ''], stdin=r)

os.close(w)
p1.wait()

The underlying reason is the same as in #22976. Python performs certain 
operations on stdin during it's initialization (different in 2.7 and 3.x), 
which block because there is an outstanding ReadFile() on the pipe end stdin 
refers to. Assuming that subprocess.call() runs some app that doesn't use stdin 
at all, if a developer doesn't control how the app is run (which was my case), 
I don't see any way to workaround this in pure Python. (An obvious workaround 
is to make a wrapper which closes stdin or redirects it to something else, but 
this wrapper can't be run with CPython).

I propose to fix this in CPython. The details are slightly different for 2.7 
and 3.x.

2.7 calls fstat(stdin) in dircheck() (Objects/fileobject.c). This hangs because 
msvcrt calls PeekNamedPipe() if stdin refers to a pipe. Ironically, this 
fstat() call is completely useless on Windows because msvcrt never sets S_IFDIR 
in st_mode (it can't distinguish between a file and a directory because it uses 
GetFileType() and doesn't perform extra checks). I've implemented a PR that 
skips dircheck() on Windows. (If we do want to add a proper dircheck() to 2.7, 
it should do something similar to 3.x).

3.x performs the dir check without relying on fstat(), but it also calls 
lseek() (in _buffered_init() (Modules/_io/bufferedio.c), if removed, there is 
another one in _io_TextIOWrapper___init___impl (Modules/_io/textio.c). mscvrt 
calls SetFilePointerEx(), which hangs too, which is somewhat surprising because 
its docs [1] say:

You cannot use the SetFilePointerEx function with a handle to a nonseeking 
device such as a pipe or a communications device.

The wording is unclear though -- it doesn't say what happens if I try. lseek() 
docs [2] contain the following:

On devices incapable of seeking (such as terminals and printers), the return 
value is undefined.

In practice, lseek() succeeds on pipes on Windows, but is nearly useless:

Python 3.7.0 (v3.7.0:1bf9cc5093, Jun 27 2018, 04:59:51) [MSC v.1914 64 bit 
(AMD64)] on win32
>>> import os
>>> r, w = os.pipe()
>>> os.write(w, b'xyz')
3
>>> os.lseek(r, 0, os.SEEK_CUR)
0
>>> os.lseek(r, 0, os.SEEK_END)
3
>>> os.lseek(r, 2, os.SEEK_SET)
2
>>> os.read(r, 1)
b'x'
>>> os.lseek(r, 0, os.SEEK_CUR)
2
>>> os.read(r, 1)
b'y'
>>> os.lseek(r, 0, os.SEEK_CUR)
2
>>> os.lseek(r, 0, os.SEEK_END)
1

So lseek() can be used to check the current pipe buffer size, and that seems 
about it. Given the above, I suggest two solutions for the hang on Windows:

1) Make lseek() fail on pipes on Windows, as it does on Unix. A number of 
projects have already done that:

https://referencesource.microsoft.com/#mscorlib/system/io/filestream.cs,1029
https://go.googlesource.com/go/+/ce58a39fca067a19c505220c0c907ccf32793427/src/syscall/syscall_windows.go#374
https://trac.ffmpeg.org/ticket/986 (workaround: 
https://lists.ffmpeg.org/pipermail/ffmpeg-cvslog/2012-June/051590.html)
https://github.com/erikd/libsndfile/blob/123cb9f9a5a356b951a23e9e2ab8527f967425cc/src/file_io.c#L266

2) Delay lseek() until it's really needed. In both cases (BufferedIO and 
TextIO), lseek() is used to set some cached fields, so ISTM it's not necessary 
to do it during initialization. This would also be an optimization (skip 
lseek() syscall until a user really wants to tell()/seek()). This can be done 
as a sole fix or can be combined with the above (as an optimization).

I'd like to hear other people's opinions before doing anything for Python 3.

[1] 
https://docs.microsoft.com/en-us/windows/desktop/api/fileapi/nf-fileapi-setfilepointerex
[2] 
https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/lseek-lseeki64

--
components: IO, Windows
messages: 326175
nosy: eryksun, izbyshev, paul.moore, steve.dower, tim.golden, vstinner, 
zach.ware
priority: normal
severity: normal
status: open
title: Hang on startup if stdin refers to a pipe with an outstanding concurrent 
operation on Windows
type: behavior
versions: Python 2.7, Python 3.6, Python 3.7, Python 3.8

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com