Guido van Rossum added the comment:

I'm trying to let go of the AIX hang. Here's a brain dump of what I've figured 
out so far.

* There were a lot of red herrings in the early discussion. This hang doesn't 
seem to have anything to do with nonblocking connect() or sockets, nor even 
signals.

* Summary of what the test (test_subprocess_interactive) tries to do: it starts 
an echo subprocess, writes a string to it, reads the string back, writes 
another string to it, reads that back, and then closes the transport.

* The test hangs after seeing the first string echoed back but not the second, 
and in between somehow the stdin pipe is broken.

* If I read David's truss log correctly, the following things have happened:

- the parent wrote 'Python ' to the pipe for the subprocess's stdin (this is 
not shown in the extract but it must have happened because we see the string 
arrive in the subprocess)
- the echo.py subprocess started and began to read from stdin
- the subprocess read 'Python ' from its stdin
- the subprocess wrote 'Python ' back to its stdout
- poll() in the parent woke up
- the parent allocated some memory and read 'Python ' from the pipe for the 
subprocess's stdout

At this point apparently the pipe for the subprocess stdin got closed so the 
subprocess received an EOF (over and over due to the missing test+break).

We also know that the parent now hangs in the last run_until_complete() call, 
which means that it has at least attempted to write 'The Winner' -- but there 
is no evidence of this in the truss extract so it is possible that that string 
is still in the transport's write buffer. It is also possible that David simply 
missed it in the endless stream of ineffective calls due to the looping bug.

I'm actually curious why it seems that poll() keeps returning 0 in the parent 
-- shouldn't it have an infinite timeout, since there's nothing left to do?

Another theory is one or more *connection_lost() methods on the protocol are 
actually being called but the test stubbornly keeps waiting until 
proto.got_data[1] becomes set.

I'd be very interested in the truss output with the fix to echo.py in place 
(which is now in the repo).

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue19293>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to