[issue18120] multiprocessing: garbage collector fails to GC Pipe() end when spawning child process

2013-06-03 Thread spresse1

spresse1 added the comment:

Oooh, thanks.  I'll use that.

> But really, this sounds rather fragile.

Absolutely.  I concur there is no good way to do this.

--

___
Python tracker 
<http://bugs.python.org/issue18120>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18120] multiprocessing: garbage collector fails to GC Pipe() end when spawning child process

2013-06-03 Thread spresse1

spresse1 added the comment:

> I don't see how using os.fork() would make things any easier.  In either 
> case you need to prepare a list of fds which the child process should 
> close before it starts, or alternatively a list of fds *not* to close.

With fork() I control where the processes diverge much more readily.  I could 
create the pipe in the main process, fork, close unnecessary fds, then call 
into the class that represents the operation of the subprocess.  (ie: do it the 
c way).  This way the class never needs to know about pipes it doesnt care 
about and I can ensure that unnecessary pipes get closed.  So I get the clean, 
understandable semantics I was after and my pipes get closed.  The only thing I 
lose is windows interoperability.

I could reimplement the close_all_fds_except() call (in straight python, using 
os.closerange()).  That seems like a reasonable solution, if a bit of a hack.  
However, given that pipes are exposed by multiprocessing, it might make sense 
to try to get this function incorperated into the main version of it?

I also think that with introspection it would be possible for the subprocessing 
module to be aware of which file descriptors are still actively referenced.  
(ie: 0,1,2 always referenced, introspect through objects in the child to see if 
they have the file.fileno() method) However, I can't state this as a certainty 
without going off and actually implementing such a version.  Additionally, I 
can make absolutely no promises as to the speed of this.  Perhaps, if it 
functioned, it would be an option one could turn on for cases like mine.

--

___
Python tracker 
<http://bugs.python.org/issue18120>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18120] multiprocessing: garbage collector fails to GC Pipe() end when spawning child process

2013-06-02 Thread spresse1

spresse1 added the comment:

I'm actually a nix programmer by trade, so I'm pretty familiar with that 
behavior =p  However, I'm also used to inheriting some way to refer to these 
fds, so that I can close them.  Perhaps I've just missed somewhere a call to 
ask the process for a list of open fds?  This would, to me, be an acceptable 
workaround - I could close all the fds I didn't wish to inherit.

Whats really bugging me is that it remains open and I can't fetch a reference.  
If I could do either of these, I'd be happy.

Maybe this is more an issue with the semantics of multiprocessing?  In that 
this behavior is perfectly reasonable with os.fork() but makes some difficulty 
here.

Perhaps I really want to be implementing with os.fork().  Sigh, I was trying to 
save myself some effort...

--

___
Python tracker 
<http://bugs.python.org/issue18120>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18120] multiprocessing: garbage collector fails to GC Pipe() end when spawning child process

2013-06-02 Thread spresse1

spresse1 added the comment:

>> So you're telling me that when I spawn a new child process, I have to 
>> deal with the entirety of my parent process's memory staying around 
>> forever?
>
> With a copy-on-write implementation of fork() this quite likely to use 
> less memory than starting a fresh process for the child process.  And 
> it is certainly much faster.

Fair enough.

>> I would have expected this to call to fork(), which gives the child 
>> plenty of chance to clean up, then call exec() which loads the new 
>> executable.
>
> There is an experimental branch (http://hg.python.org/sandbox/sbt) 
> which optionally behaves like that.  Note that "clean up" means close 
> all fds not explcitly passed, and has nothing to do with garbage 
> collection.

I appreciate the pointer, but I am writing code intended for distribution - 
using an experimental branch isn't useful.

What I'm still trying to grasp is why Python explicitly leaves the parent 
processes info around in the child.  It seems like there is no benefit 
(besides, perhaps, speed) and that this choice leads to non-intuitive behavior 
- like this.

--

___
Python tracker 
<http://bugs.python.org/issue18120>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18120] multiprocessing: garbage collector fails to GC Pipe() end when spawning child process

2013-06-02 Thread spresse1

spresse1 added the comment:

So you're telling me that when I spawn a new child process, I have to deal with 
the entirety of my parent process's memory staying around forever?  I would 
have expected this to call to fork(), which gives the child plenty of chance to 
clean up, then call exec() which loads the new executable.  Either that or the 
same instance of the python interpreter is used, just with the knowledge that 
it should execute the child function and then exit.  Keeping all the state that 
will never be used in the second case seems sloppy on the part of python.

The semantics in this case are much better if the pipe gets GC'd.  I see no 
reason my child process should have to know about pipe ends it never uses in 
order to close them.

--

___
Python tracker 
<http://bugs.python.org/issue18120>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18120] multiprocessing: garbage collector fails to GC Pipe() end when spawning child process

2013-06-02 Thread spresse1

spresse1 added the comment:

The difference is that nonfunctional.py does not pass the write end of the 
parent's pipe to the child.  functional.py does, and closes it immediately 
after breaking into a new process.  This is what you mentioned to me as a 
workaround.  Corrected code (for indentation) attached.

Why SHOULDN'T I expect this pipe to be closed automatically in the child?  Per 
the documentation for multiprocessing.Connection.close():
"This is called automatically when the connection is garbage collected."

The write end of that pipe goes out of scope and has no references in the child 
thread.  Therefore, per my understanding, it should be garbage collected (in 
the child thread).  Where am I wrong about this?

--
Added file: http://bugs.python.org/file30449/bugon.tar.gz

___
Python tracker 
<http://bugs.python.org/issue18120>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18120] multiprocessing: garbage collector fails to GC Pipe() end when spawning child process

2013-06-02 Thread spresse1

spresse1 added the comment:

Now also tested with source-built python 3.3.2.  Issue still exists, same 
example files.

--

___
Python tracker 
<http://bugs.python.org/issue18120>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18120] multiprocessing: garbage collector fails to GC Pipe() end when spawning child process

2013-06-02 Thread spresse1

New submission from spresse1:

[Code demonstrating issue attached]

When overloading multiprocessing.Process and using pipes, a reference to a pipe 
spawned in the parent is not properly garbage collected in the child.  This 
causes the write end of the pipe to be held open with no reference to it in the 
child process, and therefore no way to close it.  Therefore, it can never throw 
EOFError.

Expected behavior:
1. Create a pipe with multiprocessing.Pipe(False)
2. Pass read end to a class which subclasses multiprocessing.Process
3. Close write end in parent process
4. Receive EOFError from read end

Actual behavior:
1. Create a pipe with multiprocessing.Pipe(False)
2. Pass read end to a class which subclasses multiprocessing.Process
3. Close write end in parent process
4. Never receive EOFError from read end

Examining the processes in /proc/[pid]/fds/ indicates that a write pipe is 
still open in the child process, though none should be.  Additionally, no write 
pipe is open in the parent process.  It is my belief that this is the write 
pipe spawned in the parent, and is remaining around incorrectly in the child, 
though there are no references to it.

Tested on 2.7.3 and 3.2.3

--
components: Library (Lib)
files: bugon.tar.gz
messages: 190492
nosy: spresse1
priority: normal
severity: normal
status: open
title: multiprocessing: garbage collector fails to GC Pipe() end when spawning 
child process
versions: Python 2.7, Python 3.2
Added file: http://bugs.python.org/file30448/bugon.tar.gz

___
Python tracker 
<http://bugs.python.org/issue18120>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com