Josh Rosenberg added the comment:

The nature of a Pool precludes assumptions about the availability of specific 
objects in a forked worker process (particularly now that there are alternate 
methods of forking processes). Since the workers are spun up when the pool is 
created, objects created or modified after that point would have to be 
serialized by some mechanism anyway.

The Pool class doesn't describe this explicitly, there are multiple references 
to this behavior (e.g. 
https://docs.python.org/3/library/multiprocessing.html#all-start-methods 
mentions inheriting as being more efficient than pickling/unpickling; you have 
to develop with inheritance in mind though; pickling, particular for task 
dispatch approaches in the "Futures" model, can't be generalized as an 
inheritance problem when using producer/consumer based worker model).

Point is, this is an expected behavior. You need some means of transferring 
objects between processes, and pickling is the Python standard serialization 
method. The inability to serialize a 4+ GB bytes object is a problem I assume 
(don't know if a bug exists for that), but pickling as the mechanism is the 
only obvious way to do it. If you want to avoid inheritance, it's up to you to 
ensure the root process has created the necessary bytes object prior to 
creating the Pool, and conveying information to the worker about how to find it 
(say, a dict of int keys to your bytes object data) in its own memory.

----------
nosy: +josh.r

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue23979>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to