New submission from jakirkham <jakirk...@gmail.com>:
In Python 3.8+, pickle protocol 5 ( PEP<574> ) was added, which supports out-of-band buffer collection[1]. The idea being that when pickling an object with a large amount of data attached to it (like an array, dataframe, etc.) one could collect this large amount of data alongside the normal pickled data without causing a copy. This is important in particular when serializing data for communication between two python instances. IOW this is quite valuable when using a `multiprocessing.pool.Pool`[2] or a `concurrent.futures.ProcessPoolExecutor`[3]. However AFAICT neither of these leverage this functionality[4][5]. To ensure zero-copy processing of large data, it would be helpful for pickle protocol 5 to be used in both of these pools. [1] https://docs.python.org/3/library/pickle.html#pickle-oob [2] https://docs.python.org/3/library/multiprocessing.html#multiprocessing.pool.Pool [3] https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ProcessPoolExecutor [4] https://github.com/python/cpython/blob/16b5bc68964c6126845f4cdd54b24996e71ae0ba/Lib/multiprocessing/queues.py#L372 [5] https://github.com/python/cpython/blob/16b5bc68964c6126845f4cdd54b24996e71ae0ba/Lib/multiprocessing/queues.py#L245 ---------- components: IO, Library (Lib) messages: 402736 nosy: jakirkham priority: normal severity: normal status: open title: Supporting out-of-band buffers (pickle protocol 5) in multiprocessing type: performance versions: Python 3.10, Python 3.11, Python 3.8, Python 3.9 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue45304> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com