New submission from jakirkham <jakirk...@gmail.com>:

In Python 3.8+, pickle protocol 5 ( PEP<574> ) was added, which supports 
out-of-band buffer collection[1]. The idea being that when pickling an object 
with a large amount of data attached to it (like an array, dataframe, etc.) one 
could collect this large amount of data alongside the normal pickled data 
without causing a copy. This is important in particular when serializing data 
for communication between two python instances. IOW this is quite valuable when 
using a `multiprocessing.pool.Pool`[2] or a 
`concurrent.futures.ProcessPoolExecutor`[3]. However AFAICT neither of these 
leverage this functionality[4][5]. To ensure zero-copy processing of large 
data, it would be helpful for pickle protocol 5 to be used in both of these 
pools.


[1] https://docs.python.org/3/library/pickle.html#pickle-oob
[2] 
https://docs.python.org/3/library/multiprocessing.html#multiprocessing.pool.Pool
[3] 
https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ProcessPoolExecutor
[4] 
https://github.com/python/cpython/blob/16b5bc68964c6126845f4cdd54b24996e71ae0ba/Lib/multiprocessing/queues.py#L372
[5] 
https://github.com/python/cpython/blob/16b5bc68964c6126845f4cdd54b24996e71ae0ba/Lib/multiprocessing/queues.py#L245

----------
components: IO, Library (Lib)
messages: 402736
nosy: jakirkham
priority: normal
severity: normal
status: open
title: Supporting out-of-band buffers (pickle protocol 5) in multiprocessing
type: performance
versions: Python 3.10, Python 3.11, Python 3.8, Python 3.9

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue45304>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to