New submission from prayerslayer:

Hi!

I think the ThreadPoolExecutor should allow to set the maximum size of the 
underlying queue.

The situation I ran into recently was that I used ThreadPoolExecutor to 
parallelize AWS API calls; I had to move data from one S3 bucket to another 
(~150M objects). Contrary to what I expected the maximum size of the underlying 
queue doesn't have a non-zero value by default. Thus my process ended up 
consuming gigabytes of memory, because it put more items into the queue than 
the threads were able to work off: The queue just kept growing. (It ran on K8s 
and the pod was rightfully killed eventually.)

Of course there ways to work around this. One could use more threads, to some 
extent. Or you could use your own queue with a defined maximum size. But I 
think it's more work for users of Python than necessary.

----------
messages: 288043
nosy: prayerslayer
priority: normal
pull_requests: 104
severity: normal
status: open
title: Expose max_queue_size in ThreadPoolExecutor
type: enhancement

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29595>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to