I'm sorry but I truly fail to see the complication: sem = Semaphore(10) # line num 1 somewhere near executor creation sem.acquire() # line number 2, right before submit future = executor.sumbit(...) future.add_done_callback(lambda x: sem.release()) # line number 3, right after submit.
It's only 3 lines of code, barely noticeable, quite clear, with minimal overhead. You can start turning this into a decorator or a general wrapper but you'll only achieve a higher degree of complication. I've programmed with asyncio in that way for years, and these are almost the exact same lines I use, as asyncio futures are the same, supporting add_done_callback and so on. You may even inject the Semaphore into the executor if you wish to save yourself a local variable, but I doubt it's needed. Just think the general idea might be a premature optimisation. If already, I would have either changed the signature of the executor creation to have a new 'max_queue' keyword-only argument, or allow you to enter the queue as an input, but I still believe that any attempt to create such an interface will just cause more complications than the simple 3 line solution. -- Bar Harel On Thu, Sep 5, 2019, 12:42 AM Andrew Barnert via Python-ideas < python-ideas@python.org> wrote: > On Sep 4, 2019, at 08:54, Dan Sommers <2qdxy4rzwzuui...@potatochowder.com> > wrote: > > > > How does blocking the submit call differ from setting max_workers > > in the call to ThreadPoolExecutor? > > Here’s a concrete example from my own code: > > I need to create thousands of images, each of which is about 1MB > uncompressed, but compressed down to a 40KB PNG that I save to disk. > > Compressing and saving takes 20-80x as long as creating, so I want to do > that in parallel, so my program runs 16x as fast. > > But since 16 < 20, the main thread will still get ahead of the workers, > and eventually I’ll have a queue with thousands of 1MB pixmaps in it, at > which point my system goes into swap hell and slows to a crawl. > > If I bound the queue at length 16, the main thread automatically blocks > whenever it gets too far ahead, and now I have a fixed memory use of about > 33MB instead of unbounded GB, and my program really does run almost 16x as > fast as the original serial version. > > And the proposal in this thread would allow me to do that with just a > couple lines of code: construct an executor with a max queue length at the > top, and replace the call to the compress-and-write function with a submit > of that call, and I’m done. > > Could I instead move the pixmap creation into the worker tasks and > rearrange the calculations and add locking so they could all share the > accumulator state correctly? Sure, but it would be a lot more complicated, > and probably a bit slower (since parallelizing code that isn’t in a > bottleneck, and then adding locks to it, is a pessimization). > _______________________________________________ > Python-ideas mailing list -- python-ideas@python.org > To unsubscribe send an email to python-ideas-le...@python.org > https://mail.python.org/mailman3/lists/python-ideas.python.org/ > Message archived at > https://mail.python.org/archives/list/python-ideas@python.org/message/QMZVBPDOGHZPRXZ6WHKPYMPT7OKPO6E3/ > Code of Conduct: http://python.org/psf/codeofconduct/ >
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/U2TVAZZBCN75SQJAEZE3S4GBDHEX7ZC3/ Code of Conduct: http://python.org/psf/codeofconduct/