On 9/4/19 5:38 PM, Andrew Barnert via Python-ideas wrote:

> On Sep 4, 2019, at 08:54, Dan Sommers <2qdxy4rzwzuui...@potatochowder.com> wrote:

>> How does blocking the submit call differ from setting max_workers in
>> the call to ThreadPoolExecutor?

> Here’s a concrete example from my own code:

Aha.  Thanks.

> I need to create thousands of images, each of which is about 1MB
> uncompressed, but compressed down to a 40KB PNG that I save to disk.

> Compressing and saving takes 20-80x as long as creating, so I want to
> do that in parallel, so my program runs 16x as fast.

Without knowing anything else, I would wonder why you've combined
compressing (an apparently CPU bound operation) and saviing (an
apparently I/O bound operation) togther, but left creating separate.

Please don't answer that; it's not related to Python.

> But since 16 < 20, the main thread will still get ahead of the
> workers, and eventually I’ll have a queue with thousands of 1MB
> pixmaps in it, at which point my system goes into swap hell and slows
> to a crawl.

Yes, you need some way to produce "back pressure" from downstream to
upstream, and to stop making new work (with new memory consumption)
until there's a place to put it.

> If I bound the queue at length 16, the main thread automatically
> blocks whenever it gets too far ahead, and now I have a fixed memory
> use of about 33MB instead of unbounded GB, and my program really does
> run almost 16x as fast as the original serial version.

> And the proposal in this thread would allow me to do that with just a
> couple lines of code: construct an executor with a max queue length at
> the top, and replace the call to the compress-and-write function with
> a submit of that call, and I’m done.

> Could I instead move the pixmap creation into the worker tasks and
> rearrange the calculations and add locking so they could all share the
> accumulator state correctly? Sure, but it would be a lot more
> complicated, and probably a bit slower (since parallelizing code that
> isn’t in a bottleneck, and then adding locks to it, is a
> pessimization).

"[T]he accumulator?"  Is there only one data store to manage multiple
instances of three different/separate operations?  Ouch.  At least
that's my reaction to the limited amount of insight I have right now.

Now we're discussing how to desgign a concurrent application to maximize
resources and minimize complexity and overall run time.  Yes, if you
build your application one way, and run into issues, then some ways of
addressing the issues (e.g., going back to the design phase) will cost
more to implement than others (tweaking a supporting libraries).  It's
happened to all of us.  :-)

I'm not against tweaking the standard library (and even if I were, my
vote shouldn't count for much).  For *this case*, it seemed to me that
changing the standard library was Less Better™ than considering the
concurrency issues earlier on in the development process.

There's also an old software engineer inside me that wants most of this
control up near the top (where I can see it) as opposed to way down
inside the supporting libraries (where it becomes magic that has to be
debugged when it's not doing what I thought it would).  That's great for
new applications, but it doesn't always stay that way over time.
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/BW4NUVID6J4TVWZRFKQDYMNPG7SSAY4S/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to