[Python-ideas] Re: SerialExecutor for concurrent.futures + Convenience constructor

Kyle Stanley Sat, 15 Feb 2020 21:02:48 -0800

> I've never felt the need for either of these myself, nor have I observed
it in others I worked with. In general I feel the difference between
processes and threads is so large that I can't believe a realistic
application would work with either.


Also, ThreadPoolExecutor and ProcessPoolExecutor both have their specific
purposes in concurrent.futures: TPE for IO-bound parallelism, and PPE for
CPU-bound parallelism, what niche would the proposed SerialExecutor fall
under? Fake/dummy parallelism? If so, I personally don't see that as being
worth the cost of adding it and then maintaining it in the standard
library. But, that's not to say that it wouldn't have a place on PyPI.

> (Then again I've never had much use for ProcessExecutor period.)

I've also made use of TPE far more times than PPE, but I've definitely seen
several interesting and useful real-world applications of PPE. Particularly
with image processing. I can also imagine it also being quite useful for
scientific computing, although I've not personally used it for that purpose.

> IOW I'm rather lukewarm about this -- even if you (Jonathan) have found
use for it, I'm not sure how many other people would use it, so I doubt
it's worth adding it to the stdlib. (The only thing the stdlib might grow
could be a public API that makes implementing this feasible without
overriding private methods.)

Expanding a bit upon the public API for the cf.Future class would likely
allow something like this to be possible without accessing any private
members. In particular, I believe there would have to be an public means of
accessing the state of the future without having to go through the
condition (currently, this can only be done with ``future._state``), and
accessing a constant for each of the possible states: PENDING. RUNNING,
CANCELLED, CANCELLED_AND_NOTIFIED, and FINISHED.

Since that would actually be quite useful for debugging purposes (I had to
access ``future._state`` several times while testing the new
*cancel_futures*), I'd be willing to work on implementing something like
this.


On Sat, Feb 15, 2020 at 10:16 PM Guido van Rossum <gu...@python.org> wrote:

> Having tried my hand at a simpler version for about 15 minutes, I see the
> reason for the fiddly subclass of Future -- it seems over-engineered
> because concurrent.future is complicated.
>
> I've never felt the need for either of these myself, nor have I observed
> it in others I worked with. In general I feel the difference between
> processes and threads is so large that I can't believe a realistic
> application would work with either. (Then again I've never had much use for
> ProcessExecutor period.)
>
> The "Serial" variants somehow remind me of the "dummy_thread.py" module we
> had in Python 2. It was removed in Python 3, mostly because we ran out of
> cases where real threads weren't an option.
>
> IOW I'm rather lukewarm about this -- even if you (Jonathan) have found
> use for it, I'm not sure how many other people would use it, so I doubt
> it's worth adding it to the stdlib. (The only thing the stdlib might grow
> could be a public API that makes implementing this feasible without
> overriding private methods.)
>
> On Sat, Feb 15, 2020 at 3:16 PM Jonathan Crall <erote...@gmail.com> wrote:
>
>> This implementation is a proof-of-concept that I've been using for awhile
>> <https://gitlab.kitware.com/computer-vision/ndsampler/blob/master/ndsampler/util_futures.py>.
>> Its certain that any version that made it into the stdlib would have to be
>> more carefully designed than the implementation I threw together. However,
>> my implementation demonstrates the concept and there are reasons for the
>> choices I made.
>>
>> First, the choice to create a SerialFuture object that inherits from the
>> base Future was because I only wanted a process to run if the
>> SerialFuture.result method was called. The most obvious way to do that was
>> to overload the `result` method to execute the function when called.
>> Perhaps there is a better way, but in an effort to KISS I just went with
>> the <100 line version that seemed to work well enough.
>>
>> The `set_result` is overloaded because in Python 3.8, the base
>> Future.set_result function asserts that the _state is not FINISHED when it
>> is called. In my proof-of-concept implementation I had to set state of the
>> SerialFuture._state to FINISHED in order for `as_completed` to yield it.
>> Again, there may be a better way to do this, but I don't claim to know what
>> that is yet.
>>
>> I was thinking that a factory function might be a good idea, but if I was
>> designing the system I would have put that in the abstract Executor class.
>> Maybe something like
>>
>>
>> ```
>> @classmethod
>> def create(cls, mode, max_workers=0):
>>     """ Create an instance of a serial, thread, or process-based executor
>> """
>>     from concurrent import futures
>>     if mode == 'serial' or max_workers == 0:
>>         return futures.SerialExecutor()
>>     elif mode == 'thread':
>>         return futures.ThreadPoolExecutor(max_workers=max_workers)
>>     elif mode == 'process':
>>         return futures.ProcessPoolExecutor(max_workers=max_workers)
>>     else:
>>         raise KeyError(mode)
>> ```
>>
>> I do think that it would improve the standard lib to have something like
>> this --- again perhaps not this exact version (it does seem a bit weird to
>> give this method to an abstract class), but some common API that makes it
>> easy for the user to swap between the backend Executor implementation. Even
>> though the implementation is "trivial", lots of things in the standard lib
>> are, but they the reduce boilerplate that developers would otherwise need,
>> provide examples of good practices to new developers, and provide a defacto
>> way to do something that might otherwise be implemented differently by
>> different people, so it adds value to the stdlib.
>>
>> That being said, while I will advocate for the inclusion of such a
>> factory method or wrapper class, it would only be a minor annoyance to not
>> have it. On the other hand I think a SerialExecutor is something that is
>> sorely missing from the standard library.
>>
>> On Sat, Feb 15, 2020 at 5:16 PM Andrew Barnert <abarn...@yahoo.com>
>> wrote:
>>
>>> > On Feb 15, 2020, at 13:36, Jonathan Crall <erote...@gmail.com> wrote:
>>> >
>>> > Also, there is no duck-typed class that behaves like an executor, but
>>> does its processing in serial. Often times a develop will want to run a
>>> task in parallel, but depending on the environment they may want to disable
>>> threading or process execution. To address this I use a utility called a
>>> `SerialExecutor` which shares an API with
>>> ThreadPoolExecutor/ProcessPoolExecutor but executes processes sequentially
>>> in the same python thread:
>>>
>>> This makes sense. I think most futures-and-executors frameworks in other
>>> languages have a serial/synchronous/immediate/blocking executor just like
>>> this. (And the ones that don’t, it’s usually because they have a different
>>> way to specify the same functionality—e.g., in C++, you only use executors
>>> via the std::async function, and you can just pass a launch option instead
>>> of an executor to run synchronously.)
>>>
>>> And I’ve wanted this, and even built it myself at least once—it’s a
>>> great way to get all of the logging in order to make things easier to
>>> debug, for example.
>>>
>>> However, I think you may have overengineered this.
>>>
>>> Why can’t you use the existing Future type as-is? Yes, there’s a bit of
>>> unnecessary overhead, but your reimplementation seems to add almost the
>>> same unnecessary overhead. And does it make enough difference in practice
>>> to be worth worrying about anyway? (It doesn’t for my uses, but maybe
>>> you’re are different.)
>>>
>>> Also, why are you overriding set_result to restore pre-3.8 behavior? The
>>> relevant change here seems to be the one where 3.8 prevents executors from
>>> finishing already-finished (or canceled) futures; why does your executor
>>> need that?
>>>
>>> Finally, why do you need a wrapper class that constructs one of the
>>> three types at initialization and then just delegates all methods to it?
>>> Why not just use a factory function that constructs and returns an instance
>>> of one of the three types directly? And, given how trivial that factory
>>> function is, does it even need to be in the stdlib?
>>>
>>> I may well be missing something that makes some of these choices
>>> necessary or desirable. But otherwise, I think we’d be better off adding a
>>> SerialExecutor (that works with the existing Future type as-is) but not
>>> adding or changing anything else.
>>>
>>>
>>>
>>
>> --
>> -Jon
>> _______________________________________________
>> Python-ideas mailing list -- python-ideas@python.org
>> To unsubscribe send an email to python-ideas-le...@python.org
>> https://mail.python.org/mailman3/lists/python-ideas.python.org/
>> Message archived at
>> https://mail.python.org/archives/list/python-ideas@python.org/message/AG3AXJFU4R2CU6JPWCQ2BYHUPH75MKUM/
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
>
> --
> --Guido van Rossum (python.org/~guido)
> *Pronouns: he/him **(why is my pronoun here?)*
> <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
> _______________________________________________
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/ICJKHZ4BPIUMOPIT2TDTBIW2EH4CPNCP/
> Code of Conduct: http://python.org/psf/codeofconduct/
>

_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/6GQZJWKHLAJ5P7ZJCASP6IFV6H55OOK5/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: SerialExecutor for concurrent.futures + Convenience constructor

Reply via email to