[Python-ideas] Re: SerialExecutor for concurrent.futures + Convenience constructor

Jonathan Crall Sat, 15 Feb 2020 15:17:48 -0800

This implementation is a proof-of-concept that I've been using for awhile
<https://gitlab.kitware.com/computer-vision/ndsampler/blob/master/ndsampler/util_futures.py>.
Its certain that any version that made it into the stdlib would have to be
more carefully designed than the implementation I threw together. However,
my implementation demonstrates the concept and there are reasons for the
choices I made.

First, the choice to create a SerialFuture object that inherits from the
base Future was because I only wanted a process to run if the
SerialFuture.result method was called. The most obvious way to do that was
to overload the `result` method to execute the function when called.
Perhaps there is a better way, but in an effort to KISS I just went with
the <100 line version that seemed to work well enough.

The `set_result` is overloaded because in Python 3.8, the base
Future.set_result function asserts that the _state is not FINISHED when it
is called. In my proof-of-concept implementation I had to set state of the
SerialFuture._state to FINISHED in order for `as_completed` to yield it.
Again, there may be a better way to do this, but I don't claim to know what
that is yet.

I was thinking that a factory function might be a good idea, but if I was
designing the system I would have put that in the abstract Executor class.
Maybe something like

```
@classmethod
def create(cls, mode, max_workers=0):
    """ Create an instance of a serial, thread, or process-based executor
"""
    from concurrent import futures
    if mode == 'serial' or max_workers == 0:
        return futures.SerialExecutor()
    elif mode == 'thread':
        return futures.ThreadPoolExecutor(max_workers=max_workers)
    elif mode == 'process':
        return futures.ProcessPoolExecutor(max_workers=max_workers)
    else:
        raise KeyError(mode)
```

I do think that it would improve the standard lib to have something like
this --- again perhaps not this exact version (it does seem a bit weird to
give this method to an abstract class), but some common API that makes it
easy for the user to swap between the backend Executor implementation. Even
though the implementation is "trivial", lots of things in the standard lib
are, but they the reduce boilerplate that developers would otherwise need,
provide examples of good practices to new developers, and provide a defacto
way to do something that might otherwise be implemented differently by
different people, so it adds value to the stdlib.

That being said, while I will advocate for the inclusion of such a factory
method or wrapper class, it would only be a minor annoyance to not have it.
On the other hand I think a SerialExecutor is something that is sorely
missing from the standard library.

On Sat, Feb 15, 2020 at 5:16 PM Andrew Barnert <[email protected]> wrote:

> > On Feb 15, 2020, at 13:36, Jonathan Crall <[email protected]> wrote:
> >
> > Also, there is no duck-typed class that behaves like an executor, but
> does its processing in serial. Often times a develop will want to run a
> task in parallel, but depending on the environment they may want to disable
> threading or process execution. To address this I use a utility called a
> `SerialExecutor` which shares an API with
> ThreadPoolExecutor/ProcessPoolExecutor but executes processes sequentially
> in the same python thread:
>
> This makes sense. I think most futures-and-executors frameworks in other
> languages have a serial/synchronous/immediate/blocking executor just like
> this. (And the ones that don’t, it’s usually because they have a different
> way to specify the same functionality—e.g., in C++, you only use executors
> via the std::async function, and you can just pass a launch option instead
> of an executor to run synchronously.)
>
> And I’ve wanted this, and even built it myself at least once—it’s a great
> way to get all of the logging in order to make things easier to debug, for
> example.
>
> However, I think you may have overengineered this.
>
> Why can’t you use the existing Future type as-is? Yes, there’s a bit of
> unnecessary overhead, but your reimplementation seems to add almost the
> same unnecessary overhead. And does it make enough difference in practice
> to be worth worrying about anyway? (It doesn’t for my uses, but maybe
> you’re are different.)
>
> Also, why are you overriding set_result to restore pre-3.8 behavior? The
> relevant change here seems to be the one where 3.8 prevents executors from
> finishing already-finished (or canceled) futures; why does your executor
> need that?
>
> Finally, why do you need a wrapper class that constructs one of the three
> types at initialization and then just delegates all methods to it? Why not
> just use a factory function that constructs and returns an instance of one
> of the three types directly? And, given how trivial that factory function
> is, does it even need to be in the stdlib?
>
> I may well be missing something that makes some of these choices necessary
> or desirable. But otherwise, I think we’d be better off adding a
> SerialExecutor (that works with the existing Future type as-is) but not
> adding or changing anything else.
>
>
>

-- 
-Jon

_______________________________________________
Python-ideas mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/AG3AXJFU4R2CU6JPWCQ2BYHUPH75MKUM/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: SerialExecutor for concurrent.futures + Convenience constructor

Reply via email to