On Sat, Feb 15, 2020 at 21:00 Kyle Stanley <aeros...@gmail.com> wrote:
> > I've never felt the need for either of these myself, nor have I observed > it in others I worked with. In general I feel the difference between > processes and threads is so large that I can't believe a realistic > application would work with either. > > Also, ThreadPoolExecutor and ProcessPoolExecutor both have their specific > purposes in concurrent.futures: TPE for IO-bound parallelism, and PPE for > CPU-bound parallelism, what niche would the proposed SerialExecutor fall > under? Fake/dummy parallelism? If so, I personally don't see that as being > worth the cost of adding it and then maintaining it in the standard > library. But, that's not to say that it wouldn't have a place on PyPI. > > > (Then again I've never had much use for ProcessExecutor period.) > > I've also made use of TPE far more times than PPE, but I've definitely > seen several interesting and useful real-world applications of PPE. > Particularly with image processing. I can also imagine it also being quite > useful for scientific computing, although I've not personally used it for > that purpose. > > > IOW I'm rather lukewarm about this -- even if you (Jonathan) have found > use for it, I'm not sure how many other people would use it, so I doubt > it's worth adding it to the stdlib. (The only thing the stdlib might grow > could be a public API that makes implementing this feasible without > overriding private methods.) > > Expanding a bit upon the public API for the cf.Future class would likely > allow something like this to be possible without accessing any private > members. In particular, I believe there would have to be an public means of > accessing the state of the future without having to go through the > condition (currently, this can only be done with ``future._state``), and > accessing a constant for each of the possible states: PENDING. RUNNING, > CANCELLED, CANCELLED_AND_NOTIFIED, and FINISHED. > > Since that would actually be quite useful for debugging purposes (I had to > access ``future._state`` several times while testing the new > *cancel_futures*), I'd be willing to work on implementing something like > this. > Excellent! > > On Sat, Feb 15, 2020 at 10:16 PM Guido van Rossum <gu...@python.org> > wrote: > >> Having tried my hand at a simpler version for about 15 minutes, I see the >> reason for the fiddly subclass of Future -- it seems over-engineered >> because concurrent.future is complicated. >> >> I've never felt the need for either of these myself, nor have I observed >> it in others I worked with. In general I feel the difference between >> processes and threads is so large that I can't believe a realistic >> application would work with either. (Then again I've never had much use for >> ProcessExecutor period.) >> >> The "Serial" variants somehow remind me of the "dummy_thread.py" module >> we had in Python 2. It was removed in Python 3, mostly because we ran out >> of cases where real threads weren't an option. >> >> IOW I'm rather lukewarm about this -- even if you (Jonathan) have found >> use for it, I'm not sure how many other people would use it, so I doubt >> it's worth adding it to the stdlib. (The only thing the stdlib might grow >> could be a public API that makes implementing this feasible without >> overriding private methods.) >> >> On Sat, Feb 15, 2020 at 3:16 PM Jonathan Crall <erote...@gmail.com> >> wrote: >> >>> This implementation is a proof-of-concept that I've been using for >>> awhile >>> <https://gitlab.kitware.com/computer-vision/ndsampler/blob/master/ndsampler/util_futures.py>. >>> Its certain that any version that made it into the stdlib would have to be >>> more carefully designed than the implementation I threw together. However, >>> my implementation demonstrates the concept and there are reasons for the >>> choices I made. >>> >>> First, the choice to create a SerialFuture object that inherits from the >>> base Future was because I only wanted a process to run if the >>> SerialFuture.result method was called. The most obvious way to do that was >>> to overload the `result` method to execute the function when called. >>> Perhaps there is a better way, but in an effort to KISS I just went with >>> the <100 line version that seemed to work well enough. >>> >>> The `set_result` is overloaded because in Python 3.8, the base >>> Future.set_result function asserts that the _state is not FINISHED when it >>> is called. In my proof-of-concept implementation I had to set state of the >>> SerialFuture._state to FINISHED in order for `as_completed` to yield it. >>> Again, there may be a better way to do this, but I don't claim to know what >>> that is yet. >>> >>> I was thinking that a factory function might be a good idea, but if I >>> was designing the system I would have put that in the abstract Executor >>> class. Maybe something like >>> >>> >>> ``` >>> @classmethod >>> def create(cls, mode, max_workers=0): >>> """ Create an instance of a serial, thread, or process-based >>> executor """ >>> from concurrent import futures >>> if mode == 'serial' or max_workers == 0: >>> return futures.SerialExecutor() >>> elif mode == 'thread': >>> return futures.ThreadPoolExecutor(max_workers=max_workers) >>> elif mode == 'process': >>> return futures.ProcessPoolExecutor(max_workers=max_workers) >>> else: >>> raise KeyError(mode) >>> ``` >>> >>> I do think that it would improve the standard lib to have something like >>> this --- again perhaps not this exact version (it does seem a bit weird to >>> give this method to an abstract class), but some common API that makes it >>> easy for the user to swap between the backend Executor implementation. Even >>> though the implementation is "trivial", lots of things in the standard lib >>> are, but they the reduce boilerplate that developers would otherwise need, >>> provide examples of good practices to new developers, and provide a defacto >>> way to do something that might otherwise be implemented differently by >>> different people, so it adds value to the stdlib. >>> >>> That being said, while I will advocate for the inclusion of such a >>> factory method or wrapper class, it would only be a minor annoyance to not >>> have it. On the other hand I think a SerialExecutor is something that is >>> sorely missing from the standard library. >>> >>> On Sat, Feb 15, 2020 at 5:16 PM Andrew Barnert <abarn...@yahoo.com> >>> wrote: >>> >>>> > On Feb 15, 2020, at 13:36, Jonathan Crall <erote...@gmail.com> wrote: >>>> > >>>> > Also, there is no duck-typed class that behaves like an executor, but >>>> does its processing in serial. Often times a develop will want to run a >>>> task in parallel, but depending on the environment they may want to disable >>>> threading or process execution. To address this I use a utility called a >>>> `SerialExecutor` which shares an API with >>>> ThreadPoolExecutor/ProcessPoolExecutor but executes processes sequentially >>>> in the same python thread: >>>> >>>> This makes sense. I think most futures-and-executors frameworks in >>>> other languages have a serial/synchronous/immediate/blocking executor just >>>> like this. (And the ones that don’t, it’s usually because they have a >>>> different way to specify the same functionality—e.g., in C++, you only use >>>> executors via the std::async function, and you can just pass a launch >>>> option instead of an executor to run synchronously.) >>>> >>>> And I’ve wanted this, and even built it myself at least once—it’s a >>>> great way to get all of the logging in order to make things easier to >>>> debug, for example. >>>> >>>> However, I think you may have overengineered this. >>>> >>>> Why can’t you use the existing Future type as-is? Yes, there’s a bit of >>>> unnecessary overhead, but your reimplementation seems to add almost the >>>> same unnecessary overhead. And does it make enough difference in practice >>>> to be worth worrying about anyway? (It doesn’t for my uses, but maybe >>>> you’re are different.) >>>> >>>> Also, why are you overriding set_result to restore pre-3.8 behavior? >>>> The relevant change here seems to be the one where 3.8 prevents executors >>>> from finishing already-finished (or canceled) futures; why does your >>>> executor need that? >>>> >>>> Finally, why do you need a wrapper class that constructs one of the >>>> three types at initialization and then just delegates all methods to it? >>>> Why not just use a factory function that constructs and returns an instance >>>> of one of the three types directly? And, given how trivial that factory >>>> function is, does it even need to be in the stdlib? >>>> >>>> I may well be missing something that makes some of these choices >>>> necessary or desirable. But otherwise, I think we’d be better off adding a >>>> SerialExecutor (that works with the existing Future type as-is) but not >>>> adding or changing anything else. >>>> >>>> >>>> >>> >>> -- >>> -Jon >>> _______________________________________________ >>> Python-ideas mailing list -- python-ideas@python.org >>> To unsubscribe send an email to python-ideas-le...@python.org >>> https://mail.python.org/mailman3/lists/python-ideas.python.org/ >>> Message archived at >>> https://mail.python.org/archives/list/python-ideas@python.org/message/AG3AXJFU4R2CU6JPWCQ2BYHUPH75MKUM/ >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >> >> >> -- >> --Guido van Rossum (python.org/~guido) >> *Pronouns: he/him **(why is my pronoun here?)* >> <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> >> _______________________________________________ >> Python-ideas mailing list -- python-ideas@python.org >> To unsubscribe send an email to python-ideas-le...@python.org >> https://mail.python.org/mailman3/lists/python-ideas.python.org/ >> > Message archived at >> https://mail.python.org/archives/list/python-ideas@python.org/message/ICJKHZ4BPIUMOPIT2TDTBIW2EH4CPNCP/ > > >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > -- --Guido (mobile)
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/VHJ4YALO2N366XG7SLJOZBVW5Q3W74L7/ Code of Conduct: http://python.org/psf/codeofconduct/