> for any IO-bound call with a variable time where async isn't an option
(either because it's not available, standardized, widespread, etc.), I'd
advise using loop.run_in_executor()/to_thread() preemptively.

Clarification: this pretty much applies to any non-async IO-bound call that
can block the event loop. You can definitely get away with ignoring some
that have a consistently negligible duration, but I would not *directly*
call any of them that could vary significantly in time (or are consistently
long running) within a coroutine. Otherwise, it's a complete gamble as to
how long it stalls the rest of the program, which is generally not
desirable to say the least.

On Sun, Jun 14, 2020 at 1:42 AM Kyle Stanley <aeros...@gmail.com> wrote:

> > IOW the solution to the problem is to use threads. You can see here
> why I said what I did: threads specifically avoid this problem and the
> only way for asyncio to avoid it is to use threads.
>
> In the case of the above example, I'd say it's more so "use coroutines by
> default and threads as needed" rather than just using threads, but fair
> enough. I'll concede that point.
>
> > For instance, maybe during testing (with debug=True), your
> DNS lookups are always reasonably fast, but then some time after
> deployment, you find that they're stalling you out. How much effort is
> it to change this over? How many other things are going to be slow,
> and can you find them all?
>
> That's very situationally dependent, but for any IO-bound call with a
> variable time where async isn't an option (either because it's not
> available, standardized, widespread, etc.), I'd advise using
> loop.run_in_executor()/to_thread() preemptively. This is easier said than
> done of course and it's very possible for some to be glossed over. If it's
> missed though, I don't think it's too much effort to change it over; IMO
> the main challenge is more so with locating all of them in production for a
> large, existing codebase.
>
> > 3) Steven D'Aprano is terrified of them and will rail on you for using
> threads.
>
> Haha, I've somehow completely missed that. I CC'd Steven in the response,
> since I'm curious as to what he has to say about that.
>
> > Take your pick. Figure out what your task needs. Both exist for good
> reasons.
>
> Completely agreed, threads and coroutines are two completely different
> approaches, with neither one being clearly superior for all situations.
> Even as someone who's invested a significant amount of time in helping to
> improve asyncio recently, I'll admit that I decently often encounter users
> that would be better off using threads. Particularly for code that isn't
> performance or resource critical, or when it involves a reasonably small
> number of concurrent operations that aren't expected to scale in volume
> significantly. The fine-grained control over context switching (which can
> be a pro or a con), shorter switch delay, and lower resource usage from
> coroutines isn't always worth the added code complexity.
>
>
>
> On Sun, Jun 14, 2020 at 12:43 AM Chris Angelico <ros...@gmail.com> wrote:
>
>> On Sun, Jun 14, 2020 at 2:16 PM Kyle Stanley <aeros...@gmail.com> wrote:
>> >
>> > > If
>> > you're fine with invisible context switches, you're probably better
>> > off with threads, because they're not vulnerable to unexpectedly
>> > blocking actions (a common culprit being name lookups before network
>> > transactions - you can connect sockets asynchronously, but
>> > gethostbyname will block the current thread).
>> >
>> > These "unexpectedly blocking actions" can be identified in asyncio's
>> debug mode. Specifically, any callback or task step that has a duration
>> greater than 100ms will be logged. Then, the user can take a closer look at
>> the offending long running step. If it's like socket.gethostbyname() and is
>> a blocking IO-bound function call, it can be executed in a thread pool
>> using loop.run_in_executor(None, socket.gethostbyname, hostname) to avoid
>> blocking the event loop. In 3.9, there's also a roughly equivalent
>> higher-level function that doesn't require access to the event loop:
>> asyncio.to_thread(socket.gethostbyname, hostname).
>> >
>> > With the default duration of 100ms, it likely wouldn't pick up on
>> socket.gethostbyname(), but it can rather easily be adjusted via the
>> modifiable loop.slow_callback_duration attribute.
>> >
>> > Here's a quick, trivial example:
>> > ```
>> > import asyncio
>> > import socket
>> >
>> > async def main():
>> >     loop = asyncio.get_running_loop()
>> >     loop.slow_callback_duration = .01 # 10ms
>> >     socket.gethostbyname("python.org")
>> >
>> > asyncio.run(main(), debug=True)
>> > # If asyncio.run() is not an option, it can also be enabled via:
>> > #     loop.set_debug()
>> > #     using -X dev
>> > #     PYTHONASYNCIODEBUG env var
>> > ```
>> > Output (3.8.3):
>> > Executing <Task finished name='Task-1' coro=<main() done, defined at
>> asyncio_debug_ex.py:5> result=None created at
>> /usr/lib/python3.8/asyncio/base_events.py:595> took 0.039 seconds
>> >
>> > This is a bit more involved than it is for working with threads; I just
>> wanted to demonstrate one method of addressing the problem, as it's a
>> decently common issue. For more details about asyncio's debug mode, see
>> https://docs.python.org/3/library/asyncio-dev.html#debug-mode.
>> >
>>
>> IOW the solution to the problem is to use threads. You can see here
>> why I said what I did: threads specifically avoid this problem and the
>> only way for asyncio to avoid it is to use threads. (Yes, you can
>> asynchronously do a DNS lookup rather than using gethostbyname, but
>> the semantics aren't identical, and you may seriously annoy someone
>> who uses other forms of name resolution. So that doesn't count.) As an
>> additional concern, you don't always know which operations are going
>> to be slow. For instance, maybe during testing (with debug=True), your
>> DNS lookups are always reasonably fast, but then some time after
>> deployment, you find that they're stalling you out. How much effort is
>> it to change this over? How many other things are going to be slow,
>> and can you find them all?
>>
>> That's why threads are so convenient for these kinds of jobs.
>>
>> Disadvantages of threads:
>> 1) Overhead. If you make one thread for each task, your maximum
>> simultaneous tasks can potentially be capped. Irrelevant if each task
>> is doing things with far greater overhead anyway.
>> 2) Unexpected context switching. Unless you use locks, a context
>> switch can occur at any time. The GIL ensures that this won't corrupt
>> Python's internal data structures, but you have to be aware of it with
>> any mutable globals or shared state.
>> 3) Steven D'Aprano is terrified of them and will rail on you for using
>> threads.
>>
>> Disadvantages of asyncio:
>> 1) Code complexity. You have to explicitly show which things are
>> waiting on which others.
>> 2) Unexpected LACK of context switching. Unless you use await, a
>> context switch cannot occur.
>>
>> Take your pick. Figure out what your task needs. Both exist for good
>> reasons.
>>
>> ChrisA
>> _______________________________________________
>> Python-ideas mailing list -- python-ideas@python.org
>> To unsubscribe send an email to python-ideas-le...@python.org
>> https://mail.python.org/mailman3/lists/python-ideas.python.org/
>> Message archived at
>> https://mail.python.org/archives/list/python-ideas@python.org/message/AJ2EOLSWSOAPSUG7BOM5MF3CHP3BHS3H/
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/7EY33HD56HN2WNP4AKG74PBELPFK3DKD/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to