Thanks a bundle, Jesse!

On Wed, Dec 16, 2015 at 4:24 PM, A. Jesse Jiryu Davis <je...@emptysquare.net
> wrote:

> Committed here:
>
>
> https://github.com/python/asyncio/commit/39c135baf73762830148236da622787052efba19
>
> Yury would you please update the CPython repo when the time is right?
>
>
> On Wednesday, December 9, 2015 at 6:05:46 PM UTC-5, A. Jesse Jiryu Davis
> wrote:
>>
>> Done:
>>
>> https://github.com/python/asyncio/pull/302
>>
>> On Wednesday, December 9, 2015 at 2:58:07 PM UTC-5, Guido van Rossum
>> wrote:
>>>
>>> It'll go much quicker if you send a PR to the asyncio github project.
>>> Thanks!
>>>
>>> --Guido (mobile)
>>> On Dec 9, 2015 11:56 AM, "A. Jesse Jiryu Davis" <je...@emptysquare.net>
>>> wrote:
>>>
>>>> Thanks for your kind words about the crawler chapter. =) I think we
>>>> didn't hit this problem there because the crawler only looked up one
>>>> domain, or a few of them. On Mac, subsequent calls are cached at the OS
>>>> layer for a few minutes, so getaddrinfo is very fast, even though it's
>>>> serialized by the getaddrinfo lock. Besides I don't think the crawler has a
>>>> connection timeout, so if it *were *looking up hundreds of domains it
>>>> would wait as long as necessary for the lock.
>>>>
>>>> In this bug report, on the other hand, there are hundreds of coroutines
>>>> all waiting to look up different domains, and some of those domains take
>>>> several seconds to resolve. Resolving hundreds of different domains, plus
>>>> the getaddrinfo lock, plus Motor's 20-second connection timeout, all
>>>> combine to create this bad behavior.
>>>>
>>>> I like option #4 also; that gives me the freedom to either cache
>>>> lookups in Motor, or to treat "localhost" specially, or at least to prevent
>>>> a slow getaddrinfo from appearing to be a connection timeout. I haven't
>>>> decided which is the best approach, but #4 allows me flexibility. Would you
>>>> like me to write the patch or will you?
>>>>
>>>> I'm curious about #5 too, but I think that's a longer term project.
>>>>
>>>> On Wednesday, December 9, 2015 at 12:57:53 PM UTC-5, Guido van Rossum
>>>> wrote:
>>>>>
>>>>> 4. Modify asyncio's getaddrinfo() so that if you pass it something
>>>>> that looks like a numerical address (IPv4 or IPv6 syntax, looking exactly
>>>>> like what getaddrinfo() returns) it skips calling socket.getaddrinfo().
>>>>> I've wanted this for a while but hadn't run into this queue issue, so this
>>>>> is my favorite.
>>>>>
>>>>> 5. Do the research needed to prove that socket.getaddrinfo() on OS X
>>>>> (or perhaps on sufficiently recent versions of OS X) is thread-safe and
>>>>> submit a patch that avoids the getaddrinfo lock on those versions of OS X.
>>>>>
>>>>> FWIW, IIRC my crawler example (which your editing made so much
>>>>> better!) calls getaddrinfo() for every connection and I hadn't experienced
>>>>> any slowness there. But maybe in the Motor example you're hitting a slow
>>>>> DNS server? (I'm guessing there are lots of system configuration 
>>>>> parameters
>>>>> that may make the system's getaddrinfo() slower or faster, and in your
>>>>> setup it may well be slower.)
>>>>>
>>>>> On Wed, Dec 9, 2015 at 7:05 AM, A. Jesse Jiryu Davis <
>>>>> je...@emptysquare.net> wrote:
>>>>>
>>>>>> Thanks Guido, this all makes sense.
>>>>>>
>>>>>> One problem, though, is that even if I call getaddrinfo myself in
>>>>>> Motor and wrap a cache around it, I *still* can't use the event
>>>>>> loop's create_connection() call. create_connection always
>>>>>> calls getaddrinfo, and even though it "should be quick", that doesn't
>>>>>> matter in this scenario: the time is spent waiting for the getaddrinfo
>>>>>> lock. So I would need one of:
>>>>>>
>>>>>> 1. Copy the body of create_connection into Motor so I can customize
>>>>>> the getaddrinfo call. I especially don't like this because I'd have to 
>>>>>> call
>>>>>> the loop's private _create_connection_transport() from Motor's
>>>>>> customized create_connection(). Perhaps we could
>>>>>> make _create_connection_transport public, or otherwise make a public API
>>>>>> for separating getaddrinfo from actually establishing the connection?
>>>>>>
>>>>>> 2. Make getaddrinfo customizable in asyncio (
>>>>>> https://github.com/python/asyncio/issues/160). This isn't ideal,
>>>>>> since it requires Motor users on Mac / BSD to change configuration for 
>>>>>> the
>>>>>> whole event loop just so Motor's specific create_connection calls behave
>>>>>> correctly.
>>>>>>
>>>>>> 3. Back to the original proposal: add a connection timeout parameter
>>>>>> to create_connection. =)
>>>>>>
>>>>>> On Tuesday, December 8, 2015 at 4:30:04 PM UTC-5, Guido van Rossum
>>>>>> wrote:
>>>>>>
>>>>>>> On Tue, Dec 8, 2015 at 7:13 AM, A. Jesse Jiryu Davis <
>>>>>>> je...@emptysquare.net> wrote:
>>>>>>>
>>>>>>>> Hi, a Motor user began an interesting discussion on the
>>>>>>>> MongoDB-user list:
>>>>>>>>
>>>>>>>>
>>>>>>>> https://groups.google.com/d/topic/mongodb-user/2oK6C3BrVKI/discussion
>>>>>>>>
>>>>>>>> The summary is this: he's fetching hundreds of URLs concurrently
>>>>>>>> and inserting the results into MongoDB with Motor. Motor throws lots of
>>>>>>>> connection-timeout errors. The problem is getaddrinfo: on Mac, Python 
>>>>>>>> only
>>>>>>>> allows one getaddrinfo call at a time. With hundreds of HTTP fetches in
>>>>>>>> progress, there's a long queue waiting for the getaddrinfo lock. 
>>>>>>>> Whenever
>>>>>>>> Motor wants to grow its connection pool it has to call getaddrinfo on
>>>>>>>> "localhost", and it spends so long waiting for that call, it times out 
>>>>>>>> and
>>>>>>>> thinks it can't reach MongoDB.
>>>>>>>>
>>>>>>>
>>>>>>> If it's really looking up "localhost" over and over, maybe wrap a
>>>>>>> cache around getaddrinfo()?
>>>>>>>
>>>>>>>
>>>>>>>> Motor's connection-timeout implementation in asyncio is sort of
>>>>>>>> wrong:
>>>>>>>>
>>>>>>>>     coro = asyncio.open_connection(host, port)
>>>>>>>>     sock = yield from asyncio.wait_for(coro, timeout)
>>>>>>>>
>>>>>>>> The timer runs during the call to getaddrinfo, as well as the call
>>>>>>>> to the loop's sock_connect(). This isn't the intention: the timeout 
>>>>>>>> should
>>>>>>>> apply only to the connection.
>>>>>>>>
>>>>>>>> A philosophical digression: The "connection timeout" is a
>>>>>>>> heuristic. "If I've waited N seconds and haven't established the
>>>>>>>> connection, I probably never will. Give up." Based on what they know 
>>>>>>>> about
>>>>>>>> their own networks, users can tweak the connection timeout. In a fast
>>>>>>>> network, a server that hasn't responded in 20ms is probably down; but 
>>>>>>>> on a
>>>>>>>> global network, 10 seconds might be reasonable. Regardless, the 
>>>>>>>> heuristic
>>>>>>>> only applies to the actual TCP connection. Waiting for getaddrinfo is 
>>>>>>>> not
>>>>>>>> related; that's up to the operating system.
>>>>>>>>
>>>>>>>> In a multithreaded client like PyMongo we distinguish the two
>>>>>>>> phases:
>>>>>>>>
>>>>>>>>     for res in socket.getaddrinfo(host, port, family,
>>>>>>>> socket.SOCK_STREAM):
>>>>>>>>         af, socktype, proto, dummy, sa = res
>>>>>>>>         sock = socket.socket(af, socktype, proto)
>>>>>>>>         try:
>>>>>>>>             sock.settimeout(connect_timeout)
>>>>>>>>
>>>>>>>>             # THE TIMEOUT ONLY APPLIES HERE.
>>>>>>>>             sock.connect(sa)
>>>>>>>>             sock.settimeout(None)
>>>>>>>>             return sock
>>>>>>>>         except socket.error as e:
>>>>>>>>             # Connection refused, or not established within the
>>>>>>>> timeout.
>>>>>>>>             sock.close()
>>>>>>>>
>>>>>>>> Here, the call to getaddrinfo isn't timed at all, and each distinct
>>>>>>>> attempt to connect on a different address is timed separately. So this 
>>>>>>>> kind
>>>>>>>> of code matches the idea of a "connect timeout" as a heuristic for 
>>>>>>>> deciding
>>>>>>>> whether the server is down.
>>>>>>>>
>>>>>>>> Two questions:
>>>>>>>>
>>>>>>>> 1. Should asyncio.open_connection support a connection timeout that
>>>>>>>> acts like the blocking version above? That is, a connection timeout 
>>>>>>>> that
>>>>>>>> does not include getaddrinfo, and restarts for each address we attempt 
>>>>>>>> to
>>>>>>>> connect to?
>>>>>>>>
>>>>>>>
>>>>>>> Hm, I don't really like adding timeouts to every API. As you
>>>>>>> describe everyone has different needs. IMO if you don't want the 
>>>>>>> timeout to
>>>>>>> cover the getaddrinfo() call, call getaddrinfo() yourself and pass the 
>>>>>>> host
>>>>>>> address into the create_connection() call. That way you also have 
>>>>>>> control
>>>>>>> over whether to e.g. implement "happy eyeballs". (It will still call
>>>>>>> socket.getaddrinfo(), but it should be quick -- it's not going to a DNS
>>>>>>> server or even /etc/hosts to discover that 127.0.0.1 maps to 127.0.0.1.)
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> 2. Why does Python lock around getaddrinfo on Mac and Windows
>>>>>>>> anyway? The code comment says these are "systems on which 
>>>>>>>> getaddrinfo() is
>>>>>>>> believed to not be thread-safe". Has this belief ever been confirmed?
>>>>>>>>
>>>>>>>>
>>>>>>>> https://hg.python.org/cpython/file/d2b8354e87f5/Modules/socketmodule.c#l185
>>>>>>>>
>>>>>>>
>>>>>>> I don't know -- the list of ifdefs seems to indicate this is a
>>>>>>> generic BSD issue, which is OS X's heritage. Maybe someone can do an
>>>>>>> experiment, or review the source code used by Apple (if it's still open
>>>>>>> source)? While I agree that if this really isn't an issue we shouldn't
>>>>>>> bother with the lock, I'd also much rather be safe than sorry when it 
>>>>>>> comes
>>>>>>> to races in core Python.
>>>>>>>
>>>>>>> --
>>>>>>> --Guido van Rossum (python.org/~guido)
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> --Guido van Rossum (python.org/~guido)
>>>>>
>>>>


-- 
--Guido van Rossum (python.org/~guido)

Reply via email to