On Thu, Jun 10, 2021 at 8:21 AM Tobias S. Josefowitz
<[email protected]> wrote:
>
> On Wed, Jun 9, 2021 at 10:49 PM Stephen R. van den Berg <[email protected]> wrote:
> >
> > I'd say, the way forward should be changing the connecting functions to
> > iterate through the IP addresses and find the first that actually connects.
> >
> > >Needn't revert completely. If it's decided that the preference should
> > >continue to be A, then it's just a matter of reversing the order of
> > >returned values - a simple matter of switching two numbers.
> >
> > That would be an acceptable workaround until the connection code is fixed,
> > I think.
>
> Maybe I am the one missing something here, but even if "the connection
> code" were to learn to try to establish connections to all IPv4+IPv6
> addresses resolved for a hostname, it could not do that on top of
> Protocols.DNS.async_host_to_ip, as that returns a single result.

True, but if you call Protocols.HTTP.get_url(), the way that it does
its DNS lookups is an implementation detail.

> Changing it to result in something other than a single result will
> break every single user of Protocols.DNS.async_host_to_ip.
>
> As Protocols.DNS.async_host_to_ip is thus constrained to providing a
> single response, changing it to respond with an IPv6 IP preferred at
> this stage of IPv6 deployment is breaking more than it fixes. I would
> assume there are still much more IPv4-only than IPv6 hosts in the
> world, and the amount of IPv6-only hosts will be severely limited - at
> least some 6to4 translation should normally be available. Responding
> with IPv6 addresses first, esp. without applying any heuristics
> whatsoever to determine if a system might be IPv6-enabled, can thus
> only be seen as a severe regression.
>
> Responding with an IPv6-address if no IPv4-address can be resolved
> might not be a regression, but is neither terribly consistent or
> elegant, so it might be best to not do that either.
>
> All in all, I think the solution - beyond restoring useful and
> expected behaviour of Protocols.DNS.async_host_to_ip - would be to
> introduce a more suitable API - if necessary - and to start using that
> instead.

Okay. Here's a proposal:

1) Have a Protocols.DNS.prefer_ipv6() that chooses whether IPv6 is
preferred over IPv4 for simple lookups. Calling prefer_ipv6(1) will
result in the current behaviour, calling prefer_ipv6(0) will return to
previous behaviour. This just changes the order of responses, nothing
else.

2) Create host_to_ips() which returns (or in the case of a promise,
yields) an array with all of the results. By definition, host_to_ip()
== host_to_ips()[0] or "".

3) Change Protocols.HTTP.Query() to use async_host_to_ips(). This will
change its hostname_cache to carry arrays instead of strings - is that
used externally? I found one reference in Protocols.HTTP.Session and
one in Web.Crawler, but they're just synchronizing caches, and they
don't care what's actually stored in it.

4) Possibly change Protocols.HTTP.dns_lookup to use Protocols.DNS for
consistency?? Currently that one uses gethostbyname, so there's a
notable behavioural difference between sync_request and thread_request
(which use dns_lookup and gethostbyname) and async_request (which uses
dns_lookup_async and Protocols.DNS). Otherwise, just tweak dns_lookup
to store an array in the cache.

5) Add another phase to asynchronous HTTP connection after obtaining
DNS results: it maintains an array of IPs and attempts to connect to
them sequentially. The success callback will be the same, the failure
callback will attempt the next IP.

I'm part-way inclined to start with #4, since there's a weird
inconsistency there already. If you mix and match
sync_request/thread_request and async_request, the behaviour may
depend on which one populates the cache.

Is this worth working on?

ChrisA

Reply via email to