I glanced at the code, I'm pretty sure you're wrong.
Can you point where it actually dequeues the qid from the global table
or closes the file descriptor of the prior server when it goes to the
next server on a timeout? Or where it somehow stops listening for a
response from the old server at any point prior to the entire query ending?
As far as I can tell, its still going to accept a reply from the prior
server even when sending the query to the next ...
Looking at how it processes timeouts:
https://github.com/c-ares/c-ares/blob/main/src/lib/ares_process.c#L556
next_server doesn't close the fd
https://github.com/c-ares/c-ares/blob/main/src/lib/ares_process.c#L757
The qid doesn't appear to get rewritten during sending to the next server
https://github.com/c-ares/c-ares/blob/main/src/lib/ares_process.c#L799
Lookups for responses from any open fd does a lookup on qid only, there
is no server expectation:
https://github.com/c-ares/c-ares/blob/main/src/lib/ares_process.c#L591
The qid is generated in ares_send(), which is only ever called once per
query and doesn't change on re-sends (whether to the same server or others).
Again, I'm pretty sure all that needs to be done to meet your needs
within c-ares is:
1) Reduce query response timeout from 5s default to something more
reasonable like 200ms
2) Have the ability to set an overall query timeout (rather than relying
on number of tries) -- this will properly handle high latency connections
3) Feedback loop to re-sort server list any time either a) we receive a
hard error trying to reach a server b) we receive a successful response
from a server. (a) would sort server to bottom of server list, (b)
would sort server to top of server list.
On 1/20/22 2:24 PM, Dmitry Karpov via c-ares wrote:
> I'm pretty sure that c-ares is already doing this next server as a
parallel query, just the default timeout isn't where you expect. If
you set it lower, it will start a second request at that point the
timeout is hit, but if the first request responds,
> it will still use that response if the next server on the list
hasn't yet responded .... its been a while since I looked at the code,
but that seems to be what I recall.
Nope. C-ares iterates name servers sequentially and waits until DNS
timeout occurs before switching to the other name server in the list.
It matches the expected behavior for resolv.conf on Linux, which
prescribes resolver to iterate name servers sequentially.
For resolv.conf c-ares honors only the “rotate” option, which allows
to start not from the first server in the name server list, but not
any other options.
While sequential approach makes sense in general, it doesn’t work well
for cases with bad name servers (either dual or single stack) where
the fastest name resolution is very critical, and it also makes
overall DNS timeout non-deterministic depending on a number of bad
servers in the list.
So, for such cases we either need to have internal sorting putting
good servers on top, or use some kind of parallel approach.
Thanks,
Dmitry Karpov
*From:* Brad House <b...@brad-house.com>
*Sent:* Wednesday, January 19, 2022 4:24 PM
*To:* c-ares discussions <c-ares@lists.haxx.se>
*Cc:* Dmitry Karpov <dkar...@roku.com>
*Subject:* Re: Feature request for parallel queries for name servers
from different protocol families (IPv4 vs IPv6)
I'm pretty sure that c-ares is already doing this next server as a
parallel query, just the default timeout isn't where you expect. If
you set it lower, it will start a second request at that point the
timeout is hit, but if the first request responds, it will still use
that response if the next server on the list hasn't yet responded ....
its been a while since I looked at the code, but that seems to be what
I recall. What c-ares does NOT have is an overall query timeout ...
that has been requested previously, but it doesn't currently exist
(though I agree it should). The logic for retries once it hits the
end of the list of nameservers is a bit weird so predicting when a
query will return a failed result is basically impossible from what I
recall. So this seems to be converging on what I originally suggested
then, except now it sounds like also adding the ability to set an
overall query timeout.
On 1/19/22 7:04 PM, Dmitry Karpov via c-ares wrote:
> Again, there's a reason happy eyeballs doesn't just hammer all
endpoints returned from getaddrinfo() simultaneously, I'd think
the same reasoning would go for DNS servers ... be kind ... start
a second query after a short delay if we haven't received a
response yet (e.g. 200ms).
> It doesn't make sense to hammer more than 1 DNS server if
they're all responsive, you just doubled the network load for DNS
for no reason.
Very true! But in my parallel approach, I didn’t mean to start all
parallel queries simultaneously.
I didn’t nail the details, but obviously such approach should be
similar to the Happy Eyeballs even for single stacks.
So, parallel queries in the parallel approach should be started
with some small delays like 200ms in Happy Eyeballs, but the whole
name resolution should be controlled by one constant and
deterministic timeout – i.e. 5s, which shouldn’t depend on the
number of the name servers in the list, as it is currently the
case with c-ares.
In my use cases, using c-ares with libcurl, I see different name
resolution timeouts: 5s, 15s,… depending on a number of bad name
servers in the list, which cause some my time critical services to
fail.
And we can’t just use 200ms as a DNS timeout per name server and
iterate name servers sequentially, because there are high-latency
satellite links with big RTTs, which require 2s and sometimes more
for name resolutions.
That’s why the parallel approach (with delays between parallel
queries) seems to me as a better solution for bad name servers
than the sequential one.
But as I said, any improvements in this area will be very welcomed
c-ares extensions, especially if they help libcurl with c-ares,
used by a lot of people, to better handle issues with bad name
servers.
Thanks,
Dmitry Karpov
*From:* Brad House <b...@brad-house.com> <mailto:b...@brad-house.com>
*Sent:* Wednesday, January 19, 2022 2:37 PM
*To:* c-ares discussions <c-ares@lists.haxx.se>
<mailto:c-ares@lists.haxx.se>
*Cc:* Dmitry Karpov <dkar...@roku.com> <mailto:dkar...@roku.com>
*Subject:* Re: Feature request for parallel queries for name
servers from different protocol families (IPv4 vs IPv6)
I guess it always depends on the design of whatever is using
c-ares. In my own use cases, I have a single ares_channel running
on an event loop and enqueue my lookups to there ... so it keeps
state. Nothing with thread local storage or anything, just
dispatching to that event loop for any DNS queries that need to be
performed. The single ares_channel can handle multiple
simultaneous DNS queries.
Also, since there is a proposed feedback loop, if a DNS server is
no longer reachable, it will re-sort the list for any future
requests, so it would only impact a single request (ok, well,
whatever number of requests came in before the timeout or error
occurred).
Again, there's a reason happy eyeballs doesn't just hammer all
endpoints returned from getaddrinfo() simultaneously, I'd think
the same reasoning would go for DNS servers ... be kind ... start
a second query after a short delay if we haven't received a
response yet (e.g. 200ms). It doesn't make sense to hammer more
than 1 DNS server if they're all responsive, you just doubled the
network load for DNS for no reason.
On 1/19/22 5:25 PM, Dmitry Karpov via c-ares wrote:
> I wasn't suggesting this be outside of c-ares, I was
talking about implementing this inside of c-ares as a simpler
alternative to your proposal.
OK, I got it know. :)
Pre-sorting name servers based on reachability from previous
queries or/and protocol family may help in some cases, but the
sequential approach, even with sorting, still will have some
issues that the parallel approach allows to solve more
efficiently.
For example, the first query when nothing is sorted, may cause
critical connection timeouts aborting some applications, and
storing name server “reachability metrics” which name servers
will be sorted on will require either thread local storage
(thus requiring each thread to go through the same “name
server discovery” procedure as the other app threads using
c-ares) or some global access to the metrics data with proper
read/write accesses, needed by multi-threaded apps.
Also, if run-time conditions change from the previous query
then the sorted list may be not sorted correctly for the
current conditions, and thus not the best server or even bad
server may be tried first, thus increasing name resolution time.
The parallel approach, on the other hand, will provide the
fastest name resolution regardless the previous queries, so it
doesn’t need to store any name server metrics and do
pre-processing of the name server list from OS.
But I agree that implementing parallel approach may be not
very easy and any improvements in this area will be a very
welcomed extension, anyway.
So, if you think that updated sequential approach with smart
sorting is much easier to implement than the parallel one,
then hopefully we can get it in next c-ares updates.
Thanks,
Dmitry Karpov
*From:* Brad House <b...@brad-house.com>
<mailto:b...@brad-house.com>
*Sent:* Wednesday, January 19, 2022 12:10 PM
*To:* c-ares discussions <c-ares@lists.haxx.se>
<mailto:c-ares@lists.haxx.se>
*Cc:* Dmitry Karpov <dkar...@roku.com> <mailto:dkar...@roku.com>
*Subject:* Re: Feature request for parallel queries for name
servers from different protocol families (IPv4 vs IPv6)
Commenting below ...
On 1/19/22 2:51 PM, Dmitry Karpov via c-ares wrote:
> Infact, happyeyeballs itself doesn't always do parallel
connection attempts, its an implementation-defined delay
before also attempting the next address in the list.
In case of Happy Eyeballs, a delay between IPv4 and IPv6
connections is constant and typically relatively short –
200-300ms.
But non-functional IPv6 name servers in the server list
may create dynamic delays in connection establishment
which can be very large.
By default, c-ares uses 5s timeout per name server, so it
may take 5s and more (if several IPv6 name servers are in
the list) to get to the connection Happy Eyeballs thus
taking much more than expected 200-300ms.
It would be assumed as part of this patch set, this timer
would be reduced.
> It would be much easier to stay closer to happy eyeballs
and just sort the dns server list using prior result
success/fail (even upfront sorting using some algorithm to
interleave ipv6/ipv4 in a pattern would help,
> maybe with using logic such as from RFC6724 sec 2.1
like we do in ares_getaddrinfo for returned addresses, but
instead of the nameservers themselves).
Yes, of course, it is possible that c-ares client can
implement some kind of name server sorting/filtering logic
outside of c-ares and just pass a list of “good” name
servers to c-ares, but in this case it has to be more
involved into the name resolution business than it would
be desired.
I wasn't suggesting this be outside of c-ares, I was talking
about implementing this inside of c-ares as a simpler
alternative to your proposal.
-Brad
--
c-ares mailing list
c-ares@lists.haxx.se
https://lists.haxx.se/listinfo/c-ares