Re: [squid-dev] [RFC] dns_wait_for_all

2016-09-20 Thread Alex Rousskov
On 09/20/2016 04:41 AM, Amos Jeffries wrote:
> On 16/09/2016 3:35 a.m., Alex Rousskov wrote:
>> On 09/15/2016 03:50 AM, Amos Jeffries wrote:
>>> The serverDestinations not changing (yet).

>> I am pretty sure we have to change that field to implement
>> dns_wait_for_all functionality correctly. For example:
>>
>> * we need to make its updates (from different jobs!) safe, with
>> easy-to-trace debugging.
>>
>> * we need to support an EOF-like semantics to signal whether more
>> destinations may still be provided.


> I think you are mistaking serverDestinations as being part of the DNS
> operations. It is not.

I am not thinking in those terms. serverDestinations today is populated
by peer selection code, based on DNS lookups (including DNS cache
lookups). Those DNS lookups create two problems that dns_wait_for_all is
solving (i.e., waiting for the second DNS query in each address
resolution transaction and waiting for all peer address resolution
transactions).


> serverDestinations is state in the FwdStateData 'Job', the IP addresses
> used by it have to go through the peer selection 'Job' processing and
> related access controls applied to whatever comes out of an ipcache
> lookup 'Job'.

Agreed. serverDestinations state is shared among peer selection and
FwdState code. Today, the sharing is sequential: After peer selection
fills serverDestinations, FwdState starts using serverDestinations. The
sequential nature of this approach creates problems. Dns_wait_for_all
solves some of them.


> DNS lookups are buried 3+ 'Job' layers away on the other side of
> ipcache. All they do is feed that cache with data.

Yes, there are many layers.

The DNS code itself does not quite feed the cache (it just sends an
answer to the requester which happens to be the cache), and the cache is
not quite a layer: A fresh DNS answer is given to both the cache and the
ipcache_nbgethostbyname() callback. There might be some unwelcomed
dependencies in the code that require that fresh answer to be cached,
but they should not exist and they are unimportant when we discuss high
level concepts.


> If you are intending to alter serveDestinations operations, then please
> dont call the option dns_* . It is a tcp_outgoing_* thing.

I do not recommend using the tcp_outgoing name prefix because the
directive has nothing to do with TCP (and everything to do with DNS).
However, I am not going to fight you on this -- we usually disagree on
how things should be named, and there usually are not enough formal
arguments that can be made to support a given name. If nobody speaks up,
you will be able to rename the directive however you want. I am
certainly not a big fan of dns_wait_for_all!

Needless to say, the directive name should reflect what the feature does
from Squid admin point of view (the code changes implementing the
feature are irrelevant when it comes to naming). The proposed feature
determines whether Squid waits for certain DNS lookups when forwarding
transactions to peers.


Cheers,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [RFC] dns_wait_for_all

2016-09-20 Thread Amos Jeffries
On 16/09/2016 3:35 a.m., Alex Rousskov wrote:
> On 09/15/2016 03:50 AM, Amos Jeffries wrote:
>> On 15/09/2016 5:11 p.m., Alex Rousskov wrote:
>>> On 09/14/2016 07:26 PM, Amos Jeffries wrote:
 On 15/09/2016 8:15 a.m., Alex Rousskov wrote:
> Any better ideas or objections to adding dns_wait_for_all?
>>>
>>>
 In principle okay. However, I was intending to redesign the object we
 store DNS RR results in to achieve this is a more useful way. 
>>>
>>> If you are talking about Comm::ConnectionList/serverDestinations, then
>>> that object will have to be redesigned to support this proposal, of
>>> course. The object would also have to be safely shared across FwdState,
>>> peer selection code, and DNS code for the proposed feature to work.
>>
>> I mean Dns::LookupDetails, ipcache_addrs and rfc1035_rr.
> 
>> Something like:
>>
>> * ipcache_addrs::in_addrs becomes a list of rfc1035_rr instead of array
>> if IP address. (or an equivalent type storing the same details in easier
>> to access type than 'rdata' uses).
>>
>> * ipcache_addrs::bad_mask moved into rfc1035_rr for a mask on the
>> records in each RR separately.
>>
>> * ipcache_addrs merged into Dns::LookupDetails.
> 
> That sounds mostly unrelated to the problem at hand. If we need to alter
> Dns::LookupDetails, ipcache_addrs, or rfc1035_rr in some significant
> way, then we will come back to this discussion. Otherwise, it stays as a
> separate/independent not-yet-discussed project.
> 
> 
>> The serverDestinations not changing (yet).
> 
> I am pretty sure we have to change that field to implement
> dns_wait_for_all functionality correctly. For example:
> 
> * we need to make its updates (from different jobs!) safe, with
> easy-to-trace debugging.
> 
> * we need to support an EOF-like semantics to signal whether more
> destinations may still be provided.


I think you are mistaking serverDestinations as being part of the DNS
operations. It is not.

serverDestinations is state in the FwdStateData 'Job', the IP addresses
used by it have to go through the peer selection 'Job' processing and
related access controls applied to whatever comes out of an ipcache
lookup 'Job'.

DNS lookups are buried 3+ 'Job' layers away on the other side of
ipcache. All they do is feed that cache with data.

If you are intending to alter serveDestinations operations, then please
dont call the option dns_* . It is a tcp_outgoing_* thing.

Amos

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [RFC] dns_wait_for_all

2016-09-15 Thread Alex Rousskov
On 09/15/2016 03:50 AM, Amos Jeffries wrote:
> On 15/09/2016 5:11 p.m., Alex Rousskov wrote:
>> On 09/14/2016 07:26 PM, Amos Jeffries wrote:
>>> On 15/09/2016 8:15 a.m., Alex Rousskov wrote:
 Any better ideas or objections to adding dns_wait_for_all?
>>
>>
>>> In principle okay. However, I was intending to redesign the object we
>>> store DNS RR results in to achieve this is a more useful way. 
>>
>> If you are talking about Comm::ConnectionList/serverDestinations, then
>> that object will have to be redesigned to support this proposal, of
>> course. The object would also have to be safely shared across FwdState,
>> peer selection code, and DNS code for the proposed feature to work.
> 
> I mean Dns::LookupDetails, ipcache_addrs and rfc1035_rr.

> Something like:
> 
> * ipcache_addrs::in_addrs becomes a list of rfc1035_rr instead of array
> if IP address. (or an equivalent type storing the same details in easier
> to access type than 'rdata' uses).
> 
> * ipcache_addrs::bad_mask moved into rfc1035_rr for a mask on the
> records in each RR separately.
> 
> * ipcache_addrs merged into Dns::LookupDetails.

That sounds mostly unrelated to the problem at hand. If we need to alter
Dns::LookupDetails, ipcache_addrs, or rfc1035_rr in some significant
way, then we will come back to this discussion. Otherwise, it stays as a
separate/independent not-yet-discussed project.


> The serverDestinations not changing (yet).

I am pretty sure we have to change that field to implement
dns_wait_for_all functionality correctly. For example:

* we need to make its updates (from different jobs!) safe, with
easy-to-trace debugging.

* we need to support an EOF-like semantics to signal whether more
destinations may still be provided.


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [RFC] dns_wait_for_all

2016-09-14 Thread Alex Rousskov
On 09/14/2016 07:26 PM, Amos Jeffries wrote:
> On 15/09/2016 8:15 a.m., Alex Rousskov wrote:
>> Any better ideas or objections to adding dns_wait_for_all?


> In principle okay. However, I was intending to redesign the object we
> store DNS RR results in to achieve this is a more useful way. 

If you are talking about Comm::ConnectionList/serverDestinations, then
that object will have to be redesigned to support this proposal, of
course. The object would also have to be safely shared across FwdState,
peer selection code, and DNS code for the proposed feature to work.


> * slow DNS responses could add to the chain while earlier results are
> still in Async queue pending delivery to a caller.

... and while the caller is going through the TCP connect steps. That
TCP connect may be unsuccessful, requiring another attempt, possibly
using a different address. We already have the retrying code in place
and I am not proposing to change those algorithms during this project,
but the timing of new destinations arrival will change.


>  - this would obsolete the need for dns_wait_for_all to be configurable.

I have nothing specific against making the proposed algorithm the only
one, but I am not sure that _all_ side effects of the current
resolve-everything-first code are useless in all environments. AFAICT,
supporting both algorithms is easy (it is supporting the proposed
algorithm that is difficult).


> * ip/fqdn caches can store older data up to some timeout in TTL / LRU
> sequence.

I am not sure the DNS cache should be affected by the new lookup results
handling algorithm or the new serverDestinations storage of those
results. AFAICT, the DNS cache should store each result independently
while serverDestinations stores an ordered collection of results (some
of which may come from the DNS cache).

If you think that caching must change as a part of this project, can you
detail the related caching changes you would expect or foresee? If
caching changes are mostly independent, then we can discuss them
later/separately (to stay focused).


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [RFC] dns_wait_for_all

2016-09-14 Thread Amos Jeffries
On 15/09/2016 8:15 a.m., Alex Rousskov wrote:
> 
> Any better ideas or objections to adding dns_wait_for_all?
> 

In principle okay. However, I was intending to redesign the object we
store DNS RR results in to achieve this is a more useful way. Such that
instead of returning an array of IP addresses it returned an object
storing the Query label, then a chain of RR objects for that label.

That would give several other useful benefits beyond that speedup:

* slow DNS responses could add to the chain while earlier results are
still in Async queue pending delivery to a caller.
 - this would obsolete the need for dns_wait_for_all to be configurable.
Always respond on first result and append data if it comes in early enough.

* ip/fqdn caches can store older data up to some timeout in TTL / LRU
sequence.
 - resolving many DNS rotation issues, including certain type of Host
verify failures.
 - merging the IP and FQDN caches ??

* extensibility of RR type the DNS could lookup and present the callers.
 - needed by DANE and SRV lookups.

* one less memcpy cycle per IP address returned by DNS.
 - paid for by more chain dereferencing overheads.

Amos

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [RFC] dns_wait_for_all

2016-09-14 Thread Amos Jeffries
On 15/09/2016 8:15 a.m., Alex Rousskov wrote:
> Hello,
> 
> Currently, when connecting to an origin server, Squid sends
> concurrent DNS A and  queries and waits for both answers before
> proceeding with the HTTP transaction. If the authoritative DNS server
> (or something on its path) breaks or significantly delays IPv6 ()
> transactions, then Squid waits until timeout, even if Squid already has
> a usable IPv4 address from a successful A query. This naturally leads to
> admins disabling IPv6, willingly or under external pressure.
> 
> As Happy Eyeballs algorithms and related discussions/measurements have
> established, the best migration path to IPv6 requires making sure that
> enabling IPv6 does not create [user-visible] problems. Once that is
> accomplished, software can and should prefer IPv6 over IPv4 by default.
> I hope that we do not need to revisit those discussions to accept that
> principle.
> 
> Currently, Squid violates that principle -- enabling IPv6 leads to
> user-visible problems, and those problems lead to IPv6 being disabled.
> 
> 
> This will not fix all IPv6 problems, but I propose to modify Squid so
> that it starts connecting after receiving the first usable DNS response:
> 
>> dns_wait_for_all 
>>
>> Determines whether Squid resolves domain names of all possible
>> destinations in all supported address families before deciding which
>> IP address to try first when contacting an origin server or cache_peer.
>>
>> Before Squid can connect to a peer, it needs an IP address. Obtaining an
>> IP address often requires a DNS lookup. Squid often makes two concurrent
>> DNS lookups: An "A" query for an IPv4 address and an "" query for an
>> IPv6 address. This directive does not affect the number of DNS queries
>> sent or the side-effects of those queries (e.g., IP cache updates), but
>> if two concurrent lookups are initiated, and this directive is off, then
>> Squid proceeds immediately after receiving the first usable DNS answer.
>>
>> This directive does not affect forwarding retries. For example, if
>> dns_wait_for_all is off, and Squid gets an IPv4 address first, but the
>> TCP connection to that IPv4 address fails, Squid will wait for the IPv6
>> address resolution to complete (if it has not yet) and will then connect
>> to an IPv6 address (if possible).
>>
>> Furthermore, this directive does not affect the number of peer domain
>> names that Squid will attempt to resolve or peer addresses that Squid
>> may connect to. If Squid is allowed to forward a transaction to two
>> peers, then Squid will resolve both peer names and, if failures make it
>> necessary, will connect to all IP addresses of both peers (subject to
>> other restrictions such as connect_retries).
>>
>> See also: dns_v4_first
> 
> I suggest to enable this option by default because it will help with
> IPv6 adoption, but I certainly do not insist on that default.
> 
> 
> While we call both queries "concurrent", Squid sends the  query just
> before sending the A query. All other factors being equal, IPv6 will
> usually win the DNS race. However, even if  loses, Squid will use
> IPv6 the next time it needs to connect to the same server.
> 
> 
> From development point of view, support this feature properly means
> creating an AsyncJob that will initiate DNS queries and update the
> destinations list as the answers come in while informing the caller (if
> it is still alive) of any new answers. Today, FwdState does
> approximately this:
> 
>   1. Call peerSelect(, fwdPeerSelectionComplete)
>  and wait for the fwdPeerSelectionComplete callback.
> 
>   2. When fwdPeerSelectionComplete is called,
>  start iterating over pre-filled serverDestinations.
> 
> To support, dns_wait_for_all, FwdState will do _approximately_ this:
> 
>   1. Call peerSelect(serverDestinations, fwdPeerSelected)
>  and wait for the first fwdPeerSelected subscription callback.
> 
>   2. Every time fwdPeerSelected is called,
>  start or resume iterating still-unused serverDestinations
>  if we were actually waiting for the next destination to try.
> 
> The DNS code dealing with concurrent A and  queries will need to be
> adjusted to report the first answer while waiting for the second one.
> 
> It is questionable whether the new AsyncJob should continue running
> (i.e., resolving more peer names) after FwdState is gone (or no more
> retries are needed). However, I do not want to complicate this
> discussion by introducing that side effect. We can decide to optimize
> that later, with or without another configuration option to control the
> behavior.
> 
> Once this new infrastructure is in place, we can accommodate IPv6
> further by experimenting with limited semi-concurrent TCP connections
> and other Happy Eyeballs-inspired tricks (with proxy specifics in mind).
> 
> 
> Any better ideas or objections to adding dns_wait_for_all?
> 

In principle okay. However, I was intending to redesign the object we

[squid-dev] [RFC] dns_wait_for_all

2016-09-14 Thread Alex Rousskov
Hello,

Currently, when connecting to an origin server, Squid sends
concurrent DNS A and  queries and waits for both answers before
proceeding with the HTTP transaction. If the authoritative DNS server
(or something on its path) breaks or significantly delays IPv6 ()
transactions, then Squid waits until timeout, even if Squid already has
a usable IPv4 address from a successful A query. This naturally leads to
admins disabling IPv6, willingly or under external pressure.

As Happy Eyeballs algorithms and related discussions/measurements have
established, the best migration path to IPv6 requires making sure that
enabling IPv6 does not create [user-visible] problems. Once that is
accomplished, software can and should prefer IPv6 over IPv4 by default.
I hope that we do not need to revisit those discussions to accept that
principle.

Currently, Squid violates that principle -- enabling IPv6 leads to
user-visible problems, and those problems lead to IPv6 being disabled.


This will not fix all IPv6 problems, but I propose to modify Squid so
that it starts connecting after receiving the first usable DNS response:

> dns_wait_for_all 
> 
> Determines whether Squid resolves domain names of all possible
> destinations in all supported address families before deciding which
> IP address to try first when contacting an origin server or cache_peer.
> 
> Before Squid can connect to a peer, it needs an IP address. Obtaining an
> IP address often requires a DNS lookup. Squid often makes two concurrent
> DNS lookups: An "A" query for an IPv4 address and an "" query for an
> IPv6 address. This directive does not affect the number of DNS queries
> sent or the side-effects of those queries (e.g., IP cache updates), but
> if two concurrent lookups are initiated, and this directive is off, then
> Squid proceeds immediately after receiving the first usable DNS answer.
> 
> This directive does not affect forwarding retries. For example, if
> dns_wait_for_all is off, and Squid gets an IPv4 address first, but the
> TCP connection to that IPv4 address fails, Squid will wait for the IPv6
> address resolution to complete (if it has not yet) and will then connect
> to an IPv6 address (if possible).
> 
> Furthermore, this directive does not affect the number of peer domain
> names that Squid will attempt to resolve or peer addresses that Squid
> may connect to. If Squid is allowed to forward a transaction to two
> peers, then Squid will resolve both peer names and, if failures make it
> necessary, will connect to all IP addresses of both peers (subject to
> other restrictions such as connect_retries).
> 
> See also: dns_v4_first

I suggest to enable this option by default because it will help with
IPv6 adoption, but I certainly do not insist on that default.


While we call both queries "concurrent", Squid sends the  query just
before sending the A query. All other factors being equal, IPv6 will
usually win the DNS race. However, even if  loses, Squid will use
IPv6 the next time it needs to connect to the same server.


From development point of view, support this feature properly means
creating an AsyncJob that will initiate DNS queries and update the
destinations list as the answers come in while informing the caller (if
it is still alive) of any new answers. Today, FwdState does
approximately this:

  1. Call peerSelect(, fwdPeerSelectionComplete)
 and wait for the fwdPeerSelectionComplete callback.

  2. When fwdPeerSelectionComplete is called,
 start iterating over pre-filled serverDestinations.

To support, dns_wait_for_all, FwdState will do _approximately_ this:

  1. Call peerSelect(serverDestinations, fwdPeerSelected)
 and wait for the first fwdPeerSelected subscription callback.

  2. Every time fwdPeerSelected is called,
 start or resume iterating still-unused serverDestinations
 if we were actually waiting for the next destination to try.

The DNS code dealing with concurrent A and  queries will need to be
adjusted to report the first answer while waiting for the second one.

It is questionable whether the new AsyncJob should continue running
(i.e., resolving more peer names) after FwdState is gone (or no more
retries are needed). However, I do not want to complicate this
discussion by introducing that side effect. We can decide to optimize
that later, with or without another configuration option to control the
behavior.

Once this new infrastructure is in place, we can accommodate IPv6
further by experimenting with limited semi-concurrent TCP connections
and other Happy Eyeballs-inspired tricks (with proxy specifics in mind).


Any better ideas or objections to adding dns_wait_for_all?


Thank you,

Alex.
___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev