Re: resolvers hold valid clarification

2021-09-23 Thread Michał Pasierb
I dug deeper in the code and found out that server health checks also call
dns resolution. The hold.valid was used before the change at
https://github.com/haproxy/haproxy/commit/f50e1ac4442be41ed8b9b7372310d1d068b85b33
to trigger the resolution. After the change it is triggered no more. So the
resolution if driven only by timeout resolve now. In other words hold.valid
has not much meaning.

When a server is in layer 4 connection error there are tasks inserted into
scheduler queue which when executed never do any job after the change. This
is suboptimal but I guess are not many installations with servers in this
state for a long time.

Can anyone confirm my understating of this setting ?

śr., 22 wrz 2021 o 19:52 Michał Pasierb  napisał(a):

> I would like to clarify how *hold valid* is used by resolvers. I have
> this configuration:
>
> resolvers mydns
>   nameserver dns1 192.168.122.202:53
>   accepted_payload_size 8192
>
>   timeout resolve 5s
>   timeout retry   2s
>   resolve_retries 3
>
>   hold other  30s
>   hold refused120s
>   hold nx 30s
>   hold timeout10s
>   hold valid  1h
>   hold obsolete   0s
>
> The *valid* setting is a bit confusing. I can not find good explanation
> of it in the documentation. From what I see in the code of version 2.0.25,
> it is used only when doing DNS queries in tcp/http path:
>
> http-request do-resolve(txn.myip,mydns,ipv4) str(services.example.com)
>
> only when requests arrive in parallel, I see less queries to DNS servers
> than http requests. When requests are done in sequence, I see the same
> count of DNS requests as http requests. For example when I send 3000
> requests to HAProxy with 3 clients in parallel, there are about 2600
> requests to DNS servers.
>
> So it doesn't look like a proper cache to me. Whole HAProxy becomes
> unusable 10 seconds (hold timeout) after DNS servers stop responding
> because every server which is using DNS SRV record is put to maintenance
> state due to resolution error.
>
> Is this proper assessment of current state and is this what was intended ?
>
> Regards,
> Michal
>


resolvers hold valid clarification

2021-09-22 Thread Michał Pasierb
I would like to clarify how *hold valid* is used by resolvers. I have this
configuration:

resolvers mydns
  nameserver dns1 192.168.122.202:53
  accepted_payload_size 8192

  timeout resolve 5s
  timeout retry   2s
  resolve_retries 3

  hold other  30s
  hold refused120s
  hold nx 30s
  hold timeout10s
  hold valid  1h
  hold obsolete   0s

The *valid* setting is a bit confusing. I can not find good explanation of
it in the documentation. From what I see in the code of version 2.0.25, it
is used only when doing DNS queries in tcp/http path:

http-request do-resolve(txn.myip,mydns,ipv4) str(services.example.com)

only when requests arrive in parallel, I see less queries to DNS servers
than http requests. When requests are done in sequence, I see the same
count of DNS requests as http requests. For example when I send 3000
requests to HAProxy with 3 clients in parallel, there are about 2600
requests to DNS servers.

So it doesn't look like a proper cache to me. Whole HAProxy becomes
unusable 10 seconds (hold timeout) after DNS servers stop responding
because every server which is using DNS SRV record is put to maintenance
state due to resolution error.

Is this proper assessment of current state and is this what was intended ?

Regards,
Michal