Re: BUG/MINOR: dns: false positive downgrade of accepted_payload_size

Baptiste Wed, 21 Feb 2018 09:06:59 -0800

On Wed, Feb 21, 2018 at 11:07 AM, Lukas Tribus <lu...@ltri.eu> wrote:


> Hello Baptiste,
>
>
> On 21 February 2018 at 08:45, Baptiste <bed...@gmail.com> wrote:
> >> Is this downgrade at good thing in the first place? Doesn't it hide
> >> configuration and network issues, make troubleshooting more complex
> >> and the haproxy behavior less predictable?
> >
> >
> > It is an rfc recommendation (rfc number is commented somewhere in the
> source
> > code, but I am on a mobile and can't access it).
> > Its purpose is to hide networking issues when responses has to cross the
> > internet and behavior is not predictable.
>
> And I can see how this would be useful in end-user situations,
> browser, smartphone apps, etc.
>
> However in Haproxy the administrator *explicitly* configures a higher
> payload size, because this higher payload size is probably actually
> *required* to get all backend servers. Silently downgrading the
> payload size is harmful in my opinion here.
>
>
Maybe there is a misunderstood here.
HAProxy already downgrade "silently" the accepted payload size. I mean,
this "feature" already exists.
This patch simply ensure the downgrade happens in timeout cases only.



> >> When we see a response with the TC flag set, do we drop it or do we
> >> still consider the DNS response?
> >
> > Haproxy ignores tc flag for now.
>
> Ok, but you can see how ignoring the TC flag, combined with automatic
> payload size downgrade will make the backend servers number fluctuate
> with a little bit of packet loss? So with 40 servers in DNS and the
> loss of a single packet we will downgrade the entire backend to
> whatever fitted in the 1280 byte downgraded response.
>
> I would call this behavior highly undesirable.
>
>
You can play with "hold obsolete <timer>"  to ensure that unseen records
are kept for <timer> time.



>
> Note: networks/firewalls/router may more likely drop fragmented IP
> packets than "normal" IP packets (as they may try to reassemble them
> to actually apply layer 4+ ACLs, which requires buffers, which can
> overflow) and this makes it even worse.
>
>
>
> > Later, it will have to trigger a failover to tcp.
>
> Failing over to TCP on TC responses is the proper solution to this
> all, but in the meantime I would argue that we should make the
> behavior as predictable and stable as we can.
>

Let me know what you think,
>
>
I think that until we can do DNS over TCP, it is safer to downgrade the
announced accepted payload size in case of a timeout and ensure we'll still
receive answer than never ever downgrade and stop receiving answers.

As I explained above, this "feature" (downgrade) already exists in HAProxy,
but is applied to too many cases.
This patch simply ensure that it is applied to timeouts only.
So I don't see what's wrong with this patch.

Now, I agree the "silent" application of the downgrade could be problematic
and we could imagine some solutions to fix this:
- emit a WARNING message / log when it happens
- create a new command on the CLI to display current accepted payload
- create a new command on the CLI to set the accepted payload to a value
decided by the admin (so he can perform an upgrade in case a downgrade
happened)

Any other suggestion is welcome.

Baptiste

Re: BUG/MINOR: dns: false positive downgrade of accepted_payload_size

Reply via email to