Re: [Dnsmasq-discuss] [PATCH] TCP client timeout setting

2023-05-25 Thread Petr Menšík
This problem is best tested by an example, taken from [2] but a bit 
modified.


Let's create hepothetical network issue with one forwarder, which worked 
fine a while ago.


$ sudo iptables -I INPUT -i lo -d 127.0.0.255 -j DROP

Now start dnsmasq and send tcp query to it

$ dnsmasq -d --log-queries --port 2053 --no-resolv --conf-file=/dev/null 
--server=127.0.0.255 --server=127.0.0.1

$ dig +tcp @localhost -p 2053 test

;; communications error to ::1#2053: timed out
;; communications error to ::1#2053: timed out
;; communications error to ::1#2053: timed out
;; communications error to 127.0.0.1#2053: timed out

; <<>> DiG 9.18.15 <<>> +tcp @localhost -p 2053 test
; (2 servers found)
;; global options: +cmd
;; no servers could be reached

Because dig waits much shorter time than dnsmasq does, it never receives 
any reply. Even when the other server is responding just fine. That is 
main advantage of having local cache running, isn't it? It should 
improve things!


Now lets be persistent and keep trying:

$ time for TRY in {1..6}; do dig +tcp @localhost -p 2053 test; done

After few timeouts, it will finally notice something is wrong and tries 
also the second server, which will answer fast. However this works only 
with dnsmasq -d, which is not used in production. If I replace it with 
dnsmasq -k, it will not answer at all!


$ dnsmasq -k --log-queries --port 2053 --no-resolv --conf-file=/dev/null 
--server=127.0.0.255 --server=127.0.0.1

$ time for TRY in {1..8}; do dig +tcp @localhost -p 2053 test; done

...
;; communications error to ::1#2053: timed out
;; communications error to ::1#2053: timed out
;; communications error to ::1#2053: timed out
;; communications error to 127.0.0.1#2053: timed out

; <<>> DiG 9.18.15 <<>> +tcp @localhost -p 2053 test
; (2 servers found)
;; global options: +cmd
;; no servers could be reached


real    5m20,602s
user    0m0,094s
sys    0m0,115s

This is because with -k it spawns tcp workers, which start always with 
whatever last_server prepared by last UDP. And until any UDP query 
arrives to save the day, it will stubbornly try non-responding server 
first. Even when the other one answers in miliseconds. Notice it have 
been trying 5 minutes without success.


I think this has to be fixed somehow. This is corner case, because TCP 
queries are usually caused by UDP queries with TC bit set. But there 
exist real-world examples, where TCP only query makes sense. But dnsmasq 
does not handle them well. Summarized this at [3].


My proposal would be sending UDP query + EDNS0 header in case sending 
query failed to the main process, which can then trigger forwarders 
responsiveness and change the last_server to a working one. So 
subsequent attempts do not fall into the blackhole again and again. 
EDNS0 header would be there to increase chance for a positive reply from 
upstream, which can be cached.


Would you have other ideas, how to solve this problem?

Cheers,
Petr

[2] https://bugzilla.redhat.com/show_bug.cgi?id=2160466#c6
[3] https://bugzilla.redhat.com/show_bug.cgi?id=2160466#c13

On 19. 05. 23 13:40, Petr Menšík wrote:
When analysing report [1] for non-responding queries over TCP, I have 
found forwarded TCP connections have quite high timeout. If for 
whatever reason the forwarder currently set as a last used forwarder 
is dropping packets without reject, the TCP will timeout for about 120 
seconds on my system. That is way too much, I think any TCP clients 
will give up far before that. This is just quick workaround to improve 
the situation, not final fix.


...

[1] https://bugzilla.redhat.com/show_bug.cgi?id=2160466


--
Petr Menšík
Software Engineer, RHEL
Red Hat, http://www.redhat.com/
PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB



OpenPGP_0x4931CA5B6C9FC5CB.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature
___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss


Re: [Dnsmasq-discuss] dhcp-lease-max is only for DHCPv4?

2023-05-25 Thread Petr Menšík
Yes, dhclient generates DUID into its lease file. Either add -lf 
/var/lib/dhclient/dhclient-$I.leases or just remove lease file after 
each dhclient run. Parameter -D LLT might help too.


It should be visible what IPv6 address it is offering to the client in 
logs. Does it change?


Petr

On 5/23/23 10:11, Simon Kelley wrote:
In DHCPv6, the unique identifier for a client is NOT the MAC address, 
it's a client ID which sometimes contains the MAC address.


I suspect that dhclient is using the exact same client-id for each 
trial, and just renewing the existing lease. You will need to delete 
all the dhclient state after killing the process.


Simon.


On 23/05/2023 08:43, Linyih Teng wrote:

For the test.. i'm just curious, there is no other reason.

However, On the client side, I wrote simple scripts to run the 
dhclient, and this script will sequentially run 512 dhclient.(the 
number 512 is not a magic value, other values will happen same 
situation.)


steps of the script:

    1. create macvlan interface(It will make different MAC address for
    clients)

    2. run dhclient with macvlan interface

    3. get an IP from DHCPv6 server

    4. kill the dhclient and remove the macvlan interface

    5. back to step 1. and go on.


Results:

    After scripts, if the 513th client comes, the server will serve the
    IP to the 513th client.  but it is not just lease max + 1 th client
    getting this issue, all after the 512th client can get IP from the
    server.
    At this time,  the lease entries are remaining at 512, and all after
    clients will not appear in the lease file.



Thanks,
Lin



Geert Stappers mailto:stapp...@stappers.nl>> 於 
2023年5月23日 週二 下午1:59寫道:


    On Tue, May 23, 2023 at 12:05:08AM +0100, Simon Kelley wrote:
 > On 22/05/2023 12:18, Linyih Teng wrote:
 > > In the manual page is written:
 > > > -X, --dhcp-lease-max=
 > > >        Limits  dnsmasq  to  the  specified maximum number of
    DHCP
 > > >        leases. The default is 1000. This limit is to     
prevent  DoS

 > > >        attacks from hosts which create thousands of leases
    and use
 > > >        lots of memory in the dnsmasq process.
 > >
 > > Hello,
 > >
 > > I'm using dnsmasq2.89 and testing the maximum lease count of
    the DHCPv6
 > > server with the *dhcp-lease-max* option.
 > >
 > > For the testing, I'm using below configuration:
 > >
 > >     *dhcp-lease-max* = 512
 > >  *dhcp-range*=tag:pool0,2022::1,2022::1f:::fffe,64,120m
 > >     tag-if=set:pool0,tag:intfv0
 > >
 > >
 > > However, when the number of clients reaches the maximum 
number, the

 > > server still provides IPs to clients. Is this the expected
    behavior of
 > > DHCPv6?
 > >
 > There's a possible difference between the number of clients and
    the number
 > of DHCP leases, since leases can expire to be deleted by the 
client.

 >
 > Are you saying that the number of simultaneous DHCP leases
    increases without
 > bound, or that the 513th client gets a lease? Have you checked
    the number of
 > leases in the dnsmasq.leases file?

    Original Poster has yet to say what the expected behaviour should 
be.


    Thing I am saying: Why limit dhcp-range by dhcp-lease-max?


    Regards
    Geert Stappers
    --     Silence is hard to parse

    ___
    Dnsmasq-discuss mailing list
    Dnsmasq-discuss@lists.thekelleys.org.uk
    
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss 
 




___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss


___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss


--
Petr Menšík
Software Engineer, RHEL
Red Hat, https://www.redhat.com/
PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB


___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss