Thank you very much for the information below. Having that saved me a load of time.

The problem, as ever, is linked lists and

https://thekelleys.org.uk/gitweb/?p=dnsmasq.git;a=commit;h=930428fb970f4991e5c2933fd5a5d2504c18a551

fixes things for me.

To preempt the next question, I intend to make a 2.88 release fairly soon. I'm working through a backlog of patches from before 2.87, and once they are done in week or so, 2.88 will go into the release sausage-grinder.

Cheers,

Simon.


On 16/10/2022 22:25, Christopher J. Madsen wrote:
I tried building dnsmasq 2.87 with a patch that reverts commit 553c4c99, and that does seem to fix the problem.

Using dbus-monitor (thanks, I hadn't been aware of that), I was able to create 2 dbus-send commands that reproduce the problem without having to set up a VPN or openresolv:

dbus-send --system --dest=uk.org.thekelleys.dnsmasq /uk/org/thekelleys/dnsmasq uk.org.thekelleys.SetDomainServers array:string:"/example.com/10.3.10.24","/example.com/10.3.10.26","/example.com/10.3.10.25","/example.org/10.3.10.24","/example.org/10.3.10.26","/example.org/10.3.10.25","/lan.example.net/192.168.1.1","/lan.example.net/fd00::1"

dbus-send --system --dest=uk.org.thekelleys.dnsmasq /uk/org/thekelleys/dnsmasq uk.org.thekelleys.SetDomainServers array:string:"/lan.example.net/192.168.1.1","/lan.example.net/fd00::1"

(Yes, I did use example domains when running the commands.  It breaks lookups for those domains, since those nameservers don't exist, but other domains still work fine.)

If I start dnsmasq 2.87 and watch the debug log, the first command just adds the domain-specific nameservers to the global ones, but the second command sets only domain-specific nameservers and removes the global ones.  The same commands on 2.86 (or the patched 2.87) work fine.

However, If I remove ',"/lan.example.net/fd00::1"' from the end of each dbus-send command, then I don't see the problem.  I'm not sure if it's the IPv6 address or the number of nameservers, but the problem only happens when lan.example.net has both IPv4 and IPv6 nameservers.

Hopefully, this will help you track down the issue.  Thanks for your help.

On 10/13/22 09:36, Simon Kelley wrote:
On 10/10/2022 00:21, Christopher J. Madsen wrote:
I have configured dnsmasq and openresolv as described in https://unix.stackexchange.com/a/575449/2421 so that the DNS servers provided by the VPN are only used for the domains on that network.

With dnsmasq 2.86 and openresolv 3.12.0 this was working great, but I was setting up a new computer the same way and discovered that DNS lookups broke when I disconnected from the VPN (causing resolvconf to remove the private DNS servers).  I soon realized that the new machine had gotten dnsmasq 2.87, which I hadn't yet upgraded to on the old machine (it had dnsmasq 2.86).

The symptom is that all DNS requests (except those for other machines on my LAN) are refused by dnsmasq:

     $ nslookup www.google.com
     Server:        ::1
     Address:    ::1#53

     ** server can't find www.google.com: REFUSED

Restarting dnsmasq fixes the problem until the next time I disconnect the VPN.

I installed dnsmasq 2.86 on the new machine and the problem went away. If I put 2.87 back, the problem also comes back.  It seems that something in 2.87 breaks with my setup.  BTW, openresolv 3.12.0 uses DBus to add/remove nameservers instead of editing the dnsmasq config files.

I turned on debug logging.  When I connect the VPN, I see this in the log:

Oct  9 16:40:15 dnsmasq[105349]: setting upstream servers from DBus
Oct  9 16:40:15 dnsmasq[105349]: using nameserver 192.168.1.1#53
Oct  9 16:40:15 dnsmasq[105349]: using nameserver fd...::1#53
Oct  9 16:40:15 dnsmasq[105349]: using nameserver 10.3.10.24#53 for domain example.com Oct  9 16:40:15 dnsmasq[105349]: using nameserver 10.3.10.26#53 for domain example.com Oct  9 16:40:15 dnsmasq[105349]: using nameserver 10.3.10.25#53 for domain example.com Oct  9 16:40:15 dnsmasq[105349]: using nameserver 10.3.10.24#53 for domain example.org Oct  9 16:40:15 dnsmasq[105349]: using nameserver 10.3.10.26#53 for domain example.org Oct  9 16:40:15 dnsmasq[105349]: using nameserver 10.3.10.25#53 for domain example.org Oct  9 16:40:15 dnsmasq[105349]: using nameserver 192.168.1.1#53 for domain lan.example.net Oct  9 16:40:15 dnsmasq[105349]: using nameserver fd...::1#53 for domain lan.example.net
Oct  9 16:40:15 dnsmasq[105349]: read /etc/hosts - 0 addresses

I have redacted the IPv6 address, but it is exactly the same in all log entries.  I have also redacted the domains.  The VPN provides example.com and example.org, and lan.example.net is my LAN.  This part of the log looks exactly the same in 2.86 and 2.87; only the timestamps change.

Here is what dnsmasq 2.86 reports when I disconnect the VPN:

Oct  9 16:40:43 dnsmasq[105349]: setting upstream servers from DBus
Oct  9 16:40:43 dnsmasq[105349]: using nameserver 192.168.1.1#53
Oct  9 16:40:43 dnsmasq[105349]: using nameserver fd...::1#53
Oct  9 16:40:43 dnsmasq[105349]: using nameserver 192.168.1.1#53 for domain lan.example.net Oct  9 16:40:43 dnsmasq[105349]: using nameserver fd...::1#53 for domain lan.example.net
Oct  9 16:40:43 dnsmasq[105349]: read /etc/hosts - 0 addresses

Here is what dnsmasq 2.87 reports when I disconnect the VPN:

Oct  9 16:46:21 dnsmasq[105730]: setting upstream servers from DBus
Oct  9 16:46:21 dnsmasq[105730]: using nameserver 192.168.1.1#53 for domain lan.example.net Oct  9 16:46:21 dnsmasq[105730]: using nameserver fd...::1#53 for domain lan.example.net
Oct  9 16:46:21 dnsmasq[105730]: read /etc/hosts - 0 addresses
Oct  9 16:46:22 dnsmasq[105730]: query[A] ipv4only.arpa from ::1
Oct  9 16:46:22 dnsmasq[105730]: config error is REFUSED (EDE: not ready)

Notice that 2.87 does not show any "using nameserver" lines that don't also say "for domain".  As a result, I can only look up hosts under the lan.example.net domain.  Everything else is refused.

I don't know how to see the DBus messages that openresolv is sending to dnsmasq, but I would assume they're the same in both cases.  The only thing that changed is the version of dnsmasq. But for whatever reason, dnsmasq 2.87 isn't setting up generic nameservers when the VPN disconnects, but 2.86 is.



I've stared  at this for a while, but not found an obvious problem yet. An obvious commit on 2.87 that should be looked at is

https://thekelleys.org.uk/gitweb/?p=dnsmasq.git;a=commit;h=553c4c99cca173e9964d0edbd0676ed96c30f62b

Maybe the massive confusion is not as resolved as we thought, if you can build a test binary which reverts that change, and see if it fixes things, that would be very useful.

Another useful bit of data would be to see the DBUS messages being sent by openresolv. dbus-monitor should enable you to get that.


Cheers,

Simon.


_______________________________________________
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss

_______________________________________________
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss

_______________________________________________
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss

_______________________________________________
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss

Reply via email to