Thank you very much for the information below. Having that saved me a
load of time.
The problem, as ever, is linked lists and
https://thekelleys.org.uk/gitweb/?p=dnsmasq.git;a=commit;h=930428fb970f4991e5c2933fd5a5d2504c18a551
fixes things for me.
To preempt the next question, I intend to make a 2.88 release fairly
soon. I'm working through a backlog of patches from before 2.87, and
once they are done in week or so, 2.88 will go into the release
sausage-grinder.
Cheers,
Simon.
On 16/10/2022 22:25, Christopher J. Madsen wrote:
I tried building dnsmasq 2.87 with a patch that reverts commit 553c4c99,
and that does seem to fix the problem.
Using dbus-monitor (thanks, I hadn't been aware of that), I was able to
create 2 dbus-send commands that reproduce the problem without having to
set up a VPN or openresolv:
dbus-send --system --dest=uk.org.thekelleys.dnsmasq
/uk/org/thekelleys/dnsmasq uk.org.thekelleys.SetDomainServers
array:string:"/example.com/10.3.10.24","/example.com/10.3.10.26","/example.com/10.3.10.25","/example.org/10.3.10.24","/example.org/10.3.10.26","/example.org/10.3.10.25","/lan.example.net/192.168.1.1","/lan.example.net/fd00::1"
dbus-send --system --dest=uk.org.thekelleys.dnsmasq
/uk/org/thekelleys/dnsmasq uk.org.thekelleys.SetDomainServers
array:string:"/lan.example.net/192.168.1.1","/lan.example.net/fd00::1"
(Yes, I did use example domains when running the commands. It breaks
lookups for those domains, since those nameservers don't exist, but
other domains still work fine.)
If I start dnsmasq 2.87 and watch the debug log, the first command just
adds the domain-specific nameservers to the global ones, but the second
command sets only domain-specific nameservers and removes the global
ones. The same commands on 2.86 (or the patched 2.87) work fine.
However, If I remove ',"/lan.example.net/fd00::1"' from the end of each
dbus-send command, then I don't see the problem. I'm not sure if it's
the IPv6 address or the number of nameservers, but the problem only
happens when lan.example.net has both IPv4 and IPv6 nameservers.
Hopefully, this will help you track down the issue. Thanks for your help.
On 10/13/22 09:36, Simon Kelley wrote:
On 10/10/2022 00:21, Christopher J. Madsen wrote:
I have configured dnsmasq and openresolv as described in
https://unix.stackexchange.com/a/575449/2421 so that the DNS servers
provided by the VPN are only used for the domains on that network.
With dnsmasq 2.86 and openresolv 3.12.0 this was working great, but I
was setting up a new computer the same way and discovered that DNS
lookups broke when I disconnected from the VPN (causing resolvconf to
remove the private DNS servers). I soon realized that the new
machine had gotten dnsmasq 2.87, which I hadn't yet upgraded to on
the old machine (it had dnsmasq 2.86).
The symptom is that all DNS requests (except those for other machines
on my LAN) are refused by dnsmasq:
$ nslookup www.google.com
Server: ::1
Address: ::1#53
** server can't find www.google.com: REFUSED
Restarting dnsmasq fixes the problem until the next time I disconnect
the VPN.
I installed dnsmasq 2.86 on the new machine and the problem went
away. If I put 2.87 back, the problem also comes back. It seems that
something in 2.87 breaks with my setup. BTW, openresolv 3.12.0 uses
DBus to add/remove nameservers instead of editing the dnsmasq config
files.
I turned on debug logging. When I connect the VPN, I see this in the
log:
Oct 9 16:40:15 dnsmasq[105349]: setting upstream servers from DBus
Oct 9 16:40:15 dnsmasq[105349]: using nameserver 192.168.1.1#53
Oct 9 16:40:15 dnsmasq[105349]: using nameserver fd...::1#53
Oct 9 16:40:15 dnsmasq[105349]: using nameserver 10.3.10.24#53 for
domain example.com
Oct 9 16:40:15 dnsmasq[105349]: using nameserver 10.3.10.26#53 for
domain example.com
Oct 9 16:40:15 dnsmasq[105349]: using nameserver 10.3.10.25#53 for
domain example.com
Oct 9 16:40:15 dnsmasq[105349]: using nameserver 10.3.10.24#53 for
domain example.org
Oct 9 16:40:15 dnsmasq[105349]: using nameserver 10.3.10.26#53 for
domain example.org
Oct 9 16:40:15 dnsmasq[105349]: using nameserver 10.3.10.25#53 for
domain example.org
Oct 9 16:40:15 dnsmasq[105349]: using nameserver 192.168.1.1#53 for
domain lan.example.net
Oct 9 16:40:15 dnsmasq[105349]: using nameserver fd...::1#53 for
domain lan.example.net
Oct 9 16:40:15 dnsmasq[105349]: read /etc/hosts - 0 addresses
I have redacted the IPv6 address, but it is exactly the same in all
log entries. I have also redacted the domains. The VPN provides
example.com and example.org, and lan.example.net is my LAN. This
part of the log looks exactly the same in 2.86 and 2.87; only the
timestamps change.
Here is what dnsmasq 2.86 reports when I disconnect the VPN:
Oct 9 16:40:43 dnsmasq[105349]: setting upstream servers from DBus
Oct 9 16:40:43 dnsmasq[105349]: using nameserver 192.168.1.1#53
Oct 9 16:40:43 dnsmasq[105349]: using nameserver fd...::1#53
Oct 9 16:40:43 dnsmasq[105349]: using nameserver 192.168.1.1#53 for
domain lan.example.net
Oct 9 16:40:43 dnsmasq[105349]: using nameserver fd...::1#53 for
domain lan.example.net
Oct 9 16:40:43 dnsmasq[105349]: read /etc/hosts - 0 addresses
Here is what dnsmasq 2.87 reports when I disconnect the VPN:
Oct 9 16:46:21 dnsmasq[105730]: setting upstream servers from DBus
Oct 9 16:46:21 dnsmasq[105730]: using nameserver 192.168.1.1#53 for
domain lan.example.net
Oct 9 16:46:21 dnsmasq[105730]: using nameserver fd...::1#53 for
domain lan.example.net
Oct 9 16:46:21 dnsmasq[105730]: read /etc/hosts - 0 addresses
Oct 9 16:46:22 dnsmasq[105730]: query[A] ipv4only.arpa from ::1
Oct 9 16:46:22 dnsmasq[105730]: config error is REFUSED (EDE: not
ready)
Notice that 2.87 does not show any "using nameserver" lines that
don't also say "for domain". As a result, I can only look up hosts
under the lan.example.net domain. Everything else is refused.
I don't know how to see the DBus messages that openresolv is sending
to dnsmasq, but I would assume they're the same in both cases. The
only thing that changed is the version of dnsmasq. But for whatever
reason, dnsmasq 2.87 isn't setting up generic nameservers when the
VPN disconnects, but 2.86 is.
I've stared at this for a while, but not found an obvious problem
yet. An obvious commit on 2.87 that should be looked at is
https://thekelleys.org.uk/gitweb/?p=dnsmasq.git;a=commit;h=553c4c99cca173e9964d0edbd0676ed96c30f62b
Maybe the massive confusion is not as resolved as we thought, if you
can build a test binary which reverts that change, and see if it fixes
things, that would be very useful.
Another useful bit of data would be to see the DBUS messages being
sent by openresolv. dbus-monitor should enable you to get that.
Cheers,
Simon.
_______________________________________________
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss
_______________________________________________
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss
_______________________________________________
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss
_______________________________________________
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss