On Fri 05 May 2023 15:17:37 +0000, Jens Meißner wrote:
> dnsmasq on bookworm fails to start after installation because the dns port 53 
> is already is use by systemd-resolved.
> After stopping systemd-resolved dnsmasq will start but refuses all dns 
> queries with the Extended DNS Error Code 14 "Not Ready".
> This error is reproducible on new installation.

First of all, this should block dnsmasq.service (binary package "dnsmasq"), but
it should NOT block /usr/sbin/dnsmasq (binary package "dnsmasq-base").
The latter is needed by things like libvirtd and network-manager!



Here is how I solved this on my Debian 11 router:

  1. in /etc/dnsmasq.d/cyber-kludges.conf

      # Don't fight nsd and systemd-resolved for control over ports.
      # Also don't shit yourself at boot time if dnsmasq starts before the 
ifaces are up.
      # The combination of options is a little confusing.
      # "--ignore-address=203.7.155.4" does something COMPLETELY unrelated, so
      # instead we need to whitelist the OTHER dmz address (203.7.155.1).
      # Then we need to whitelist the other ifaces, else it would bind ONLY to 
203.7.155.1.
      bind-dynamic
      interface=lo
      interface=byod
      interface=lan
      listen-address=203.7.155.1
      no-dhcp-interface=dmz
      listen-address=10.194.71.1
      no-dhcp-interface=vpn
      except-interface=internet


      # Proxy DNSv4/DNSv6 from the internet.
      # HARD CODE the upstream servers.
      # We SHOULD get them dynamically from systemd-networkd (the 
DHCPv4/RA/DHCPv6 client on the "internet" interface).
      # However, that requires third-party software like 
https://gitlab.com/craftyguy/networkd-dispatcher
      # Since Aussie Broadband rarely (if ever) change these, hard-coding them 
is Good EnoughTM.
      # I considered also/instead adding the Cloudflare and/or Google anycast 
DNS servers from here:
      #     https://github.com/systemd/systemd/blob/main/docs/DISTRO_PORTING.md
      # ...but those DNS servers will direct us to more distant hosts.
      # For example, "deb.debian.org" is
      #  8ms away using the address from AB or CF, but
      # 22ms away using the address from Google.
      no-resolv
      all-servers
      cache-size=8192
      server=202.142.142.142
      server=202.142.142.242
      server=2403:5800:100:1::142
      server=2403:5800:1:5::242

      # THIS BIT IS ONLY NEEDED BECAUSE I *ALSO* RUN NSD.
      # IT IS NOT NEEDED FOR systemd-resolve + dnsmasq.
      # Don't go out to the internet and back in, for our own domains.
      # This also means e.g. "logserv" still works when the internet is down.
      server=/cyber.com.au/155.7.203.in-addr.arpa/203.7.155.4

  2. in /etc/systemd/network/00-dmz.network, tell systemd-networkd (and thus 
resolved) about dnsmasq

      [Match]
      Name=dmz
      [Link]
      RequiredForOnline=no
      [Network]
      Domains=cyber.com.au
      Address=203.7.155.1/26
      Address=203.7.155.4/26
      Address=203.7.155.49/26
      # THESE NEXT TWO LINES ARE THE RELEVANT ONES FOR 1035568
      Domains=cyber.com.au ~155.7.203.in-addr.arpa
      DNS=203.7.155.1

   3. install libnss-resolve and make this link

      lrwxrwxrwx 1 root root 24 Feb 24  2021 /etc/resolv.conf -> 
/lib/systemd/resolv.conf

In other words, what I have is:

  a. local nss users go

       libnss_resolve
       -> resolved     (via socket)
          -> dnsmasq on 203.7.155.1 (for cyber.com.au and 
155.7.203.in-addr.arpa)

          -> whatever systemd-networkd got from upstream DHCP/DHCPv6 (for every 
other domain)

  b. local /etc/resolv.conf users go

       -> resolved on 127.0.0.53 (via UDP)
       [rest as above]

Because of the quirky way to code this in dnsmasq,
there is no good way to write a general default dnsmasq.conf to hook it up this 
way.

The other potential way to hook this up is to simply tell resolved not to 
listen on 127.0.0.53:53 (DNSStubListener=no in /etc/systemd/resolved.conf).
HOWEVER, it then means that name resolution is different for glibc (nss) versus 
everyone else, because

  a. local nss users go

     libnss_resolve
     -> resolved (via socket)
     -> whatever systemd-networkd got from upstream DHCP/DHCPv6 (for all 
domains)

     ...NEVER see RRs in dnsmasq.

  b. local resolv.conf users cannot go to resolved, because it now only listens 
on a AF_UNIX socket, not AF_DGRAM (UDP).

     So it either points directly upstream (typical legacy setup in dhclient) 
and bypasses BOTH dnsmasq and resolved; or
     it's set to 127.0.0.1 (i.e. dnsmasq) and bypasses resolved.

     Note that networkd has NO WAY to tell dnsmasq what DNS server(s) are 
supplied by upstream .network files / DHCP responses.
     networkd can only tell resolved that (I last checked back in v247).


PS: I have also seen deeply inconsistent results when there are unqualified 
names in /etc/hosts (e.g. "10.1.2.3 alice")
    because libnss_files.so, dnsmasq, and resolved treat those differently.  In 
essence, libnss_files.so is a third "path" on top of (a) and (b) above.

    My solution was to move stop using /etc/hosts and
    instead use /etc/hosts.dnsmasq-only (dnsmasq --addn-hosts=), then
    make all name resolution paths pass through *AT LEAST* dnsmasq.

    This is part of why the knee-jerk answer of "FFS, just patch IPT_FREEBIND 
into dnsmasq" probably isn't a comprehensive fix.


PPS: for the record, here is the "ip -c -4 a" of the host the above config 
comes from.
     It only has legacy IP at the ISP, so I haven't even considered solving 
this for IPv6 :-(

       bash5$ ip -c -4 a
       1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN 
group default qlen 1000
           inet 127.0.0.1/8 scope host lo
              valid_lft forever preferred_lft forever
       2: byod: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state 
UP group default qlen 1000
           inet 203.7.155.65/26 brd 203.7.155.127 scope global byod
              valid_lft forever preferred_lft forever
           inet 203.7.155.193/26 brd 203.7.155.255 scope global byod
              valid_lft forever preferred_lft forever
       4: lan: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state 
UP group default qlen 1000
           inet 203.7.155.129/26 brd 203.7.155.191 scope global lan
              valid_lft forever preferred_lft forever
       6: internet: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state 
UP group default qlen 1000
           inet 119.17.136.37/22 brd 119.17.139.255 scope global dynamic 
internet
              valid_lft 1066sec preferred_lft 1066sec
       7: dmz: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP 
group default qlen 1000
           inet 203.7.155.1/26 brd 203.7.155.63 scope global dmz
              valid_lft forever preferred_lft forever
           inet 203.7.155.4/26 brd 203.7.155.63 scope global secondary dmz
              valid_lft forever preferred_lft forever
           inet 203.7.155.49/26 brd 203.7.155.63 scope global secondary dmz
              valid_lft forever preferred_lft forever
       8: vpn: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1420 qdisc noqueue state 
UNKNOWN group default qlen 1000
           inet 10.194.71.1/24 brd 10.194.71.255 scope global vpn
              valid_lft forever preferred_lft forever

Reply via email to