Re: [Dnsmasq-discuss] REFUSED PTR queries without recursion desired

2019-07-10 Thread Petr Mensik
Hi Chiang,

I discovered the same issue and even posted patch on 2019-04-12 [1].
Queries without RD flag are always forwarded to "upstream" server, not
answered locally. REFUSED is usually given by server dnsmasq points to,
dnsmasq is just passing it to you. It should be fixed, but no reply for
it yet.

But I think it should work on authoritative interface, but it has to be
different interface used for normal dns cache.

1.
http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/2019q2/013013.html

On 7/9/19 12:24 PM, Chiang Fong Lee wrote:
> Hello,
> 
> I’m having some trouble getting dnsmasq to respond to PTR queries without 
> recursion desired, even when authoritative mode is enabled.
> 
> Given the following config:
> domain-needed
> bogus-priv
> no-resolv
> no-hosts
> port=10053
> server=/example.com/
> log-queries
> host-record=host1.example.com,10.2.3.4
> 
> Observed results:
> Query host1.example.com A (with recursion) - NOERROR, returns answer
> Query host1.example.com A (without recursion) - REFUSED
> Query 4.3.2.10.in-addr.arpa PTR (with recursion) - NOERROR, returns answer
> Query 4.3.2.10.in-addr.arpa PTR (without recursion) - REFUSED
> 
> Given the above config, plus the following two lines to enable authoritative 
> mode:
> auth-server=ns1.example.com
> auth-zone=example.com,10.0.0.0/8
> 
> Observed results:
> Query host1.example.com A (with recursion) - NOERROR, returns answer
> Query host1.example.com A (without recursion) - NOERROR, returns answer
> Query 4.3.2.10.in-addr.arpa PTR (with recursion) - NOERROR, returns answer
> Query 4.3.2.10.in-addr.arpa PTR (without recursion) - REFUSED
> 
> Expected results:
> Enabling auth mode for the zone, and specifying the subnet, would result in 
> the last PTR query being accepted instead of refused.
> 
> The log lines seen when the REFUSED occurs are:
> dnsmasq_1  | Jul  9 09:42:06 dnsmasq[1]: query[PTR] 4.3.2.10.in-addr.arpa 
> from 172.19.0.1
> dnsmasq_1  | Jul  9 09:42:06 dnsmasq[1]: config error is REFUSED
> 
> Version info:
> Dnsmasq version 2.80  Copyright (c) 2000-2018 Simon Kelley
> Compile time options: IPv6 GNU-getopt no-DBus no-i18n no-IDN DHCP DHCPv6 
> no-Lua TFTP no-conntrack ipset auth no-DNSSEC loop-detect inotify dumpfile
> 
> I was looking through the source and I’m guessing that PTR queries don’t ever 
> trigger the auth zone path, since the query ends in “in-addr.arpa” instead of 
> the auth-zone domain like “example.com”. Once it reaches the regular 
> answer_request path, it immediately returns since the RD flag is not set, 
> without checking host-records, and proceeds to forward the query instead.
> 
> Is this intended behaviour? The 2.79 CHANGELOG states that this 
> always-SERVFAIL (or forward, in 2.80) behaviour for queries without recursion 
> desired should always happen “UNLESS acting as an authoritative DNS server”, 
> without a caveat that it only works for non-reverse DNS queries.
> 
> Thanks,
> Chiang Fong
> 
> 
> ___
> Dnsmasq-discuss mailing list
> Dnsmasq-discuss@lists.thekelleys.org.uk
> http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss
> 

-- 
Petr Menšík
Software Engineer
Red Hat, http://www.redhat.com/
email: pemen...@redhat.com  PGP: 65C6C973

___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss


Re: [Dnsmasq-discuss] NXDOMAIN on exisiting A record

2019-07-10 Thread Petr Mensik
Hello Alex,

I would try removing all-servers and clear-on-reload statements away. I
would use just one server for testing, retesting all of them for the
same behaviour. When you do not know which server is used, it is hard to
debug better.

I think dots in server=/.X/ are not necessary and maybe even misleading.
Try it without them, just server=/X/ip

I think one second timeout is too short. Just use only localhost in
/etc/resolv.conf and debug what happens with dnsmasq. Record what
queries are sent to dnsmasq and what dnsmasq forwards to configured servers.

Note I discovered already requests without recursion desired bit set are
forwarded always, do not serve any local records. But that should not be
the issue. Try dig +rec and dig +norec to rule it out.

Regards,
Petr

On 7/7/19 10:28 PM, Alex Litvak wrote:
> (luck of sleep, fixing some mistakes in text)
> 
> Hello everyone,
> 
> I run consul services on my network where services are registered with
> .service.consul when they start.  All containers and bare metal
> hosts are running dnsmasq 2.80.
> I noticed that if I restart one of the containers, one of the hosts
> continue failing to resolve the service name.  I assume that dnsmasq is
> a culprit because:
> 
> 1. I can resolve service xyz.service.consul against standard dns servers
> with dig.
> 2. Dnsmasq listening on 127.0.0.1 is the first line in the resolve.conf
> and when I run tcpdump against port 53 on interface lo I see it returns
> NXDOMAIN on each A record query for service in question.
> 3. If I restart dnsmasq everything is back to normal again.  Even more
> weird, if I send SIGHUP to dnsmasq, which only causes a reread of
> /etc/hosts file, everything is back to normal as far as service
> resolution goes.
> 
> I have this problem only happening  on some hosts without the pattern I
> can recognize.  For example I have two nodes with the same config, os,
> kernel version, dnsmasq version, etc ... and one of them has the problem
> 100% after service xyz.service.consul restart and the other is not.
> 
> Where do I start troubleshooting? Any ideas are welcome.
> 
> Here is a standard dnsmasq confugration.
> 
> port=53
> domain-needed
> bogus-priv
> interface=lo
> listen-address=127.0.0.1
> no-dhcp-interface=127.0.0.1
> #bind-interfaces
> no-resolv
> all-servers
> dns-forward-max=500
> 
> # If you don't want dnsmasq to read /etc/hosts, uncomment the
> # following line.
> #no-hosts
> # or if you want it to read another file, as well as /etc/hosts, use
> # this.
> #addn-hosts=/etc/banner_add_hosts
> 
> #log-queries=extra
> #log-facility=/var/log/dnsmasq.log
> log-async=25
> 
> # Set the cachesize here.
> cache-size=1
> min-cache-ttl=5
> #neg-ttl=3600
> 
> # If you want to disable negative caching, uncomment this.
> #no-negcache
> 
> # For debugging purposes, log each DNS query as it passes through
> # dnsmasq.
> #log-queries
> clear-on-reload
> 
> server=10.0.48.12
> server=10.0.48.11
> server=10.0.21.63
> server=10.0.21.61
> 
> server=/.la.consul/10.0.73.43
> server=/.la.consul/10.0.73.40
> server=/.la.consul/10.0.73.28
> server=/.chi-pbx.consul/10.1.73.1
> server=/.chi-pbx.consul/10.1.73.2
> server=/.chi-pbx.consul/10.1.73.3
> server=/.consul/10.0.73.43
> server=/.consul/10.0.73.40
> server=/.consul/10.0.73.28
> 
> Resolver config
> 
> search ''
> options  timeout:1 attempts:1
> nameserver 127.0.0.1
> nameserver 10.0.48.11
> nameserver 10.0.48.12
> nameserver 10.0.21.63
> 
> 
> 
> ___
> Dnsmasq-discuss mailing list
> Dnsmasq-discuss@lists.thekelleys.org.uk
> http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss

-- 
Petr Menšík
Software Engineer
Red Hat, http://www.redhat.com/
email: pemen...@redhat.com  PGP: 65C6C973

___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss


Re: [Dnsmasq-discuss] [PATCH] Issues with TCP queries on recreated interfaces.

2019-07-10 Thread Petr Mensik
Hi Vladislav

On 7/9/19 10:00 PM, Vladislav Grishenko wrote:
> Hi Petr,
> 
> Regarding 0002-Compare-address-and-interface-index-for-allowed-inte.patch, 
> does it support case with different valid interfaces with the same address?
> For example:
>   eth0 192.168.1.1/24
>   tun0 192.168.1.1./16 (created/destroyed dynamically)

Not tested this specific case, but I think it should be handled
correctly, unlike previous code. Because it now compares also interface
index, it will mark existing entry as found only if interface index also
match. If it does not, new entry is created with correct index instead.
It should work, unlike previous code, it should keep both interface
addresses stored separately.

If tun0 is often destroyed and recreated, number of interfaces records
might grow. That is reason for patch #3, which removes dropped
interfaces after creating new ones.
> 
> Regarding appearance, seems newly added code doesn’t fully follow dnsmasq 
> code style in several places:
> * indentation (should be ident ==2 spaces, 8 spaces == \t)
> * brackets on the same code lines
Ok, I forgot to follow style on 3rd patch. Attached fixed formatting and
removed debug log on interface removal.
> * function args on the next line are not aligned with the first argument
> * prettyprint_addr() result is forcibly ignored with (void) unlike other 
> places
I think that is better to state explicitly return value is not used.
> 
> Best Regards, Vladislav Grishenko
> 
> -Original Message-
> From: Dnsmasq-discuss  On 
> Behalf Of Petr Mensik
> Sent: Tuesday, July 9, 2019 5:31 PM
> To: dnsmasq-discuss@lists.thekelleys.org.uk
> Subject: [Dnsmasq-discuss] [PATCH] Issues with TCP queries on recreated 
> interfaces.
> 
> Hello Simon and others,
> 
> we have discovered issues with TCP DNS query on dnsmasq, when running in 
> bind-dynamic or bind-interfaces mode. dnsmasq scans automatically new 
> interfaces or do that on new query in second case. However, because used 
> speedup comparing only IP adresses in iface_allowed function, it never gets 
> updated index of an interface.
> 
> In case where named interface is destroyed and created again, that drops TCP 
> queries on that interface. They are checked for incoming interface number. If 
> such number is not found in interfaces list, query is denied.
> 
> Luckily, there was a bug in checking, hiding this problem from usual 
> configuration. If IPv6 address is enabled on the new device, new iface entry 
> would be created, because scope_id of sockaddr_in6 does not match previous. 
> That makes even IPv4 queries succeed.
> 
> Bug on bugzilla [1] is partly private.
> 
> I propose three changes. First is just helper to log what happens with 
> listeners on bind-dynamic configuration.
> 
> Second is the most important. Create new interface every time index changes. 
> Also test address family of incoming TCP query when checking allowed clients.
> 
> Third is cleanup of unused interfaces. On some virtual machines hosts, 
> interfaces may often be created and destroyed. It might have negative effect 
> on walking trough interfaces list. I think listeners should be garbage 
> collected also on bind-interfaces configuration. But for now, release memory 
> for unused interfaces at least for bind-dynamic.
> 
> 1. https://bugzilla.redhat.com/show_bug.cgi?id=1721668
> --
> Petr Menšík
> Software Engineer
> Red Hat, http://www.redhat.com/
> email: pemen...@redhat.com  PGP: 65C6C973
> 

-- 
Petr Menšík
Software Engineer
Red Hat, http://www.redhat.com/
email: pemen...@redhat.com  PGP: 65C6C973
From 46a77df93b9e5b04f84a031aede0954c0641fe10 Mon Sep 17 00:00:00 2001
From: Petr Mensik 
Date: Tue, 9 Jul 2019 14:05:59 +0200
Subject: [PATCH 3/3] Cleanup interfaces no longer available

Clean addresses and interfaces not found after enumerate. Free unused
records to speed up checking active interfaces and reduce used memory.
---
 src/network.c | 32 ++--
 1 file changed, 30 insertions(+), 2 deletions(-)

diff --git a/src/network.c b/src/network.c
index f487617..44bb757 100644
--- a/src/network.c
+++ b/src/network.c
@@ -533,7 +533,30 @@ static int iface_allowed_v4(struct in_addr local, int if_index, char *label,
 
   return iface_allowed((struct iface_param *)vparam, if_index, label, &addr, netmask, prefix, 0);
 }
-   
+
+/*
+ * Clean old interfaces no longer found.
+ */
+static void clean_interfaces()
+{
+  struct irec *iface;
+  struct irec **up = &daemon->interfaces;
+
+  for (iface = *up; iface; iface = *up)
+  {
+if (!iface->found && !iface->done)
+  {
+*up = iface->next;
+free(iface->name);
+free(iface);
+  }
+else
+  {
+up = &iface->next;
+  }
+  }
+}
+
 int enumerate_interfaces(int reset)
 {
   static struct addrlist *spare = NULL;
@@ -631,6 +654,7 @@ int enumerate_interfaces(int reset)
 	 in OPT_CLEVERBIND mode, that at listener will just disappear after
 	 a call to enumerate_interfaces, this is checke

Re: [Dnsmasq-discuss] [PATCH] Issues with TCP queries on recreated interfaces.

2019-07-10 Thread Vladislav Grishenko
Hi Petr,

> Not tested this specific case, but I think it should be handled correctly, 
> unlike previous code. Because it now compares also interface index, it will 
> mark existing entry as found only if interface index also match. If it does 
> not, new entry is created with correct index instead.

Checked, unfortunately interface index comparison breaks the things.
If there's 2+ interface with same address on startup, no error is happen, 
single TCP and multiple UDP (unlike before) sockets are created, on any of such 
interface shutdown - thins single TCP socket is closed (unlike before), so 
there are noting listens on TCP:53 after that.
If there's only one interface on startup - single TCP&UDP sockets are created, 
on subsequent interface up with the same address - bind error raised and only 
UDP socket is created additionally (unlike before).

At the moment, dnsmasq logic expects single TCP/UDP socket per address even for 
multiple interfaces.
For example, comment in iface_allowed() states that:
/* check whether the interface IP has been added already
 * we call this routine multiple times */
So, I'm afraid, seems proposed changes does not play well with that.
How do you think, can it be solved too?
Reproduction this case is quite easy, just need to create dummy interface with 
same address (different netmask) and up/down it.

> Ok, I forgot to follow style on 3rd patch. Attached fixed formatting and 
> removed debug log on interface removal.

Thanks,  fyi sed -r 's/[ ]{8}/\t/' is missed too.

> I think that is better to state explicitly return value is not used.

I think that would be better to rip it off from functional patch, and let it be 
as separate full patch for all prettyprint_* instances not just for some 
selected.
At the other hand, with no __attribute__((warn_unused_result)) it will not 
generate warning anyway.

Best Regards, Vladislav Grishenko

-Original Message-
From: Petr Mensik  
Sent: Wednesday, July 10, 2019 3:01 PM
To: Vladislav Grishenko ; 
dnsmasq-discuss@lists.thekelleys.org.uk
Subject: Re: [Dnsmasq-discuss] [PATCH] Issues with TCP queries on recreated 
interfaces.

Hi Vladislav

On 7/9/19 10:00 PM, Vladislav Grishenko wrote:
> Hi Petr,
> 
> Regarding 0002-Compare-address-and-interface-index-for-allowed-inte.patch, 
> does it support case with different valid interfaces with the same address?
> For example:
>   eth0 192.168.1.1/24
>   tun0 192.168.1.1./16 (created/destroyed dynamically)

Not tested this specific case, but I think it should be handled correctly, 
unlike previous code. Because it now compares also interface index, it will 
mark existing entry as found only if interface index also match. If it does 
not, new entry is created with correct index instead.
It should work, unlike previous code, it should keep both interface addresses 
stored separately.

If tun0 is often destroyed and recreated, number of interfaces records might 
grow. That is reason for patch #3, which removes dropped interfaces after 
creating new ones.
> 
> Regarding appearance, seems newly added code doesn’t fully follow dnsmasq 
> code style in several places:
> * indentation (should be ident ==2 spaces, 8 spaces == \t)
> * brackets on the same code lines
Ok, I forgot to follow style on 3rd patch. Attached fixed formatting and 
removed debug log on interface removal.
> * function args on the next line are not aligned with the first 
> argument
> * prettyprint_addr() result is forcibly ignored with (void) unlike 
> other places
I think that is better to state explicitly return value is not used.
> 
> Best Regards, Vladislav Grishenko
> 
> -Original Message-
> From: Dnsmasq-discuss 
>  On Behalf Of Petr 
> Mensik
> Sent: Tuesday, July 9, 2019 5:31 PM
> To: dnsmasq-discuss@lists.thekelleys.org.uk
> Subject: [Dnsmasq-discuss] [PATCH] Issues with TCP queries on recreated 
> interfaces.
> 
> Hello Simon and others,
> 
> we have discovered issues with TCP DNS query on dnsmasq, when running in 
> bind-dynamic or bind-interfaces mode. dnsmasq scans automatically new 
> interfaces or do that on new query in second case. However, because used 
> speedup comparing only IP adresses in iface_allowed function, it never gets 
> updated index of an interface.
> 
> In case where named interface is destroyed and created again, that drops TCP 
> queries on that interface. They are checked for incoming interface number. If 
> such number is not found in interfaces list, query is denied.
> 
> Luckily, there was a bug in checking, hiding this problem from usual 
> configuration. If IPv6 address is enabled on the new device, new iface entry 
> would be created, because scope_id of sockaddr_in6 does not match previous. 
> That makes even IPv4 queries succeed.
> 
> Bug on bugzilla [1] is partly private.
> 
> I propose three changes. First is just helper to log what happens with 
> listeners on bind-dynamic configuration.
> 
> Second is the most important. Create new interface eve

Re: [Dnsmasq-discuss] NXDOMAIN on exisiting A record

2019-07-10 Thread Sasha Litvak
Petr,

Thank you very much for your help.   I will follow your advice and report
my findings to the list.

On Wed, Jul 10, 2019, 4:47 AM Petr Mensik  wrote:

> Hello Alex,
>
> I would try removing all-servers and clear-on-reload statements away. I
> would use just one server for testing, retesting all of them for the
> same behaviour. When you do not know which server is used, it is hard to
> debug better.
>
> I think dots in server=/.X/ are not necessary and maybe even misleading.
> Try it without them, just server=/X/ip
>
> I think one second timeout is too short. Just use only localhost in
> /etc/resolv.conf and debug what happens with dnsmasq. Record what
> queries are sent to dnsmasq and what dnsmasq forwards to configured
> servers.
>
> Note I discovered already requests without recursion desired bit set are
> forwarded always, do not serve any local records. But that should not be
> the issue. Try dig +rec and dig +norec to rule it out.
>
> Regards,
> Petr
>
> On 7/7/19 10:28 PM, Alex Litvak wrote:
> > (luck of sleep, fixing some mistakes in text)
> >
> > Hello everyone,
> >
> > I run consul services on my network where services are registered with
> > .service.consul when they start.  All containers and bare metal
> > hosts are running dnsmasq 2.80.
> > I noticed that if I restart one of the containers, one of the hosts
> > continue failing to resolve the service name.  I assume that dnsmasq is
> > a culprit because:
> >
> > 1. I can resolve service xyz.service.consul against standard dns servers
> > with dig.
> > 2. Dnsmasq listening on 127.0.0.1 is the first line in the resolve.conf
> > and when I run tcpdump against port 53 on interface lo I see it returns
> > NXDOMAIN on each A record query for service in question.
> > 3. If I restart dnsmasq everything is back to normal again.  Even more
> > weird, if I send SIGHUP to dnsmasq, which only causes a reread of
> > /etc/hosts file, everything is back to normal as far as service
> > resolution goes.
> >
> > I have this problem only happening  on some hosts without the pattern I
> > can recognize.  For example I have two nodes with the same config, os,
> > kernel version, dnsmasq version, etc ... and one of them has the problem
> > 100% after service xyz.service.consul restart and the other is not.
> >
> > Where do I start troubleshooting? Any ideas are welcome.
> >
> > Here is a standard dnsmasq confugration.
> >
> > port=53
> > domain-needed
> > bogus-priv
> > interface=lo
> > listen-address=127.0.0.1
> > no-dhcp-interface=127.0.0.1
> > #bind-interfaces
> > no-resolv
> > all-servers
> > dns-forward-max=500
> >
> > # If you don't want dnsmasq to read /etc/hosts, uncomment the
> > # following line.
> > #no-hosts
> > # or if you want it to read another file, as well as /etc/hosts, use
> > # this.
> > #addn-hosts=/etc/banner_add_hosts
> >
> > #log-queries=extra
> > #log-facility=/var/log/dnsmasq.log
> > log-async=25
> >
> > # Set the cachesize here.
> > cache-size=1
> > min-cache-ttl=5
> > #neg-ttl=3600
> >
> > # If you want to disable negative caching, uncomment this.
> > #no-negcache
> >
> > # For debugging purposes, log each DNS query as it passes through
> > # dnsmasq.
> > #log-queries
> > clear-on-reload
> >
> > server=10.0.48.12
> > server=10.0.48.11
> > server=10.0.21.63
> > server=10.0.21.61
> >
> > server=/.la.consul/10.0.73.43
> > server=/.la.consul/10.0.73.40
> > server=/.la.consul/10.0.73.28
> > server=/.chi-pbx.consul/10.1.73.1
> > server=/.chi-pbx.consul/10.1.73.2
> > server=/.chi-pbx.consul/10.1.73.3
> > server=/.consul/10.0.73.43
> > server=/.consul/10.0.73.40
> > server=/.consul/10.0.73.28
> >
> > Resolver config
> >
> > search ''
> > options  timeout:1 attempts:1
> > nameserver 127.0.0.1
> > nameserver 10.0.48.11
> > nameserver 10.0.48.12
> > nameserver 10.0.21.63
> >
> >
> >
> > ___
> > Dnsmasq-discuss mailing list
> > Dnsmasq-discuss@lists.thekelleys.org.uk
> > http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss
>
> --
> Petr Menšík
> Software Engineer
> Red Hat, http://www.redhat.com/
> email: pemen...@redhat.com  PGP: 65C6C973
>
> ___
> Dnsmasq-discuss mailing list
> Dnsmasq-discuss@lists.thekelleys.org.uk
> http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss
>
___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss