Re: [Dnsmasq-discuss] Internal error in cache

2021-12-24 Thread e9hack
Am Fr., 24. Dez. 2021 um 20:12 Uhr schrieb Simon Kelley <
si...@thekelleys.org.uk>:

>
> Reassurance that the bug is fixed for you too would be appreciated.
>

It looks like it's fixed now. In the past, it took ~12h to trigger the
issue. It can be related to my configuration, 300 cache entries and an
adblock list with 50k entries like 'address=/googleanalytics.com/'. When I
run Steve Gibson's DNS benchmark utility, the issue is triggered
immediately. The utility sends ~350 DNS queries to the local DNS
server/resolver. ~100 must fail with NXDOMAIN.

Regards and Merry Christmas,
Hartmut
___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss


Re: [Dnsmasq-discuss] Internal error in cache

2021-12-24 Thread Simon Kelley



On 24/12/2021 09:24, Hartmut Birr wrote:
> It looks like that
> 
> Commit: 1ce1c6beae9f683bec54cba4c0d375f85b209b95
> Caching cleanup. Use cached NXDOMAIN to answer queries of any type.
> 
> does introduce the error.
> 
> This pre-commits are fine:
> 
> Commit: 51d56df7a3a125e117b3278cab16281c85500287
> Add RFC 4833 DHCP options "posix-timezone" and "tzdb-timezone".
> 
> Commit: cac9ca38f62437c65464f58fc54342c7f294c40b
> Treat ANY queries the same as CNAME queries WRT to DNSSEC on CNAME targets.
> 
> Regards,
> Hartmut
> 

Nice work finding that.


My hypothesis on this goes like this.

1) The "internal error" is triggered during cache insertion when the
cache is full, and a record has to be deleted. cache_scan_free() gets
called with the contents of the least recently used record in the cache
and it deletes all instances of this (so, all A records of the correct
name, or all  records or whatever).

2)  Since there's at least one record which should have been deleted by
this (the least recently used record that started the process) then
after this process there should be at least one free cache record and
the insertion can be retried and should succeed. If nothing gets deleted
by  cache_scan_free then there will again be no free records, and rather
than going into an infinite loop, the internal error gets logged and
insertion is abandoned.

3) The commit you found changes the way NXDOMAIN records are stored:
These used to be stored with a type, If a query for an A record returned
NXDOMAIN then a cache record would be stored with F_NXDOMAIN and F_IPV4
set in the flags. This is a historical ananchronism. If the domain
doesn't exist it doesn't exist for all query types. The code therefore
now stores a cache entry with only F_NXDOMAIN set, and that's good to
answer a query of any type.

4) The problem is  that cache_scan_free() fails to delete a cache record
with only F_NXDOMAIN set, so if such a record fall to the end of the LRU
list and then needs to be deleted, the deletion will fail and the
internal error is triggered.


Given the above, I found a way to reproduce the bug: start dnsmasq with
a small cache, then make more queries which have NXDOMAIN answers than
the size of the cache. The cache_size+1'th query triggers the bug.


The fix is tiny, and fixes the problem for me, at least for my method of
reproduction.

Please see
https://thekelleys.org.uk/gitweb/?p=dnsmasq.git;a=commit;h=ea33a0130366d316f01be4c891e4f5b247f97171

Reassurance that the bug is fixed for you too would be appreciated.

Cheers, and Happy Christmas.

Simon.


> ___
> Dnsmasq-discuss mailing list
> Dnsmasq-discuss@lists.thekelleys.org.uk
> https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss
> 

___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss


Re: [Dnsmasq-discuss] Internal error in cache

2021-12-24 Thread Hartmut Birr
It looks like that

Commit: 1ce1c6beae9f683bec54cba4c0d375f85b209b95
Caching cleanup. Use cached NXDOMAIN to answer queries of any type.

does introduce the error.

This pre-commits are fine:

Commit: 51d56df7a3a125e117b3278cab16281c85500287
Add RFC 4833 DHCP options "posix-timezone" and "tzdb-timezone".

Commit: cac9ca38f62437c65464f58fc54342c7f294c40b
Treat ANY queries the same as CNAME queries WRT to DNSSEC on CNAME targets.

Regards,
Hartmut
___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss