ѽ҉ᶬḳ℠ via Unbound-users wrote:
Whilst concurring on the abuse statement I am not sure why DNS tunnel users should actually be wary of /caching/. The caching related to the DNS tunnelling is bloating the cache, especially NULL records not serving any legitimate purpose in DNS. But to detect such users I would reckon that analytics are not looking at the resolver's cache but rather the resolver's log (dnstap)?
i think those fears differ slightly. most RDNS servers do not log their transactions, though that's changing due to dnstap and analytics which can leverage dnstap. all RDNS servers have a cache which can be dumped. so, even though dns tunnels usually utilize the qname as a data carrier in the stub-to-authority direction and so the qname won't be predictable enough for others to query it, any RDNS operator who sees evidence of dns tunneling can dump her cache to analyze tunnel traffic in detail. that's a more-real fear simply because it is more common.
however, those concerns are in a way off topic for this mailing list, so allow me to ask a more direct unbound question. why does the cache bloat? you're using LRU replacement, and these records are never accessed. therefore while they can push other more vital things out of the cache, decreasing cache hit rate, they should be primary targets for replacement whenever other data is looking for a place to land. i understand that this cache churn has a cost, in bandwidth and in CPU, but not in memory -- once the cache reaches its working set maximum, it ought to grow no further. what could i be misunderstanding about this?
a second unbound-related topic is cache management itself. it is unusual for the splay between a name and its descendants to number in the millions. it happens for arpa, and popular TLD's such as COM, NET, ORG, and DE. as a cache management strategy, consider whether to more rapidly discard descendants of a high splay apex, unless they are accessed at least once. and in defiance my fear-related argument above, when the cache is full beyond some threshold like 90%, consider using the "splay is high, subsequent access of descendants is zero" as a signal to (a) not cache new descendant data, and (b) syslog it. there isn't a dnstap message-tag for this condition yet, but there ought to be. splay is easy to keep track of unless your cache is flat.
-- P Vixie
