Re: [Dnsmasq-discuss] Partial denial of service with dnsmasq on resource constrained systems

2021-04-16 Thread Tony Ambardar
On Fri, 16 Apr 2021 at 02:21, Kevin 'ldir' Darbyshire-Bryant
 wrote:
>
> > On 14 Apr 2021, at 00:34, Simon Kelley  wrote:
> >
> > Tagging onto the end of the thread just to report the results of my
> > research.
> >
> > This started because of problems with the OOM killer in a
> > resource-constrained system that was prompting OOM death when it spawned
> > sub-processes to handle TCP connections. I proposed a trick of putting
> > the large in-memory dataset in sub-process, so that the main dnsmasq
> > process which forks to handle TCP requests stays small. Reading around,
> > that probably won't work: the OOM killer weights size of all children
> > when looking for a victim. I think that trying to outsmart the OOM
> > killer is probably a hiding to nothing. It's possible for the OS to
> > protect critical daemons using oom_adj and friends, and supporting that
> > in OpenWRT looks like a better way to go.
> >
> > We then moved onto the fact that adblocking involves thousands of lines of
> >
> > local=/example.com/
> >
> > and the code supporting that didn't really envisage such large numbers.
> > I think improvements can be made there, and I'll look at doing that in
> > more detail.
>
> Hi Simon,
>
> Thanks for picking this up again.  There are multiple interlinked problems:
>
> (ab)use of local=/example.com/ or similar for adblock lists which leads to 
> memory usage & (apparently) slower response time due to the linear search of 
> ’servers’ handling the local domain.  I’m sure the ‘local’ list could be put 
> into some sort of tree/hash structure to speed that up but the memory 
> consumption will inherently be there to some (hopefully optimised) extent.  
> The list has to exist :-)
>
> TCP requests causing a fork of dnsmasq with all that ‘local/server’ list 
> memory usage, up to 21 times.  I’m looking at my APU 2 running openwrt at the 
> moment with a 46000 line (small) adblock ‘address=/foo.bar/‘ list - dnsmasq 
> consuming circa 12MB.  21*12MB 252MB isn’t going to cause my APU to sweat 
> memory wise but a lesser device could very well ask ‘Where am I going to find 
> this extra 228MB of memory from then?’.
>
> I don’t agree that your ‘large dataset sub-process with small tcp handling 
> children’ won’t work.  Yes linux oom killer takes into account all the 
> children, but the tcp children are going to be much(much!) smaller, 
> presumably the size of a basic dnsmasq instance of which most of it will be 
> program text.  A base dnsmasq on said APU2 takes 2.5MB (the above example 
> would then total 12MB + 20*2.5=62MB) That’s quite a difference on a 
> constrained system between asking for lumps of 12MB vs 2.5MB.  Whether this 
> approach makes sense from a latency perspective I don’t know… I’m assuming 
> that large process is acting as an ‘address validator’ and hence the 
> requesting children will need to wait for its ‘yes/no’ answer subject to 
> process scheduling etc.
>
> Tony mentioned another possible solution which avoids the sub-process malarky 
> of marking the ’server list’ memory in some way that the OS knows it’s shared 
> and therefore don’t need to have ‘real memory’ for all those tcp 
> sub-processes.  I’m sure he’ll be along in a minute to explain further.
>

Hi Simon, Kevin,

I wrote a test program that emulates dnsmasq execution when loading
large blocklists and then forking children to handle TCP DNS requests.
It allocates memory and initializes it, then sets it read-only, and
finally forks a number of child processes. At each step, it prints
committed memory from /proc/meminfo, tracking the risk of OOM. The
option "--private" uses private anonymous memory and reproduces normal
dnsmasq behaviour; it can easily trigger OOM on small memory systems
(e.g. 128MB). The other option "--shared" demonstrates how using
shared anonymous memory reduces the overall commit and can work on
even small systems.

I also captured some log files from running on a "big" 8GB system and
a "small" 128MB OpenWrt system. Using shared memory on the "small"
system, I can allocate 64 MB without issue, while doing the same using
private memory is a reliable way to trigger OOM.

For more details see the head commit and related files here:
https://github.com/guidosarducci/dnsmasq/commits/oom_alloc_test/

Best regards,
Tony

___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss


Re: [Dnsmasq-discuss] Partial denial of service with dnsmasq on resource constrained systems

2021-04-16 Thread Kevin 'ldir' Darbyshire-Bryant


> On 14 Apr 2021, at 00:34, Simon Kelley  wrote:
> 
> Tagging onto the end of the thread just to report the results of my
> research.
> 
> This started because of problems with the OOM killer in a
> resource-constrained system that was prompting OOM death when it spawned
> sub-processes to handle TCP connections. I proposed a trick of putting
> the large in-memory dataset in sub-process, so that the main dnsmasq
> process which forks to handle TCP requests stays small. Reading around,
> that probably won't work: the OOM killer weights size of all children
> when looking for a victim. I think that trying to outsmart the OOM
> killer is probably a hiding to nothing. It's possible for the OS to
> protect critical daemons using oom_adj and friends, and supporting that
> in OpenWRT looks like a better way to go.
> 
> We then moved onto the fact that adblocking involves thousands of lines of
> 
> local=/example.com/
> 
> and the code supporting that didn't really envisage such large numbers.
> I think improvements can be made there, and I'll look at doing that in
> more detail.

Hi Simon,

Thanks for picking this up again.  There are multiple interlinked problems:

(ab)use of local=/example.com/ or similar for adblock lists which leads to 
memory usage & (apparently) slower response time due to the linear search of 
’servers’ handling the local domain.  I’m sure the ‘local’ list could be put 
into some sort of tree/hash structure to speed that up but the memory 
consumption will inherently be there to some (hopefully optimised) extent.  The 
list has to exist :-)

TCP requests causing a fork of dnsmasq with all that ‘local/server’ list memory 
usage, up to 21 times.  I’m looking at my APU 2 running openwrt at the moment 
with a 46000 line (small) adblock ‘address=/foo.bar/‘ list - dnsmasq consuming 
circa 12MB.  21*12MB 252MB isn’t going to cause my APU to sweat memory wise but 
a lesser device could very well ask ‘Where am I going to find this extra 228MB 
of memory from then?’.

I don’t agree that your ‘large dataset sub-process with small tcp handling 
children’ won’t work.  Yes linux oom killer takes into account all the 
children, but the tcp children are going to be much(much!) smaller, presumably 
the size of a basic dnsmasq instance of which most of it will be program text.  
A base dnsmasq on said APU2 takes 2.5MB (the above example would then total 
12MB + 20*2.5=62MB) That’s quite a difference on a constrained system between 
asking for lumps of 12MB vs 2.5MB.  Whether this approach makes sense from a 
latency perspective I don’t know… I’m assuming that large process is acting as 
an ‘address validator’ and hence the requesting children will need to wait for 
its ‘yes/no’ answer subject to process scheduling etc.

Tony mentioned another possible solution which avoids the sub-process malarky 
of marking the ’server list’ memory in some way that the OS knows it’s shared 
and therefore don’t need to have ‘real memory’ for all those tcp sub-processes. 
 I’m sure he’ll be along in a minute to explain further.

Cheers,

Kevin D-B

gpg: 012C ACB2 28C6 C53E 9775  9123 B3A2 389B 9DE2 334A



signature.asc
Description: Message signed with OpenPGP
___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss


Re: [Dnsmasq-discuss] Partial denial of service with dnsmasq on resource constrained systems

2021-04-05 Thread Gordon Shawn
>
> Hey Simon,
>
> On Thu, 2021-04-01 at 23:55 +0100, Simon Kelley wrote:
> > I could do with a handle on exactly how people are configuring dnsmasq
> > to do ad blocking. It's not something I have much experience of.
>
> The situation for Pi-hole (a popular ad blocker based on dnsmasq) is the
> following:
>
> Traditionally, Pi-hole used "addn-hosts" to add HOSTS-like files containing
> domains (example:
> https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts). This
> list contains roughly 80,000 domains. This is doable with dnsmasq on all
> platforms having at least 512 MB of memory. However, Pi-hole users
> typically want to take it to the extremes. They added more and more lists,
> often going beyond the one million domains mark. This became a problem
> regarding memory. I don't recall complains about slow replies, though.
>
> Anyway, as this became more and more an issue and since we wanted to have
> something more professional than a text file (so users can easily add
> comments, etc.), we amended the dnsmasq code with an interface to a SQLite3
> database holding all domains to be blocked. We also added support for
> regular expressions (and hereby wildcards). With this new approach, we
> stopped storing anything about blocked domains in dnsmasq's cache: blocked
> domains are short-circuited and replied to with a mock answer. They are
> never added to the cache. This is done because we allow different lists to
> be assigned to different clients so some devices using the DNS server can
> be limited further down while other may be fully open on the same process.
>
> This works really fast because the balanced-tree (B-tree) index on the
> domain is very efficient. The tree lives transparently in page cache so
> accessing it is very fast even in the 1 mio. range (lookup speed scales
> logarithmic, typically < 5 ms on Raspberry Pis for 3mio. blocked domains).
>
> Note that we are hooking into dnsmasq's code from "outside" to keep changes
> in the dnsmasq codebase minimal so we can straightaway apply any patches
> from dnsmasq's git.
>
> So even when this is a bit outside of the current discussion, I thought
> it'd be interesting to mention that Pi-hole used to use "addn-hosts" but
> stopped to do so some time ago.
>
> Best,
> Dominik
>
> I was indeed in pi-hole's dnsmasq changes a while go and tried to use it
to replace dnsmasq(for its sqlite3, cname etc), however that turns out to
be too challenging as they're really geared towards pi-holes specifically.
Especially the way it forks dnsmasq. I would be great if pihole's dnsmasq
changes can be used standalone(e.g. a dnsmasq variant with
sqlite3/cname-nesting etc).

Though the performance issue I mentioned in my last reply about
local/address/cname parsing remains the same, you will have to use hosts
files to a quick reload when you have large blocklists.

Thanks,
Gordon
___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss


Re: [Dnsmasq-discuss] Partial denial of service with dnsmasq on resource constrained systems

2021-04-02 Thread e9hack

Am 02.04.2021 um 10:56 schrieb Kevin 'ldir' Darbyshire-Bryant:


The adblock package solution on openwrt (I’m being specific ‘cos there are a 
number of ‘adblock’ solutions with ‘adblock’ name :-)

Deny uses 'address=/foo.bar/‘ to block ‘foo.bar’ and ‘*.foo.bar'


Such a definition is put in a variable of type struct server. Struct server 
contains the member interface with a size of 65 bytes. Since the number of 
interfaces is limited, it would be nice, if an additional list for interface 
names can be used.

Regards,
Hartmut

___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss


Re: [Dnsmasq-discuss] Partial denial of service with dnsmasq on resource constrained systems

2021-04-02 Thread Dominik
Hey Simon,

On Thu, 2021-04-01 at 23:55 +0100, Simon Kelley wrote:
> I could do with a handle on exactly how people are configuring dnsmasq
> to do ad blocking. It's not something I have much experience of.

The situation for Pi-hole (a popular ad blocker based on dnsmasq) is the
following:

Traditionally, Pi-hole used "addn-hosts" to add HOSTS-like files containing
domains (example: 
https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts). This
list contains roughly 80,000 domains. This is doable with dnsmasq on all
platforms having at least 512 MB of memory. However, Pi-hole users
typically want to take it to the extremes. They added more and more lists,
often going beyond the one million domains mark. This became a problem
regarding memory. I don't recall complains about slow replies, though.

Anyway, as this became more and more an issue and since we wanted to have
something more professional than a text file (so users can easily add
comments, etc.), we amended the dnsmasq code with an interface to a SQLite3
database holding all domains to be blocked. We also added support for
regular expressions (and hereby wildcards). With this new approach, we
stopped storing anything about blocked domains in dnsmasq's cache: blocked
domains are short-circuited and replied to with a mock answer. They are
never added to the cache. This is done because we allow different lists to
be assigned to different clients so some devices using the DNS server can
be limited further down while other may be fully open on the same process.

This works really fast because the balanced-tree (B-tree) index on the
domain is very efficient. The tree lives transparently in page cache so
accessing it is very fast even in the 1 mio. range (lookup speed scales
logarithmic, typically < 5 ms on Raspberry Pis for 3mio. blocked domains).

Note that we are hooking into dnsmasq's code from "outside" to keep changes
in the dnsmasq codebase minimal so we can straightaway apply any patches
from dnsmasq's git.

So even when this is a bit outside of the current discussion, I thought
it'd be interesting to mention that Pi-hole used to use "addn-hosts" but
stopped to do so some time ago.

Best,
Dominik


___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss


Re: [Dnsmasq-discuss] Partial denial of service with dnsmasq on resource constrained systems

2021-04-01 Thread Simon Kelley



> 
> One other thing I saw while testing with large blocklists was a noticeable
> latency increase, likely related to lookup times. I recall some discussion
> on the ML where you mentioned work on a hash/tree solution was in
> progress. Were those changes completed?
> 


This seems to be the crucial aspect here: large blocklists. Is we move
the large blocklists to a subsystem designed to handle them, then the
problem goes away.

I could do with a handle on exactly how people are configuring dnsmasq
to do ad blocking. It's not something I have much experience of.


Simon.


___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss


Re: [Dnsmasq-discuss] Partial denial of service with dnsmasq on resource constrained systems

2021-04-01 Thread marcus via Dnsmasq-discuss
Fuck youVon meinem/meiner Galaxy gesendet
 Ursprüngliche Nachricht Von: Dominik  Datum: 
01.04.21  09:52  (GMT+01:00) An: Tony Ambardar , 
dnsmasq-discuss@lists.thekelleys.org.uk Betreff: Re: [Dnsmasq-discuss] Partial 
denial of service with dnsmasq on
  resource constrained systems Hey Tony,On Wed, 2021-03-31 at 19:43 -0700, Tony 
Ambardar wrote:> You're right that text segments are fairly small and shared; 
memory usage> was dominated by storage for blocklists read from file. This 
makes the> problem more general than just tiny systems, since people tend to 
size> their blocklists proportional to system memory size.I wounldn't say this. 
Users try to squeeze too-large files also when theydo not have enough memory 
for them...On Wed, 2021-03-31 at 19:43 -0700, Tony Ambardar wrote:> You're also 
right that actual memory footprint increases only minimally> with each fork() 
thanks to copy-on-write; I'm certain these OOM systems> aren't really 
exhausting memory. But I do think there's confusion around> memory usage 
optimizations like COW vs. memory accounting used for OOM.OOM is just severely 
broken IMO. As a concept. Linux should likely notallow overcommitment at all, 
there is just no way at all for software toaccount for memory not being 
available it successfully allocated some timeago.On Wed, 2021-03-31 at 19:43 
-0700, Tony Ambardar wrote:> I recall looking at dnsmasq process statistics on 
OOM invocation, and> noticed their VM set sizes were usually close to total 
system memory,> i.e.> COW wasn't relevant. And from a dnsmasq proc memory map, 
the large> segment> storing the blocklist was marked read-write. I suspect that 
despite COW,> since that memory is *potentially* writable it's being accounted 
for at> fork() time.The fork technically needs to allocate as much memory as 
the program iscurrently using but /proc/[pid]/maps won't tell you if the memory 
is copy-on-write or not. It is for sure read-write as, otherwise, when the 
forkwould write to it, it would be sent SIGSEGV. Instead, when trying to 
writeto a copy-on-write page, you will trigger a page-fault, the page will 
beduplicated and you can continue happily as if nothing would have 
happened.Also the "p" (private) doesn't help much here because it is 
justdistinguishing from "s" (shared) at this point.It *should* be possible to 
extract the relevant information from/proc/[pid]/pagemap and then check the 
details of the page(s) in/proc/kpageflags for KPF_SWAPBACKED (page is backed by 
swap/RAM). This isthe only way I'm aware of to check if this is a copy-on-write 
page existingin multiple places.If you know a simpler way to do this, I'd be 
happy to learn.On Wed, 2021-03-31 at 19:43 -0700, Tony Ambardar wrote:> A 
possible fix I'd suggest is to update dnsmasq's memory handling. IIRC,> we use 
the same cache structure and memory allocation for both DNS cache> and storing 
static server lists read from file. Perhaps use a separate,> page-aligned 
memory pool to store these lists, then after initialization> (and before 
forking) use mprotect() to set the region as read-only.> > Assuming it works, 
this would have the advantage of being a no-knobs> solution vs. setting kludgey 
process or connection limits.I like the idea of splitting the cache in two 
parts. Say a static and adynamic cache. Using mprotect() shouldn't even be 
necessary but helps toensure we're not writing to the static part of the cache 
anywhere in thecode.KSM (kernel samepage merging) comes to my mind as well, but 
this seems tobe the wrong tool for the job. Figured I should mention it 
nonetheless.On Wed, 2021-03-31 at 19:43 -0700, Tony Ambardar wrote:> One other 
thing I saw while testing with large blocklists was a> noticeable> latency 
increase, likely related to lookup times. I recall some> discussion> on the ML 
where you mentioned work on a hash/tree solution was in> progress. Were those 
changes completed?Yes, dnsmasq uses hash buckets to minimize the amount of 
memory it has toloop over when trying to find a 
name.Best,Dominik___Dnsmasq-discuss 
mailing 
listdnsmasq-disc...@lists.thekelleys.org.ukhttps://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss


Re: [Dnsmasq-discuss] Partial denial of service with dnsmasq on resource constrained systems

2021-04-01 Thread Dominik
Hey Tony,

On Wed, 2021-03-31 at 19:43 -0700, Tony Ambardar wrote:
> You're right that text segments are fairly small and shared; memory usage
> was dominated by storage for blocklists read from file. This makes the
> problem more general than just tiny systems, since people tend to size
> their blocklists proportional to system memory size.

I wounldn't say this. Users try to squeeze too-large files also when they
do not have enough memory for them...

On Wed, 2021-03-31 at 19:43 -0700, Tony Ambardar wrote:
> You're also right that actual memory footprint increases only minimally
> with each fork() thanks to copy-on-write; I'm certain these OOM systems
> aren't really exhausting memory. But I do think there's confusion around
> memory usage optimizations like COW vs. memory accounting used for OOM.

OOM is just severely broken IMO. As a concept. Linux should likely not
allow overcommitment at all, there is just no way at all for software to
account for memory not being available it successfully allocated some time
ago.

On Wed, 2021-03-31 at 19:43 -0700, Tony Ambardar wrote:
> I recall looking at dnsmasq process statistics on OOM invocation, and
> noticed their VM set sizes were usually close to total system memory,
> i.e.
> COW wasn't relevant. And from a dnsmasq proc memory map, the large
> segment
> storing the blocklist was marked read-write. I suspect that despite COW,
> since that memory is *potentially* writable it's being accounted for at
> fork() time.

The fork technically needs to allocate as much memory as the program is
currently using but /proc/[pid]/maps won't tell you if the memory is copy-
on-write or not. It is for sure read-write as, otherwise, when the fork
would write to it, it would be sent SIGSEGV. Instead, when trying to write
to a copy-on-write page, you will trigger a page-fault, the page will be
duplicated and you can continue happily as if nothing would have happened.
Also the "p" (private) doesn't help much here because it is just
distinguishing from "s" (shared) at this point.

It *should* be possible to extract the relevant information from
/proc/[pid]/pagemap and then check the details of the page(s) in
/proc/kpageflags for KPF_SWAPBACKED (page is backed by swap/RAM). This is
the only way I'm aware of to check if this is a copy-on-write page existing
in multiple places.

If you know a simpler way to do this, I'd be happy to learn.

On Wed, 2021-03-31 at 19:43 -0700, Tony Ambardar wrote:
> A possible fix I'd suggest is to update dnsmasq's memory handling. IIRC,
> we use the same cache structure and memory allocation for both DNS cache
> and storing static server lists read from file. Perhaps use a separate,
> page-aligned memory pool to store these lists, then after initialization
> (and before forking) use mprotect() to set the region as read-only.
> 
> Assuming it works, this would have the advantage of being a no-knobs
> solution vs. setting kludgey process or connection limits.

I like the idea of splitting the cache in two parts. Say a static and a
dynamic cache. Using mprotect() shouldn't even be necessary but helps to
ensure we're not writing to the static part of the cache anywhere in the
code.

KSM (kernel samepage merging) comes to my mind as well, but this seems to
be the wrong tool for the job. Figured I should mention it nonetheless.

On Wed, 2021-03-31 at 19:43 -0700, Tony Ambardar wrote:
> One other thing I saw while testing with large blocklists was a
> noticeable
> latency increase, likely related to lookup times. I recall some
> discussion
> on the ML where you mentioned work on a hash/tree solution was in
> progress. Were those changes completed?

Yes, dnsmasq uses hash buckets to minimize the amount of memory it has to
loop over when trying to find a name.

Best,
Dominik



___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss


Re: [Dnsmasq-discuss] Partial denial of service with dnsmasq on resource constrained systems

2021-03-31 Thread Tony Ambardar
From: "Tony Ambardar" 

On 27/03/2021 17:21, Simon wrote:
>> On 24/03/2021 19:55, Ian wrote:
>>
>> It seems that on resource constrained routers, it’s possible to execute
>> a non-critical denial of service attack against the router simply by
>> opening multiple tcp queries to dnsmasq, which then forks for each tcp
>> connection up to MAX_PROCS times, resulting in oom-killer being invoked
>> after the router runs out of memory.
>>
>> One could imagine a malicious app or shell script constantly spawning
>> new tcp connections and keeping the router out of memory as a result.
>>
>> This problem came to light on the Openwrt forum as a user had a taxi
>> booking app that opened multiple tcp connections to dnsmasq.
>>
>> A simple patch to add a long form configuration option
>> “—max-procs=” to dnsmasq that allows MAX_PROCS to be overridden
>> at runtime fixed the user’s problem.
>>
>> Not sure if this is the best way of dealing with the problem, but wanted
>> to bring this to the list’s attention.
>
>
> The default value of MAX_PROCS is 20, which doesn't seem excessive, my
> point being that systems which run out of memory when dnsmasq forks 20
> times are likely a pretty small niche, and reducing the default is not a
> good idea on most systems. Note that dnsmasq doesn't exec() after
> forking, so the text segments will be shared, and I'd expect that not a
> lot of the memory in the TCP-handling process would be written, so
> copy-on-write will share data pages too. I don't have figures, but it's
> certainly not the case that each fork doubles the memory footprint.
>
> Adding this is a run-time option is only useful if people know that
> their system in vulnerable, and use the option, or a distribution always
> sets the option. But if OpenWrt determines that this is a general
> problem on OpenWrt systems, the best solution would be for OpenWRT
> packages to be compiled with MAX_PROCS set to a lower value. Carrying a
> single-line patch to src/config.h is a sensible way to do that.
>
>
> Any look at the dnsmasq man page shows that we're not averse to adding
> configurability, but the configurability has to have real-world uses,
> and options which have to be set in ill-defined circumstances to avoid
> catastrophic problems are not good options. It's a judgement call, but
> my judgement whenever this was written (at least a decade ago) was that
> this wasn't a useful parameter for a user to tweak. I can't help
> thinking that changing that now isn't really solving the underlying
> problem, but I can't offer a better solution.
>
> Comments? How do we fix this?
>
>
> Simon.

Hi Simon,

I also hit the OOM issue a few years back, while evaluating dnsmasq
performance on OpenWrt with large server lists loaded, and spent some time
investigating. I could be hazy on some details but the following covers
the basics.

You're right that text segments are fairly small and shared; memory usage
was dominated by storage for blocklists read from file. This makes the
problem more general than just tiny systems, since people tend to size
their blocklists proportional to system memory size.

You're also right that actual memory footprint increases only minimally
with each fork() thanks to copy-on-write; I'm certain these OOM systems
aren't really exhausting memory. But I do think there's confusion around
memory usage optimizations like COW vs. memory accounting used for OOM.

I recall looking at dnsmasq process statistics on OOM invocation, and
noticed their VM set sizes were usually close to total system memory, i.e.
COW wasn't relevant. And from a dnsmasq proc memory map, the large segment
storing the blocklist was marked read-write. I suspect that despite COW,
since that memory is *potentially* writable it's being accounted for at
fork() time.

A possible fix I'd suggest is to update dnsmasq's memory handling. IIRC,
we use the same cache structure and memory allocation for both DNS cache
and storing static server lists read from file. Perhaps use a separate,
page-aligned memory pool to store these lists, then after initialization
(and before forking) use mprotect() to set the region as read-only.

Assuming it works, this would have the advantage of being a no-knobs
solution vs. setting kludgey process or connection limits.

One other thing I saw while testing with large blocklists was a noticeable
latency increase, likely related to lookup times. I recall some discussion
on the ML where you mentioned work on a hash/tree solution was in
progress. Were those changes completed?

Best regards,
Tony

___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss


Re: [Dnsmasq-discuss] Partial denial of service with dnsmasq on resource constrained systems

2021-03-27 Thread Simon Kelley
On 24/03/2021 19:55, Ian wrote:
>  
> 
> It seems that on resource constrained routers, it’s possible to execute
> a non-critical denial of service attack against the router simply by
> opening multiple tcp queries to dnsmasq, which then forks for each tcp
> connection up to MAX_PROCS times, resulting in oom-killer being invoked
> after the router runs out of memory.
> 
>  
> 
> One could imagine a malicious app or shell script constantly spawning
> new tcp connections and keeping the router out of memory as a result.
> 
>  
> 
> This problem came to light on the Openwrt forum as a user had a taxi
> booking app that opened multiple tcp connections to dnsmasq.
> 
>  
> 
> A simple patch to add a long form configuration option
> “—max-procs=” to dnsmasq that allows MAX_PROCS to be overridden
> at runtime fixed the user’s problem.
> 
>  
> 
> Not sure if this is the best way of dealing with the problem, but wanted
> to bring this to the list’s attention.
> 
>  



The default value of MAX_PROCS is 20, which doesn't seem excessive, my
point being that systems which run out of memory when dnsmasq forks 20
times are likely a pretty small niche, and reducing the default is not a
good idea on most systems. Note that dnsmasq doesn't exec() after
forking, so the text segments will be shared, and I'd expect that not a
lot of the memory in the TCP-handling process would be written, so
copy-on-write will share data pages too. I don't have figures, but it's
certainly not the case that each fork doubles the memory footprint.

Adding this is a run-time option is only useful if people know that
their system in vulnerable, and use the option, or a distribution always
sets the option. But if OpenWrt determines that this is a general
problem on OpenWrt systems, the best solution would be for OpenWRT
packages to be compiled with MAX_PROCS set to a lower value. Carrying a
single-line patch to src/config.h is a sensible way to do that.


Any look at the dnsmasq man page shows that we're not averse to adding
configurability, but the configurability has to have real-world uses,
and options which have to be set in ill-defined circumstances to avoid
catastrophic problems are not good options. It's a judgement call, but
my judgement whenever this was written (at least a decade ago) was that
this wasn't a useful parameter for a user to tweak. I can't help
thinking that changing that now isn't really solving the underlying
problem, but I can't offer a better solution.

Comments? How do we fix this?


Simon.

> 
> Ian
> 
> 
> ___
> Dnsmasq-discuss mailing list
> Dnsmasq-discuss@lists.thekelleys.org.uk
> https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss
> 


___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss


[Dnsmasq-discuss] Partial denial of service with dnsmasq on resource constrained systems

2021-03-27 Thread Ian
 
It seems that on resource constrained routers, it's possible to execute a
non-critical denial of service attack against the router simply by opening
multiple tcp queries to dnsmasq, which then forks for each tcp connection up
to MAX_PROCS times, resulting in oom-killer being invoked after the router
runs out of memory.
 
One could imagine a malicious app or shell script constantly spawning new
tcp connections and keeping the router out of memory as a result.
 
This problem came to light on the Openwrt forum as a user had a taxi booking
app that opened multiple tcp connections to dnsmasq.
 
A simple patch to add a long form configuration option "-max-procs="
to dnsmasq that allows MAX_PROCS to be overridden at runtime fixed the
user's problem.
 
Not sure if this is the best way of dealing with the problem, but wanted to
bring this to the list's attention.
 
Ian


200-max-procs.patch
Description: Binary data
___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss