Re: [Bug 200651] New: cgroups iptables-restor: vmalloc: allocation failure

2018-07-25 Thread Andrew Morton
(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Wed, 25 Jul 2018 11:42:57 + bugzilla-dae...@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=200651
> 
> Bug ID: 200651
>Summary: cgroups iptables-restor: vmalloc: allocation failure

Thanks.  Please do note the above request.

>Product: Memory Management
>Version: 2.5
> Kernel Version: 4.14
>   Hardware: All
> OS: Linux
>   Tree: Mainline
> Status: NEW
>   Severity: normal
>   Priority: P1
>  Component: Other
>   Assignee: a...@linux-foundation.org
>   Reporter: gniko...@icdsoft.com
> Regression: No
> 
> Created attachment 277505
>   --> https://bugzilla.kernel.org/attachment.cgi?id=277505=edit
> iptables save
> 
> After creating large number of cgroups and under memory pressure, iptables
> command fails with following error:
> 
> "iptables-restor: vmalloc: allocation failure, allocated 3047424 of 3465216
> bytes, mode:0x14010c0(GFP_KERNEL|__GFP_NORETRY), nodemask=(null)"

I'm not sure what the problem is here, apart from iptables being
over-optimistic about vmalloc()'s abilities.

Are cgroups having any impact on this, or is it simply vmalloc arena
fragmentation, and the iptables code should use some data structure
more sophisticated than a massive array?

Maybe all that ccgroup metadata is contributing to the arena
fragmentation, but that allocations will be small and the two systems
should be able to live alongside, by being realistic about vmalloc.

> System which is used to reproduce the bug is with 2 vcpus and 2GB of ram, but
> it happens on more powerfull systems.
> 
> Steps to reproduce:
> 
> mkdir /cgroup
> mount cgroup -t cgroup -omemory,pids,blkio,cpuacct /cgroup
> for a in `seq 1 1000`; do for b in `seq 1 4` ; do mkdir -p
> "/cgroup/user/$a/$b"; done; done
> 
> Then in separate consoles
> 
> cat /dev/vda > /dev/null
> ./test
> ./test
> i=0;while sleep 0 ; do iptables-restore < iptables.save ; i=$(($i+1)); echo 
> $i;
> done
> 
> Here is the source of "test" program and attached iptables.save. It happens
> also with smaller iptables.save file.
> 
> #include 
> #include 
> 
> int main(void) {
> 
> srand(time(NULL));
> int i = 0, j = 0, randnum=0;
> int arr[6] = { 3072, 7168, 15360 , 31744, 64512, 130048}; 
> while(1) {
> 
> for (i = 0; i < 6 ; i++) {
> 
> int *ptr = (int*) malloc(arr[i] * 93);  
> 
> for(j = 0 ; j < arr[i] * 93 / sizeof(int); j++) {
> *(ptr+j) = j+1;
> }
> 
> free(ptr);
> }
> }   
> }
> 

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: simplify procfs code for seq_file instances

2018-04-24 Thread Andrew Morton
On Tue, 24 Apr 2018 16:23:04 +0200 Christoph Hellwig  wrote:

> On Thu, Apr 19, 2018 at 09:57:50PM +0300, Alexey Dobriyan wrote:
> > > git://git.infradead.org/users/hch/misc.git proc_create
> > 
> > 
> > I want to ask if it is time to start using poorman function overloading
> > with _b_c_e(). There are millions of allocation functions for example,
> > all slightly difference, and people will add more. Seeing /proc interfaces
> > doubled like this is painful.
> 
> Function overloading is totally unacceptable.
> 
> And I very much disagree with a tradeoff that keeps 5000 lines of 
> code vs a few new helpers.

OK, the curiosity and suspense are killing me.  What the heck is
"function overloading with _b_c_e()"?
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [netfilter-core] kernel panic: Out of memory and no killable processes... (2)

2018-02-07 Thread Andrew Morton
On Wed, 7 Feb 2018 18:44:39 +0100 Pablo Neira Ayuso <pa...@netfilter.org> wrote:

> Hi,
> 
> On Wed, Jan 31, 2018 at 09:19:16AM +0100, Michal Hocko wrote:
> [...]
> > Yeah, we do not BUG but rather fail instead. See __vmalloc_node_range.
> > My excavation tools pointed me to "VM: Rework vmalloc code to support 
> > mapping of arbitray pages"
> > by Christoph back in 2002. So yes, we can safely remove it finally. Se
> > below.
> > 
> > 
> > From 8d52e1d939d101b0dafed6ae5c3c1376183e65bb Mon Sep 17 00:00:00 2001
> > From: Michal Hocko <mho...@suse.com>
> > Date: Wed, 31 Jan 2018 09:16:56 +0100
> > Subject: [PATCH] net/netfilter/x_tables.c: remove size check
> > 
> > Back in 2002 vmalloc used to BUG on too large sizes. We are much better
> > behaved these days and vmalloc simply returns NULL for those. Remove
> > the check as it simply not needed and the comment even misleading.
> > 
> > Suggested-by: Andrew Morton <a...@linux-foundation.org>
> > Signed-off-by: Michal Hocko <mho...@suse.com>
> > ---
> >  net/netfilter/x_tables.c | 4 
> >  1 file changed, 4 deletions(-)
> > 
> > diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c
> > index b55ec5aa51a6..48a6ff620493 100644
> > --- a/net/netfilter/x_tables.c
> > +++ b/net/netfilter/x_tables.c
> > @@ -999,10 +999,6 @@ struct xt_table_info *xt_alloc_table_info(unsigned int 
> > size)
> > if (sz < sizeof(*info))
> > return NULL;
> >  
> > -   /* Pedantry: prevent them from hitting BUG() in vmalloc.c --RR */
> > -   if ((SMP_ALIGN(size) >> PAGE_SHIFT) + 2 > totalram_pages)
> > -   return NULL;
> > -
> > /* __GFP_NORETRY is not fully supported by kvmalloc but it should
> >  * work reasonably well if sz is too large and bail out rather
> >  * than shoot all processes down before realizing there is nothing
> 
> Patchwork didn't catch this patch for some reason, would you mind to
> resend?

From: Michal Hocko <mho...@suse.com>
Subject: net/netfilter/x_tables.c: remove size check

Back in 2002 vmalloc used to BUG on too large sizes.  We are much better
behaved these days and vmalloc simply returns NULL for those.  Remove the
check as it simply not needed and the comment is even misleading.

Link: http://lkml.kernel.org/r/20180131081916.go21...@dhcp22.suse.cz
Suggested-by: Andrew Morton <a...@linux-foundation.org>
Signed-off-by: Michal Hocko <mho...@suse.com>
Reviewed-by: Andrew Morton <a...@linux-foundation.org>
Cc: Florian Westphal <f...@strlen.de>
Cc: David S. Miller <da...@davemloft.net>
Signed-off-by: Andrew Morton <a...@linux-foundation.org>
---

 net/netfilter/x_tables.c |4 
 1 file changed, 4 deletions(-)

diff -puN net/netfilter/x_tables.c~net-netfilter-x_tablesc-remove-size-check 
net/netfilter/x_tables.c
--- a/net/netfilter/x_tables.c~net-netfilter-x_tablesc-remove-size-check
+++ a/net/netfilter/x_tables.c
@@ -1004,10 +1004,6 @@ struct xt_table_info *xt_alloc_table_inf
if (sz < sizeof(*info))
return NULL;
 
-   /* Pedantry: prevent them from hitting BUG() in vmalloc.c --RR */
-   if ((size >> PAGE_SHIFT) + 2 > totalram_pages)
-   return NULL;
-
/* __GFP_NORETRY is not fully supported by kvmalloc but it should
 * work reasonably well if sz is too large and bail out rather
 * than shoot all processes down before realizing there is nothing
_

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [netfilter-core] kernel panic: Out of memory and no killable processes... (2)

2018-01-30 Thread Andrew Morton
On Tue, 30 Jan 2018 15:01:04 +0100 Michal Hocko  wrote:

> > Well, this is not about syzkaller, it merely pointed out a potential
> > DoS... And that has to be addressed somehow.
> 
> So how about this?
> ---

argh ;)

> >From d48e950f1b04f234b57b9e34c363bdcfec10aeee Mon Sep 17 00:00:00 2001
> From: Michal Hocko 
> Date: Tue, 30 Jan 2018 14:51:07 +0100
> Subject: [PATCH] net/netfilter/x_tables.c: make allocation less aggressive
> 
> syzbot has noticed that xt_alloc_table_info can allocate a lot of
> memory. This is an admin only interface but an admin in a namespace
> is sufficient as well. eacd86ca3b03 ("net/netfilter/x_tables.c: use
> kvmalloc() in xt_alloc_table_info()") has changed the opencoded
> kmalloc->vmalloc fallback into kvmalloc. It has dropped __GFP_NORETRY on
> the way because vmalloc has simply never fully supported __GFP_NORETRY
> semantic. This is still the case because e.g. page tables backing the
> vmalloc area are hardcoded GFP_KERNEL.
> 
> Revert back to __GFP_NORETRY as a poors man defence against excessively
> large allocation request here. We will not rule out the OOM killer
> completely but __GFP_NORETRY should at least stop the large request
> in most cases.
> 
> Fixes: eacd86ca3b03 ("net/netfilter/x_tables.c: use kvmalloc() in 
> xt_alloc_table_info()")
> Signed-off-by: Michal Hocko 
> ---
>  net/netfilter/x_tables.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c
> index d8571f414208..a5f5c29bcbdc 100644
> --- a/net/netfilter/x_tables.c
> +++ b/net/netfilter/x_tables.c
> @@ -1003,7 +1003,13 @@ struct xt_table_info *xt_alloc_table_info(unsigned int 
> size)
>   if ((SMP_ALIGN(size) >> PAGE_SHIFT) + 2 > totalram_pages)
>   return NULL;

offtopic: preceding comment here is "prevent them from hitting BUG() in
vmalloc.c".  I suspect this is ancient code and vmalloc sure as heck
shouldn't go BUG with this input.  And it should be using `sz' ;)

So I suspect and hope that this code can be removed.  If not, let's fix
vmalloc!

> - info = kvmalloc(sz, GFP_KERNEL);
> + /*
> +  * __GFP_NORETRY is not fully supported by kvmalloc but it should
> +  * work reasonably well if sz is too large and bail out rather
> +  * than shoot all processes down before realizing there is nothing
> +  * more to reclaim.
> +  */
> + info = kvmalloc(sz, GFP_KERNEL | __GFP_NORETRY);
>   if (!info)
>   return NULL;

checkpatch sayeth

networking block comments don't use an empty /* line, use /* Comment...

So I'll do that and shall scoot the patch Davewards.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html