Re: [Bug 200651] New: cgroups iptables-restor: vmalloc: allocation failure
(switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). On Wed, 25 Jul 2018 11:42:57 + bugzilla-dae...@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=200651 > > Bug ID: 200651 >Summary: cgroups iptables-restor: vmalloc: allocation failure Thanks. Please do note the above request. >Product: Memory Management >Version: 2.5 > Kernel Version: 4.14 > Hardware: All > OS: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: Other > Assignee: a...@linux-foundation.org > Reporter: gniko...@icdsoft.com > Regression: No > > Created attachment 277505 > --> https://bugzilla.kernel.org/attachment.cgi?id=277505=edit > iptables save > > After creating large number of cgroups and under memory pressure, iptables > command fails with following error: > > "iptables-restor: vmalloc: allocation failure, allocated 3047424 of 3465216 > bytes, mode:0x14010c0(GFP_KERNEL|__GFP_NORETRY), nodemask=(null)" I'm not sure what the problem is here, apart from iptables being over-optimistic about vmalloc()'s abilities. Are cgroups having any impact on this, or is it simply vmalloc arena fragmentation, and the iptables code should use some data structure more sophisticated than a massive array? Maybe all that ccgroup metadata is contributing to the arena fragmentation, but that allocations will be small and the two systems should be able to live alongside, by being realistic about vmalloc. > System which is used to reproduce the bug is with 2 vcpus and 2GB of ram, but > it happens on more powerfull systems. > > Steps to reproduce: > > mkdir /cgroup > mount cgroup -t cgroup -omemory,pids,blkio,cpuacct /cgroup > for a in `seq 1 1000`; do for b in `seq 1 4` ; do mkdir -p > "/cgroup/user/$a/$b"; done; done > > Then in separate consoles > > cat /dev/vda > /dev/null > ./test > ./test > i=0;while sleep 0 ; do iptables-restore < iptables.save ; i=$(($i+1)); echo > $i; > done > > Here is the source of "test" program and attached iptables.save. It happens > also with smaller iptables.save file. > > #include > #include > > int main(void) { > > srand(time(NULL)); > int i = 0, j = 0, randnum=0; > int arr[6] = { 3072, 7168, 15360 , 31744, 64512, 130048}; > while(1) { > > for (i = 0; i < 6 ; i++) { > > int *ptr = (int*) malloc(arr[i] * 93); > > for(j = 0 ; j < arr[i] * 93 / sizeof(int); j++) { > *(ptr+j) = j+1; > } > > free(ptr); > } > } > } > -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: simplify procfs code for seq_file instances
On Tue, 24 Apr 2018 16:23:04 +0200 Christoph Hellwigwrote: > On Thu, Apr 19, 2018 at 09:57:50PM +0300, Alexey Dobriyan wrote: > > > git://git.infradead.org/users/hch/misc.git proc_create > > > > > > I want to ask if it is time to start using poorman function overloading > > with _b_c_e(). There are millions of allocation functions for example, > > all slightly difference, and people will add more. Seeing /proc interfaces > > doubled like this is painful. > > Function overloading is totally unacceptable. > > And I very much disagree with a tradeoff that keeps 5000 lines of > code vs a few new helpers. OK, the curiosity and suspense are killing me. What the heck is "function overloading with _b_c_e()"? -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [netfilter-core] kernel panic: Out of memory and no killable processes... (2)
On Wed, 7 Feb 2018 18:44:39 +0100 Pablo Neira Ayuso <pa...@netfilter.org> wrote: > Hi, > > On Wed, Jan 31, 2018 at 09:19:16AM +0100, Michal Hocko wrote: > [...] > > Yeah, we do not BUG but rather fail instead. See __vmalloc_node_range. > > My excavation tools pointed me to "VM: Rework vmalloc code to support > > mapping of arbitray pages" > > by Christoph back in 2002. So yes, we can safely remove it finally. Se > > below. > > > > > > From 8d52e1d939d101b0dafed6ae5c3c1376183e65bb Mon Sep 17 00:00:00 2001 > > From: Michal Hocko <mho...@suse.com> > > Date: Wed, 31 Jan 2018 09:16:56 +0100 > > Subject: [PATCH] net/netfilter/x_tables.c: remove size check > > > > Back in 2002 vmalloc used to BUG on too large sizes. We are much better > > behaved these days and vmalloc simply returns NULL for those. Remove > > the check as it simply not needed and the comment even misleading. > > > > Suggested-by: Andrew Morton <a...@linux-foundation.org> > > Signed-off-by: Michal Hocko <mho...@suse.com> > > --- > > net/netfilter/x_tables.c | 4 > > 1 file changed, 4 deletions(-) > > > > diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c > > index b55ec5aa51a6..48a6ff620493 100644 > > --- a/net/netfilter/x_tables.c > > +++ b/net/netfilter/x_tables.c > > @@ -999,10 +999,6 @@ struct xt_table_info *xt_alloc_table_info(unsigned int > > size) > > if (sz < sizeof(*info)) > > return NULL; > > > > - /* Pedantry: prevent them from hitting BUG() in vmalloc.c --RR */ > > - if ((SMP_ALIGN(size) >> PAGE_SHIFT) + 2 > totalram_pages) > > - return NULL; > > - > > /* __GFP_NORETRY is not fully supported by kvmalloc but it should > > * work reasonably well if sz is too large and bail out rather > > * than shoot all processes down before realizing there is nothing > > Patchwork didn't catch this patch for some reason, would you mind to > resend? From: Michal Hocko <mho...@suse.com> Subject: net/netfilter/x_tables.c: remove size check Back in 2002 vmalloc used to BUG on too large sizes. We are much better behaved these days and vmalloc simply returns NULL for those. Remove the check as it simply not needed and the comment is even misleading. Link: http://lkml.kernel.org/r/20180131081916.go21...@dhcp22.suse.cz Suggested-by: Andrew Morton <a...@linux-foundation.org> Signed-off-by: Michal Hocko <mho...@suse.com> Reviewed-by: Andrew Morton <a...@linux-foundation.org> Cc: Florian Westphal <f...@strlen.de> Cc: David S. Miller <da...@davemloft.net> Signed-off-by: Andrew Morton <a...@linux-foundation.org> --- net/netfilter/x_tables.c |4 1 file changed, 4 deletions(-) diff -puN net/netfilter/x_tables.c~net-netfilter-x_tablesc-remove-size-check net/netfilter/x_tables.c --- a/net/netfilter/x_tables.c~net-netfilter-x_tablesc-remove-size-check +++ a/net/netfilter/x_tables.c @@ -1004,10 +1004,6 @@ struct xt_table_info *xt_alloc_table_inf if (sz < sizeof(*info)) return NULL; - /* Pedantry: prevent them from hitting BUG() in vmalloc.c --RR */ - if ((size >> PAGE_SHIFT) + 2 > totalram_pages) - return NULL; - /* __GFP_NORETRY is not fully supported by kvmalloc but it should * work reasonably well if sz is too large and bail out rather * than shoot all processes down before realizing there is nothing _ -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [netfilter-core] kernel panic: Out of memory and no killable processes... (2)
On Tue, 30 Jan 2018 15:01:04 +0100 Michal Hockowrote: > > Well, this is not about syzkaller, it merely pointed out a potential > > DoS... And that has to be addressed somehow. > > So how about this? > --- argh ;) > >From d48e950f1b04f234b57b9e34c363bdcfec10aeee Mon Sep 17 00:00:00 2001 > From: Michal Hocko > Date: Tue, 30 Jan 2018 14:51:07 +0100 > Subject: [PATCH] net/netfilter/x_tables.c: make allocation less aggressive > > syzbot has noticed that xt_alloc_table_info can allocate a lot of > memory. This is an admin only interface but an admin in a namespace > is sufficient as well. eacd86ca3b03 ("net/netfilter/x_tables.c: use > kvmalloc() in xt_alloc_table_info()") has changed the opencoded > kmalloc->vmalloc fallback into kvmalloc. It has dropped __GFP_NORETRY on > the way because vmalloc has simply never fully supported __GFP_NORETRY > semantic. This is still the case because e.g. page tables backing the > vmalloc area are hardcoded GFP_KERNEL. > > Revert back to __GFP_NORETRY as a poors man defence against excessively > large allocation request here. We will not rule out the OOM killer > completely but __GFP_NORETRY should at least stop the large request > in most cases. > > Fixes: eacd86ca3b03 ("net/netfilter/x_tables.c: use kvmalloc() in > xt_alloc_table_info()") > Signed-off-by: Michal Hocko > --- > net/netfilter/x_tables.c | 8 +++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c > index d8571f414208..a5f5c29bcbdc 100644 > --- a/net/netfilter/x_tables.c > +++ b/net/netfilter/x_tables.c > @@ -1003,7 +1003,13 @@ struct xt_table_info *xt_alloc_table_info(unsigned int > size) > if ((SMP_ALIGN(size) >> PAGE_SHIFT) + 2 > totalram_pages) > return NULL; offtopic: preceding comment here is "prevent them from hitting BUG() in vmalloc.c". I suspect this is ancient code and vmalloc sure as heck shouldn't go BUG with this input. And it should be using `sz' ;) So I suspect and hope that this code can be removed. If not, let's fix vmalloc! > - info = kvmalloc(sz, GFP_KERNEL); > + /* > + * __GFP_NORETRY is not fully supported by kvmalloc but it should > + * work reasonably well if sz is too large and bail out rather > + * than shoot all processes down before realizing there is nothing > + * more to reclaim. > + */ > + info = kvmalloc(sz, GFP_KERNEL | __GFP_NORETRY); > if (!info) > return NULL; checkpatch sayeth networking block comments don't use an empty /* line, use /* Comment... So I'll do that and shall scoot the patch Davewards. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html