Re: [PATCH] staging: lustre: fix GFP_ATOMIC macro usage

2014-01-21 Thread Dilger, Andreas
On 2014/01/17, 8:17 AM, Greg Kroah-Hartman gre...@linuxfoundation.org
wrote:

On Fri, Jan 17, 2014 at 05:51:28PM +0300, Dan Carpenter wrote:
 We will want to get rid of lustre's custom allocator before this gets
 out of staging.
 
 But one feature that the lustre allocator has which is pretty neat is
 that it lets you debug how much memory the filesystem is using.  Is
 there a standard way to find this information?

Create your own mempool/slab/whatever_it's_called and look in the
debugfs or proc files for the allocator usage, depending on the memory
allocator the kernel is using.

That's how the rest of the kernel does it, no reason lustre should be
any different.

The Lustre allocation macros track the memory usage across the whole
filesystem,
not just of a single structure that a mempool/slab/whatever would do.
This is
useful to know for debugging purposes (e.g. user complains about not having
enough RAM for their highly-tuned application, or to check for leaks at
unmount).

It can also log the alloc/free calls and post-process them to find leaks
easily, or find pieces code that is allocating too much memory that are not
using dedicated slabs.  This also works if you encounter a system with a
lot
of allocated memory, enable free logging, and then unmount the
filesystem.
The logs will show which structures are being freed (assuming they are not
leaked completely) and point you to whatever is not being shrunk properly.

I don't know if there is any way to track this with regular kmalloc(), and
creating separate slabs for so ever data structure would be ugly.  The
generic
/proc/meminfo data doesn't really tell you what is using all the memory,
and
the size- slabs give some information, but are used all over the
kernel.

I'm pretty much resigned to losing all of this functionality, but it
definitely
has been very useful for finding problems.

Cheers, Andreas
-- 
Andreas Dilger

Lustre Software Architect
Intel High Performance Data Division


___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH] staging: lustre: fix GFP_ATOMIC macro usage

2014-01-21 Thread Dan Carpenter
Greg meant mempools not slab.  You should look at it.  It does what you
need for debugging and solves a couple other problems as well.

We have a leak checker in the kernel but most people disable it.  I
forget the config name.  There are a bunch of useful debug configs.

regards,
dan carpenter
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH] staging: lustre: fix GFP_ATOMIC macro usage

2014-01-21 Thread Dave Hansen
On 01/21/2014 12:02 PM, Dilger, Andreas wrote:
 The Lustre allocation macros track the memory usage across the whole
 filesystem,
 not just of a single structure that a mempool/slab/whatever would do.
 This is
 useful to know for debugging purposes (e.g. user complains about not having
 enough RAM for their highly-tuned application, or to check for leaks at
 unmount).

Urg, it does this with a global variable.  If we did this kind of thing
generically, we'd get eaten alive by all of the cacheline bouncing from
that atomic.  It's also a 32-bit atomic.  Guess it's never had more than
4GB of memory tracked in there. :)

This also doesn't track overhead from things that *are* in slabs like
the inodes, or the 14 kmem_caches that lustre has, so it's far from a
complete picture of how lustre is using memory.

 It can also log the alloc/free calls and post-process them to find leaks
 easily, or find pieces code that is allocating too much memory that are not
 using dedicated slabs.  This also works if you encounter a system with a
 lot of allocated memory, enable free logging, and then unmount the
 filesystem. 
 The logs will show which structures are being freed (assuming they are not
 leaked completely) and point you to whatever is not being shrunk properly.

This isn't perfect, but it does cover most of the ext4 call sites in my
kernel.  It would work better for a module, I'd imagine:

cd /sys/kernel/debug/tracing/events/kmem
echo -n 'call_site  0x81e0af00  call_site =
0x81229cf0'  kmalloc/filter
echo 1  kmalloc/enable
cat /sys/kernel/debug/tracing/trace_pipe

It will essentially log all the kmalloc() calls from the ext4 code.  I
got the call site locations from grepping System.map.  It would be
_really_ nice if we were able to do something like:

echo 'call_site =~ *ext4*'

but there's no way to do that which I know of.  You could probably rig
something up with the function graph tracer by triggering only on entry
to the lustre code.

 I don't know if there is any way to track this with regular kmalloc(), and
 creating separate slabs for so ever data structure would be ugly.  The
 generic
 /proc/meminfo data doesn't really tell you what is using all the memory,
 and
 the size- slabs give some information, but are used all over the
 kernel.
 
 I'm pretty much resigned to losing all of this functionality, but it
 definitely
 has been very useful for finding problems.

Yeah, it is hard to find out who is responsible for leaking pages or
kmalloc()s, especially after the fact.  But, we seem to limp along just
fine.  If lustre is that bad of a memory leaker that it *NEEDS* this
feature, we have bigger problems on our hands. :)



___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH] staging: lustre: fix GFP_ATOMIC macro usage

2014-01-21 Thread Drokin, Oleg
Hello!

On Jan 21, 2014, at 3:16 PM, Dan Carpenter wrote:

 We have a leak checker in the kernel but most people disable it.  I
 forget the config name.  There are a bunch of useful debug configs.

I actually use it at times too and it's useful (e.g. it works even if you did 
not wrap the allocation in the lustre macro, and this did happen before),
but it comes with it's own problems.
E.g. it gobbles on memory like there's no tomorrow (at least in my case), it 
has horrible false failure ratio with zfs in place (and probably
that's why it uses even more memory then), it has otherwise quite significant 
failure ratio too.
But yes, it is useful and I am glad it's there. It does not solve some other 
usecases outlined by Andreas, though.

Bye,
Oleg
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH] staging: lustre: fix GFP_ATOMIC macro usage

2014-01-21 Thread Drokin, Oleg
Hello!

On Jan 21, 2014, at 4:15 PM, Dave Hansen wrote:

 On 01/21/2014 12:02 PM, Dilger, Andreas wrote:
 The Lustre allocation macros track the memory usage across the whole
 filesystem,
 not just of a single structure that a mempool/slab/whatever would do.
 This is
 useful to know for debugging purposes (e.g. user complains about not having
 enough RAM for their highly-tuned application, or to check for leaks at
 unmount).
 Urg, it does this with a global variable.  If we did this kind of thing
 generically, we'd get eaten alive by all of the cacheline bouncing from
 that atomic.  It's also a 32-bit atomic.  Guess it's never had more than
 4GB of memory tracked in there. :)

No, hopefully we'll never get to be this memory hungry on a single node.
Good point about the cacheline, I guess.

 This also doesn't track overhead from things that *are* in slabs like
 the inodes, or the 14 kmem_caches that lustre has, so it's far from a
 complete picture of how lustre is using memory.

The inodes are per filesystem, so we do have complete picture there.
dentries and some other structures are shared, but e.g. on a client lustre is
frequently the only active FS in use and as such it's easy.

 It can also log the alloc/free calls and post-process them to find leaks
 easily, or find pieces code that is allocating too much memory that are not
 using dedicated slabs.  This also works if you encounter a system with a
 lot of allocated memory, enable free logging, and then unmount the
 filesystem. 
 The logs will show which structures are being freed (assuming they are not
 leaked completely) and point you to whatever is not being shrunk properly.
 
 This isn't perfect, but it does cover most of the ext4 call sites in my
 kernel.  It would work better for a module, I'd imagine:
 
 cd /sys/kernel/debug/tracing/events/kmem
 echo -n 'call_site  0x81e0af00  call_site =
 0x81229cf0'  kmalloc/filter
 echo 1  kmalloc/enable
 cat /sys/kernel/debug/tracing/trace_pipe

That's a neat trick.

 It will essentially log all the kmalloc() calls from the ext4 code.  I
 got the call site locations from grepping System.map.  It would be
 _really_ nice if we were able to do something like:
 
   echo 'call_site =~ *ext4*'

So basically module address plus len from /proc/modules should do for modular 
features? Should be easy to script.

Is there any neat way to enable this after module is loaded, but before it gets 
any control, so that there's
a full track of all allocations and deallocations (obviously that's only going 
to be used for debugging).

 Yeah, it is hard to find out who is responsible for leaking pages or
 kmalloc()s, especially after the fact.  But, we seem to limp along just
 fine.  If lustre is that bad of a memory leaker that it *NEEDS* this
 feature, we have bigger problems on our hands. :)

It's one of those things that you don't need often, but that does come very 
handy once the need arises ;)
It's also nice that it warns you on module unload that hey, you left this much 
stuff behind, bad boy!.
But we can live without it, that's true too.

Bye,
Oleg

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH] staging: lustre: fix GFP_ATOMIC macro usage

2014-01-20 Thread Marek Szyprowski

Hello,

On 2014-01-17 15:33, Greg Kroah-Hartman wrote:

On Fri, Jan 17, 2014 at 09:46:56AM +0100, Marek Szyprowski wrote:
 GFP_ATOMIC is not a single gfp flag, but a macro which expands to the other
 flags and LACK of __GFP_WAIT flag. To check if caller wanted to perform an
 atomic allocation, the code must test __GFP_WAIT flag presence.

 Signed-off-by: Marek Szyprowski m.szyprow...@samsung.com
 ---
  .../lustre/include/linux/libcfs/libcfs_private.h   |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs_private.h 
b/drivers/staging/lustre/include/linux/libcfs/libcfs_private.h
 index d0d942c..dddccca1 100644
 --- a/drivers/staging/lustre/include/linux/libcfs/libcfs_private.h
 +++ b/drivers/staging/lustre/include/linux/libcfs/libcfs_private.h
 @@ -120,7 +120,7 @@ do {   \
  do { 
 \
LASSERT(!in_interrupt() ||  \
((size) = LIBCFS_VMALLOC_SIZE \
 -   ((mask)  GFP_ATOMIC)) != 0);  \
 +   ((mask)  __GFP_WAIT) == 0));  \
  } while (0)

What a horrible assert, can't we just remove this entirely?
in_interrupt() usually should never be checked, if so, the code is doing
something wrong.  And __GFP flags shouldn't be used on their own.


Well, I've prepared this patch when I was checking kernel sources for 
incorrect

GFP_ATOMIC usage. I agree that drivers should use generic memory allocation
methods instead of inventing their own stuff. Feel free to ignore my 
patch in

favor of removing this custom allocator at all.

Best regards
--
Marek Szyprowski, PhD
Samsung RD Institute Poland

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH] staging: lustre: fix GFP_ATOMIC macro usage

2014-01-17 Thread Greg Kroah-Hartman
On Fri, Jan 17, 2014 at 09:46:56AM +0100, Marek Szyprowski wrote:
 GFP_ATOMIC is not a single gfp flag, but a macro which expands to the other
 flags and LACK of __GFP_WAIT flag. To check if caller wanted to perform an
 atomic allocation, the code must test __GFP_WAIT flag presence.
 
 Signed-off-by: Marek Szyprowski m.szyprow...@samsung.com
 ---
  .../lustre/include/linux/libcfs/libcfs_private.h   |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs_private.h 
 b/drivers/staging/lustre/include/linux/libcfs/libcfs_private.h
 index d0d942c..dddccca1 100644
 --- a/drivers/staging/lustre/include/linux/libcfs/libcfs_private.h
 +++ b/drivers/staging/lustre/include/linux/libcfs/libcfs_private.h
 @@ -120,7 +120,7 @@ do {  \
  do { \
   LASSERT(!in_interrupt() ||  \
   ((size) = LIBCFS_VMALLOC_SIZE\
 -  ((mask)  GFP_ATOMIC)) != 0);  \
 +  ((mask)  __GFP_WAIT) == 0));  \
  } while (0)

What a horrible assert, can't we just remove this entirely?
in_interrupt() usually should never be checked, if so, the code is doing
something wrong.  And __GFP flags shouldn't be used on their own.

thanks,

greg k-h
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH] staging: lustre: fix GFP_ATOMIC macro usage

2014-01-17 Thread Dan Carpenter
We will want to get rid of lustre's custom allocator before this gets
out of staging.

But one feature that the lustre allocator has which is pretty neat is
that it lets you debug how much memory the filesystem is using.  Is
there a standard way to find this information?

regards,
dan carpenter

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH] staging: lustre: fix GFP_ATOMIC macro usage

2014-01-17 Thread Greg Kroah-Hartman
On Fri, Jan 17, 2014 at 05:51:28PM +0300, Dan Carpenter wrote:
 We will want to get rid of lustre's custom allocator before this gets
 out of staging.
 
 But one feature that the lustre allocator has which is pretty neat is
 that it lets you debug how much memory the filesystem is using.  Is
 there a standard way to find this information?

Create your own mempool/slab/whatever_it's_called and look in the
debugfs or proc files for the allocator usage, depending on the memory
allocator the kernel is using.

That's how the rest of the kernel does it, no reason lustre should be
any different.

thanks,

greg k-h
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel