Re: [Lustre-discuss] Lustre Client - Memory Issue

Dmitry Zogin Mon, 30 Aug 2010 18:52:50 -0700

Actually there was a bug fixed in 1.8.4 when obdo structures can beallocated and freed outside of OBDO_ALLOC/OBDO_FREE macros. That couldlead to the slab fragmentation and pseudo-leak.

The patch is in the attachment 30664 for bz 21980

Dmitry



Andreas Dilger wrote:

On 2010-08-26, at 18:42, Jagga Soorma wrote:

I am still running into this issue on some nodes:

client109: ll_obdo_cache          0 152914489    208   19    1 : tunables  120  
 60    8 : slabdata      0 8048131      0
client102: ll_obdo_cache          0 308526883    208   19    1 : tunables  120  
 60    8 : slabdata      0 16238257      0

How can I calculate how much memory this is holding on to.


If you do "head -1 /proc/slabinfo" it reports the column descriptions.

The "slabdata" will section reports numslabs=16238257, and pagesperslab=1, so 
tis is 16238257 pages of memory, or about 64GB of RAM on client102.  Ouch.

My system shows a lot of memory that is being used up but none of the jobs are using that much memory. Also, these clients are running a smp sles 11 kernel but I can't find any /sys/kernel/slab directory.
Linux client102 2.6.27.29-0.1-default #1 SMP 2009-08-15 17:53:59 +0200 x86_64 
x86_64 x86_64 GNU/Linux

What makes you say that this does not look like a lustre memory leak?  I 
thought all the ll_* objects in slabinfo are lustre related?


It's true that the ll_obdo_cache objects are allocated by Lustre, but the above 
data shows 0 of those objects in use, so the kernel _should_ be freeing the 
unused slab objects.  This particular data type (obdo) is only ever in use 
temporarily during system calls on the client, and should never be allocated 
for a long time.

For some reason the kernel is not freeing the empty slab pages.  That is the 
responsibility of the kernel, and not Lustre.

 To me it looks like lustre is holding on to this memory but I don't know much 
about lustre internals.

Also, memused on these systems are:

client102: 2353666940
client109: 2421645924


This shows that Lustre is actively using about 2.4GB of memory allocations.  It 
is not tracking the 64GB of memory in the obdo_cache slab, because it has freed 
that memory (even though the kernel has not freed those pages).

Any help would be greatly appreciated.


The only suggestion I have is that if you unmount Lustre and unload the modules 
(lustre_rmmod) it will free up this memory.  Otherwise, searching for problems 
with the slab cache on this kernel may turn up something.

On Wed, May 19, 2010 at 8:08 AM, Dmitry Zogin <[email protected]> wrote:
Hello Jagga,

I checked the data, and indeed this does not look like a lustre memory leak, 
rather than a slab fragmentation, which assumes there might be a kernel issue 
here. From the slabinfo (I only keep three first columns here):


name            <active_objs> <num_objs>
ll_obdo_cache          0 452282156    208

means that there are no active objects, but the memory pages are not released back from slab allocator to the free pool (the num value is huge). That looks like a slab fragmentation - you can get more description athttp://kerneltrap.org/Linux/Slab_Defragmentation


Checking your mails, I wonder if this only happens on clients which have  
SLES11 installed? As the RAM size is around 192Gb, I assume they are NUMA 
systems?
If so, SLES11 has defrag_ratio tunables in /sys/kernel/slab/xxx
From the source of get_any_partial()

#ifdef CONFIG_NUMA

        /*
         * The defrag ratio allows a configuration of the tradeoffs between
         * inter node defragmentation and node local allocations. A lower
         * defrag_ratio increases the tendency to do local allocations
         * instead of attempting to obtain partial slabs from other nodes.
         *
         * If the defrag_ratio is set to 0 then kmalloc() always
         * returns node local objects. If the ratio is higher then kmalloc()
         * may return off node objects because partial slabs are obtained
         * from other nodes and filled up.
         *
         * If /sys/kernel/slab/xx/defrag_ratio is set to 100 (which makes
         * defrag_ratio = 1000) then every (well almost) allocation will
         * first attempt to defrag slab caches on other nodes. This means
         * scanning over all nodes to look for partial slabs which may be
         * expensive if we do it every time we are trying to find a slab
         * with available objects.
         */

Could you please verify that your clients have defrag_ratio tunable and try to 
use various values?
It looks like the value of 100 should be the best, unless there is a bug, then 
may be even 0 gets the desired result?

Best regards,
Dmitry


Jagga Soorma wrote:

Hi Johann,

I am actually using 1.8.1 and not 1.8.2:

# rpm -qa | grep -i lustre
lustre-client-1.8.1.1-2.6.27.29_0.1_lustre.1.8.1.1_default
lustre-client-modules-1.8.1.1-2.6.27.29_0.1_lustre.1.8.1.1_default

My kernel version on the SLES 11 clients is:
# uname -r
2.6.27.29-0.1-default

My kernel version on the RHEL 5.3 mds/oss servers is:
# uname -r
2.6.18-128.7.1.el5_lustre.1.8.1.1

Please let me know if you need any further information.  I am still trying to 
get the user to help me run his app so that I can run the leak finder script to 
capture more information.

Regards,
-Simran

On Tue, Apr 27, 2010 at 7:20 AM, Johann Lombardi <[email protected]> wrote:
Hi,

On Tue, Apr 20, 2010 at 09:08:25AM -0700, Jagga Soorma wrote:

Thanks for your response.* I will try to run the leak-finder script and
hopefully it will point us in the right direction.* This only seems to be
happening on some of my clients:

Could you please tell us what kernel you use on the client side?

   client104: ll_obdo_cache********* 0 433506280*** 208** 19*** 1 : tunables*
   120** 60*** 8 : slabdata***** 0 22816120***** 0
   client116: ll_obdo_cache********* 0 457366746*** 208** 19*** 1 : tunables*
   120** 60*** 8 : slabdata***** 0 24071934***** 0
   client113: ll_obdo_cache********* 0 456778867*** 208** 19*** 1 : tunables*
   120** 60*** 8 : slabdata***** 0 24040993***** 0
   client106: ll_obdo_cache********* 0 456372267*** 208** 19*** 1 : tunables*
   120** 60*** 8 : slabdata***** 0 24019593***** 0
   client115: ll_obdo_cache********* 0 449929310*** 208** 19*** 1 : tunables*
   120** 60*** 8 : slabdata***** 0 23680490***** 0
   client101: ll_obdo_cache********* 0 454318101*** 208** 19*** 1 : tunables*
   120** 60*** 8 : slabdata***** 0 23911479***** 0
   --

   Hopefully this should help.* Not sure which application might be causing
   the leaks.* Currently R is the only app that users seem to be using
   heavily on these clients.* Will let you know what I find.

Tommi Tervo has filed a bugzilla ticket for this issue, see
https://bugzilla.lustre.org/show_bug.cgi?id=22701

Could you please add a comment to this ticket to describe the
behavior of the application "R" (fork many threads, write to
many files, use direct i/o, ...)?

Cheers,
Johann


_______________________________________________
Lustre-discuss mailing list

[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss



Cheers, Andreas
--
Andreas Dilger
Lustre Technical Lead
Oracle Corporation Canada Inc.

_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] Lustre Client - Memory Issue

Reply via email to