On 2010-08-26, at 18:42, Jagga Soorma wrote:
I am still running into this issue on some nodes:
client109: ll_obdo_cache 0 152914489 208 19 1 : tunables 120
60 8 : slabdata 0 8048131 0
client102: ll_obdo_cache 0 308526883 208 19 1 : tunables 120
60 8 : slabdata 0 16238257 0
How can I calculate how much memory this is holding on to.
If you do "head -1 /proc/slabinfo" it reports the column descriptions.
The "slabdata" will section reports numslabs=16238257, and pagesperslab=1, so
tis is 16238257 pages of memory, or about 64GB of RAM on client102. Ouch.
My system shows a lot of memory that is being used up but none of the jobs are using that much memory. Also, these clients are running a smp sles 11 kernel but I can't find any /sys/kernel/slab directory.
Linux client102 2.6.27.29-0.1-default #1 SMP 2009-08-15 17:53:59 +0200 x86_64
x86_64 x86_64 GNU/Linux
What makes you say that this does not look like a lustre memory leak? I
thought all the ll_* objects in slabinfo are lustre related?
It's true that the ll_obdo_cache objects are allocated by Lustre, but the above
data shows 0 of those objects in use, so the kernel _should_ be freeing the
unused slab objects. This particular data type (obdo) is only ever in use
temporarily during system calls on the client, and should never be allocated
for a long time.
For some reason the kernel is not freeing the empty slab pages. That is the
responsibility of the kernel, and not Lustre.
To me it looks like lustre is holding on to this memory but I don't know much
about lustre internals.
Also, memused on these systems are:
client102: 2353666940
client109: 2421645924
This shows that Lustre is actively using about 2.4GB of memory allocations. It
is not tracking the 64GB of memory in the obdo_cache slab, because it has freed
that memory (even though the kernel has not freed those pages).
Any help would be greatly appreciated.
The only suggestion I have is that if you unmount Lustre and unload the modules
(lustre_rmmod) it will free up this memory. Otherwise, searching for problems
with the slab cache on this kernel may turn up something.
On Wed, May 19, 2010 at 8:08 AM, Dmitry Zogin <[email protected]> wrote:
Hello Jagga,
I checked the data, and indeed this does not look like a lustre memory leak,
rather than a slab fragmentation, which assumes there might be a kernel issue
here. From the slabinfo (I only keep three first columns here):
name <active_objs> <num_objs>
ll_obdo_cache 0 452282156 208
means that there are no active objects, but the memory pages are not released back from slab allocator to the free pool (the num value is huge). That looks like a slab fragmentation - you can get more description at
http://kerneltrap.org/Linux/Slab_Defragmentation
Checking your mails, I wonder if this only happens on clients which have
SLES11 installed? As the RAM size is around 192Gb, I assume they are NUMA
systems?
If so, SLES11 has defrag_ratio tunables in /sys/kernel/slab/xxx
From the source of get_any_partial()
#ifdef CONFIG_NUMA
/*
* The defrag ratio allows a configuration of the tradeoffs between
* inter node defragmentation and node local allocations. A lower
* defrag_ratio increases the tendency to do local allocations
* instead of attempting to obtain partial slabs from other nodes.
*
* If the defrag_ratio is set to 0 then kmalloc() always
* returns node local objects. If the ratio is higher then kmalloc()
* may return off node objects because partial slabs are obtained
* from other nodes and filled up.
*
* If /sys/kernel/slab/xx/defrag_ratio is set to 100 (which makes
* defrag_ratio = 1000) then every (well almost) allocation will
* first attempt to defrag slab caches on other nodes. This means
* scanning over all nodes to look for partial slabs which may be
* expensive if we do it every time we are trying to find a slab
* with available objects.
*/
Could you please verify that your clients have defrag_ratio tunable and try to
use various values?
It looks like the value of 100 should be the best, unless there is a bug, then
may be even 0 gets the desired result?
Best regards,
Dmitry
Jagga Soorma wrote:
Hi Johann,
I am actually using 1.8.1 and not 1.8.2:
# rpm -qa | grep -i lustre
lustre-client-1.8.1.1-2.6.27.29_0.1_lustre.1.8.1.1_default
lustre-client-modules-1.8.1.1-2.6.27.29_0.1_lustre.1.8.1.1_default
My kernel version on the SLES 11 clients is:
# uname -r
2.6.27.29-0.1-default
My kernel version on the RHEL 5.3 mds/oss servers is:
# uname -r
2.6.18-128.7.1.el5_lustre.1.8.1.1
Please let me know if you need any further information. I am still trying to
get the user to help me run his app so that I can run the leak finder script to
capture more information.
Regards,
-Simran
On Tue, Apr 27, 2010 at 7:20 AM, Johann Lombardi <[email protected]> wrote:
Hi,
On Tue, Apr 20, 2010 at 09:08:25AM -0700, Jagga Soorma wrote:
Thanks for your response.* I will try to run the leak-finder script and
hopefully it will point us in the right direction.* This only seems to be
happening on some of my clients:
Could you please tell us what kernel you use on the client side?
client104: ll_obdo_cache********* 0 433506280*** 208** 19*** 1 : tunables*
120** 60*** 8 : slabdata***** 0 22816120***** 0
client116: ll_obdo_cache********* 0 457366746*** 208** 19*** 1 : tunables*
120** 60*** 8 : slabdata***** 0 24071934***** 0
client113: ll_obdo_cache********* 0 456778867*** 208** 19*** 1 : tunables*
120** 60*** 8 : slabdata***** 0 24040993***** 0
client106: ll_obdo_cache********* 0 456372267*** 208** 19*** 1 : tunables*
120** 60*** 8 : slabdata***** 0 24019593***** 0
client115: ll_obdo_cache********* 0 449929310*** 208** 19*** 1 : tunables*
120** 60*** 8 : slabdata***** 0 23680490***** 0
client101: ll_obdo_cache********* 0 454318101*** 208** 19*** 1 : tunables*
120** 60*** 8 : slabdata***** 0 23911479***** 0
--
Hopefully this should help.* Not sure which application might be causing
the leaks.* Currently R is the only app that users seem to be using
heavily on these clients.* Will let you know what I find.
Tommi Tervo has filed a bugzilla ticket for this issue, see
https://bugzilla.lustre.org/show_bug.cgi?id=22701
Could you please add a comment to this ticket to describe the
behavior of the application "R" (fork many threads, write to
many files, use direct i/o, ...)?
Cheers,
Johann
_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss
Cheers, Andreas
--
Andreas Dilger
Lustre Technical Lead
Oracle Corporation Canada Inc.
_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss