Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client

Oleksandr Natalenko Wed, 06 Jan 2016 00:28:51 -0800

OK, here is valgrind log of patched Ganesha (I took recent version ofyour patchset, 8685abfc6d) with Entries_HWMARK set to 500.


https://gist.github.com/5397c152a259b9600af0

See no huge runtime leaks now. However, I've repeated this test withanother volume in replica and got the following Ganesha error:

===

ganesha.nfsd: inode.c:716: __inode_forget: Assertion `inode->nlookup >=nlookup' failed.

===

06.01.2016 08:40, Soumya Koduri написав:

On 01/06/2016 03:53 AM, Oleksandr Natalenko wrote:
OK, I've repeated the same traversing test with patched GlusterFS API,and
here is new Valgrind log:

https://gist.github.com/17ecb16a11c9aed957f5
Fuse mount doesn't use gfapi helper. Does your above GlusterFS API
application call glfs_fini() during exit? glfs_fini() is responsible
for freeing the memory consumed by gfAPI applications.

Could you repeat the test with nfs-ganesha (which for sure calls
glfs_fini() and purges inodes if exceeds its inode cache limit) if
possible.

Thanks,
Soumya
Still leaks.

On вівторок, 5 січня 2016 р. 22:52:25 EET Soumya Koduri wrote:
On 01/05/2016 05:56 PM, Oleksandr Natalenko wrote:
Unfortunately, both patches didn't make any difference for me.
I've patched 3.7.6 with both patches, recompiled and installedpatchedGlusterFS package on client side and mounted volume with ~2M offiles.
The I performed usual tree traverse with simple "find".

Memory RES value went from ~130M at the moment of mounting to ~1.5G
after traversing the volume for ~40 mins. Valgrind log still showslots
of leaks. Here it is:

https://gist.github.com/56906ca6e657c4ffa4a1
Looks like you had done fuse mount. The patches which I have pasted
below apply to gfapi/nfs-ganesha applications.
Also, to resolve the nfs-ganesha issue which I had mentioned below(incase if Entries_HWMARK option gets changed), I have posted below fix-
        https://review.gerrithub.io/#/c/258687

Thanks,
Soumya
Ideas?

05.01.2016 12:31, Soumya Koduri написав:
I tried to debug the inode* related leaks and seen someimprovements
after applying the below patches when ran the same test (but will
smaller load). Could you please apply those patches & confirm the
same?

a) http://review.gluster.org/13125
This will fix the inodes & their ctx related leaks during unexportandthe program exit. Please check the valgrind output after applyingthe
patch. It should not list any inodes related memory as lost.

b) http://review.gluster.org/13096

The reason the change in Entries_HWMARK (in your earlier mail) dint
have much effect is that the inode_nlookup count doesn't becomezerofor those handles/inodes being closed by ganesha. Hence thoseinodes
shall get added to inode lru list instead of purge list which shall
get forcefully purged only when the number of gfapi inode table
entries reaches its limit (which is 137012).
This patch fixes those 'nlookup' counts. Please apply this patchandreduce 'Entries_HWMARK' to much lower value and check if itdecreases
the in-memory being consumed by ganesha process while being active.

CACHEINODE {

         Entries_HWMark = 500;

}


Note: I see an issue with nfs-ganesha during exit when the option
'Entries_HWMARK' gets changed. This is not related to any of theabove
patches (or rather Gluster) and I am currently debugging it.

Thanks,
Soumya

On 12/25/2015 11:34 PM, Oleksandr Natalenko wrote:
1. test with Cache_Size = 256 and Entries_HWMark = 4096

Before find . -type f:

root      3120  0.6 11.0 879120 208408 ?       Ssl  17:39   0:00
/usr/bin/
ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf-N
NIV_EVENT

After:

root      3120 11.4 24.3 1170076 458168 ?      Ssl  17:39  13:39
/usr/bin/
ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf-N
NIV_EVENT

~250M leak.

2. test with default values (after ganesha restart)

Before:

root     24937  1.3 10.4 875016 197808 ?       Ssl  19:39   0:00
/usr/bin/
ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf-N
NIV_EVENT

After:

root     24937  3.5 18.9 1022544 356340 ?      Ssl  19:39   0:40
/usr/bin/
ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf-N
NIV_EVENT

~159M leak.

No reasonable correlation detected. Second test was finished much
faster than
first (I guess, server-side GlusterFS cache or server kernel page
cache is the
cause).

There are ~1.8M files on this test volume.

On пʼятниця, 25 грудня 2015 р. 20:28:13 EET Soumya Koduri wrote:
On 12/24/2015 09:17 PM, Oleksandr Natalenko wrote:
Another addition: it seems to be GlusterFS API library memoryleakbecause NFS-Ganesha also consumes huge amount of memory whiledoingordinary "find . -type f" via NFSv4.2 on remote client. Here ismemory
usage:

===
root      5416 34.2 78.5 2047176 1480552 ?     Ssl  12:02 117:54
/usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f
/etc/ganesha/ganesha.conf -N NIV_EVENT
===

1.4G is too much for simple stat() :(.

Ideas?
nfs-ganesha also has cache layer which can scale to millions ofentriesdepending on the number of files/directories being looked upon.Howeverthere are parameters to tune it. So either try stat with fewentries oradd below block in nfs-ganesha.conf file, set low limits andcheck the
difference. That may help us narrow down how much memory actually
consumed by core nfs-ganesha and gfAPI.

CACHEINODE {

     Cache_Size(uint32, range 1 to UINT32_MAX, default 32633); #

cache size
Entries_HWMark(uint32, range 1 to UINT32_MAX, default100000);
#Max no.
of entries in the cache.
}

Thanks,
Soumya
24.12.2015 16:32, Oleksandr Natalenko написав:
Still actual issue for 3.7.6. Any suggestions?

24.09.2015 10:14, Oleksandr Natalenko написав:
In our GlusterFS deployment we've encountered something likememory
leak in GlusterFS FUSE client.
We use replicated (×2) GlusterFS volume to store mail(exim+dovecot,maildir format). Here is inode stats for both bricks andmountpoint:
===
Brick 1 (Server 1):
Filesystem InodesIUsed
       IFree IUse% Mounted on

/dev/mapper/vg_vd1_misc-lv08_mail                   578768144
10954918

   567813226    2% /bricks/r6sdLV08_vd1_mail

Brick 2 (Server 2):
Filesystem InodesIUsed
       IFree IUse% Mounted on

/dev/mapper/vg_vd0_misc-lv07_mail                   578767984
10954913

   567813071    2% /bricks/r6sdLV07_vd0_mail

Mountpoint (Server 3):
Filesystem Inodes IUsedIFree
IUse% Mounted on
glusterfs.xxx:mail 578767760 10954915567812845
2% /var/spool/mail/virtual
===

glusterfs.xxx domain has two A records for both Server 1 and
Server 2.

Here is volume info:

===
Volume Name: mail
Type: Replicate
Volume ID: f564e85c-7aa6-4170-9417-1f501aa98cd2
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: server1.xxx:/bricks/r6sdLV08_vd1_mail/mail
Brick2: server2.xxx:/bricks/r6sdLV07_vd0_mail/mail
Options Reconfigured:
nfs.rpc-auth-allow: 1.2.4.0/24,4.5.6.0/24
features.cache-invalidation-timeout: 10
performance.stat-prefetch: off
performance.quick-read: on
performance.read-ahead: off
performance.flush-behind: on
performance.write-behind: on
performance.io-thread-count: 4
performance.cache-max-file-size: 1048576
performance.cache-size: 67108864
performance.readdir-ahead: off
===
Soon enough after mounting and exim/dovecot start, glusterfsclient
process begins to consume huge amount of RAM:

===
user@server3 ~$ ps aux | grep glusterfs | grep mail
root 28895 14.4 15.0 15510324 14908868 ? Ssl Sep034310:05/usr/sbin/glusterfs --fopen-keep-cache--direct-io-mode=disable
--volfile-server=glusterfs.xxx --volfile-id=mail
/var/spool/mail/virtual
===

That is, ~15 GiB of RAM.
Also we've tried to use mountpoint withing separate KVM VMwith 2
or 3
GiB of RAM, and soon after starting mail daemons got OOMkiller for
glusterfs client process.
Mounting same share via NFS works just fine. Also, we havemuch less
iowait and loadavg on client side with NFS.
Also, we've tried to change IO threads count and cache size inorderto limit memory usage with no luck. As you can see, totalcache size
is 4×64==256 MiB (compare to 15 GiB).

Enabling-disabling stat-prefetch, read-ahead and readdir-ahead
didn't
help as well.

Here are volume memory stats:

===
Memory status for volume : mail
----------------------------------------------
Brick : server1.xxx:/bricks/r6sdLV08_vd1_mail/mail
Mallinfo
--------
Arena    : 36859904
Ordblks  : 10357
Smblks   : 519
Hblks    : 21
Hblkhd   : 30515200
Usmblks  : 0
Fsmblks  : 53440
Uordblks : 18604144
Fordblks : 18255760
Keepcost : 114112

Mempool Stats
-------------
Name HotCount ColdCountPaddedSizeof
AllocCount MaxAlloc   Misses Max-StdAlloc
---- -------- ---------------------
---------- -------- -------- ------------
mail-server:fd_t 0 1024108
30773120      137        0            0
mail-server:dentry_t 16110 27484
235676148    16384  1106499         1152
mail-server:inode_t 16363 21156
237216876    16384  1876651         1169
mail-trash:fd_t 0 1024108
    0        0        0            0
mail-trash:dentry_t 0 3276884
    0        0        0            0
mail-trash:inode_t 4 32764156
    4        4        0            0
mail-trash:trash_local_t 0 648628
    0        0        0            0

mail-changetimerecorder:gf_ctr_local_t         0        64
16540          0        0        0            0
mail-changelog:rpcsvc_request_t 0 82828
     0        0        0            0
mail-changelog:changelog_local_t 0 64116
      0        0        0            0
mail-bitrot-stub:br_stub_local_t 0 51284
79204        4        0            0
mail-locks:pl_local_t 0 32148
6812757        4        0            0
mail-upcall:upcall_local_t 0 512108
    0        0        0            0
mail-marker:marker_local_t 0 128332
64980        3        0            0
mail-quota:quota_local_t 0 64476
    0        0        0            0
mail-server:rpcsvc_request_t 0 5122828
45462533       34        0            0
glusterfs:struct saved_frame 0 8124
    2        2        0            0
glusterfs:struct rpc_req 0 8588
    2        2        0            0
glusterfs:rpcsvc_request_t 1 72828
    2        1        0            0
glusterfs:log_buf_t 5 251140
3452        6        0            0
glusterfs:data_t 242 1614152
480115498      664        0            0
glusterfs:data_pair_t 230 1615368
179483528      275        0            0
glusterfs:dict_t 23 4073140
303751675      627        0            0
glusterfs:call_stub_t 0 10243764
45290655       34        0            0
glusterfs:call_stack_t 1 10231708
43598469       34        0            0
glusterfs:call_frame_t 1 4095172
336219655      184        0            0
----------------------------------------------
Brick : server2.xxx:/bricks/r6sdLV07_vd0_mail/mail
Mallinfo
--------
Arena    : 38174720
Ordblks  : 9041
Smblks   : 507
Hblks    : 21
Hblkhd   : 30515200
Usmblks  : 0
Fsmblks  : 51712
Uordblks : 19415008
Fordblks : 18759712
Keepcost : 114848

Mempool Stats
-------------
Name HotCount ColdCountPaddedSizeof
AllocCount MaxAlloc   Misses Max-StdAlloc
---- -------- ---------------------
---------- -------- -------- ------------
mail-server:fd_t 0 1024108
2373075      133        0            0
mail-server:dentry_t 14114 227084
3513654    16384     2300          267
mail-server:inode_t 16374 10156
6766642    16384   194635         1279
mail-trash:fd_t 0 1024108
    0        0        0            0
mail-trash:dentry_t 0 3276884
    0        0        0            0
mail-trash:inode_t 4 32764156
    4        4        0            0
mail-trash:trash_local_t 0 648628
    0        0        0            0

mail-changetimerecorder:gf_ctr_local_t         0        64
16540          0        0        0            0
mail-changelog:rpcsvc_request_t 0 82828
     0        0        0            0
mail-changelog:changelog_local_t 0 64116
      0        0        0            0
mail-bitrot-stub:br_stub_local_t 0 51284
71354        4        0            0
mail-locks:pl_local_t 0 32148
8135032        4        0            0
mail-upcall:upcall_local_t 0 512108
    0        0        0            0
mail-marker:marker_local_t 0 128332
65005        3        0            0
mail-quota:quota_local_t 0 64476
    0        0        0            0
mail-server:rpcsvc_request_t 0 5122828
12882393       30        0            0
glusterfs:struct saved_frame 0 8124
    2        2        0            0
glusterfs:struct rpc_req 0 8588
    2        2        0            0
glusterfs:rpcsvc_request_t 1 72828
    2        1        0            0
glusterfs:log_buf_t 5 251140
3443        6        0            0
glusterfs:data_t 242 1614152
138743429      290        0            0
glusterfs:data_pair_t 230 1615368
126649864      270        0            0
glusterfs:dict_t 23 4073140
20356289       63        0            0
glusterfs:call_stub_t 0 10243764
13678560       31        0            0
glusterfs:call_stack_t 1 10231708
11011561       30        0            0
glusterfs:call_frame_t 1 4095172
125764190      193        0            0
----------------------------------------------
===

So, my questions are:
1) what one should do to limit GlusterFS FUSE client memoryusage?2) what one should do to prevent client high loadavg becauseof high
iowait because of multiple concurrent volume users?
Server/client OS is CentOS 7.1, GlusterFS server version is3.7.3,
GlusterFS client version is 3.7.4.

Any additional info needed?
_______________________________________________
Gluster-users mailing list
gluster-us...@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client

Reply via email to