Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
OK, given GlusterFS v3.7.6 with the following patches: === Kaleb S KEITHLEY (1): fuse: use-after-free fix in fuse-bridge, revisited Pranith Kumar K (1): mount/fuse: Fix use-after-free crash Soumya Koduri (3): gfapi: Fix inode nlookup counts inode: Retire the inodes from the lru list in inode_table_destroy upcall: free the xdr* allocations === I've repeated "rsync" test under Valgrind, and here is Valgrind output: https://gist.github.com/f8e0151a6878cacc9b1a I see DHT-related leaks. On понеділок, 25 січня 2016 р. 02:46:32 EET Oleksandr Natalenko wrote: > Also, I've repeated the same "find" test again, but with glusterfs process > launched under valgrind. And here is valgrind output: > > https://gist.github.com/097afb01ebb2c5e9e78d > > On неділя, 24 січня 2016 р. 09:33:00 EET Mathieu Chateau wrote: > > Thanks for all your tests and times, it looks promising :) > > > > > > Cordialement, > > Mathieu CHATEAU > > http://www.lotp.fr > > > > 2016-01-23 22:30 GMT+01:00 Oleksandr Natalenko : > > > OK, now I'm re-performing tests with rsync + GlusterFS v3.7.6 + the > > > following > > > patches: > > > > > > === > > > > > > Kaleb S KEITHLEY (1): > > > fuse: use-after-free fix in fuse-bridge, revisited > > > > > > Pranith Kumar K (1): > > > mount/fuse: Fix use-after-free crash > > > > > > Soumya Koduri (3): > > > gfapi: Fix inode nlookup counts > > > inode: Retire the inodes from the lru list in inode_table_destroy > > > upcall: free the xdr* allocations > > > > > > === > > > > > > I run rsync from one GlusterFS volume to another. While memory started > > > from > > > under 100 MiBs, it stalled at around 600 MiBs for source volume and does > > > not > > > grow further. As for target volume it is ~730 MiBs, and that is why I'm > > > going > > > to do several rsync rounds to see if it grows more (with no patches bare > > > 3.7.6 > > > could consume more than 20 GiBs). > > > > > > No "kernel notifier loop terminated" message so far for both volumes. > > > > > > Will report more in several days. I hope current patches will be > > > incorporated > > > into 3.7.7. > > > > > > On пʼятниця, 22 січня 2016 р. 12:53:36 EET Kaleb S. KEITHLEY wrote: > > > > On 01/22/2016 12:43 PM, Oleksandr Natalenko wrote: > > > > > On пʼятниця, 22 січня 2016 р. 12:32:01 EET Kaleb S. KEITHLEY wrote: > > > > >> I presume by this you mean you're not seeing the "kernel notifier > > > > >> loop > > > > >> terminated" error in your logs. > > > > > > > > > > Correct, but only with simple traversing. Have to test under rsync. > > > > > > > > Without the patch I'd get "kernel notifier loop terminated" within a > > > > few > > > > minutes of starting I/O. With the patch I haven't seen it in 24 hours > > > > of beating on it. > > > > > > > > >> Hmmm. My system is not leaking. Last 24 hours the RSZ and VSZ are > > > > > > > >> stable: > > > http://download.gluster.org/pub/gluster/glusterfs/dynamic-analysis/longe > > > v > > > > > > > >> ity /client.out > > > > > > > > > > What ops do you perform on mounted volume? Read, write, stat? Is > > > > > that > > > > > 3.7.6 + patches? > > > > > > > > I'm running an internally developed I/O load generator written by a > > > > guy > > > > on our perf team. > > > > > > > > it does, create, write, read, rename, stat, delete, and more. > > ___ > Gluster-users mailing list > gluster-us...@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
Here are the results of "rsync" test. I've got 2 volumes — source and target — performing multiple files rsyncing from one volume to another. Source volume: === root 22259 3.5 1.5 1204200 771004 ? Ssl Jan23 109:42 /usr/sbin/ glusterfs --volfile-server=glusterfs.example.com --volfile-id=source /mnt/net/ glusterfs/source === One may see that memory consumption of source volume is not that high as with "find" test. Here is source volume client statedump: https://gist.github.com/ ef5b798859219e739aeb Here is source volume info: https://gist.github.com/3d2f32e7346df9333004 Target volume: === root 22200 23.8 6.9 3983676 3456252 ? Ssl Jan23 734:57 /usr/sbin/ glusterfs --volfile-server=glusterfs.example.com --volfile-id=target /mnt/net/ glusterfs/target === Here is target volume info: https://gist.github.com/c9de01168071575b109e Target volume RAM consumption is very high (more than 3 GiBs). Here is client statedump too: https://gist.github.com/31e43110eaa4da663435 I see huge DHT-related memory usage, e.g.: === [cluster/distribute.asterisk_records-dht - usage-type gf_common_mt_mem_pool memusage] size=725575592 num_allocs=7552486 max_size=725575836 max_num_allocs=7552489 total_allocs=90843958 [cluster/distribute.asterisk_records-dht - usage-type gf_common_mt_char memusage] size=586404954 num_allocs=7572836 max_size=586405157 max_num_allocs=7572839 total_allocs=80463096 === Ideas? On понеділок, 25 січня 2016 р. 02:46:32 EET Oleksandr Natalenko wrote: > Also, I've repeated the same "find" test again, but with glusterfs process > launched under valgrind. And here is valgrind output: > > https://gist.github.com/097afb01ebb2c5e9e78d > > On неділя, 24 січня 2016 р. 09:33:00 EET Mathieu Chateau wrote: > > Thanks for all your tests and times, it looks promising :) > > > > > > Cordialement, > > Mathieu CHATEAU > > http://www.lotp.fr > > > > 2016-01-23 22:30 GMT+01:00 Oleksandr Natalenko : > > > OK, now I'm re-performing tests with rsync + GlusterFS v3.7.6 + the > > > following > > > patches: > > > > > > === > > > > > > Kaleb S KEITHLEY (1): > > > fuse: use-after-free fix in fuse-bridge, revisited > > > > > > Pranith Kumar K (1): > > > mount/fuse: Fix use-after-free crash > > > > > > Soumya Koduri (3): > > > gfapi: Fix inode nlookup counts > > > inode: Retire the inodes from the lru list in inode_table_destroy > > > upcall: free the xdr* allocations > > > > > > === > > > > > > I run rsync from one GlusterFS volume to another. While memory started > > > from > > > under 100 MiBs, it stalled at around 600 MiBs for source volume and does > > > not > > > grow further. As for target volume it is ~730 MiBs, and that is why I'm > > > going > > > to do several rsync rounds to see if it grows more (with no patches bare > > > 3.7.6 > > > could consume more than 20 GiBs). > > > > > > No "kernel notifier loop terminated" message so far for both volumes. > > > > > > Will report more in several days. I hope current patches will be > > > incorporated > > > into 3.7.7. > > > > > > On пʼятниця, 22 січня 2016 р. 12:53:36 EET Kaleb S. KEITHLEY wrote: > > > > On 01/22/2016 12:43 PM, Oleksandr Natalenko wrote: > > > > > On пʼятниця, 22 січня 2016 р. 12:32:01 EET Kaleb S. KEITHLEY wrote: > > > > >> I presume by this you mean you're not seeing the "kernel notifier > > > > >> loop > > > > >> terminated" error in your logs. > > > > > > > > > > Correct, but only with simple traversing. Have to test under rsync. > > > > > > > > Without the patch I'd get "kernel notifier loop terminated" within a > > > > few > > > > minutes of starting I/O. With the patch I haven't seen it in 24 hours > > > > of beating on it. > > > > > > > > >> Hmmm. My system is not leaking. Last 24 hours the RSZ and VSZ are > > > > > > > >> stable: > > > http://download.gluster.org/pub/gluster/glusterfs/dynamic-analysis/longe > > > v > > > > > > > >> ity /client.out > > > > > > > > > > What ops do you perform on mounted volume? Read, write, stat? Is > > > > > that > > > > > 3.7.6 + patches? > > > > > > > > I'm running an internally developed I/O load generator written by a > > > > guy > > > > on our perf team. > > > > > > > > it does, create, write, read, rename, stat, delete, and more. > > ___ > Gluster-users mailing list > gluster-us...@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
Also, I've repeated the same "find" test again, but with glusterfs process launched under valgrind. And here is valgrind output: https://gist.github.com/097afb01ebb2c5e9e78d On неділя, 24 січня 2016 р. 09:33:00 EET Mathieu Chateau wrote: > Thanks for all your tests and times, it looks promising :) > > > Cordialement, > Mathieu CHATEAU > http://www.lotp.fr > > 2016-01-23 22:30 GMT+01:00 Oleksandr Natalenko : > > OK, now I'm re-performing tests with rsync + GlusterFS v3.7.6 + the > > following > > patches: > > > > === > > > > Kaleb S KEITHLEY (1): > > fuse: use-after-free fix in fuse-bridge, revisited > > > > Pranith Kumar K (1): > > mount/fuse: Fix use-after-free crash > > > > Soumya Koduri (3): > > gfapi: Fix inode nlookup counts > > inode: Retire the inodes from the lru list in inode_table_destroy > > upcall: free the xdr* allocations > > > > === > > > > I run rsync from one GlusterFS volume to another. While memory started > > from > > under 100 MiBs, it stalled at around 600 MiBs for source volume and does > > not > > grow further. As for target volume it is ~730 MiBs, and that is why I'm > > going > > to do several rsync rounds to see if it grows more (with no patches bare > > 3.7.6 > > could consume more than 20 GiBs). > > > > No "kernel notifier loop terminated" message so far for both volumes. > > > > Will report more in several days. I hope current patches will be > > incorporated > > into 3.7.7. > > > > On пʼятниця, 22 січня 2016 р. 12:53:36 EET Kaleb S. KEITHLEY wrote: > > > On 01/22/2016 12:43 PM, Oleksandr Natalenko wrote: > > > > On пʼятниця, 22 січня 2016 р. 12:32:01 EET Kaleb S. KEITHLEY wrote: > > > >> I presume by this you mean you're not seeing the "kernel notifier > > > >> loop > > > >> terminated" error in your logs. > > > > > > > > Correct, but only with simple traversing. Have to test under rsync. > > > > > > Without the patch I'd get "kernel notifier loop terminated" within a few > > > minutes of starting I/O. With the patch I haven't seen it in 24 hours > > > of beating on it. > > > > > > >> Hmmm. My system is not leaking. Last 24 hours the RSZ and VSZ are > > > > > >> stable: > > http://download.gluster.org/pub/gluster/glusterfs/dynamic-analysis/longev > > > > > >> ity /client.out > > > > > > > > What ops do you perform on mounted volume? Read, write, stat? Is that > > > > 3.7.6 + patches? > > > > > > I'm running an internally developed I/O load generator written by a guy > > > on our perf team. > > > > > > it does, create, write, read, rename, stat, delete, and more. ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
BTW, am I the only one who sees in max_size=4294965480 almost 2^32? Could that be integer overflow? On неділя, 24 січня 2016 р. 13:23:55 EET Oleksandr Natalenko wrote: > The leak definitely remains. I did "find /mnt/volume -type d" over GlusterFS > volume, with mentioned patches applied and without "kernel notifier loop > terminated" message, but "glusterfs" process consumed ~4GiB of RAM after > "find" finished. > > Here is statedump: > > https://gist.github.com/10cde83c63f1b4f1dd7a > > I see the following: > > === > [mount/fuse.fuse - usage-type gf_fuse_mt_iov_base memusage] > size=4235109959 > num_allocs=2 > max_size=4294965480 > max_num_allocs=3 > total_allocs=4533524 > === > > ~4GiB, right? > > Pranith, Kaleb? > > On неділя, 24 січня 2016 р. 09:33:00 EET Mathieu Chateau wrote: > > Thanks for all your tests and times, it looks promising :) > > > > > > Cordialement, > > Mathieu CHATEAU > > http://www.lotp.fr > > > > 2016-01-23 22:30 GMT+01:00 Oleksandr Natalenko : > > > OK, now I'm re-performing tests with rsync + GlusterFS v3.7.6 + the > > > following > > > patches: > > > > > > === > > > > > > Kaleb S KEITHLEY (1): > > > fuse: use-after-free fix in fuse-bridge, revisited > > > > > > Pranith Kumar K (1): > > > mount/fuse: Fix use-after-free crash > > > > > > Soumya Koduri (3): > > > gfapi: Fix inode nlookup counts > > > inode: Retire the inodes from the lru list in inode_table_destroy > > > upcall: free the xdr* allocations > > > > > > === > > > > > > I run rsync from one GlusterFS volume to another. While memory started > > > from > > > under 100 MiBs, it stalled at around 600 MiBs for source volume and does > > > not > > > grow further. As for target volume it is ~730 MiBs, and that is why I'm > > > going > > > to do several rsync rounds to see if it grows more (with no patches bare > > > 3.7.6 > > > could consume more than 20 GiBs). > > > > > > No "kernel notifier loop terminated" message so far for both volumes. > > > > > > Will report more in several days. I hope current patches will be > > > incorporated > > > into 3.7.7. > > > > > > On пʼятниця, 22 січня 2016 р. 12:53:36 EET Kaleb S. KEITHLEY wrote: > > > > On 01/22/2016 12:43 PM, Oleksandr Natalenko wrote: > > > > > On пʼятниця, 22 січня 2016 р. 12:32:01 EET Kaleb S. KEITHLEY wrote: > > > > >> I presume by this you mean you're not seeing the "kernel notifier > > > > >> loop > > > > >> terminated" error in your logs. > > > > > > > > > > Correct, but only with simple traversing. Have to test under rsync. > > > > > > > > Without the patch I'd get "kernel notifier loop terminated" within a > > > > few > > > > minutes of starting I/O. With the patch I haven't seen it in 24 hours > > > > of beating on it. > > > > > > > > >> Hmmm. My system is not leaking. Last 24 hours the RSZ and VSZ are > > > > > > > >> stable: > > > http://download.gluster.org/pub/gluster/glusterfs/dynamic-analysis/longe > > > v > > > > > > > >> ity /client.out > > > > > > > > > > What ops do you perform on mounted volume? Read, write, stat? Is > > > > > that > > > > > 3.7.6 + patches? > > > > > > > > I'm running an internally developed I/O load generator written by a > > > > guy > > > > on our perf team. > > > > > > > > it does, create, write, read, rename, stat, delete, and more. > > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
The leak definitely remains. I did "find /mnt/volume -type d" over GlusterFS volume, with mentioned patches applied and without "kernel notifier loop terminated" message, but "glusterfs" process consumed ~4GiB of RAM after "find" finished. Here is statedump: https://gist.github.com/10cde83c63f1b4f1dd7a I see the following: === [mount/fuse.fuse - usage-type gf_fuse_mt_iov_base memusage] size=4235109959 num_allocs=2 max_size=4294965480 max_num_allocs=3 total_allocs=4533524 === ~4GiB, right? Pranith, Kaleb? On неділя, 24 січня 2016 р. 09:33:00 EET Mathieu Chateau wrote: > Thanks for all your tests and times, it looks promising :) > > > Cordialement, > Mathieu CHATEAU > http://www.lotp.fr > > 2016-01-23 22:30 GMT+01:00 Oleksandr Natalenko : > > OK, now I'm re-performing tests with rsync + GlusterFS v3.7.6 + the > > following > > patches: > > > > === > > > > Kaleb S KEITHLEY (1): > > fuse: use-after-free fix in fuse-bridge, revisited > > > > Pranith Kumar K (1): > > mount/fuse: Fix use-after-free crash > > > > Soumya Koduri (3): > > gfapi: Fix inode nlookup counts > > inode: Retire the inodes from the lru list in inode_table_destroy > > upcall: free the xdr* allocations > > > > === > > > > I run rsync from one GlusterFS volume to another. While memory started > > from > > under 100 MiBs, it stalled at around 600 MiBs for source volume and does > > not > > grow further. As for target volume it is ~730 MiBs, and that is why I'm > > going > > to do several rsync rounds to see if it grows more (with no patches bare > > 3.7.6 > > could consume more than 20 GiBs). > > > > No "kernel notifier loop terminated" message so far for both volumes. > > > > Will report more in several days. I hope current patches will be > > incorporated > > into 3.7.7. > > > > On пʼятниця, 22 січня 2016 р. 12:53:36 EET Kaleb S. KEITHLEY wrote: > > > On 01/22/2016 12:43 PM, Oleksandr Natalenko wrote: > > > > On пʼятниця, 22 січня 2016 р. 12:32:01 EET Kaleb S. KEITHLEY wrote: > > > >> I presume by this you mean you're not seeing the "kernel notifier > > > >> loop > > > >> terminated" error in your logs. > > > > > > > > Correct, but only with simple traversing. Have to test under rsync. > > > > > > Without the patch I'd get "kernel notifier loop terminated" within a few > > > minutes of starting I/O. With the patch I haven't seen it in 24 hours > > > of beating on it. > > > > > > >> Hmmm. My system is not leaking. Last 24 hours the RSZ and VSZ are > > > > > >> stable: > > http://download.gluster.org/pub/gluster/glusterfs/dynamic-analysis/longev > > > > > >> ity /client.out > > > > > > > > What ops do you perform on mounted volume? Read, write, stat? Is that > > > > 3.7.6 + patches? > > > > > > I'm running an internally developed I/O load generator written by a guy > > > on our perf team. > > > > > > it does, create, write, read, rename, stat, delete, and more. ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
Thanks for all your tests and times, it looks promising :) Cordialement, Mathieu CHATEAU http://www.lotp.fr 2016-01-23 22:30 GMT+01:00 Oleksandr Natalenko : > OK, now I'm re-performing tests with rsync + GlusterFS v3.7.6 + the > following > patches: > > === > Kaleb S KEITHLEY (1): > fuse: use-after-free fix in fuse-bridge, revisited > > Pranith Kumar K (1): > mount/fuse: Fix use-after-free crash > > Soumya Koduri (3): > gfapi: Fix inode nlookup counts > inode: Retire the inodes from the lru list in inode_table_destroy > upcall: free the xdr* allocations > === > > I run rsync from one GlusterFS volume to another. While memory started from > under 100 MiBs, it stalled at around 600 MiBs for source volume and does > not > grow further. As for target volume it is ~730 MiBs, and that is why I'm > going > to do several rsync rounds to see if it grows more (with no patches bare > 3.7.6 > could consume more than 20 GiBs). > > No "kernel notifier loop terminated" message so far for both volumes. > > Will report more in several days. I hope current patches will be > incorporated > into 3.7.7. > > On пʼятниця, 22 січня 2016 р. 12:53:36 EET Kaleb S. KEITHLEY wrote: > > On 01/22/2016 12:43 PM, Oleksandr Natalenko wrote: > > > On пʼятниця, 22 січня 2016 р. 12:32:01 EET Kaleb S. KEITHLEY wrote: > > >> I presume by this you mean you're not seeing the "kernel notifier loop > > >> terminated" error in your logs. > > > > > > Correct, but only with simple traversing. Have to test under rsync. > > > > Without the patch I'd get "kernel notifier loop terminated" within a few > > minutes of starting I/O. With the patch I haven't seen it in 24 hours > > of beating on it. > > > > >> Hmmm. My system is not leaking. Last 24 hours the RSZ and VSZ are > > >> stable: > > >> > http://download.gluster.org/pub/gluster/glusterfs/dynamic-analysis/longev > > >> ity /client.out > > > > > > What ops do you perform on mounted volume? Read, write, stat? Is that > > > 3.7.6 + patches? > > > > I'm running an internally developed I/O load generator written by a guy > > on our perf team. > > > > it does, create, write, read, rename, stat, delete, and more. > > > ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
OK, now I'm re-performing tests with rsync + GlusterFS v3.7.6 + the following patches: === Kaleb S KEITHLEY (1): fuse: use-after-free fix in fuse-bridge, revisited Pranith Kumar K (1): mount/fuse: Fix use-after-free crash Soumya Koduri (3): gfapi: Fix inode nlookup counts inode: Retire the inodes from the lru list in inode_table_destroy upcall: free the xdr* allocations === I run rsync from one GlusterFS volume to another. While memory started from under 100 MiBs, it stalled at around 600 MiBs for source volume and does not grow further. As for target volume it is ~730 MiBs, and that is why I'm going to do several rsync rounds to see if it grows more (with no patches bare 3.7.6 could consume more than 20 GiBs). No "kernel notifier loop terminated" message so far for both volumes. Will report more in several days. I hope current patches will be incorporated into 3.7.7. On пʼятниця, 22 січня 2016 р. 12:53:36 EET Kaleb S. KEITHLEY wrote: > On 01/22/2016 12:43 PM, Oleksandr Natalenko wrote: > > On пʼятниця, 22 січня 2016 р. 12:32:01 EET Kaleb S. KEITHLEY wrote: > >> I presume by this you mean you're not seeing the "kernel notifier loop > >> terminated" error in your logs. > > > > Correct, but only with simple traversing. Have to test under rsync. > > Without the patch I'd get "kernel notifier loop terminated" within a few > minutes of starting I/O. With the patch I haven't seen it in 24 hours > of beating on it. > > >> Hmmm. My system is not leaking. Last 24 hours the RSZ and VSZ are > >> stable: > >> http://download.gluster.org/pub/gluster/glusterfs/dynamic-analysis/longev > >> ity /client.out > > > > What ops do you perform on mounted volume? Read, write, stat? Is that > > 3.7.6 + patches? > > I'm running an internally developed I/O load generator written by a guy > on our perf team. > > it does, create, write, read, rename, stat, delete, and more. ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
On 01/22/2016 01:20 PM, Joe Julian wrote: > > > On 01/22/16 09:53, Kaleb S. KEITHLEY wrote: >> On 01/22/2016 12:43 PM, Oleksandr Natalenko wrote: >>> On пʼятниця, 22 січня 2016 р. 12:32:01 EET Kaleb S. KEITHLEY wrote: I presume by this you mean you're not seeing the "kernel notifier loop terminated" error in your logs. >>> Correct, but only with simple traversing. Have to test under rsync. >> Without the patch I'd get "kernel notifier loop terminated" within a few >> minutes of starting I/O. With the patch I haven't seen it in 24 hours >> of beating on it. >> Hmmm. My system is not leaking. Last 24 hours the RSZ and VSZ are stable: http://download.gluster.org/pub/gluster/glusterfs/dynamic-analysis/longevity /client.out >>> What ops do you perform on mounted volume? Read, write, stat? Is that >>> 3.7.6 + >>> patches? >> I'm running an internally developed I/O load generator written by a guy >> on our perf team. >> >> it does, create, write, read, rename, stat, delete, and more. >> > Github link? I looked for one before posting. I don't think he has shared it. -- Kaleb ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
On 01/22/16 09:53, Kaleb S. KEITHLEY wrote: On 01/22/2016 12:43 PM, Oleksandr Natalenko wrote: On пʼятниця, 22 січня 2016 р. 12:32:01 EET Kaleb S. KEITHLEY wrote: I presume by this you mean you're not seeing the "kernel notifier loop terminated" error in your logs. Correct, but only with simple traversing. Have to test under rsync. Without the patch I'd get "kernel notifier loop terminated" within a few minutes of starting I/O. With the patch I haven't seen it in 24 hours of beating on it. Hmmm. My system is not leaking. Last 24 hours the RSZ and VSZ are stable: http://download.gluster.org/pub/gluster/glusterfs/dynamic-analysis/longevity /client.out What ops do you perform on mounted volume? Read, write, stat? Is that 3.7.6 + patches? I'm running an internally developed I/O load generator written by a guy on our perf team. it does, create, write, read, rename, stat, delete, and more. Github link? ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
On 01/22/2016 12:43 PM, Oleksandr Natalenko wrote: > On пʼятниця, 22 січня 2016 р. 12:32:01 EET Kaleb S. KEITHLEY wrote: >> I presume by this you mean you're not seeing the "kernel notifier loop >> terminated" error in your logs. > > Correct, but only with simple traversing. Have to test under rsync. Without the patch I'd get "kernel notifier loop terminated" within a few minutes of starting I/O. With the patch I haven't seen it in 24 hours of beating on it. > >> Hmmm. My system is not leaking. Last 24 hours the RSZ and VSZ are >> stable: >> http://download.gluster.org/pub/gluster/glusterfs/dynamic-analysis/longevity >> /client.out > > What ops do you perform on mounted volume? Read, write, stat? Is that 3.7.6 + > patches? I'm running an internally developed I/O load generator written by a guy on our perf team. it does, create, write, read, rename, stat, delete, and more. -- Kaleb ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
On пʼятниця, 22 січня 2016 р. 12:32:01 EET Kaleb S. KEITHLEY wrote: > I presume by this you mean you're not seeing the "kernel notifier loop > terminated" error in your logs. Correct, but only with simple traversing. Have to test under rsync. > Hmmm. My system is not leaking. Last 24 hours the RSZ and VSZ are > stable: > http://download.gluster.org/pub/gluster/glusterfs/dynamic-analysis/longevity > /client.out What ops do you perform on mounted volume? Read, write, stat? Is that 3.7.6 + patches? ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
On 01/22/2016 12:15 PM, Oleksandr Natalenko wrote: > OK, compiles and runs well now, I presume by this you mean you're not seeing the "kernel notifier loop terminated" error in your logs. > but still leaks. Hmmm. My system is not leaking. Last 24 hours the RSZ and VSZ are stable: http://download.gluster.org/pub/gluster/glusterfs/dynamic-analysis/longevity/client.out -- Kaleb ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
OK, compiles and runs well now, but still leaks. Will try to load the volume with rsync. On четвер, 21 січня 2016 р. 20:40:45 EET Kaleb KEITHLEY wrote: > On 01/21/2016 06:59 PM, Oleksandr Natalenko wrote: > > I see extra GF_FREE (node); added with two patches: > > > > === > > $ git diff HEAD~2 | gist > > https://gist.github.com/9524fa2054cc48278ea8 > > === > > > > Is that intentionally? I guess I face double-free issue. > > I presume you're referring to the release-3.7 branch. > > Yup, bad edit. Long day. That's why we review. ;-) > > Please try the latest. > > Thanks, > > -- > > Kaleb ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
On 01/21/2016 06:59 PM, Oleksandr Natalenko wrote: > I see extra GF_FREE (node); added with two patches: > > === > $ git diff HEAD~2 | gist > https://gist.github.com/9524fa2054cc48278ea8 > === > > Is that intentionally? I guess I face double-free issue. > I presume you're referring to the release-3.7 branch. Yup, bad edit. Long day. That's why we review. ;-) Please try the latest. Thanks, -- Kaleb ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
I see extra GF_FREE (node); added with two patches: === $ git diff HEAD~2 | gist https://gist.github.com/9524fa2054cc48278ea8 === Is that intentionally? I guess I face double-free issue. On четвер, 21 січня 2016 р. 17:29:53 EET Kaleb KEITHLEY wrote: > On 01/20/2016 04:08 AM, Oleksandr Natalenko wrote: > > Yes, there are couple of messages like this in my logs too (I guess one > > message per each remount): > > > > === > > [2016-01-18 23:42:08.742447] I [fuse-bridge.c:3875:notify_kernel_loop] 0- > > glusterfs-fuse: kernel notifier loop terminated > > === > > Bug reports and fixes for master and release-3.7 branches are: > > master) > https://bugzilla.redhat.com/show_bug.cgi?id=1288857 > http://review.gluster.org/12886 > > release-3.7) > https://bugzilla.redhat.com/show_bug.cgi?id=1288922 > http://review.gluster.org/12887 > > The release-3.7 fix will be in glusterfs-3.7.7 when it's released. > > I think with even with the above fixes applied there are still some > issues remaining. I have submitted additional/revised fixes on top of > the above fixes at: > > master: http://review.gluster.org/13274 > release-3.7: http://review.gluster.org/13275 > > I invite you to review the patches in gerrit (review.gluster.org). > > Regards, > > -- > > Kaleb ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
With the proposed patches I get the following assertion while copying files to GlusterFS volume: === glusterfs: mem-pool.c:305: __gf_free: Assertion `0xCAFEBABE == header->magic' failed. Program received signal SIGABRT, Aborted. [Switching to Thread 0x7fffe9ffb700 (LWP 12635)] 0x76f215f8 in raise () from /usr/lib/libc.so.6 (gdb) bt #0 0x76f215f8 in raise () from /usr/lib/libc.so.6 #1 0x76f22a7a in abort () from /usr/lib/libc.so.6 #2 0x76f1a417 in __assert_fail_base () from /usr/lib/libc.so.6 #3 0x76f1a4c2 in __assert_fail () from /usr/lib/libc.so.6 #4 0x77b6046b in __gf_free (free_ptr=0x7fffdc0b8f00) at mem-pool.c: 305 #5 0x75144eb9 in notify_kernel_loop (data=0x63df90) at fuse-bridge.c: 3893 #6 0x772994a4 in start_thread () from /usr/lib/libpthread.so.0 #7 0x76fd713d in clone () from /usr/lib/libc.so.6 === On четвер, 21 січня 2016 р. 17:29:53 EET Kaleb KEITHLEY wrote: > On 01/20/2016 04:08 AM, Oleksandr Natalenko wrote: > > Yes, there are couple of messages like this in my logs too (I guess one > > message per each remount): > > > > === > > [2016-01-18 23:42:08.742447] I [fuse-bridge.c:3875:notify_kernel_loop] 0- > > glusterfs-fuse: kernel notifier loop terminated > > === > > Bug reports and fixes for master and release-3.7 branches are: > > master) > https://bugzilla.redhat.com/show_bug.cgi?id=1288857 > http://review.gluster.org/12886 > > release-3.7) > https://bugzilla.redhat.com/show_bug.cgi?id=1288922 > http://review.gluster.org/12887 > > The release-3.7 fix will be in glusterfs-3.7.7 when it's released. > > I think with even with the above fixes applied there are still some > issues remaining. I have submitted additional/revised fixes on top of > the above fixes at: > > master: http://review.gluster.org/13274 > release-3.7: http://review.gluster.org/13275 > > I invite you to review the patches in gerrit (review.gluster.org). > > Regards, > > -- > > Kaleb ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
On 01/20/2016 04:08 AM, Oleksandr Natalenko wrote: > Yes, there are couple of messages like this in my logs too (I guess one > message per each remount): > > === > [2016-01-18 23:42:08.742447] I [fuse-bridge.c:3875:notify_kernel_loop] 0- > glusterfs-fuse: kernel notifier loop terminated > === > Bug reports and fixes for master and release-3.7 branches are: master) https://bugzilla.redhat.com/show_bug.cgi?id=1288857 http://review.gluster.org/12886 release-3.7) https://bugzilla.redhat.com/show_bug.cgi?id=1288922 http://review.gluster.org/12887 The release-3.7 fix will be in glusterfs-3.7.7 when it's released. I think with even with the above fixes applied there are still some issues remaining. I have submitted additional/revised fixes on top of the above fixes at: master: http://review.gluster.org/13274 release-3.7: http://review.gluster.org/13275 I invite you to review the patches in gerrit (review.gluster.org). Regards, -- Kaleb ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
I perform the tests using 1) rsync (massive copy of millions of files); 2) find (simple tree traversing). To check if memory leak happens, I use find tool. I've performed two traversing (w/ and w/o fopen-keep-cache=off) with remount between them, but I didn't encounter "kernel notifier loop terminated" message during both traversing as well as before unmounting volume. Nevertheless, memory still leaks (at least up to 3 GiB in each case), so I believe invalidation requests are not the case. I've also checked logs for the volume where I do rsync, and the message "kernel notifier loop terminated" happens somewhere in the middle of rsyncing, not before unmounting. But the memory starts leaking on rsync start as well, not just after "kernel notifier loop terminated" message. So, I believe, "kernel notifier loop terminated" is not the case again. Also, I've tried to implement quick and dirty GlusterFS FUSE client using API (see https://github.com/pfactum/xglfs), and with latest patches from this thread (http://review.gluster.org/#/c/13096/, http://review.gluster.org/#/c/ 13125/ and http://review.gluster.org/#/c/13232/) my FUSE client does not leak on tree traversing. So, I believe, this should be related to GlusterFS FUSE implementation. How could I debug memory leak better? On четвер, 21 січня 2016 р. 10:32:32 EET Xavier Hernandez wrote: > If this message appears way before the volume is unmounted, can you try > to start the volume manually using this command and repeat the tests ? > > glusterfs --fopen-keep-cache=off --volfile-server= > --volfile-id=/ > > This will prevent invalidation requests to be sent to the kernel, so > there shouldn't be any memory leak even if the worker thread exits > prematurely. > > If that solves the problem, we could try to determine the cause of the > premature exit and solve it. > > Xavi > > On 20/01/16 10:08, Oleksandr Natalenko wrote: > > Yes, there are couple of messages like this in my logs too (I guess one > > message per each remount): > > > > === > > [2016-01-18 23:42:08.742447] I [fuse-bridge.c:3875:notify_kernel_loop] 0- > > glusterfs-fuse: kernel notifier loop terminated > > === > > > > On середа, 20 січня 2016 р. 09:51:23 EET Xavier Hernandez wrote: > >> I'm seeing a similar problem with 3.7.6. > >> > >> This latest statedump contains a lot of gf_fuse_mt_invalidate_node_t > >> objects in fuse. Looking at the code I see they are used to send > >> invalidations to kernel fuse, however this is done in a separate thread > >> that writes a log message when it exits. On the system I'm seeing the > >> memory leak, I can see that message in the log files: > >> > >> [2016-01-18 23:04:55.384873] I [fuse-bridge.c:3875:notify_kernel_loop] > >> 0-glusterfs-fuse: kernel notifier loop terminated > >> > >> But the volume is still working at this moment, so any future inode > >> invalidations will leak memory because it was this thread that should > >> release it. > >> > >> Can you check if you also see this message in the mount log ? > >> > >> It seems that this thread terminates if write returns any error > >> different than ENOENT. I'm not sure if there could be any other error > >> that can cause this. > >> > >> Xavi > >> > >> On 20/01/16 00:13, Oleksandr Natalenko wrote: > >>> Here is another RAM usage stats and statedump of GlusterFS mount > >>> approaching to just another OOM: > >>> > >>> === > >>> root 32495 1.4 88.3 4943868 1697316 ? Ssl Jan13 129:18 > >>> /usr/sbin/ > >>> glusterfs --volfile-server=server.example.com --volfile-id=volume > >>> /mnt/volume === > >>> > >>> https://gist.github.com/86198201c79e927b46bd > >>> > >>> 1.6G of RAM just for almost idle mount (we occasionally store Asterisk > >>> recordings there). Triple OOM for 69 days of uptime. > >>> > >>> Any thoughts? > >>> > >>> On середа, 13 січня 2016 р. 16:26:59 EET Soumya Koduri wrote: > kill -USR1 > >>> > >>> ___ > >>> Gluster-devel mailing list > >>> Gluster-devel@gluster.org > >>> http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
If this message appears way before the volume is unmounted, can you try to start the volume manually using this command and repeat the tests ? glusterfs --fopen-keep-cache=off --volfile-server= --volfile-id=/ This will prevent invalidation requests to be sent to the kernel, so there shouldn't be any memory leak even if the worker thread exits prematurely. If that solves the problem, we could try to determine the cause of the premature exit and solve it. Xavi On 20/01/16 10:08, Oleksandr Natalenko wrote: Yes, there are couple of messages like this in my logs too (I guess one message per each remount): === [2016-01-18 23:42:08.742447] I [fuse-bridge.c:3875:notify_kernel_loop] 0- glusterfs-fuse: kernel notifier loop terminated === On середа, 20 січня 2016 р. 09:51:23 EET Xavier Hernandez wrote: I'm seeing a similar problem with 3.7.6. This latest statedump contains a lot of gf_fuse_mt_invalidate_node_t objects in fuse. Looking at the code I see they are used to send invalidations to kernel fuse, however this is done in a separate thread that writes a log message when it exits. On the system I'm seeing the memory leak, I can see that message in the log files: [2016-01-18 23:04:55.384873] I [fuse-bridge.c:3875:notify_kernel_loop] 0-glusterfs-fuse: kernel notifier loop terminated But the volume is still working at this moment, so any future inode invalidations will leak memory because it was this thread that should release it. Can you check if you also see this message in the mount log ? It seems that this thread terminates if write returns any error different than ENOENT. I'm not sure if there could be any other error that can cause this. Xavi On 20/01/16 00:13, Oleksandr Natalenko wrote: Here is another RAM usage stats and statedump of GlusterFS mount approaching to just another OOM: === root 32495 1.4 88.3 4943868 1697316 ? Ssl Jan13 129:18 /usr/sbin/ glusterfs --volfile-server=server.example.com --volfile-id=volume /mnt/volume === https://gist.github.com/86198201c79e927b46bd 1.6G of RAM just for almost idle mount (we occasionally store Asterisk recordings there). Triple OOM for 69 days of uptime. Any thoughts? On середа, 13 січня 2016 р. 16:26:59 EET Soumya Koduri wrote: kill -USR1 ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
Yes, there are couple of messages like this in my logs too (I guess one message per each remount): === [2016-01-18 23:42:08.742447] I [fuse-bridge.c:3875:notify_kernel_loop] 0- glusterfs-fuse: kernel notifier loop terminated === On середа, 20 січня 2016 р. 09:51:23 EET Xavier Hernandez wrote: > I'm seeing a similar problem with 3.7.6. > > This latest statedump contains a lot of gf_fuse_mt_invalidate_node_t > objects in fuse. Looking at the code I see they are used to send > invalidations to kernel fuse, however this is done in a separate thread > that writes a log message when it exits. On the system I'm seeing the > memory leak, I can see that message in the log files: > > [2016-01-18 23:04:55.384873] I [fuse-bridge.c:3875:notify_kernel_loop] > 0-glusterfs-fuse: kernel notifier loop terminated > > But the volume is still working at this moment, so any future inode > invalidations will leak memory because it was this thread that should > release it. > > Can you check if you also see this message in the mount log ? > > It seems that this thread terminates if write returns any error > different than ENOENT. I'm not sure if there could be any other error > that can cause this. > > Xavi > > On 20/01/16 00:13, Oleksandr Natalenko wrote: > > Here is another RAM usage stats and statedump of GlusterFS mount > > approaching to just another OOM: > > > > === > > root 32495 1.4 88.3 4943868 1697316 ? Ssl Jan13 129:18 > > /usr/sbin/ > > glusterfs --volfile-server=server.example.com --volfile-id=volume > > /mnt/volume === > > > > https://gist.github.com/86198201c79e927b46bd > > > > 1.6G of RAM just for almost idle mount (we occasionally store Asterisk > > recordings there). Triple OOM for 69 days of uptime. > > > > Any thoughts? > > > > On середа, 13 січня 2016 р. 16:26:59 EET Soumya Koduri wrote: > >> kill -USR1 > > > > ___ > > Gluster-devel mailing list > > Gluster-devel@gluster.org > > http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
I'm seeing a similar problem with 3.7.6. This latest statedump contains a lot of gf_fuse_mt_invalidate_node_t objects in fuse. Looking at the code I see they are used to send invalidations to kernel fuse, however this is done in a separate thread that writes a log message when it exits. On the system I'm seeing the memory leak, I can see that message in the log files: [2016-01-18 23:04:55.384873] I [fuse-bridge.c:3875:notify_kernel_loop] 0-glusterfs-fuse: kernel notifier loop terminated But the volume is still working at this moment, so any future inode invalidations will leak memory because it was this thread that should release it. Can you check if you also see this message in the mount log ? It seems that this thread terminates if write returns any error different than ENOENT. I'm not sure if there could be any other error that can cause this. Xavi On 20/01/16 00:13, Oleksandr Natalenko wrote: Here is another RAM usage stats and statedump of GlusterFS mount approaching to just another OOM: === root 32495 1.4 88.3 4943868 1697316 ? Ssl Jan13 129:18 /usr/sbin/ glusterfs --volfile-server=server.example.com --volfile-id=volume /mnt/volume === https://gist.github.com/86198201c79e927b46bd 1.6G of RAM just for almost idle mount (we occasionally store Asterisk recordings there). Triple OOM for 69 days of uptime. Any thoughts? On середа, 13 січня 2016 р. 16:26:59 EET Soumya Koduri wrote: kill -USR1 ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
And another statedump of FUSE mount client consuming more than 7 GiB of RAM: https://gist.github.com/136d7c49193c798b3ade DHT-related leak? On середа, 13 січня 2016 р. 16:26:59 EET Soumya Koduri wrote: > On 01/13/2016 04:08 PM, Soumya Koduri wrote: > > On 01/12/2016 12:46 PM, Oleksandr Natalenko wrote: > >> Just in case, here is Valgrind output on FUSE client with 3.7.6 + > >> API-related patches we discussed before: > >> > >> https://gist.github.com/cd6605ca19734c1496a4 > > > > Thanks for sharing the results. I made changes to fix one leak reported > > there wrt ' client_cbk_cache_invalidation' - > > > > - http://review.gluster.org/#/c/13232/ > > > > The other inode* related memory reported as lost is mainly (maybe) > > because fuse client process doesn't cleanup its memory (doesn't use > > fini()) while exiting the process. Hence majority of those allocations > > are listed as lost. But most of the inodes should have got purged when > > we drop vfs cache. Did you do drop vfs cache before exiting the process? > > > > I shall add some log statements and check that part > > Also please take statedump of the fuse mount process (after dropping vfs > cache) when you see high memory usage by issuing the following command - > 'kill -USR1 ' > > The statedump will be copied to 'glusterdump..dump.tim > estamp` file in /var/run/gluster or /usr/local/var/run/gluster. > Please refer to [1] for more information. > > Thanks, > Soumya > [1] http://review.gluster.org/#/c/8288/1/doc/debugging/statedump.md > > > Thanks, > > Soumya > > > >> 12.01.2016 08:24, Soumya Koduri написав: > >>> For fuse client, I tried vfs drop_caches as suggested by Vijay in an > >>> earlier mail. Though all the inodes get purged, I still doesn't see > >>> much difference in the memory footprint drop. Need to investigate what > >>> else is consuming so much memory here. > > > > ___ > > Gluster-users mailing list > > gluster-us...@gluster.org > > http://www.gluster.org/mailman/listinfo/gluster-users ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
Here is another RAM usage stats and statedump of GlusterFS mount approaching to just another OOM: === root 32495 1.4 88.3 4943868 1697316 ? Ssl Jan13 129:18 /usr/sbin/ glusterfs --volfile-server=server.example.com --volfile-id=volume /mnt/volume === https://gist.github.com/86198201c79e927b46bd 1.6G of RAM just for almost idle mount (we occasionally store Asterisk recordings there). Triple OOM for 69 days of uptime. Any thoughts? On середа, 13 січня 2016 р. 16:26:59 EET Soumya Koduri wrote: > kill -USR1 ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
I've applied client_cbk_cache_invalidation leak patch, and here are the results. Launch: === valgrind --leak-check=full --show-leak-kinds=all --log-file="valgrind_fuse.log" /usr/bin/glusterfs -N --volfile-server=server.example.com --volfile-id=somevolume /mnt/somevolume find /mnt/somevolume -type d === During the traversing, memory RSS value for glusterfs process went from 79M to 644M. Then I performed dropping VFS cache (as I did in previous tests), but RSS value was not affected. Then I did statedump: https://gist.github.com/11c7b11fc99ab123e6e2 Then I unmounted the volume and got Valgrind log: https://gist.github.com/99d2e3c5cb4ed50b091c Leaks reported by Valgrind do not conform by their size to overall runtime memory consumption, so I believe with the latest patch some cleanup is being performed better on exit (unmount), but in runtime there are still some issues. 13.01.2016 12:56, Soumya Koduri написав: On 01/13/2016 04:08 PM, Soumya Koduri wrote: On 01/12/2016 12:46 PM, Oleksandr Natalenko wrote: Just in case, here is Valgrind output on FUSE client with 3.7.6 + API-related patches we discussed before: https://gist.github.com/cd6605ca19734c1496a4 Thanks for sharing the results. I made changes to fix one leak reported there wrt ' client_cbk_cache_invalidation' - - http://review.gluster.org/#/c/13232/ The other inode* related memory reported as lost is mainly (maybe) because fuse client process doesn't cleanup its memory (doesn't use fini()) while exiting the process. Hence majority of those allocations are listed as lost. But most of the inodes should have got purged when we drop vfs cache. Did you do drop vfs cache before exiting the process? I shall add some log statements and check that part Also please take statedump of the fuse mount process (after dropping vfs cache) when you see high memory usage by issuing the following command - 'kill -USR1 ' The statedump will be copied to 'glusterdump..dump.tim estamp` file in /var/run/gluster or /usr/local/var/run/gluster. Please refer to [1] for more information. Thanks, Soumya [1] http://review.gluster.org/#/c/8288/1/doc/debugging/statedump.md Thanks, Soumya 12.01.2016 08:24, Soumya Koduri написав: For fuse client, I tried vfs drop_caches as suggested by Vijay in an earlier mail. Though all the inodes get purged, I still doesn't see much difference in the memory footprint drop. Need to investigate what else is consuming so much memory here. ___ Gluster-users mailing list gluster-us...@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
On 01/13/2016 04:08 PM, Soumya Koduri wrote: On 01/12/2016 12:46 PM, Oleksandr Natalenko wrote: Just in case, here is Valgrind output on FUSE client with 3.7.6 + API-related patches we discussed before: https://gist.github.com/cd6605ca19734c1496a4 Thanks for sharing the results. I made changes to fix one leak reported there wrt ' client_cbk_cache_invalidation' - - http://review.gluster.org/#/c/13232/ The other inode* related memory reported as lost is mainly (maybe) because fuse client process doesn't cleanup its memory (doesn't use fini()) while exiting the process. Hence majority of those allocations are listed as lost. But most of the inodes should have got purged when we drop vfs cache. Did you do drop vfs cache before exiting the process? I shall add some log statements and check that part Also please take statedump of the fuse mount process (after dropping vfs cache) when you see high memory usage by issuing the following command - 'kill -USR1 ' The statedump will be copied to 'glusterdump..dump.tim estamp` file in /var/run/gluster or /usr/local/var/run/gluster. Please refer to [1] for more information. Thanks, Soumya [1] http://review.gluster.org/#/c/8288/1/doc/debugging/statedump.md Thanks, Soumya 12.01.2016 08:24, Soumya Koduri написав: For fuse client, I tried vfs drop_caches as suggested by Vijay in an earlier mail. Though all the inodes get purged, I still doesn't see much difference in the memory footprint drop. Need to investigate what else is consuming so much memory here. ___ Gluster-users mailing list gluster-us...@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
On 01/12/2016 12:17 PM, Mathieu Chateau wrote: I tried like suggested: echo 3 > /proc/sys/vm/drop_caches sync It lower a bit usage: before: Images intégrées 2 after: Images intégrées 1 Thanks Mathieu. There is a drop in memory usage after dropping vfs cache but doesn't seem significant. Not sure at this point what else may be consuming most of the memory. Maybe after dropping the vfs cache, you could as well take valgrind results and see if there are huge chunks reported as lost from inode_new*. I shall too look into it further and update. -Soumya Cordialement, Mathieu CHATEAU http://www.lotp.fr 2016-01-12 7:34 GMT+01:00 Mathieu Chateau mailto:mathieu.chat...@lotp.fr>>: Hello, I also experience high memory usage on my gluster clients. Sample : Images intégrées 1 Can I help in testing/debugging ? Cordialement, Mathieu CHATEAU http://www.lotp.fr 2016-01-12 7:24 GMT+01:00 Soumya Koduri mailto:skod...@redhat.com>>: On 01/11/2016 05:11 PM, Oleksandr Natalenko wrote: Brief test shows that Ganesha stopped leaking and crashing, so it seems to be good for me. Thanks for checking. Nevertheless, back to my original question: what about FUSE client? It is still leaking despite all the fixes applied. Should it be considered another issue? For fuse client, I tried vfs drop_caches as suggested by Vijay in an earlier mail. Though all the inodes get purged, I still doesn't see much difference in the memory footprint drop. Need to investigate what else is consuming so much memory here. Thanks, Soumya 11.01.2016 12:26, Soumya Koduri написав: I have made changes to fix the lookup leak in a different way (as discussed with Pranith) and uploaded them in the latest patch set #4 - http://review.gluster.org/#/c/13096/ Please check if it resolves the mem leak and hopefully doesn't result in any assertion :) Thanks, Soumya On 01/08/2016 05:04 PM, Soumya Koduri wrote: I could reproduce while testing deep directories with in the mount point. I root caus'ed the issue & had discussion with Pranith to understand the purpose and recommended way of taking nlookup on inodes. I shall make changes to my existing fix and post the patch soon. Thanks for your patience! -Soumya On 01/07/2016 07:34 PM, Oleksandr Natalenko wrote: OK, I've patched GlusterFS v3.7.6 with 43570a01 and 5cffb56b (the most recent revisions) and NFS-Ganesha v2.3.0 with 8685abfc (most recent revision too). On traversing GlusterFS volume with many files in one folder via NFS mount I get an assertion: === ganesha.nfsd: inode.c:716: __inode_forget: Assertion `inode->nlookup >= nlookup' failed. === I used GDB on NFS-Ganesha process to get appropriate stacktraces: 1. short stacktrace of failed thread: https://gist.github.com/7f63bb99c530d26ded18 2. full stacktrace of failed thread: https://gist.github.com/d9bc7bc8f6a0bbff9e86 3. short stacktrace of all threads: https://gist.github.com/f31da7725306854c719f 4. full stacktrace of all threads: https://gist.github.com/65cbc562b01211ea5612 GlusterFS volume configuration: https://gist.github.com/30f0129d16e25d4a5a52 ganesha.conf: https://gist.github.com/9b5e59b8d6d8cb84c85d How I mount NFS share: === mount -t nfs4 127.0.0.1:/mail_boxes /mnt/tmp -o defaults,_netdev,minorversion=2,noac,noacl,lookupcache=none,timeo=100 === On четвер, 7 січня 2016 р. 12:06:42 EET Soumya Koduri wrote: Entries_HWMark = 500; ___ Glus
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
On 01/12/2016 12:46 PM, Oleksandr Natalenko wrote: Just in case, here is Valgrind output on FUSE client with 3.7.6 + API-related patches we discussed before: https://gist.github.com/cd6605ca19734c1496a4 Thanks for sharing the results. I made changes to fix one leak reported there wrt ' client_cbk_cache_invalidation' - - http://review.gluster.org/#/c/13232/ The other inode* related memory reported as lost is mainly (maybe) because fuse client process doesn't cleanup its memory (doesn't use fini()) while exiting the process. Hence majority of those allocations are listed as lost. But most of the inodes should have got purged when we drop vfs cache. Did you do drop vfs cache before exiting the process? I shall add some log statements and check that part Thanks, Soumya 12.01.2016 08:24, Soumya Koduri написав: For fuse client, I tried vfs drop_caches as suggested by Vijay in an earlier mail. Though all the inodes get purged, I still doesn't see much difference in the memory footprint drop. Need to investigate what else is consuming so much memory here. ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
I tried like suggested: echo 3 > /proc/sys/vm/drop_caches sync It lower a bit usage: before: [image: Images intégrées 2] after: [image: Images intégrées 1] Cordialement, Mathieu CHATEAU http://www.lotp.fr 2016-01-12 7:34 GMT+01:00 Mathieu Chateau : > Hello, > > I also experience high memory usage on my gluster clients. Sample : > [image: Images intégrées 1] > > Can I help in testing/debugging ? > > > > Cordialement, > Mathieu CHATEAU > http://www.lotp.fr > > 2016-01-12 7:24 GMT+01:00 Soumya Koduri : > >> >> >> On 01/11/2016 05:11 PM, Oleksandr Natalenko wrote: >> >>> Brief test shows that Ganesha stopped leaking and crashing, so it seems >>> to be good for me. >>> >>> Thanks for checking. >> >> Nevertheless, back to my original question: what about FUSE client? It >>> is still leaking despite all the fixes applied. Should it be considered >>> another issue? >>> >> >> For fuse client, I tried vfs drop_caches as suggested by Vijay in an >> earlier mail. Though all the inodes get purged, I still doesn't see much >> difference in the memory footprint drop. Need to investigate what else is >> consuming so much memory here. >> >> Thanks, >> Soumya >> >> >> >>> 11.01.2016 12:26, Soumya Koduri написав: >>> I have made changes to fix the lookup leak in a different way (as discussed with Pranith) and uploaded them in the latest patch set #4 - http://review.gluster.org/#/c/13096/ Please check if it resolves the mem leak and hopefully doesn't result in any assertion :) Thanks, Soumya On 01/08/2016 05:04 PM, Soumya Koduri wrote: > I could reproduce while testing deep directories with in the mount > point. I root caus'ed the issue & had discussion with Pranith to > understand the purpose and recommended way of taking nlookup on inodes. > > I shall make changes to my existing fix and post the patch soon. > Thanks for your patience! > > -Soumya > > On 01/07/2016 07:34 PM, Oleksandr Natalenko wrote: > >> OK, I've patched GlusterFS v3.7.6 with 43570a01 and 5cffb56b (the most >> recent >> revisions) and NFS-Ganesha v2.3.0 with 8685abfc (most recent revision >> too). >> >> On traversing GlusterFS volume with many files in one folder via NFS >> mount I >> get an assertion: >> >> === >> ganesha.nfsd: inode.c:716: __inode_forget: Assertion `inode->nlookup >> >= >> nlookup' failed. >> === >> >> I used GDB on NFS-Ganesha process to get appropriate stacktraces: >> >> 1. short stacktrace of failed thread: >> >> https://gist.github.com/7f63bb99c530d26ded18 >> >> 2. full stacktrace of failed thread: >> >> https://gist.github.com/d9bc7bc8f6a0bbff9e86 >> >> 3. short stacktrace of all threads: >> >> https://gist.github.com/f31da7725306854c719f >> >> 4. full stacktrace of all threads: >> >> https://gist.github.com/65cbc562b01211ea5612 >> >> GlusterFS volume configuration: >> >> https://gist.github.com/30f0129d16e25d4a5a52 >> >> ganesha.conf: >> >> https://gist.github.com/9b5e59b8d6d8cb84c85d >> >> How I mount NFS share: >> >> === >> mount -t nfs4 127.0.0.1:/mail_boxes /mnt/tmp -o >> defaults,_netdev,minorversion=2,noac,noacl,lookupcache=none,timeo=100 >> === >> >> On четвер, 7 січня 2016 р. 12:06:42 EET Soumya Koduri wrote: >> >>> Entries_HWMark = 500; >>> >> >> >> ___ > Gluster-users mailing list > gluster-us...@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users > ___ >> Gluster-users mailing list >> gluster-us...@gluster.org >> http://www.gluster.org/mailman/listinfo/gluster-users >> > > ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
Hello, I also experience high memory usage on my gluster clients. Sample : [image: Images intégrées 1] Can I help in testing/debugging ? Cordialement, Mathieu CHATEAU http://www.lotp.fr 2016-01-12 7:24 GMT+01:00 Soumya Koduri : > > > On 01/11/2016 05:11 PM, Oleksandr Natalenko wrote: > >> Brief test shows that Ganesha stopped leaking and crashing, so it seems >> to be good for me. >> >> Thanks for checking. > > Nevertheless, back to my original question: what about FUSE client? It >> is still leaking despite all the fixes applied. Should it be considered >> another issue? >> > > For fuse client, I tried vfs drop_caches as suggested by Vijay in an > earlier mail. Though all the inodes get purged, I still doesn't see much > difference in the memory footprint drop. Need to investigate what else is > consuming so much memory here. > > Thanks, > Soumya > > > >> 11.01.2016 12:26, Soumya Koduri написав: >> >>> I have made changes to fix the lookup leak in a different way (as >>> discussed with Pranith) and uploaded them in the latest patch set #4 >>> - http://review.gluster.org/#/c/13096/ >>> >>> Please check if it resolves the mem leak and hopefully doesn't result >>> in any assertion :) >>> >>> Thanks, >>> Soumya >>> >>> On 01/08/2016 05:04 PM, Soumya Koduri wrote: >>> I could reproduce while testing deep directories with in the mount point. I root caus'ed the issue & had discussion with Pranith to understand the purpose and recommended way of taking nlookup on inodes. I shall make changes to my existing fix and post the patch soon. Thanks for your patience! -Soumya On 01/07/2016 07:34 PM, Oleksandr Natalenko wrote: > OK, I've patched GlusterFS v3.7.6 with 43570a01 and 5cffb56b (the most > recent > revisions) and NFS-Ganesha v2.3.0 with 8685abfc (most recent revision > too). > > On traversing GlusterFS volume with many files in one folder via NFS > mount I > get an assertion: > > === > ganesha.nfsd: inode.c:716: __inode_forget: Assertion `inode->nlookup >= > nlookup' failed. > === > > I used GDB on NFS-Ganesha process to get appropriate stacktraces: > > 1. short stacktrace of failed thread: > > https://gist.github.com/7f63bb99c530d26ded18 > > 2. full stacktrace of failed thread: > > https://gist.github.com/d9bc7bc8f6a0bbff9e86 > > 3. short stacktrace of all threads: > > https://gist.github.com/f31da7725306854c719f > > 4. full stacktrace of all threads: > > https://gist.github.com/65cbc562b01211ea5612 > > GlusterFS volume configuration: > > https://gist.github.com/30f0129d16e25d4a5a52 > > ganesha.conf: > > https://gist.github.com/9b5e59b8d6d8cb84c85d > > How I mount NFS share: > > === > mount -t nfs4 127.0.0.1:/mail_boxes /mnt/tmp -o > defaults,_netdev,minorversion=2,noac,noacl,lookupcache=none,timeo=100 > === > > On четвер, 7 січня 2016 р. 12:06:42 EET Soumya Koduri wrote: > >> Entries_HWMark = 500; >> > > > ___ Gluster-users mailing list gluster-us...@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users >>> ___ > Gluster-users mailing list > gluster-us...@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users > ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
Just in case, here is Valgrind output on FUSE client with 3.7.6 + API-related patches we discussed before: https://gist.github.com/cd6605ca19734c1496a4 12.01.2016 08:24, Soumya Koduri написав: For fuse client, I tried vfs drop_caches as suggested by Vijay in an earlier mail. Though all the inodes get purged, I still doesn't see much difference in the memory footprint drop. Need to investigate what else is consuming so much memory here. ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
On 01/11/2016 05:11 PM, Oleksandr Natalenko wrote: Brief test shows that Ganesha stopped leaking and crashing, so it seems to be good for me. Thanks for checking. Nevertheless, back to my original question: what about FUSE client? It is still leaking despite all the fixes applied. Should it be considered another issue? For fuse client, I tried vfs drop_caches as suggested by Vijay in an earlier mail. Though all the inodes get purged, I still doesn't see much difference in the memory footprint drop. Need to investigate what else is consuming so much memory here. Thanks, Soumya 11.01.2016 12:26, Soumya Koduri написав: I have made changes to fix the lookup leak in a different way (as discussed with Pranith) and uploaded them in the latest patch set #4 - http://review.gluster.org/#/c/13096/ Please check if it resolves the mem leak and hopefully doesn't result in any assertion :) Thanks, Soumya On 01/08/2016 05:04 PM, Soumya Koduri wrote: I could reproduce while testing deep directories with in the mount point. I root caus'ed the issue & had discussion with Pranith to understand the purpose and recommended way of taking nlookup on inodes. I shall make changes to my existing fix and post the patch soon. Thanks for your patience! -Soumya On 01/07/2016 07:34 PM, Oleksandr Natalenko wrote: OK, I've patched GlusterFS v3.7.6 with 43570a01 and 5cffb56b (the most recent revisions) and NFS-Ganesha v2.3.0 with 8685abfc (most recent revision too). On traversing GlusterFS volume with many files in one folder via NFS mount I get an assertion: === ganesha.nfsd: inode.c:716: __inode_forget: Assertion `inode->nlookup >= nlookup' failed. === I used GDB on NFS-Ganesha process to get appropriate stacktraces: 1. short stacktrace of failed thread: https://gist.github.com/7f63bb99c530d26ded18 2. full stacktrace of failed thread: https://gist.github.com/d9bc7bc8f6a0bbff9e86 3. short stacktrace of all threads: https://gist.github.com/f31da7725306854c719f 4. full stacktrace of all threads: https://gist.github.com/65cbc562b01211ea5612 GlusterFS volume configuration: https://gist.github.com/30f0129d16e25d4a5a52 ganesha.conf: https://gist.github.com/9b5e59b8d6d8cb84c85d How I mount NFS share: === mount -t nfs4 127.0.0.1:/mail_boxes /mnt/tmp -o defaults,_netdev,minorversion=2,noac,noacl,lookupcache=none,timeo=100 === On четвер, 7 січня 2016 р. 12:06:42 EET Soumya Koduri wrote: Entries_HWMark = 500; ___ Gluster-users mailing list gluster-us...@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
Brief test shows that Ganesha stopped leaking and crashing, so it seems to be good for me. Nevertheless, back to my original question: what about FUSE client? It is still leaking despite all the fixes applied. Should it be considered another issue? 11.01.2016 12:26, Soumya Koduri написав: I have made changes to fix the lookup leak in a different way (as discussed with Pranith) and uploaded them in the latest patch set #4 - http://review.gluster.org/#/c/13096/ Please check if it resolves the mem leak and hopefully doesn't result in any assertion :) Thanks, Soumya On 01/08/2016 05:04 PM, Soumya Koduri wrote: I could reproduce while testing deep directories with in the mount point. I root caus'ed the issue & had discussion with Pranith to understand the purpose and recommended way of taking nlookup on inodes. I shall make changes to my existing fix and post the patch soon. Thanks for your patience! -Soumya On 01/07/2016 07:34 PM, Oleksandr Natalenko wrote: OK, I've patched GlusterFS v3.7.6 with 43570a01 and 5cffb56b (the most recent revisions) and NFS-Ganesha v2.3.0 with 8685abfc (most recent revision too). On traversing GlusterFS volume with many files in one folder via NFS mount I get an assertion: === ganesha.nfsd: inode.c:716: __inode_forget: Assertion `inode->nlookup >= nlookup' failed. === I used GDB on NFS-Ganesha process to get appropriate stacktraces: 1. short stacktrace of failed thread: https://gist.github.com/7f63bb99c530d26ded18 2. full stacktrace of failed thread: https://gist.github.com/d9bc7bc8f6a0bbff9e86 3. short stacktrace of all threads: https://gist.github.com/f31da7725306854c719f 4. full stacktrace of all threads: https://gist.github.com/65cbc562b01211ea5612 GlusterFS volume configuration: https://gist.github.com/30f0129d16e25d4a5a52 ganesha.conf: https://gist.github.com/9b5e59b8d6d8cb84c85d How I mount NFS share: === mount -t nfs4 127.0.0.1:/mail_boxes /mnt/tmp -o defaults,_netdev,minorversion=2,noac,noacl,lookupcache=none,timeo=100 === On четвер, 7 січня 2016 р. 12:06:42 EET Soumya Koduri wrote: Entries_HWMark = 500; ___ Gluster-users mailing list gluster-us...@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
I have made changes to fix the lookup leak in a different way (as discussed with Pranith) and uploaded them in the latest patch set #4 - http://review.gluster.org/#/c/13096/ Please check if it resolves the mem leak and hopefully doesn't result in any assertion :) Thanks, Soumya On 01/08/2016 05:04 PM, Soumya Koduri wrote: I could reproduce while testing deep directories with in the mount point. I root caus'ed the issue & had discussion with Pranith to understand the purpose and recommended way of taking nlookup on inodes. I shall make changes to my existing fix and post the patch soon. Thanks for your patience! -Soumya On 01/07/2016 07:34 PM, Oleksandr Natalenko wrote: OK, I've patched GlusterFS v3.7.6 with 43570a01 and 5cffb56b (the most recent revisions) and NFS-Ganesha v2.3.0 with 8685abfc (most recent revision too). On traversing GlusterFS volume with many files in one folder via NFS mount I get an assertion: === ganesha.nfsd: inode.c:716: __inode_forget: Assertion `inode->nlookup >= nlookup' failed. === I used GDB on NFS-Ganesha process to get appropriate stacktraces: 1. short stacktrace of failed thread: https://gist.github.com/7f63bb99c530d26ded18 2. full stacktrace of failed thread: https://gist.github.com/d9bc7bc8f6a0bbff9e86 3. short stacktrace of all threads: https://gist.github.com/f31da7725306854c719f 4. full stacktrace of all threads: https://gist.github.com/65cbc562b01211ea5612 GlusterFS volume configuration: https://gist.github.com/30f0129d16e25d4a5a52 ganesha.conf: https://gist.github.com/9b5e59b8d6d8cb84c85d How I mount NFS share: === mount -t nfs4 127.0.0.1:/mail_boxes /mnt/tmp -o defaults,_netdev,minorversion=2,noac,noacl,lookupcache=none,timeo=100 === On четвер, 7 січня 2016 р. 12:06:42 EET Soumya Koduri wrote: Entries_HWMark = 500; ___ Gluster-users mailing list gluster-us...@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
I could reproduce while testing deep directories with in the mount point. I root caus'ed the issue & had discussion with Pranith to understand the purpose and recommended way of taking nlookup on inodes. I shall make changes to my existing fix and post the patch soon. Thanks for your patience! -Soumya On 01/07/2016 07:34 PM, Oleksandr Natalenko wrote: OK, I've patched GlusterFS v3.7.6 with 43570a01 and 5cffb56b (the most recent revisions) and NFS-Ganesha v2.3.0 with 8685abfc (most recent revision too). On traversing GlusterFS volume with many files in one folder via NFS mount I get an assertion: === ganesha.nfsd: inode.c:716: __inode_forget: Assertion `inode->nlookup >= nlookup' failed. === I used GDB on NFS-Ganesha process to get appropriate stacktraces: 1. short stacktrace of failed thread: https://gist.github.com/7f63bb99c530d26ded18 2. full stacktrace of failed thread: https://gist.github.com/d9bc7bc8f6a0bbff9e86 3. short stacktrace of all threads: https://gist.github.com/f31da7725306854c719f 4. full stacktrace of all threads: https://gist.github.com/65cbc562b01211ea5612 GlusterFS volume configuration: https://gist.github.com/30f0129d16e25d4a5a52 ganesha.conf: https://gist.github.com/9b5e59b8d6d8cb84c85d How I mount NFS share: === mount -t nfs4 127.0.0.1:/mail_boxes /mnt/tmp -o defaults,_netdev,minorversion=2,noac,noacl,lookupcache=none,timeo=100 === On четвер, 7 січня 2016 р. 12:06:42 EET Soumya Koduri wrote: Entries_HWMark = 500; ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
OK, I've patched GlusterFS v3.7.6 with 43570a01 and 5cffb56b (the most recent revisions) and NFS-Ganesha v2.3.0 with 8685abfc (most recent revision too). On traversing GlusterFS volume with many files in one folder via NFS mount I get an assertion: === ganesha.nfsd: inode.c:716: __inode_forget: Assertion `inode->nlookup >= nlookup' failed. === I used GDB on NFS-Ganesha process to get appropriate stacktraces: 1. short stacktrace of failed thread: https://gist.github.com/7f63bb99c530d26ded18 2. full stacktrace of failed thread: https://gist.github.com/d9bc7bc8f6a0bbff9e86 3. short stacktrace of all threads: https://gist.github.com/f31da7725306854c719f 4. full stacktrace of all threads: https://gist.github.com/65cbc562b01211ea5612 GlusterFS volume configuration: https://gist.github.com/30f0129d16e25d4a5a52 ganesha.conf: https://gist.github.com/9b5e59b8d6d8cb84c85d How I mount NFS share: === mount -t nfs4 127.0.0.1:/mail_boxes /mnt/tmp -o defaults,_netdev,minorversion=2,noac,noacl,lookupcache=none,timeo=100 === On четвер, 7 січня 2016 р. 12:06:42 EET Soumya Koduri wrote: > Entries_HWMark = 500; ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
On 01/06/2016 01:58 PM, Oleksandr Natalenko wrote: OK, here is valgrind log of patched Ganesha (I took recent version of your patchset, 8685abfc6d) with Entries_HWMARK set to 500. https://gist.github.com/5397c152a259b9600af0 See no huge runtime leaks now. Glad to hear this :) However, I've repeated this test with another volume in replica and got the following Ganesha error: === ganesha.nfsd: inode.c:716: __inode_forget: Assertion `inode->nlookup >= nlookup' failed. === I repeated the tests on replica volume as well. But haven't hit any assert. Could you confirm if you have taken the latest gluster patch set #3 ? - http://review.gluster.org/#/c/13096/3 If you are hitting the issue even then, please provide the core if possible. Thanks, Soumya 06.01.2016 08:40, Soumya Koduri написав: On 01/06/2016 03:53 AM, Oleksandr Natalenko wrote: OK, I've repeated the same traversing test with patched GlusterFS API, and here is new Valgrind log: https://gist.github.com/17ecb16a11c9aed957f5 Fuse mount doesn't use gfapi helper. Does your above GlusterFS API application call glfs_fini() during exit? glfs_fini() is responsible for freeing the memory consumed by gfAPI applications. Could you repeat the test with nfs-ganesha (which for sure calls glfs_fini() and purges inodes if exceeds its inode cache limit) if possible. Thanks, Soumya Still leaks. On вівторок, 5 січня 2016 р. 22:52:25 EET Soumya Koduri wrote: On 01/05/2016 05:56 PM, Oleksandr Natalenko wrote: Unfortunately, both patches didn't make any difference for me. I've patched 3.7.6 with both patches, recompiled and installed patched GlusterFS package on client side and mounted volume with ~2M of files. The I performed usual tree traverse with simple "find". Memory RES value went from ~130M at the moment of mounting to ~1.5G after traversing the volume for ~40 mins. Valgrind log still shows lots of leaks. Here it is: https://gist.github.com/56906ca6e657c4ffa4a1 Looks like you had done fuse mount. The patches which I have pasted below apply to gfapi/nfs-ganesha applications. Also, to resolve the nfs-ganesha issue which I had mentioned below (in case if Entries_HWMARK option gets changed), I have posted below fix - https://review.gerrithub.io/#/c/258687 Thanks, Soumya Ideas? 05.01.2016 12:31, Soumya Koduri написав: I tried to debug the inode* related leaks and seen some improvements after applying the below patches when ran the same test (but will smaller load). Could you please apply those patches & confirm the same? a) http://review.gluster.org/13125 This will fix the inodes & their ctx related leaks during unexport and the program exit. Please check the valgrind output after applying the patch. It should not list any inodes related memory as lost. b) http://review.gluster.org/13096 The reason the change in Entries_HWMARK (in your earlier mail) dint have much effect is that the inode_nlookup count doesn't become zero for those handles/inodes being closed by ganesha. Hence those inodes shall get added to inode lru list instead of purge list which shall get forcefully purged only when the number of gfapi inode table entries reaches its limit (which is 137012). This patch fixes those 'nlookup' counts. Please apply this patch and reduce 'Entries_HWMARK' to much lower value and check if it decreases the in-memory being consumed by ganesha process while being active. CACHEINODE { Entries_HWMark = 500; } Note: I see an issue with nfs-ganesha during exit when the option 'Entries_HWMARK' gets changed. This is not related to any of the above patches (or rather Gluster) and I am currently debugging it. Thanks, Soumya On 12/25/2015 11:34 PM, Oleksandr Natalenko wrote: 1. test with Cache_Size = 256 and Entries_HWMark = 4096 Before find . -type f: root 3120 0.6 11.0 879120 208408 ? Ssl 17:39 0:00 /usr/bin/ ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT After: root 3120 11.4 24.3 1170076 458168 ? Ssl 17:39 13:39 /usr/bin/ ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT ~250M leak. 2. test with default values (after ganesha restart) Before: root 24937 1.3 10.4 875016 197808 ? Ssl 19:39 0:00 /usr/bin/ ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT After: root 24937 3.5 18.9 1022544 356340 ? Ssl 19:39 0:40 /usr/bin/ ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT ~159M leak. No reasonable correlation detected. Second test was finished much faster than first (I guess, server-side GlusterFS cache or server kernel page cache is the cause). There are ~1.8M files on this test volume. On пʼятниця, 25 грудня 2015 р. 20:28:13 EET Soumya Koduri wrote: On 12/24/2015 09:17 PM, Oleksandr Natalenko wrote: Another addition: it seems to be GlusterFS API library memory leak because NFS-Ganesha also consumes huge amount of memory while doi
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
OK, here is valgrind log of patched Ganesha (I took recent version of your patchset, 8685abfc6d) with Entries_HWMARK set to 500. https://gist.github.com/5397c152a259b9600af0 See no huge runtime leaks now. However, I've repeated this test with another volume in replica and got the following Ganesha error: === ganesha.nfsd: inode.c:716: __inode_forget: Assertion `inode->nlookup >= nlookup' failed. === 06.01.2016 08:40, Soumya Koduri написав: On 01/06/2016 03:53 AM, Oleksandr Natalenko wrote: OK, I've repeated the same traversing test with patched GlusterFS API, and here is new Valgrind log: https://gist.github.com/17ecb16a11c9aed957f5 Fuse mount doesn't use gfapi helper. Does your above GlusterFS API application call glfs_fini() during exit? glfs_fini() is responsible for freeing the memory consumed by gfAPI applications. Could you repeat the test with nfs-ganesha (which for sure calls glfs_fini() and purges inodes if exceeds its inode cache limit) if possible. Thanks, Soumya Still leaks. On вівторок, 5 січня 2016 р. 22:52:25 EET Soumya Koduri wrote: On 01/05/2016 05:56 PM, Oleksandr Natalenko wrote: Unfortunately, both patches didn't make any difference for me. I've patched 3.7.6 with both patches, recompiled and installed patched GlusterFS package on client side and mounted volume with ~2M of files. The I performed usual tree traverse with simple "find". Memory RES value went from ~130M at the moment of mounting to ~1.5G after traversing the volume for ~40 mins. Valgrind log still shows lots of leaks. Here it is: https://gist.github.com/56906ca6e657c4ffa4a1 Looks like you had done fuse mount. The patches which I have pasted below apply to gfapi/nfs-ganesha applications. Also, to resolve the nfs-ganesha issue which I had mentioned below (in case if Entries_HWMARK option gets changed), I have posted below fix - https://review.gerrithub.io/#/c/258687 Thanks, Soumya Ideas? 05.01.2016 12:31, Soumya Koduri написав: I tried to debug the inode* related leaks and seen some improvements after applying the below patches when ran the same test (but will smaller load). Could you please apply those patches & confirm the same? a) http://review.gluster.org/13125 This will fix the inodes & their ctx related leaks during unexport and the program exit. Please check the valgrind output after applying the patch. It should not list any inodes related memory as lost. b) http://review.gluster.org/13096 The reason the change in Entries_HWMARK (in your earlier mail) dint have much effect is that the inode_nlookup count doesn't become zero for those handles/inodes being closed by ganesha. Hence those inodes shall get added to inode lru list instead of purge list which shall get forcefully purged only when the number of gfapi inode table entries reaches its limit (which is 137012). This patch fixes those 'nlookup' counts. Please apply this patch and reduce 'Entries_HWMARK' to much lower value and check if it decreases the in-memory being consumed by ganesha process while being active. CACHEINODE { Entries_HWMark = 500; } Note: I see an issue with nfs-ganesha during exit when the option 'Entries_HWMARK' gets changed. This is not related to any of the above patches (or rather Gluster) and I am currently debugging it. Thanks, Soumya On 12/25/2015 11:34 PM, Oleksandr Natalenko wrote: 1. test with Cache_Size = 256 and Entries_HWMark = 4096 Before find . -type f: root 3120 0.6 11.0 879120 208408 ? Ssl 17:39 0:00 /usr/bin/ ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT After: root 3120 11.4 24.3 1170076 458168 ? Ssl 17:39 13:39 /usr/bin/ ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT ~250M leak. 2. test with default values (after ganesha restart) Before: root 24937 1.3 10.4 875016 197808 ? Ssl 19:39 0:00 /usr/bin/ ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT After: root 24937 3.5 18.9 1022544 356340 ? Ssl 19:39 0:40 /usr/bin/ ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT ~159M leak. No reasonable correlation detected. Second test was finished much faster than first (I guess, server-side GlusterFS cache or server kernel page cache is the cause). There are ~1.8M files on this test volume. On пʼятниця, 25 грудня 2015 р. 20:28:13 EET Soumya Koduri wrote: On 12/24/2015 09:17 PM, Oleksandr Natalenko wrote: Another addition: it seems to be GlusterFS API library memory leak because NFS-Ganesha also consumes huge amount of memory while doing ordinary "find . -type f" via NFSv4.2 on remote client. Here is memory usage: === root 5416 34.2 78.5 2047176 1480552 ? Ssl 12:02 117:54 /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT === 1.4G is too much for simple stat() :(. Ideas? nfs-ganesha also has cache layer
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
On 01/06/2016 03:53 AM, Oleksandr Natalenko wrote: OK, I've repeated the same traversing test with patched GlusterFS API, and here is new Valgrind log: https://gist.github.com/17ecb16a11c9aed957f5 Fuse mount doesn't use gfapi helper. Does your above GlusterFS API application call glfs_fini() during exit? glfs_fini() is responsible for freeing the memory consumed by gfAPI applications. Could you repeat the test with nfs-ganesha (which for sure calls glfs_fini() and purges inodes if exceeds its inode cache limit) if possible. Thanks, Soumya Still leaks. On вівторок, 5 січня 2016 р. 22:52:25 EET Soumya Koduri wrote: On 01/05/2016 05:56 PM, Oleksandr Natalenko wrote: Unfortunately, both patches didn't make any difference for me. I've patched 3.7.6 with both patches, recompiled and installed patched GlusterFS package on client side and mounted volume with ~2M of files. The I performed usual tree traverse with simple "find". Memory RES value went from ~130M at the moment of mounting to ~1.5G after traversing the volume for ~40 mins. Valgrind log still shows lots of leaks. Here it is: https://gist.github.com/56906ca6e657c4ffa4a1 Looks like you had done fuse mount. The patches which I have pasted below apply to gfapi/nfs-ganesha applications. Also, to resolve the nfs-ganesha issue which I had mentioned below (in case if Entries_HWMARK option gets changed), I have posted below fix - https://review.gerrithub.io/#/c/258687 Thanks, Soumya Ideas? 05.01.2016 12:31, Soumya Koduri написав: I tried to debug the inode* related leaks and seen some improvements after applying the below patches when ran the same test (but will smaller load). Could you please apply those patches & confirm the same? a) http://review.gluster.org/13125 This will fix the inodes & their ctx related leaks during unexport and the program exit. Please check the valgrind output after applying the patch. It should not list any inodes related memory as lost. b) http://review.gluster.org/13096 The reason the change in Entries_HWMARK (in your earlier mail) dint have much effect is that the inode_nlookup count doesn't become zero for those handles/inodes being closed by ganesha. Hence those inodes shall get added to inode lru list instead of purge list which shall get forcefully purged only when the number of gfapi inode table entries reaches its limit (which is 137012). This patch fixes those 'nlookup' counts. Please apply this patch and reduce 'Entries_HWMARK' to much lower value and check if it decreases the in-memory being consumed by ganesha process while being active. CACHEINODE { Entries_HWMark = 500; } Note: I see an issue with nfs-ganesha during exit when the option 'Entries_HWMARK' gets changed. This is not related to any of the above patches (or rather Gluster) and I am currently debugging it. Thanks, Soumya On 12/25/2015 11:34 PM, Oleksandr Natalenko wrote: 1. test with Cache_Size = 256 and Entries_HWMark = 4096 Before find . -type f: root 3120 0.6 11.0 879120 208408 ? Ssl 17:39 0:00 /usr/bin/ ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT After: root 3120 11.4 24.3 1170076 458168 ? Ssl 17:39 13:39 /usr/bin/ ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT ~250M leak. 2. test with default values (after ganesha restart) Before: root 24937 1.3 10.4 875016 197808 ? Ssl 19:39 0:00 /usr/bin/ ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT After: root 24937 3.5 18.9 1022544 356340 ? Ssl 19:39 0:40 /usr/bin/ ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT ~159M leak. No reasonable correlation detected. Second test was finished much faster than first (I guess, server-side GlusterFS cache or server kernel page cache is the cause). There are ~1.8M files on this test volume. On пʼятниця, 25 грудня 2015 р. 20:28:13 EET Soumya Koduri wrote: On 12/24/2015 09:17 PM, Oleksandr Natalenko wrote: Another addition: it seems to be GlusterFS API library memory leak because NFS-Ganesha also consumes huge amount of memory while doing ordinary "find . -type f" via NFSv4.2 on remote client. Here is memory usage: === root 5416 34.2 78.5 2047176 1480552 ? Ssl 12:02 117:54 /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT === 1.4G is too much for simple stat() :(. Ideas? nfs-ganesha also has cache layer which can scale to millions of entries depending on the number of files/directories being looked upon. However there are parameters to tune it. So either try stat with few entries or add below block in nfs-ganesha.conf file, set low limits and check the difference. That may help us narrow down how much memory actually consumed by core nfs-ganesha and gfAPI. CACHEINODE { Cache_Size(uint32, range 1 to UINT32_MAX, default 32633); # cache size Entries_HWMark(uint32
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
OK, I've repeated the same traversing test with patched GlusterFS API, and here is new Valgrind log: https://gist.github.com/17ecb16a11c9aed957f5 Still leaks. On вівторок, 5 січня 2016 р. 22:52:25 EET Soumya Koduri wrote: > On 01/05/2016 05:56 PM, Oleksandr Natalenko wrote: > > Unfortunately, both patches didn't make any difference for me. > > > > I've patched 3.7.6 with both patches, recompiled and installed patched > > GlusterFS package on client side and mounted volume with ~2M of files. > > The I performed usual tree traverse with simple "find". > > > > Memory RES value went from ~130M at the moment of mounting to ~1.5G > > after traversing the volume for ~40 mins. Valgrind log still shows lots > > of leaks. Here it is: > > > > https://gist.github.com/56906ca6e657c4ffa4a1 > > Looks like you had done fuse mount. The patches which I have pasted > below apply to gfapi/nfs-ganesha applications. > > Also, to resolve the nfs-ganesha issue which I had mentioned below (in > case if Entries_HWMARK option gets changed), I have posted below fix - > https://review.gerrithub.io/#/c/258687 > > Thanks, > Soumya > > > Ideas? > > > > 05.01.2016 12:31, Soumya Koduri написав: > >> I tried to debug the inode* related leaks and seen some improvements > >> after applying the below patches when ran the same test (but will > >> smaller load). Could you please apply those patches & confirm the > >> same? > >> > >> a) http://review.gluster.org/13125 > >> > >> This will fix the inodes & their ctx related leaks during unexport and > >> the program exit. Please check the valgrind output after applying the > >> patch. It should not list any inodes related memory as lost. > >> > >> b) http://review.gluster.org/13096 > >> > >> The reason the change in Entries_HWMARK (in your earlier mail) dint > >> have much effect is that the inode_nlookup count doesn't become zero > >> for those handles/inodes being closed by ganesha. Hence those inodes > >> shall get added to inode lru list instead of purge list which shall > >> get forcefully purged only when the number of gfapi inode table > >> entries reaches its limit (which is 137012). > >> > >> This patch fixes those 'nlookup' counts. Please apply this patch and > >> reduce 'Entries_HWMARK' to much lower value and check if it decreases > >> the in-memory being consumed by ganesha process while being active. > >> > >> CACHEINODE { > >> > >> Entries_HWMark = 500; > >> > >> } > >> > >> > >> Note: I see an issue with nfs-ganesha during exit when the option > >> 'Entries_HWMARK' gets changed. This is not related to any of the above > >> patches (or rather Gluster) and I am currently debugging it. > >> > >> Thanks, > >> Soumya > >> > >> On 12/25/2015 11:34 PM, Oleksandr Natalenko wrote: > >>> 1. test with Cache_Size = 256 and Entries_HWMark = 4096 > >>> > >>> Before find . -type f: > >>> > >>> root 3120 0.6 11.0 879120 208408 ? Ssl 17:39 0:00 > >>> /usr/bin/ > >>> ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N > >>> NIV_EVENT > >>> > >>> After: > >>> > >>> root 3120 11.4 24.3 1170076 458168 ? Ssl 17:39 13:39 > >>> /usr/bin/ > >>> ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N > >>> NIV_EVENT > >>> > >>> ~250M leak. > >>> > >>> 2. test with default values (after ganesha restart) > >>> > >>> Before: > >>> > >>> root 24937 1.3 10.4 875016 197808 ? Ssl 19:39 0:00 > >>> /usr/bin/ > >>> ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N > >>> NIV_EVENT > >>> > >>> After: > >>> > >>> root 24937 3.5 18.9 1022544 356340 ? Ssl 19:39 0:40 > >>> /usr/bin/ > >>> ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N > >>> NIV_EVENT > >>> > >>> ~159M leak. > >>> > >>> No reasonable correlation detected. Second test was finished much > >>> faster than > >>> first (I guess, server-side GlusterFS cache or server kernel page > >>> cache is the > >>> cause). > >>> > >>> There are ~1.8M files on this test volume. > >>> > >>> On пʼятниця, 25 грудня 2015 р. 20:28:13 EET Soumya Koduri wrote: > On 12/24/2015 09:17 PM, Oleksandr Natalenko wrote: > > Another addition: it seems to be GlusterFS API library memory leak > > because NFS-Ganesha also consumes huge amount of memory while doing > > ordinary "find . -type f" via NFSv4.2 on remote client. Here is memory > > usage: > > > > === > > root 5416 34.2 78.5 2047176 1480552 ? Ssl 12:02 117:54 > > /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f > > /etc/ganesha/ganesha.conf -N NIV_EVENT > > === > > > > 1.4G is too much for simple stat() :(. > > > > Ideas? > > nfs-ganesha also has cache layer which can scale to millions of entries > depending on the number of files/directories being looked upon. However > there are parameters to tune it. So either try stat with few entries or > add below block in n
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
Correct, I used FUSE mount. Shouldn't gfapi be used by FUSE mount helper (/ usr/bin/glusterfs)? On вівторок, 5 січня 2016 р. 22:52:25 EET Soumya Koduri wrote: > On 01/05/2016 05:56 PM, Oleksandr Natalenko wrote: > > Unfortunately, both patches didn't make any difference for me. > > > > I've patched 3.7.6 with both patches, recompiled and installed patched > > GlusterFS package on client side and mounted volume with ~2M of files. > > The I performed usual tree traverse with simple "find". > > > > Memory RES value went from ~130M at the moment of mounting to ~1.5G > > after traversing the volume for ~40 mins. Valgrind log still shows lots > > of leaks. Here it is: > > > > https://gist.github.com/56906ca6e657c4ffa4a1 > > Looks like you had done fuse mount. The patches which I have pasted > below apply to gfapi/nfs-ganesha applications. > > Also, to resolve the nfs-ganesha issue which I had mentioned below (in > case if Entries_HWMARK option gets changed), I have posted below fix - > https://review.gerrithub.io/#/c/258687 > > Thanks, > Soumya > > > Ideas? > > > > 05.01.2016 12:31, Soumya Koduri написав: > >> I tried to debug the inode* related leaks and seen some improvements > >> after applying the below patches when ran the same test (but will > >> smaller load). Could you please apply those patches & confirm the > >> same? > >> > >> a) http://review.gluster.org/13125 > >> > >> This will fix the inodes & their ctx related leaks during unexport and > >> the program exit. Please check the valgrind output after applying the > >> patch. It should not list any inodes related memory as lost. > >> > >> b) http://review.gluster.org/13096 > >> > >> The reason the change in Entries_HWMARK (in your earlier mail) dint > >> have much effect is that the inode_nlookup count doesn't become zero > >> for those handles/inodes being closed by ganesha. Hence those inodes > >> shall get added to inode lru list instead of purge list which shall > >> get forcefully purged only when the number of gfapi inode table > >> entries reaches its limit (which is 137012). > >> > >> This patch fixes those 'nlookup' counts. Please apply this patch and > >> reduce 'Entries_HWMARK' to much lower value and check if it decreases > >> the in-memory being consumed by ganesha process while being active. > >> > >> CACHEINODE { > >> > >> Entries_HWMark = 500; > >> > >> } > >> > >> > >> Note: I see an issue with nfs-ganesha during exit when the option > >> 'Entries_HWMARK' gets changed. This is not related to any of the above > >> patches (or rather Gluster) and I am currently debugging it. > >> > >> Thanks, > >> Soumya > >> > >> On 12/25/2015 11:34 PM, Oleksandr Natalenko wrote: > >>> 1. test with Cache_Size = 256 and Entries_HWMark = 4096 > >>> > >>> Before find . -type f: > >>> > >>> root 3120 0.6 11.0 879120 208408 ? Ssl 17:39 0:00 > >>> /usr/bin/ > >>> ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N > >>> NIV_EVENT > >>> > >>> After: > >>> > >>> root 3120 11.4 24.3 1170076 458168 ? Ssl 17:39 13:39 > >>> /usr/bin/ > >>> ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N > >>> NIV_EVENT > >>> > >>> ~250M leak. > >>> > >>> 2. test with default values (after ganesha restart) > >>> > >>> Before: > >>> > >>> root 24937 1.3 10.4 875016 197808 ? Ssl 19:39 0:00 > >>> /usr/bin/ > >>> ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N > >>> NIV_EVENT > >>> > >>> After: > >>> > >>> root 24937 3.5 18.9 1022544 356340 ? Ssl 19:39 0:40 > >>> /usr/bin/ > >>> ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N > >>> NIV_EVENT > >>> > >>> ~159M leak. > >>> > >>> No reasonable correlation detected. Second test was finished much > >>> faster than > >>> first (I guess, server-side GlusterFS cache or server kernel page > >>> cache is the > >>> cause). > >>> > >>> There are ~1.8M files on this test volume. > >>> > >>> On пʼятниця, 25 грудня 2015 р. 20:28:13 EET Soumya Koduri wrote: > On 12/24/2015 09:17 PM, Oleksandr Natalenko wrote: > > Another addition: it seems to be GlusterFS API library memory leak > > because NFS-Ganesha also consumes huge amount of memory while doing > > ordinary "find . -type f" via NFSv4.2 on remote client. Here is memory > > usage: > > > > === > > root 5416 34.2 78.5 2047176 1480552 ? Ssl 12:02 117:54 > > /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f > > /etc/ganesha/ganesha.conf -N NIV_EVENT > > === > > > > 1.4G is too much for simple stat() :(. > > > > Ideas? > > nfs-ganesha also has cache layer which can scale to millions of entries > depending on the number of files/directories being looked upon. However > there are parameters to tune it. So either try stat with few entries or > add below block in nfs-ganesha.conf file, set low limits and check the > differen
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
On 01/05/2016 05:56 PM, Oleksandr Natalenko wrote: Unfortunately, both patches didn't make any difference for me. I've patched 3.7.6 with both patches, recompiled and installed patched GlusterFS package on client side and mounted volume with ~2M of files. The I performed usual tree traverse with simple "find". Memory RES value went from ~130M at the moment of mounting to ~1.5G after traversing the volume for ~40 mins. Valgrind log still shows lots of leaks. Here it is: https://gist.github.com/56906ca6e657c4ffa4a1 Looks like you had done fuse mount. The patches which I have pasted below apply to gfapi/nfs-ganesha applications. Also, to resolve the nfs-ganesha issue which I had mentioned below (in case if Entries_HWMARK option gets changed), I have posted below fix - https://review.gerrithub.io/#/c/258687 Thanks, Soumya Ideas? 05.01.2016 12:31, Soumya Koduri написав: I tried to debug the inode* related leaks and seen some improvements after applying the below patches when ran the same test (but will smaller load). Could you please apply those patches & confirm the same? a) http://review.gluster.org/13125 This will fix the inodes & their ctx related leaks during unexport and the program exit. Please check the valgrind output after applying the patch. It should not list any inodes related memory as lost. b) http://review.gluster.org/13096 The reason the change in Entries_HWMARK (in your earlier mail) dint have much effect is that the inode_nlookup count doesn't become zero for those handles/inodes being closed by ganesha. Hence those inodes shall get added to inode lru list instead of purge list which shall get forcefully purged only when the number of gfapi inode table entries reaches its limit (which is 137012). This patch fixes those 'nlookup' counts. Please apply this patch and reduce 'Entries_HWMARK' to much lower value and check if it decreases the in-memory being consumed by ganesha process while being active. CACHEINODE { Entries_HWMark = 500; } Note: I see an issue with nfs-ganesha during exit when the option 'Entries_HWMARK' gets changed. This is not related to any of the above patches (or rather Gluster) and I am currently debugging it. Thanks, Soumya On 12/25/2015 11:34 PM, Oleksandr Natalenko wrote: 1. test with Cache_Size = 256 and Entries_HWMark = 4096 Before find . -type f: root 3120 0.6 11.0 879120 208408 ? Ssl 17:39 0:00 /usr/bin/ ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT After: root 3120 11.4 24.3 1170076 458168 ? Ssl 17:39 13:39 /usr/bin/ ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT ~250M leak. 2. test with default values (after ganesha restart) Before: root 24937 1.3 10.4 875016 197808 ? Ssl 19:39 0:00 /usr/bin/ ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT After: root 24937 3.5 18.9 1022544 356340 ? Ssl 19:39 0:40 /usr/bin/ ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT ~159M leak. No reasonable correlation detected. Second test was finished much faster than first (I guess, server-side GlusterFS cache or server kernel page cache is the cause). There are ~1.8M files on this test volume. On пʼятниця, 25 грудня 2015 р. 20:28:13 EET Soumya Koduri wrote: On 12/24/2015 09:17 PM, Oleksandr Natalenko wrote: Another addition: it seems to be GlusterFS API library memory leak because NFS-Ganesha also consumes huge amount of memory while doing ordinary "find . -type f" via NFSv4.2 on remote client. Here is memory usage: === root 5416 34.2 78.5 2047176 1480552 ? Ssl 12:02 117:54 /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT === 1.4G is too much for simple stat() :(. Ideas? nfs-ganesha also has cache layer which can scale to millions of entries depending on the number of files/directories being looked upon. However there are parameters to tune it. So either try stat with few entries or add below block in nfs-ganesha.conf file, set low limits and check the difference. That may help us narrow down how much memory actually consumed by core nfs-ganesha and gfAPI. CACHEINODE { Cache_Size(uint32, range 1 to UINT32_MAX, default 32633); # cache size Entries_HWMark(uint32, range 1 to UINT32_MAX, default 10); #Max no. of entries in the cache. } Thanks, Soumya 24.12.2015 16:32, Oleksandr Natalenko написав: Still actual issue for 3.7.6. Any suggestions? 24.09.2015 10:14, Oleksandr Natalenko написав: In our GlusterFS deployment we've encountered something like memory leak in GlusterFS FUSE client. We use replicated (×2) GlusterFS volume to store mail (exim+dovecot, maildir format). Here is inode stats for both bricks and mountpoint: === Brick 1 (Server 1): Filesystem Inodes IUsed IFree IUse% Mounted on /dev/mapper/vg_vd1_misc-lv08_mail
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
Unfortunately, both patches didn't make any difference for me. I've patched 3.7.6 with both patches, recompiled and installed patched GlusterFS package on client side and mounted volume with ~2M of files. The I performed usual tree traverse with simple "find". Memory RES value went from ~130M at the moment of mounting to ~1.5G after traversing the volume for ~40 mins. Valgrind log still shows lots of leaks. Here it is: https://gist.github.com/56906ca6e657c4ffa4a1 Ideas? 05.01.2016 12:31, Soumya Koduri написав: I tried to debug the inode* related leaks and seen some improvements after applying the below patches when ran the same test (but will smaller load). Could you please apply those patches & confirm the same? a) http://review.gluster.org/13125 This will fix the inodes & their ctx related leaks during unexport and the program exit. Please check the valgrind output after applying the patch. It should not list any inodes related memory as lost. b) http://review.gluster.org/13096 The reason the change in Entries_HWMARK (in your earlier mail) dint have much effect is that the inode_nlookup count doesn't become zero for those handles/inodes being closed by ganesha. Hence those inodes shall get added to inode lru list instead of purge list which shall get forcefully purged only when the number of gfapi inode table entries reaches its limit (which is 137012). This patch fixes those 'nlookup' counts. Please apply this patch and reduce 'Entries_HWMARK' to much lower value and check if it decreases the in-memory being consumed by ganesha process while being active. CACHEINODE { Entries_HWMark = 500; } Note: I see an issue with nfs-ganesha during exit when the option 'Entries_HWMARK' gets changed. This is not related to any of the above patches (or rather Gluster) and I am currently debugging it. Thanks, Soumya On 12/25/2015 11:34 PM, Oleksandr Natalenko wrote: 1. test with Cache_Size = 256 and Entries_HWMark = 4096 Before find . -type f: root 3120 0.6 11.0 879120 208408 ? Ssl 17:39 0:00 /usr/bin/ ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT After: root 3120 11.4 24.3 1170076 458168 ? Ssl 17:39 13:39 /usr/bin/ ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT ~250M leak. 2. test with default values (after ganesha restart) Before: root 24937 1.3 10.4 875016 197808 ? Ssl 19:39 0:00 /usr/bin/ ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT After: root 24937 3.5 18.9 1022544 356340 ? Ssl 19:39 0:40 /usr/bin/ ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT ~159M leak. No reasonable correlation detected. Second test was finished much faster than first (I guess, server-side GlusterFS cache or server kernel page cache is the cause). There are ~1.8M files on this test volume. On пʼятниця, 25 грудня 2015 р. 20:28:13 EET Soumya Koduri wrote: On 12/24/2015 09:17 PM, Oleksandr Natalenko wrote: Another addition: it seems to be GlusterFS API library memory leak because NFS-Ganesha also consumes huge amount of memory while doing ordinary "find . -type f" via NFSv4.2 on remote client. Here is memory usage: === root 5416 34.2 78.5 2047176 1480552 ? Ssl 12:02 117:54 /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT === 1.4G is too much for simple stat() :(. Ideas? nfs-ganesha also has cache layer which can scale to millions of entries depending on the number of files/directories being looked upon. However there are parameters to tune it. So either try stat with few entries or add below block in nfs-ganesha.conf file, set low limits and check the difference. That may help us narrow down how much memory actually consumed by core nfs-ganesha and gfAPI. CACHEINODE { Cache_Size(uint32, range 1 to UINT32_MAX, default 32633); # cache size Entries_HWMark(uint32, range 1 to UINT32_MAX, default 10); #Max no. of entries in the cache. } Thanks, Soumya 24.12.2015 16:32, Oleksandr Natalenko написав: Still actual issue for 3.7.6. Any suggestions? 24.09.2015 10:14, Oleksandr Natalenko написав: In our GlusterFS deployment we've encountered something like memory leak in GlusterFS FUSE client. We use replicated (×2) GlusterFS volume to store mail (exim+dovecot, maildir format). Here is inode stats for both bricks and mountpoint: === Brick 1 (Server 1): Filesystem Inodes IUsed IFree IUse% Mounted on /dev/mapper/vg_vd1_misc-lv08_mail 578768144 10954918 5678132262% /bricks/r6sdLV08_vd1_mail Brick 2 (Server 2): Filesystem Inodes IUsed IFree IUse% Mounted on /dev/mapper/vg_vd0_misc-lv07_mail 578767984 10954913 5678130712% /bricks/r6sdLV07_vd0_mail Mountpoint (Server 3)
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
I tried to debug the inode* related leaks and seen some improvements after applying the below patches when ran the same test (but will smaller load). Could you please apply those patches & confirm the same? a) http://review.gluster.org/13125 This will fix the inodes & their ctx related leaks during unexport and the program exit. Please check the valgrind output after applying the patch. It should not list any inodes related memory as lost. b) http://review.gluster.org/13096 The reason the change in Entries_HWMARK (in your earlier mail) dint have much effect is that the inode_nlookup count doesn't become zero for those handles/inodes being closed by ganesha. Hence those inodes shall get added to inode lru list instead of purge list which shall get forcefully purged only when the number of gfapi inode table entries reaches its limit (which is 137012). This patch fixes those 'nlookup' counts. Please apply this patch and reduce 'Entries_HWMARK' to much lower value and check if it decreases the in-memory being consumed by ganesha process while being active. CACHEINODE { Entries_HWMark = 500; } Note: I see an issue with nfs-ganesha during exit when the option 'Entries_HWMARK' gets changed. This is not related to any of the above patches (or rather Gluster) and I am currently debugging it. Thanks, Soumya On 12/25/2015 11:34 PM, Oleksandr Natalenko wrote: 1. test with Cache_Size = 256 and Entries_HWMark = 4096 Before find . -type f: root 3120 0.6 11.0 879120 208408 ? Ssl 17:39 0:00 /usr/bin/ ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT After: root 3120 11.4 24.3 1170076 458168 ? Ssl 17:39 13:39 /usr/bin/ ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT ~250M leak. 2. test with default values (after ganesha restart) Before: root 24937 1.3 10.4 875016 197808 ? Ssl 19:39 0:00 /usr/bin/ ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT After: root 24937 3.5 18.9 1022544 356340 ? Ssl 19:39 0:40 /usr/bin/ ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT ~159M leak. No reasonable correlation detected. Second test was finished much faster than first (I guess, server-side GlusterFS cache or server kernel page cache is the cause). There are ~1.8M files on this test volume. On пʼятниця, 25 грудня 2015 р. 20:28:13 EET Soumya Koduri wrote: On 12/24/2015 09:17 PM, Oleksandr Natalenko wrote: Another addition: it seems to be GlusterFS API library memory leak because NFS-Ganesha also consumes huge amount of memory while doing ordinary "find . -type f" via NFSv4.2 on remote client. Here is memory usage: === root 5416 34.2 78.5 2047176 1480552 ? Ssl 12:02 117:54 /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT === 1.4G is too much for simple stat() :(. Ideas? nfs-ganesha also has cache layer which can scale to millions of entries depending on the number of files/directories being looked upon. However there are parameters to tune it. So either try stat with few entries or add below block in nfs-ganesha.conf file, set low limits and check the difference. That may help us narrow down how much memory actually consumed by core nfs-ganesha and gfAPI. CACHEINODE { Cache_Size(uint32, range 1 to UINT32_MAX, default 32633); # cache size Entries_HWMark(uint32, range 1 to UINT32_MAX, default 10); #Max no. of entries in the cache. } Thanks, Soumya 24.12.2015 16:32, Oleksandr Natalenko написав: Still actual issue for 3.7.6. Any suggestions? 24.09.2015 10:14, Oleksandr Natalenko написав: In our GlusterFS deployment we've encountered something like memory leak in GlusterFS FUSE client. We use replicated (×2) GlusterFS volume to store mail (exim+dovecot, maildir format). Here is inode stats for both bricks and mountpoint: === Brick 1 (Server 1): Filesystem InodesIUsed IFree IUse% Mounted on /dev/mapper/vg_vd1_misc-lv08_mail 578768144 10954918 5678132262% /bricks/r6sdLV08_vd1_mail Brick 2 (Server 2): Filesystem InodesIUsed IFree IUse% Mounted on /dev/mapper/vg_vd0_misc-lv07_mail 578767984 10954913 5678130712% /bricks/r6sdLV07_vd0_mail Mountpoint (Server 3): Filesystem InodesIUsed IFree IUse% Mounted on glusterfs.xxx:mail 578767760 10954915 567812845 2% /var/spool/mail/virtual === glusterfs.xxx domain has two A records for both Server 1 and Server 2. Here is volume info: === Volume Name: mail Type: Replicate Volume ID: f564e85c-7aa6-4170-9417-1f501aa98cd2 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: server1.xxx:/bricks/r6sdLV08_vd1_mail/mail Brick2: server2.xxx:/bricks/r6sdLV07_vd0_mail
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
Here is another Valgrind log of similar scenario but with drop_caches before umount: https://gist.github.com/06997ecc8c7bce83aec1 Also, I've tried to drop caches on production VM with GluserFS volume mounted and memleaking for several weeks with absolutely no effect: === root 945 0.1 48.2 1273900 739244 ? Ssl 2015 58:54 /usr/sbin/ glusterfs --volfile-server=server.example.com --volfile-id=volume /mnt/volume === The numbers above stayed the same before drop as well as 5 mins after the drop. On неділя, 3 січня 2016 р. 13:35:51 EET Vijay Bellur wrote: > /proc/sys/vm/drop_caches ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
On 01/03/2016 09:23 AM, Oleksandr Natalenko wrote: Another Valgrind run. I did the following: === valgrind --leak-check=full --show-leak-kinds=all --log- file="valgrind_fuse.log" /usr/bin/glusterfs -N --volfile- server=some.server.com --volfile-id=somevolume /mnt/volume === then cd to /mnt/volume and find . -type f. After traversing some part of hierarchy I've stopped find and did umount /mnt/volume. Here is valgrind_fuse.log file: https://gist.github.com/7e2679e1e72e48f75a2b Can you please try the same by dropping caches before umount? echo 3 > /proc/sys/vm/drop_caches Gluster relies on vfs sending forgets and releases to clean up the inodes and the contexts in the inodes maintained by various translators. Thanks, Vijay ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
Another Valgrind run. I did the following: === valgrind --leak-check=full --show-leak-kinds=all --log- file="valgrind_fuse.log" /usr/bin/glusterfs -N --volfile- server=some.server.com --volfile-id=somevolume /mnt/volume === then cd to /mnt/volume and find . -type f. After traversing some part of hierarchy I've stopped find and did umount /mnt/volume. Here is valgrind_fuse.log file: https://gist.github.com/7e2679e1e72e48f75a2b On четвер, 31 грудня 2015 р. 14:09:03 EET Soumya Koduri wrote: > On 12/28/2015 02:32 PM, Soumya Koduri wrote: > > - Original Message - > > > >> From: "Pranith Kumar Karampuri" > >> To: "Oleksandr Natalenko" , "Soumya Koduri" > >> Cc: gluster-us...@gluster.org, > >> gluster-devel@gluster.org > >> Sent: Monday, December 28, 2015 9:32:07 AM > >> Subject: Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS > >> FUSE client>> > >> On 12/26/2015 04:45 AM, Oleksandr Natalenko wrote: > >>> Also, here is valgrind output with our custom tool, that does GlusterFS > >>> volume > >>> traversing (with simple stats) just like find tool. In this case > >>> NFS-Ganesha > >>> is not used. > >>> > >>> https://gist.github.com/e4602a50d3c98f7a2766 > >> > >> hi Oleksandr, > >> > >> I went through the code. Both NFS Ganesha and the custom tool use > >> > >> gfapi and the leak is stemming from that. I am not very familiar with > >> this part of code but there seems to be one inode_unref() that is > >> missing in failure path of resolution. Not sure if that is corresponding > >> to the leaks. > >> > >> Soumya, > >> > >> Could this be the issue? review.gluster.org seems to be down. So > >> > >> couldn't send the patch. Please ping me on IRC. > >> diff --git a/api/src/glfs-resolve.c b/api/src/glfs-resolve.c > >> index b5efcba..52b538b 100644 > >> --- a/api/src/glfs-resolve.c > >> +++ b/api/src/glfs-resolve.c > >> @@ -467,9 +467,11 @@ priv_glfs_resolve_at (struct glfs *fs, xlator_t > >> *subvol, inode_t *at, > >> > >> } > >> > >> } > >> > >> - if (parent && next_component) > >> + if (parent && next_component) { > >> + inode_unref (parent); > >> + parent = NULL; > >> > >> /* resolution failed mid-way */ > >> goto out; > >> > >> +} > >> > >> /* At this point, all components up to the last parent > >> directory > >> > >> have been resolved successfully (@parent). Resolution of > >> > >> basename > > > > yes. This could be one of the reasons. There are few leaks with respect to > > inode references in gfAPI. See below. > > > > > > On GlusterFS side, looks like majority of the leaks are related to inodes > > and their contexts. Possible reasons which I can think of are: > > > > 1) When there is a graph switch, old inode table and their entries are not > > purged (this is a known issue). There was an effort put to fix this > > issue. But I think it had other side-effects and hence not been applied. > > Maybe we should revive those changes again. > > > > 2) With regard to above, old entries can be purged in case if any request > > comes with the reference to old inode (as part of 'glfs_resolve_inode'), > > provided their reference counts are properly decremented. But this is not > > happening at the moment in gfapi. > > > > 3) Applications should hold and release their reference as needed and > > required. There are certain fixes needed in this area as well (including > > the fix provided by Pranith above).> > > From code-inspection, have made changes to fix few leaks of case (2) & > > (3) with respect to gfAPI.> > > http://review.gluster.org/#/c/13096 (yet to test the changes) > > > > I haven't yet narrowed down any suspects pertaining to only NFS-Ganesha. > > Will re-check and update. > I tried similar tests but with smaller set of files. I could see the > inode_ctx leak even without graph switches involved. I suspect that > could be because valgrind checks for memory leaks during the exit of the > program. We call 'glfs_fini()' to cleanup the memory being used b
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
On 12/28/2015 02:32 PM, Soumya Koduri wrote: - Original Message - From: "Pranith Kumar Karampuri" To: "Oleksandr Natalenko" , "Soumya Koduri" Cc: gluster-us...@gluster.org, gluster-devel@gluster.org Sent: Monday, December 28, 2015 9:32:07 AM Subject: Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client On 12/26/2015 04:45 AM, Oleksandr Natalenko wrote: Also, here is valgrind output with our custom tool, that does GlusterFS volume traversing (with simple stats) just like find tool. In this case NFS-Ganesha is not used. https://gist.github.com/e4602a50d3c98f7a2766 hi Oleksandr, I went through the code. Both NFS Ganesha and the custom tool use gfapi and the leak is stemming from that. I am not very familiar with this part of code but there seems to be one inode_unref() that is missing in failure path of resolution. Not sure if that is corresponding to the leaks. Soumya, Could this be the issue? review.gluster.org seems to be down. So couldn't send the patch. Please ping me on IRC. diff --git a/api/src/glfs-resolve.c b/api/src/glfs-resolve.c index b5efcba..52b538b 100644 --- a/api/src/glfs-resolve.c +++ b/api/src/glfs-resolve.c @@ -467,9 +467,11 @@ priv_glfs_resolve_at (struct glfs *fs, xlator_t *subvol, inode_t *at, } } - if (parent && next_component) + if (parent && next_component) { + inode_unref (parent); + parent = NULL; /* resolution failed mid-way */ goto out; +} /* At this point, all components up to the last parent directory have been resolved successfully (@parent). Resolution of basename yes. This could be one of the reasons. There are few leaks with respect to inode references in gfAPI. See below. On GlusterFS side, looks like majority of the leaks are related to inodes and their contexts. Possible reasons which I can think of are: 1) When there is a graph switch, old inode table and their entries are not purged (this is a known issue). There was an effort put to fix this issue. But I think it had other side-effects and hence not been applied. Maybe we should revive those changes again. 2) With regard to above, old entries can be purged in case if any request comes with the reference to old inode (as part of 'glfs_resolve_inode'), provided their reference counts are properly decremented. But this is not happening at the moment in gfapi. 3) Applications should hold and release their reference as needed and required. There are certain fixes needed in this area as well (including the fix provided by Pranith above). From code-inspection, have made changes to fix few leaks of case (2) & (3) with respect to gfAPI. http://review.gluster.org/#/c/13096 (yet to test the changes) I haven't yet narrowed down any suspects pertaining to only NFS-Ganesha. Will re-check and update. I tried similar tests but with smaller set of files. I could see the inode_ctx leak even without graph switches involved. I suspect that could be because valgrind checks for memory leaks during the exit of the program. We call 'glfs_fini()' to cleanup the memory being used by gfapi during exit. Those inode_ctx leaks are result of some inodes being left during inode_table cleanup. I have submitted below patch to address this issue. http://review.gluster.org/13125 However this shall help only if there are volume un-exports being involved or program being exited. It still doesn't address the actual RAM being consumed by the application when active. Thanks, Soumya Thanks, Soumya Pranith One may see GlusterFS-related leaks here as well. On пʼятниця, 25 грудня 2015 р. 20:28:13 EET Soumya Koduri wrote: On 12/24/2015 09:17 PM, Oleksandr Natalenko wrote: Another addition: it seems to be GlusterFS API library memory leak because NFS-Ganesha also consumes huge amount of memory while doing ordinary "find . -type f" via NFSv4.2 on remote client. Here is memory usage: === root 5416 34.2 78.5 2047176 1480552 ? Ssl 12:02 117:54 /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT === 1.4G is too much for simple stat() :(. Ideas? nfs-ganesha also has cache layer which can scale to millions of entries depending on the number of files/directories being looked upon. However there are parameters to tune it. So either try stat with few entries or add below block in nfs-ganesha.conf file, set low limits and check the difference. That may help us narrow down how much memory actually consumed by core nfs-ganesha and gfAPI. CACHEINODE { Cache_Size(uint32, range 1 to UINT32_MAX, default 32633); # cache size Entries_HWMark(uint32, range 1 to UINT32_MAX, default 10); #Max no. of entries in the cache. } Thanks, Soumya 24.12.2015 16:32, Oleksandr
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
- Original Message - > From: "Pranith Kumar Karampuri" > To: "Oleksandr Natalenko" , "Soumya Koduri" > > Cc: gluster-us...@gluster.org, gluster-devel@gluster.org > Sent: Monday, December 28, 2015 9:32:07 AM > Subject: Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE > client > > > > On 12/26/2015 04:45 AM, Oleksandr Natalenko wrote: > > Also, here is valgrind output with our custom tool, that does GlusterFS > > volume > > traversing (with simple stats) just like find tool. In this case > > NFS-Ganesha > > is not used. > > > > https://gist.github.com/e4602a50d3c98f7a2766 > hi Oleksandr, >I went through the code. Both NFS Ganesha and the custom tool use > gfapi and the leak is stemming from that. I am not very familiar with > this part of code but there seems to be one inode_unref() that is > missing in failure path of resolution. Not sure if that is corresponding > to the leaks. > > Soumya, > Could this be the issue? review.gluster.org seems to be down. So > couldn't send the patch. Please ping me on IRC. > diff --git a/api/src/glfs-resolve.c b/api/src/glfs-resolve.c > index b5efcba..52b538b 100644 > --- a/api/src/glfs-resolve.c > +++ b/api/src/glfs-resolve.c > @@ -467,9 +467,11 @@ priv_glfs_resolve_at (struct glfs *fs, xlator_t > *subvol, inode_t *at, > } > } > > - if (parent && next_component) > + if (parent && next_component) { > + inode_unref (parent); > + parent = NULL; > /* resolution failed mid-way */ > goto out; > +} > > /* At this point, all components up to the last parent directory > have been resolved successfully (@parent). Resolution of > basename > yes. This could be one of the reasons. There are few leaks with respect to inode references in gfAPI. See below. On GlusterFS side, looks like majority of the leaks are related to inodes and their contexts. Possible reasons which I can think of are: 1) When there is a graph switch, old inode table and their entries are not purged (this is a known issue). There was an effort put to fix this issue. But I think it had other side-effects and hence not been applied. Maybe we should revive those changes again. 2) With regard to above, old entries can be purged in case if any request comes with the reference to old inode (as part of 'glfs_resolve_inode'), provided their reference counts are properly decremented. But this is not happening at the moment in gfapi. 3) Applications should hold and release their reference as needed and required. There are certain fixes needed in this area as well (including the fix provided by Pranith above). From code-inspection, have made changes to fix few leaks of case (2) & (3) with respect to gfAPI. http://review.gluster.org/#/c/13096 (yet to test the changes) I haven't yet narrowed down any suspects pertaining to only NFS-Ganesha. Will re-check and update. Thanks, Soumya > Pranith > > > > One may see GlusterFS-related leaks here as well. > > > > On пʼятниця, 25 грудня 2015 р. 20:28:13 EET Soumya Koduri wrote: > >> On 12/24/2015 09:17 PM, Oleksandr Natalenko wrote: > >>> Another addition: it seems to be GlusterFS API library memory leak > >>> because NFS-Ganesha also consumes huge amount of memory while doing > >>> ordinary "find . -type f" via NFSv4.2 on remote client. Here is memory > >>> usage: > >>> > >>> === > >>> root 5416 34.2 78.5 2047176 1480552 ? Ssl 12:02 117:54 > >>> /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f > >>> /etc/ganesha/ganesha.conf -N NIV_EVENT > >>> === > >>> > >>> 1.4G is too much for simple stat() :(. > >>> > >>> Ideas? > >> nfs-ganesha also has cache layer which can scale to millions of entries > >> depending on the number of files/directories being looked upon. However > >> there are parameters to tune it. So either try stat with few entries or > >> add below block in nfs-ganesha.conf file, set low limits and check the > >> difference. That may help us narrow down how much memory actually > >> consumed by core nfs-ganesha and gfAPI. > >> > >> CACHEINODE { > >>Cache_Size(uint32, range 1 to UINT32_MAX, default 32633); # cache size > >>Entries_HWMark(uint32, range 1 to UINT32_MAX, default 10); #Max no. > >> of entries in the cache. > >> } > >> > >> Thanks, > >> Soum
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
On 12/26/2015 04:45 AM, Oleksandr Natalenko wrote: Also, here is valgrind output with our custom tool, that does GlusterFS volume traversing (with simple stats) just like find tool. In this case NFS-Ganesha is not used. https://gist.github.com/e4602a50d3c98f7a2766 hi Oleksandr, I went through the code. Both NFS Ganesha and the custom tool use gfapi and the leak is stemming from that. I am not very familiar with this part of code but there seems to be one inode_unref() that is missing in failure path of resolution. Not sure if that is corresponding to the leaks. Soumya, Could this be the issue? review.gluster.org seems to be down. So couldn't send the patch. Please ping me on IRC. diff --git a/api/src/glfs-resolve.c b/api/src/glfs-resolve.c index b5efcba..52b538b 100644 --- a/api/src/glfs-resolve.c +++ b/api/src/glfs-resolve.c @@ -467,9 +467,11 @@ priv_glfs_resolve_at (struct glfs *fs, xlator_t *subvol, inode_t *at, } } - if (parent && next_component) + if (parent && next_component) { + inode_unref (parent); + parent = NULL; /* resolution failed mid-way */ goto out; +} /* At this point, all components up to the last parent directory have been resolved successfully (@parent). Resolution of basename Pranith One may see GlusterFS-related leaks here as well. On пʼятниця, 25 грудня 2015 р. 20:28:13 EET Soumya Koduri wrote: On 12/24/2015 09:17 PM, Oleksandr Natalenko wrote: Another addition: it seems to be GlusterFS API library memory leak because NFS-Ganesha also consumes huge amount of memory while doing ordinary "find . -type f" via NFSv4.2 on remote client. Here is memory usage: === root 5416 34.2 78.5 2047176 1480552 ? Ssl 12:02 117:54 /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT === 1.4G is too much for simple stat() :(. Ideas? nfs-ganesha also has cache layer which can scale to millions of entries depending on the number of files/directories being looked upon. However there are parameters to tune it. So either try stat with few entries or add below block in nfs-ganesha.conf file, set low limits and check the difference. That may help us narrow down how much memory actually consumed by core nfs-ganesha and gfAPI. CACHEINODE { Cache_Size(uint32, range 1 to UINT32_MAX, default 32633); # cache size Entries_HWMark(uint32, range 1 to UINT32_MAX, default 10); #Max no. of entries in the cache. } Thanks, Soumya 24.12.2015 16:32, Oleksandr Natalenko написав: Still actual issue for 3.7.6. Any suggestions? 24.09.2015 10:14, Oleksandr Natalenko написав: In our GlusterFS deployment we've encountered something like memory leak in GlusterFS FUSE client. We use replicated (×2) GlusterFS volume to store mail (exim+dovecot, maildir format). Here is inode stats for both bricks and mountpoint: === Brick 1 (Server 1): Filesystem InodesIUsed IFree IUse% Mounted on /dev/mapper/vg_vd1_misc-lv08_mail 578768144 10954918 5678132262% /bricks/r6sdLV08_vd1_mail Brick 2 (Server 2): Filesystem InodesIUsed IFree IUse% Mounted on /dev/mapper/vg_vd0_misc-lv07_mail 578767984 10954913 5678130712% /bricks/r6sdLV07_vd0_mail Mountpoint (Server 3): Filesystem InodesIUsed IFree IUse% Mounted on glusterfs.xxx:mail 578767760 10954915 567812845 2% /var/spool/mail/virtual === glusterfs.xxx domain has two A records for both Server 1 and Server 2. Here is volume info: === Volume Name: mail Type: Replicate Volume ID: f564e85c-7aa6-4170-9417-1f501aa98cd2 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: server1.xxx:/bricks/r6sdLV08_vd1_mail/mail Brick2: server2.xxx:/bricks/r6sdLV07_vd0_mail/mail Options Reconfigured: nfs.rpc-auth-allow: 1.2.4.0/24,4.5.6.0/24 features.cache-invalidation-timeout: 10 performance.stat-prefetch: off performance.quick-read: on performance.read-ahead: off performance.flush-behind: on performance.write-behind: on performance.io-thread-count: 4 performance.cache-max-file-size: 1048576 performance.cache-size: 67108864 performance.readdir-ahead: off === Soon enough after mounting and exim/dovecot start, glusterfs client process begins to consume huge amount of RAM: === user@server3 ~$ ps aux | grep glusterfs | grep mail root 28895 14.4 15.0 15510324 14908868 ? Ssl Sep03 4310:05 /usr/sbin/glusterfs --fopen-keep-cache --direct-io-mode=disable --volfile-server=glusterfs.xxx --volfile-id=mail /var/spool/mail/virtual === That is, ~15 GiB of RAM. Also we've tried to use mountpoint withing separate KVM VM with 2 or 3 GiB of RAM, and soon after starting mail daemons got OOM killer for glusterfs client process. Mounting same share via
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
Thanks for sharing the results. Shall look at the leaks and update. -Soumya On 12/26/2015 04:45 AM, Oleksandr Natalenko wrote: Also, here is valgrind output with our custom tool, that does GlusterFS volume traversing (with simple stats) just like find tool. In this case NFS-Ganesha is not used. https://gist.github.com/e4602a50d3c98f7a2766 One may see GlusterFS-related leaks here as well. On пʼятниця, 25 грудня 2015 р. 20:28:13 EET Soumya Koduri wrote: On 12/24/2015 09:17 PM, Oleksandr Natalenko wrote: Another addition: it seems to be GlusterFS API library memory leak because NFS-Ganesha also consumes huge amount of memory while doing ordinary "find . -type f" via NFSv4.2 on remote client. Here is memory usage: === root 5416 34.2 78.5 2047176 1480552 ? Ssl 12:02 117:54 /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT === 1.4G is too much for simple stat() :(. Ideas? nfs-ganesha also has cache layer which can scale to millions of entries depending on the number of files/directories being looked upon. However there are parameters to tune it. So either try stat with few entries or add below block in nfs-ganesha.conf file, set low limits and check the difference. That may help us narrow down how much memory actually consumed by core nfs-ganesha and gfAPI. CACHEINODE { Cache_Size(uint32, range 1 to UINT32_MAX, default 32633); # cache size Entries_HWMark(uint32, range 1 to UINT32_MAX, default 10); #Max no. of entries in the cache. } Thanks, Soumya 24.12.2015 16:32, Oleksandr Natalenko написав: Still actual issue for 3.7.6. Any suggestions? 24.09.2015 10:14, Oleksandr Natalenko написав: In our GlusterFS deployment we've encountered something like memory leak in GlusterFS FUSE client. We use replicated (×2) GlusterFS volume to store mail (exim+dovecot, maildir format). Here is inode stats for both bricks and mountpoint: === Brick 1 (Server 1): Filesystem InodesIUsed IFree IUse% Mounted on /dev/mapper/vg_vd1_misc-lv08_mail 578768144 10954918 5678132262% /bricks/r6sdLV08_vd1_mail Brick 2 (Server 2): Filesystem InodesIUsed IFree IUse% Mounted on /dev/mapper/vg_vd0_misc-lv07_mail 578767984 10954913 5678130712% /bricks/r6sdLV07_vd0_mail Mountpoint (Server 3): Filesystem InodesIUsed IFree IUse% Mounted on glusterfs.xxx:mail 578767760 10954915 567812845 2% /var/spool/mail/virtual === glusterfs.xxx domain has two A records for both Server 1 and Server 2. Here is volume info: === Volume Name: mail Type: Replicate Volume ID: f564e85c-7aa6-4170-9417-1f501aa98cd2 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: server1.xxx:/bricks/r6sdLV08_vd1_mail/mail Brick2: server2.xxx:/bricks/r6sdLV07_vd0_mail/mail Options Reconfigured: nfs.rpc-auth-allow: 1.2.4.0/24,4.5.6.0/24 features.cache-invalidation-timeout: 10 performance.stat-prefetch: off performance.quick-read: on performance.read-ahead: off performance.flush-behind: on performance.write-behind: on performance.io-thread-count: 4 performance.cache-max-file-size: 1048576 performance.cache-size: 67108864 performance.readdir-ahead: off === Soon enough after mounting and exim/dovecot start, glusterfs client process begins to consume huge amount of RAM: === user@server3 ~$ ps aux | grep glusterfs | grep mail root 28895 14.4 15.0 15510324 14908868 ? Ssl Sep03 4310:05 /usr/sbin/glusterfs --fopen-keep-cache --direct-io-mode=disable --volfile-server=glusterfs.xxx --volfile-id=mail /var/spool/mail/virtual === That is, ~15 GiB of RAM. Also we've tried to use mountpoint withing separate KVM VM with 2 or 3 GiB of RAM, and soon after starting mail daemons got OOM killer for glusterfs client process. Mounting same share via NFS works just fine. Also, we have much less iowait and loadavg on client side with NFS. Also, we've tried to change IO threads count and cache size in order to limit memory usage with no luck. As you can see, total cache size is 4×64==256 MiB (compare to 15 GiB). Enabling-disabling stat-prefetch, read-ahead and readdir-ahead didn't help as well. Here are volume memory stats: === Memory status for volume : mail -- Brick : server1.xxx:/bricks/r6sdLV08_vd1_mail/mail Mallinfo Arena: 36859904 Ordblks : 10357 Smblks : 519 Hblks: 21 Hblkhd : 30515200 Usmblks : 0 Fsmblks : 53440 Uordblks : 18604144 Fordblks : 18255760 Keepcost : 114112 Mempool Stats - NameHotCount ColdCount PaddedSizeof AllocCount MaxAlloc Misses Max-StdAlloc - -- mail-server:fd_t 0 1024 108 30773120 137
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
Also, here is valgrind output with our custom tool, that does GlusterFS volume traversing (with simple stats) just like find tool. In this case NFS-Ganesha is not used. https://gist.github.com/e4602a50d3c98f7a2766 One may see GlusterFS-related leaks here as well. On пʼятниця, 25 грудня 2015 р. 20:28:13 EET Soumya Koduri wrote: > On 12/24/2015 09:17 PM, Oleksandr Natalenko wrote: > > Another addition: it seems to be GlusterFS API library memory leak > > because NFS-Ganesha also consumes huge amount of memory while doing > > ordinary "find . -type f" via NFSv4.2 on remote client. Here is memory > > usage: > > > > === > > root 5416 34.2 78.5 2047176 1480552 ? Ssl 12:02 117:54 > > /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f > > /etc/ganesha/ganesha.conf -N NIV_EVENT > > === > > > > 1.4G is too much for simple stat() :(. > > > > Ideas? > > nfs-ganesha also has cache layer which can scale to millions of entries > depending on the number of files/directories being looked upon. However > there are parameters to tune it. So either try stat with few entries or > add below block in nfs-ganesha.conf file, set low limits and check the > difference. That may help us narrow down how much memory actually > consumed by core nfs-ganesha and gfAPI. > > CACHEINODE { > Cache_Size(uint32, range 1 to UINT32_MAX, default 32633); # cache size > Entries_HWMark(uint32, range 1 to UINT32_MAX, default 10); #Max no. > of entries in the cache. > } > > Thanks, > Soumya > > > 24.12.2015 16:32, Oleksandr Natalenko написав: > >> Still actual issue for 3.7.6. Any suggestions? > >> > >> 24.09.2015 10:14, Oleksandr Natalenko написав: > >>> In our GlusterFS deployment we've encountered something like memory > >>> leak in GlusterFS FUSE client. > >>> > >>> We use replicated (×2) GlusterFS volume to store mail (exim+dovecot, > >>> maildir format). Here is inode stats for both bricks and mountpoint: > >>> > >>> === > >>> Brick 1 (Server 1): > >>> > >>> Filesystem InodesIUsed > >>> > >>> IFree IUse% Mounted on > >>> > >>> /dev/mapper/vg_vd1_misc-lv08_mail 578768144 10954918 > >>> > >>> 5678132262% /bricks/r6sdLV08_vd1_mail > >>> > >>> Brick 2 (Server 2): > >>> > >>> Filesystem InodesIUsed > >>> > >>> IFree IUse% Mounted on > >>> > >>> /dev/mapper/vg_vd0_misc-lv07_mail 578767984 10954913 > >>> > >>> 5678130712% /bricks/r6sdLV07_vd0_mail > >>> > >>> Mountpoint (Server 3): > >>> > >>> Filesystem InodesIUsed IFree > >>> IUse% Mounted on > >>> glusterfs.xxx:mail 578767760 10954915 567812845 > >>> 2% /var/spool/mail/virtual > >>> === > >>> > >>> glusterfs.xxx domain has two A records for both Server 1 and Server 2. > >>> > >>> Here is volume info: > >>> > >>> === > >>> Volume Name: mail > >>> Type: Replicate > >>> Volume ID: f564e85c-7aa6-4170-9417-1f501aa98cd2 > >>> Status: Started > >>> Number of Bricks: 1 x 2 = 2 > >>> Transport-type: tcp > >>> Bricks: > >>> Brick1: server1.xxx:/bricks/r6sdLV08_vd1_mail/mail > >>> Brick2: server2.xxx:/bricks/r6sdLV07_vd0_mail/mail > >>> Options Reconfigured: > >>> nfs.rpc-auth-allow: 1.2.4.0/24,4.5.6.0/24 > >>> features.cache-invalidation-timeout: 10 > >>> performance.stat-prefetch: off > >>> performance.quick-read: on > >>> performance.read-ahead: off > >>> performance.flush-behind: on > >>> performance.write-behind: on > >>> performance.io-thread-count: 4 > >>> performance.cache-max-file-size: 1048576 > >>> performance.cache-size: 67108864 > >>> performance.readdir-ahead: off > >>> === > >>> > >>> Soon enough after mounting and exim/dovecot start, glusterfs client > >>> process begins to consume huge amount of RAM: > >>> > >>> === > >>> user@server3 ~$ ps aux | grep glusterfs | grep mail > >>> root 28895 14.4 15.0 15510324 14908868 ? Ssl Sep03 4310:05 > >>> /usr/sbin/glusterfs --fopen-keep-cache --direct-io-mode=disable > >>> --volfile-server=glusterfs.xxx --volfile-id=mail > >>> /var/spool/mail/virtual > >>> === > >>> > >>> That is, ~15 GiB of RAM. > >>> > >>> Also we've tried to use mountpoint withing separate KVM VM with 2 or 3 > >>> GiB of RAM, and soon after starting mail daemons got OOM killer for > >>> glusterfs client process. > >>> > >>> Mounting same share via NFS works just fine. Also, we have much less > >>> iowait and loadavg on client side with NFS. > >>> > >>> Also, we've tried to change IO threads count and cache size in order > >>> to limit memory usage with no luck. As you can see, total cache size > >>> is 4×64==256 MiB (compare to 15 GiB). > >>> > >>> Enabling-disabling stat-prefetch, read-ahead and readdir-ahead didn't > >>> help as well. > >>> > >>> Here are volume memory stats: > >>> > >>> === > >>> Memory status for volume : mail > >>> -- > >>> Brick : server1.xxx:/bric
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
OK, I've rebuild GlusterFS v3.7.6 with debug enabled as well as NFS-Ganesha with debug enabled as well (and libc allocator). Here is my test steps: 1. launch nfs-ganesha: valgrind --leak-check=full --show-leak-kinds=all --log-file="valgrind.log" / opt/nfs-ganesha/bin/ganesha.nfsd -F -L ./ganesha.log -f ./ganesha.conf -N NIV_EVENT 2. mount NFS share: mount -t nfs4 127.0.0.1:/share share -o defaults,_netdev,minorversion=2,noac,noacl,lookupcache=none,timeo=100 3. cd to share and run find . for some time 4. CTRL+C find, unmount share. 5. CTRL+C NFS-Ganesha. Here is full valgrind output: https://gist.github.com/eebd9f94ababd8130d49 One may see the probability of massive leaks at the end of valgrind output related to both GlusterFS and NFS-Ganesha code. On пʼятниця, 25 грудня 2015 р. 23:29:07 EET Soumya Koduri wrote: > On 12/25/2015 08:56 PM, Oleksandr Natalenko wrote: > > What units Cache_Size is measured in? Bytes? > > Its actually (Cache_Size * sizeof_ptr) bytes. If possible, could you > please run ganesha process under valgrind? Will help in detecting leaks. > > Thanks, > Soumya > > > 25.12.2015 16:58, Soumya Koduri написав: > >> On 12/24/2015 09:17 PM, Oleksandr Natalenko wrote: > >>> Another addition: it seems to be GlusterFS API library memory leak > >>> because NFS-Ganesha also consumes huge amount of memory while doing > >>> ordinary "find . -type f" via NFSv4.2 on remote client. Here is memory > >>> usage: > >>> > >>> === > >>> root 5416 34.2 78.5 2047176 1480552 ? Ssl 12:02 117:54 > >>> /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f > >>> /etc/ganesha/ganesha.conf -N NIV_EVENT > >>> === > >>> > >>> 1.4G is too much for simple stat() :(. > >>> > >>> Ideas? > >> > >> nfs-ganesha also has cache layer which can scale to millions of > >> entries depending on the number of files/directories being looked > >> upon. However there are parameters to tune it. So either try stat with > >> few entries or add below block in nfs-ganesha.conf file, set low > >> limits and check the difference. That may help us narrow down how much > >> memory actually consumed by core nfs-ganesha and gfAPI. > >> > >> CACHEINODE { > >> > >> Cache_Size(uint32, range 1 to UINT32_MAX, default 32633); # cache > >> > >> size > >> > >> Entries_HWMark(uint32, range 1 to UINT32_MAX, default 10); #Max > >> > >> no. of entries in the cache. > >> } > >> > >> Thanks, > >> Soumya > >> > >>> 24.12.2015 16:32, Oleksandr Natalenko написав: > Still actual issue for 3.7.6. Any suggestions? > > 24.09.2015 10:14, Oleksandr Natalenko написав: > > In our GlusterFS deployment we've encountered something like memory > > leak in GlusterFS FUSE client. > > > > We use replicated (×2) GlusterFS volume to store mail (exim+dovecot, > > maildir format). Here is inode stats for both bricks and mountpoint: > > > > === > > Brick 1 (Server 1): > > > > Filesystem Inodes IUsed > > > > IFree IUse% Mounted on > > > > /dev/mapper/vg_vd1_misc-lv08_mail 578768144 10954918 > > > > 5678132262% /bricks/r6sdLV08_vd1_mail > > > > Brick 2 (Server 2): > > > > Filesystem Inodes IUsed > > > > IFree IUse% Mounted on > > > > /dev/mapper/vg_vd0_misc-lv07_mail 578767984 10954913 > > > > 5678130712% /bricks/r6sdLV07_vd0_mail > > > > Mountpoint (Server 3): > > > > Filesystem InodesIUsed IFree > > IUse% Mounted on > > glusterfs.xxx:mail 578767760 10954915 567812845 > > 2% /var/spool/mail/virtual > > === > > > > glusterfs.xxx domain has two A records for both Server 1 and Server 2. > > > > Here is volume info: > > > > === > > Volume Name: mail > > Type: Replicate > > Volume ID: f564e85c-7aa6-4170-9417-1f501aa98cd2 > > Status: Started > > Number of Bricks: 1 x 2 = 2 > > Transport-type: tcp > > Bricks: > > Brick1: server1.xxx:/bricks/r6sdLV08_vd1_mail/mail > > Brick2: server2.xxx:/bricks/r6sdLV07_vd0_mail/mail > > Options Reconfigured: > > nfs.rpc-auth-allow: 1.2.4.0/24,4.5.6.0/24 > > features.cache-invalidation-timeout: 10 > > performance.stat-prefetch: off > > performance.quick-read: on > > performance.read-ahead: off > > performance.flush-behind: on > > performance.write-behind: on > > performance.io-thread-count: 4 > > performance.cache-max-file-size: 1048576 > > performance.cache-size: 67108864 > > performance.readdir-ahead: off > > === > > > > Soon enough after mounting and exim/dovecot start, glusterfs client > > process begins to consume huge amount of RAM: > > > > === > > user@server3 ~$ ps aux | grep glusterfs | grep mail > > root
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
1. test with Cache_Size = 256 and Entries_HWMark = 4096 Before find . -type f: root 3120 0.6 11.0 879120 208408 ? Ssl 17:39 0:00 /usr/bin/ ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT After: root 3120 11.4 24.3 1170076 458168 ? Ssl 17:39 13:39 /usr/bin/ ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT ~250M leak. 2. test with default values (after ganesha restart) Before: root 24937 1.3 10.4 875016 197808 ? Ssl 19:39 0:00 /usr/bin/ ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT After: root 24937 3.5 18.9 1022544 356340 ? Ssl 19:39 0:40 /usr/bin/ ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT ~159M leak. No reasonable correlation detected. Second test was finished much faster than first (I guess, server-side GlusterFS cache or server kernel page cache is the cause). There are ~1.8M files on this test volume. On пʼятниця, 25 грудня 2015 р. 20:28:13 EET Soumya Koduri wrote: > On 12/24/2015 09:17 PM, Oleksandr Natalenko wrote: > > Another addition: it seems to be GlusterFS API library memory leak > > because NFS-Ganesha also consumes huge amount of memory while doing > > ordinary "find . -type f" via NFSv4.2 on remote client. Here is memory > > usage: > > > > === > > root 5416 34.2 78.5 2047176 1480552 ? Ssl 12:02 117:54 > > /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f > > /etc/ganesha/ganesha.conf -N NIV_EVENT > > === > > > > 1.4G is too much for simple stat() :(. > > > > Ideas? > > nfs-ganesha also has cache layer which can scale to millions of entries > depending on the number of files/directories being looked upon. However > there are parameters to tune it. So either try stat with few entries or > add below block in nfs-ganesha.conf file, set low limits and check the > difference. That may help us narrow down how much memory actually > consumed by core nfs-ganesha and gfAPI. > > CACHEINODE { > Cache_Size(uint32, range 1 to UINT32_MAX, default 32633); # cache size > Entries_HWMark(uint32, range 1 to UINT32_MAX, default 10); #Max no. > of entries in the cache. > } > > Thanks, > Soumya > > > 24.12.2015 16:32, Oleksandr Natalenko написав: > >> Still actual issue for 3.7.6. Any suggestions? > >> > >> 24.09.2015 10:14, Oleksandr Natalenko написав: > >>> In our GlusterFS deployment we've encountered something like memory > >>> leak in GlusterFS FUSE client. > >>> > >>> We use replicated (×2) GlusterFS volume to store mail (exim+dovecot, > >>> maildir format). Here is inode stats for both bricks and mountpoint: > >>> > >>> === > >>> Brick 1 (Server 1): > >>> > >>> Filesystem InodesIUsed > >>> > >>> IFree IUse% Mounted on > >>> > >>> /dev/mapper/vg_vd1_misc-lv08_mail 578768144 10954918 > >>> > >>> 5678132262% /bricks/r6sdLV08_vd1_mail > >>> > >>> Brick 2 (Server 2): > >>> > >>> Filesystem InodesIUsed > >>> > >>> IFree IUse% Mounted on > >>> > >>> /dev/mapper/vg_vd0_misc-lv07_mail 578767984 10954913 > >>> > >>> 5678130712% /bricks/r6sdLV07_vd0_mail > >>> > >>> Mountpoint (Server 3): > >>> > >>> Filesystem InodesIUsed IFree > >>> IUse% Mounted on > >>> glusterfs.xxx:mail 578767760 10954915 567812845 > >>> 2% /var/spool/mail/virtual > >>> === > >>> > >>> glusterfs.xxx domain has two A records for both Server 1 and Server 2. > >>> > >>> Here is volume info: > >>> > >>> === > >>> Volume Name: mail > >>> Type: Replicate > >>> Volume ID: f564e85c-7aa6-4170-9417-1f501aa98cd2 > >>> Status: Started > >>> Number of Bricks: 1 x 2 = 2 > >>> Transport-type: tcp > >>> Bricks: > >>> Brick1: server1.xxx:/bricks/r6sdLV08_vd1_mail/mail > >>> Brick2: server2.xxx:/bricks/r6sdLV07_vd0_mail/mail > >>> Options Reconfigured: > >>> nfs.rpc-auth-allow: 1.2.4.0/24,4.5.6.0/24 > >>> features.cache-invalidation-timeout: 10 > >>> performance.stat-prefetch: off > >>> performance.quick-read: on > >>> performance.read-ahead: off > >>> performance.flush-behind: on > >>> performance.write-behind: on > >>> performance.io-thread-count: 4 > >>> performance.cache-max-file-size: 1048576 > >>> performance.cache-size: 67108864 > >>> performance.readdir-ahead: off > >>> === > >>> > >>> Soon enough after mounting and exim/dovecot start, glusterfs client > >>> process begins to consume huge amount of RAM: > >>> > >>> === > >>> user@server3 ~$ ps aux | grep glusterfs | grep mail > >>> root 28895 14.4 15.0 15510324 14908868 ? Ssl Sep03 4310:05 > >>> /usr/sbin/glusterfs --fopen-keep-cache --direct-io-mode=disable > >>> --volfile-server=glusterfs.xxx --volfile-id=mail > >>> /var/spool/mail/virtual > >>> === > >>> > >>> That is, ~15 GiB of RAM. > >>> > >>> Also we've tried to use mountpoint withing sepa
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
On 12/25/2015 08:56 PM, Oleksandr Natalenko wrote: What units Cache_Size is measured in? Bytes? Its actually (Cache_Size * sizeof_ptr) bytes. If possible, could you please run ganesha process under valgrind? Will help in detecting leaks. Thanks, Soumya 25.12.2015 16:58, Soumya Koduri написав: On 12/24/2015 09:17 PM, Oleksandr Natalenko wrote: Another addition: it seems to be GlusterFS API library memory leak because NFS-Ganesha also consumes huge amount of memory while doing ordinary "find . -type f" via NFSv4.2 on remote client. Here is memory usage: === root 5416 34.2 78.5 2047176 1480552 ? Ssl 12:02 117:54 /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT === 1.4G is too much for simple stat() :(. Ideas? nfs-ganesha also has cache layer which can scale to millions of entries depending on the number of files/directories being looked upon. However there are parameters to tune it. So either try stat with few entries or add below block in nfs-ganesha.conf file, set low limits and check the difference. That may help us narrow down how much memory actually consumed by core nfs-ganesha and gfAPI. CACHEINODE { Cache_Size(uint32, range 1 to UINT32_MAX, default 32633); # cache size Entries_HWMark(uint32, range 1 to UINT32_MAX, default 10); #Max no. of entries in the cache. } Thanks, Soumya 24.12.2015 16:32, Oleksandr Natalenko написав: Still actual issue for 3.7.6. Any suggestions? 24.09.2015 10:14, Oleksandr Natalenko написав: In our GlusterFS deployment we've encountered something like memory leak in GlusterFS FUSE client. We use replicated (×2) GlusterFS volume to store mail (exim+dovecot, maildir format). Here is inode stats for both bricks and mountpoint: === Brick 1 (Server 1): Filesystem Inodes IUsed IFree IUse% Mounted on /dev/mapper/vg_vd1_misc-lv08_mail 578768144 10954918 5678132262% /bricks/r6sdLV08_vd1_mail Brick 2 (Server 2): Filesystem Inodes IUsed IFree IUse% Mounted on /dev/mapper/vg_vd0_misc-lv07_mail 578767984 10954913 5678130712% /bricks/r6sdLV07_vd0_mail Mountpoint (Server 3): Filesystem InodesIUsed IFree IUse% Mounted on glusterfs.xxx:mail 578767760 10954915 567812845 2% /var/spool/mail/virtual === glusterfs.xxx domain has two A records for both Server 1 and Server 2. Here is volume info: === Volume Name: mail Type: Replicate Volume ID: f564e85c-7aa6-4170-9417-1f501aa98cd2 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: server1.xxx:/bricks/r6sdLV08_vd1_mail/mail Brick2: server2.xxx:/bricks/r6sdLV07_vd0_mail/mail Options Reconfigured: nfs.rpc-auth-allow: 1.2.4.0/24,4.5.6.0/24 features.cache-invalidation-timeout: 10 performance.stat-prefetch: off performance.quick-read: on performance.read-ahead: off performance.flush-behind: on performance.write-behind: on performance.io-thread-count: 4 performance.cache-max-file-size: 1048576 performance.cache-size: 67108864 performance.readdir-ahead: off === Soon enough after mounting and exim/dovecot start, glusterfs client process begins to consume huge amount of RAM: === user@server3 ~$ ps aux | grep glusterfs | grep mail root 28895 14.4 15.0 15510324 14908868 ? Ssl Sep03 4310:05 /usr/sbin/glusterfs --fopen-keep-cache --direct-io-mode=disable --volfile-server=glusterfs.xxx --volfile-id=mail /var/spool/mail/virtual === That is, ~15 GiB of RAM. Also we've tried to use mountpoint withing separate KVM VM with 2 or 3 GiB of RAM, and soon after starting mail daemons got OOM killer for glusterfs client process. Mounting same share via NFS works just fine. Also, we have much less iowait and loadavg on client side with NFS. Also, we've tried to change IO threads count and cache size in order to limit memory usage with no luck. As you can see, total cache size is 4×64==256 MiB (compare to 15 GiB). Enabling-disabling stat-prefetch, read-ahead and readdir-ahead didn't help as well. Here are volume memory stats: === Memory status for volume : mail -- Brick : server1.xxx:/bricks/r6sdLV08_vd1_mail/mail Mallinfo Arena: 36859904 Ordblks : 10357 Smblks : 519 Hblks: 21 Hblkhd : 30515200 Usmblks : 0 Fsmblks : 53440 Uordblks : 18604144 Fordblks : 18255760 Keepcost : 114112 Mempool Stats - NameHotCount ColdCount PaddedSizeof AllocCount MaxAlloc Misses Max-StdAlloc - -- mail-server:fd_t 0 1024 108 30773120 13700 mail-server:dentry_t 16110 274 84 23567614816384 1106499 1152 mail-server:inode_t1636321
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
What units Cache_Size is measured in? Bytes? 25.12.2015 16:58, Soumya Koduri написав: On 12/24/2015 09:17 PM, Oleksandr Natalenko wrote: Another addition: it seems to be GlusterFS API library memory leak because NFS-Ganesha also consumes huge amount of memory while doing ordinary "find . -type f" via NFSv4.2 on remote client. Here is memory usage: === root 5416 34.2 78.5 2047176 1480552 ? Ssl 12:02 117:54 /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT === 1.4G is too much for simple stat() :(. Ideas? nfs-ganesha also has cache layer which can scale to millions of entries depending on the number of files/directories being looked upon. However there are parameters to tune it. So either try stat with few entries or add below block in nfs-ganesha.conf file, set low limits and check the difference. That may help us narrow down how much memory actually consumed by core nfs-ganesha and gfAPI. CACHEINODE { Cache_Size(uint32, range 1 to UINT32_MAX, default 32633); # cache size Entries_HWMark(uint32, range 1 to UINT32_MAX, default 10); #Max no. of entries in the cache. } Thanks, Soumya 24.12.2015 16:32, Oleksandr Natalenko написав: Still actual issue for 3.7.6. Any suggestions? 24.09.2015 10:14, Oleksandr Natalenko написав: In our GlusterFS deployment we've encountered something like memory leak in GlusterFS FUSE client. We use replicated (×2) GlusterFS volume to store mail (exim+dovecot, maildir format). Here is inode stats for both bricks and mountpoint: === Brick 1 (Server 1): Filesystem Inodes IUsed IFree IUse% Mounted on /dev/mapper/vg_vd1_misc-lv08_mail 578768144 10954918 5678132262% /bricks/r6sdLV08_vd1_mail Brick 2 (Server 2): Filesystem Inodes IUsed IFree IUse% Mounted on /dev/mapper/vg_vd0_misc-lv07_mail 578767984 10954913 5678130712% /bricks/r6sdLV07_vd0_mail Mountpoint (Server 3): Filesystem InodesIUsed IFree IUse% Mounted on glusterfs.xxx:mail 578767760 10954915 567812845 2% /var/spool/mail/virtual === glusterfs.xxx domain has two A records for both Server 1 and Server 2. Here is volume info: === Volume Name: mail Type: Replicate Volume ID: f564e85c-7aa6-4170-9417-1f501aa98cd2 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: server1.xxx:/bricks/r6sdLV08_vd1_mail/mail Brick2: server2.xxx:/bricks/r6sdLV07_vd0_mail/mail Options Reconfigured: nfs.rpc-auth-allow: 1.2.4.0/24,4.5.6.0/24 features.cache-invalidation-timeout: 10 performance.stat-prefetch: off performance.quick-read: on performance.read-ahead: off performance.flush-behind: on performance.write-behind: on performance.io-thread-count: 4 performance.cache-max-file-size: 1048576 performance.cache-size: 67108864 performance.readdir-ahead: off === Soon enough after mounting and exim/dovecot start, glusterfs client process begins to consume huge amount of RAM: === user@server3 ~$ ps aux | grep glusterfs | grep mail root 28895 14.4 15.0 15510324 14908868 ? Ssl Sep03 4310:05 /usr/sbin/glusterfs --fopen-keep-cache --direct-io-mode=disable --volfile-server=glusterfs.xxx --volfile-id=mail /var/spool/mail/virtual === That is, ~15 GiB of RAM. Also we've tried to use mountpoint withing separate KVM VM with 2 or 3 GiB of RAM, and soon after starting mail daemons got OOM killer for glusterfs client process. Mounting same share via NFS works just fine. Also, we have much less iowait and loadavg on client side with NFS. Also, we've tried to change IO threads count and cache size in order to limit memory usage with no luck. As you can see, total cache size is 4×64==256 MiB (compare to 15 GiB). Enabling-disabling stat-prefetch, read-ahead and readdir-ahead didn't help as well. Here are volume memory stats: === Memory status for volume : mail -- Brick : server1.xxx:/bricks/r6sdLV08_vd1_mail/mail Mallinfo Arena: 36859904 Ordblks : 10357 Smblks : 519 Hblks: 21 Hblkhd : 30515200 Usmblks : 0 Fsmblks : 53440 Uordblks : 18604144 Fordblks : 18255760 Keepcost : 114112 Mempool Stats - NameHotCount ColdCount PaddedSizeof AllocCount MaxAlloc Misses Max-StdAlloc - -- mail-server:fd_t 0 1024 108 30773120 13700 mail-server:dentry_t 16110 274 84 23567614816384 1106499 1152 mail-server:inode_t1636321 156 23721687616384 1876651 1169 mail-trash:fd_t0 1024 108 0000 mail-trash:dentry_t0
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
On 12/24/2015 09:17 PM, Oleksandr Natalenko wrote: Another addition: it seems to be GlusterFS API library memory leak because NFS-Ganesha also consumes huge amount of memory while doing ordinary "find . -type f" via NFSv4.2 on remote client. Here is memory usage: === root 5416 34.2 78.5 2047176 1480552 ? Ssl 12:02 117:54 /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT === 1.4G is too much for simple stat() :(. Ideas? nfs-ganesha also has cache layer which can scale to millions of entries depending on the number of files/directories being looked upon. However there are parameters to tune it. So either try stat with few entries or add below block in nfs-ganesha.conf file, set low limits and check the difference. That may help us narrow down how much memory actually consumed by core nfs-ganesha and gfAPI. CACHEINODE { Cache_Size(uint32, range 1 to UINT32_MAX, default 32633); # cache size Entries_HWMark(uint32, range 1 to UINT32_MAX, default 10); #Max no. of entries in the cache. } Thanks, Soumya 24.12.2015 16:32, Oleksandr Natalenko написав: Still actual issue for 3.7.6. Any suggestions? 24.09.2015 10:14, Oleksandr Natalenko написав: In our GlusterFS deployment we've encountered something like memory leak in GlusterFS FUSE client. We use replicated (×2) GlusterFS volume to store mail (exim+dovecot, maildir format). Here is inode stats for both bricks and mountpoint: === Brick 1 (Server 1): Filesystem InodesIUsed IFree IUse% Mounted on /dev/mapper/vg_vd1_misc-lv08_mail 578768144 10954918 5678132262% /bricks/r6sdLV08_vd1_mail Brick 2 (Server 2): Filesystem InodesIUsed IFree IUse% Mounted on /dev/mapper/vg_vd0_misc-lv07_mail 578767984 10954913 5678130712% /bricks/r6sdLV07_vd0_mail Mountpoint (Server 3): Filesystem InodesIUsed IFree IUse% Mounted on glusterfs.xxx:mail 578767760 10954915 567812845 2% /var/spool/mail/virtual === glusterfs.xxx domain has two A records for both Server 1 and Server 2. Here is volume info: === Volume Name: mail Type: Replicate Volume ID: f564e85c-7aa6-4170-9417-1f501aa98cd2 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: server1.xxx:/bricks/r6sdLV08_vd1_mail/mail Brick2: server2.xxx:/bricks/r6sdLV07_vd0_mail/mail Options Reconfigured: nfs.rpc-auth-allow: 1.2.4.0/24,4.5.6.0/24 features.cache-invalidation-timeout: 10 performance.stat-prefetch: off performance.quick-read: on performance.read-ahead: off performance.flush-behind: on performance.write-behind: on performance.io-thread-count: 4 performance.cache-max-file-size: 1048576 performance.cache-size: 67108864 performance.readdir-ahead: off === Soon enough after mounting and exim/dovecot start, glusterfs client process begins to consume huge amount of RAM: === user@server3 ~$ ps aux | grep glusterfs | grep mail root 28895 14.4 15.0 15510324 14908868 ? Ssl Sep03 4310:05 /usr/sbin/glusterfs --fopen-keep-cache --direct-io-mode=disable --volfile-server=glusterfs.xxx --volfile-id=mail /var/spool/mail/virtual === That is, ~15 GiB of RAM. Also we've tried to use mountpoint withing separate KVM VM with 2 or 3 GiB of RAM, and soon after starting mail daemons got OOM killer for glusterfs client process. Mounting same share via NFS works just fine. Also, we have much less iowait and loadavg on client side with NFS. Also, we've tried to change IO threads count and cache size in order to limit memory usage with no luck. As you can see, total cache size is 4×64==256 MiB (compare to 15 GiB). Enabling-disabling stat-prefetch, read-ahead and readdir-ahead didn't help as well. Here are volume memory stats: === Memory status for volume : mail -- Brick : server1.xxx:/bricks/r6sdLV08_vd1_mail/mail Mallinfo Arena: 36859904 Ordblks : 10357 Smblks : 519 Hblks: 21 Hblkhd : 30515200 Usmblks : 0 Fsmblks : 53440 Uordblks : 18604144 Fordblks : 18255760 Keepcost : 114112 Mempool Stats - NameHotCount ColdCount PaddedSizeof AllocCount MaxAlloc Misses Max-StdAlloc - -- mail-server:fd_t 0 1024 108 30773120 13700 mail-server:dentry_t 16110 274 84 23567614816384 1106499 1152 mail-server:inode_t1636321 156 23721687616384 1876651 1169 mail-trash:fd_t0 1024 108 0000 mail-trash:dentry_t0 32768 84 0000 mail-trash:inode_t 4
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
Here are two consecutive statedumps of brick in question memory usage [1] [2]. glusterfs client process went from ~630 MB to ~1350 MB of memory usage in less than one hour. Volume options: === cluster.lookup-optimize: on cluster.readdir-optimize: on client.event-threads: 4 network.inode-lru-limit: 4096 server.event-threads: 8 performance.client-io-threads: on storage.linux-aio: on performance.write-behind-window-size: 4194304 performance.stat-prefetch: on performance.quick-read: on performance.read-ahead: on performance.flush-behind: on performance.write-behind: on performance.io-thread-count: 4 performance.cache-max-file-size: 1048576 performance.cache-size: 33554432 performance.readdir-ahead: on === I observe such a behavior on similar volumes where millions of files are stored. The volume in question holds ~11M of small files (mail storage). So, memleak persists. Had to switch to NFS temporarily :(. Any idea? [1] https://gist.github.com/46697b70ffe193fa797e [2] https://gist.github.com/3a968ca909bfdeb31cca 28.09.2015 14:31, Raghavendra Bhat написав: Hi Oleksandr, You are right. The description should have said it as the limit on the number of inodes in the lru list of the inode cache. I have sent a patch for that. http://review.gluster.org/#/c/12242/ [3] Regards, Raghavendra Bhat On Thu, Sep 24, 2015 at 1:44 PM, Oleksandr Natalenko wrote: I've checked statedump of volume in question and haven't found lots of iobuf as mentioned in that bugreport. However, I've noticed that there are lots of LRU records like this: === [conn.1.bound_xl./bricks/r6sdLV07_vd0_mail/mail.lru.1] gfid=c4b29310-a19d-451b-8dd1-b3ac2d86b595 nlookup=1 fd-count=0 ref=0 ia_type=1 === In fact, there are 16383 of them. I've checked "gluster volume set help" in order to find something LRU-related and have found this: === Option: network.inode-lru-limit Default Value: 16384 Description: Specifies the maximum megabytes of memory to be used in the inode cache. === Is there error in description stating "maximum megabytes of memory"? Shouldn't it mean "maximum amount of LRU records"? If no, is that true, that inode cache could grow up to 16 GiB for client, and one must lower network.inode-lru-limit value? Another thought: we've enabled write-behind, and the default write-behind-window-size value is 1 MiB. So, one may conclude that with lots of small files written, write-behind buffer could grow up to inode-lru-limit×write-behind-window-size=16 GiB? Who could explain that to me? 24.09.2015 10:42, Gabi C write: oh, my bad... coulb be this one? https://bugzilla.redhat.com/show_bug.cgi?id=1126831 [1] [2] Anyway, on ovirt+gluster w I experienced similar behavior... ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel [2] Links: -- [1] https://bugzilla.redhat.com/show_bug.cgi?id=1126831 [2] http://www.gluster.org/mailman/listinfo/gluster-devel [3] http://review.gluster.org/#/c/12242/ ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
Hi Oleksandr, You are right. The description should have said it as the limit on the number of inodes in the lru list of the inode cache. I have sent a patch for that. http://review.gluster.org/#/c/12242/ Regards, Raghavendra Bhat On Thu, Sep 24, 2015 at 1:44 PM, Oleksandr Natalenko < oleksa...@natalenko.name> wrote: > I've checked statedump of volume in question and haven't found lots of > iobuf as mentioned in that bugreport. > > However, I've noticed that there are lots of LRU records like this: > > === > [conn.1.bound_xl./bricks/r6sdLV07_vd0_mail/mail.lru.1] > gfid=c4b29310-a19d-451b-8dd1-b3ac2d86b595 > nlookup=1 > fd-count=0 > ref=0 > ia_type=1 > === > > In fact, there are 16383 of them. I've checked "gluster volume set help" > in order to find something LRU-related and have found this: > > === > Option: network.inode-lru-limit > Default Value: 16384 > Description: Specifies the maximum megabytes of memory to be used in the > inode cache. > === > > Is there error in description stating "maximum megabytes of memory"? > Shouldn't it mean "maximum amount of LRU records"? If no, is that true, > that inode cache could grow up to 16 GiB for client, and one must lower > network.inode-lru-limit value? > > Another thought: we've enabled write-behind, and the default > write-behind-window-size value is 1 MiB. So, one may conclude that with > lots of small files written, write-behind buffer could grow up to > inode-lru-limit×write-behind-window-size=16 GiB? Who could explain that to > me? > > 24.09.2015 10:42, Gabi C write: > >> oh, my bad... >> coulb be this one? >> >> https://bugzilla.redhat.com/show_bug.cgi?id=1126831 [2] >> Anyway, on ovirt+gluster w I experienced similar behavior... >> > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel > ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
oh, my bad... coulb be this one? https://bugzilla.redhat.com/show_bug.cgi?id=1126831 Anyway, on ovirt+gluster w I experienced similar behavior... On Thu, Sep 24, 2015 at 10:32 AM, Oleksandr Natalenko < oleksa...@natalenko.name> wrote: > We use bare GlusterFS installation with no oVirt involved. > > 24.09.2015 10:29, Gabi C wrote: > >> google vdsm memory leak..it's been discussed on list last year and >> earlier this one... >> > ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
google vdsm memory leak..it's been discussed on list last year and earlier this one... On Thu, Sep 24, 2015 at 10:14 AM, Oleksandr Natalenko < oleksa...@natalenko.name> wrote: > In our GlusterFS deployment we've encountered something like memory leak > in GlusterFS FUSE client. > > We use replicated (×2) GlusterFS volume to store mail (exim+dovecot, > maildir format). Here is inode stats for both bricks and mountpoint: > > === > Brick 1 (Server 1): > > Filesystem InodesIUsed > IFree IUse% Mounted on > /dev/mapper/vg_vd1_misc-lv08_mail 578768144 10954918 > 5678132262% /bricks/r6sdLV08_vd1_mail > > Brick 2 (Server 2): > > Filesystem InodesIUsed > IFree IUse% Mounted on > /dev/mapper/vg_vd0_misc-lv07_mail 578767984 10954913 > 5678130712% /bricks/r6sdLV07_vd0_mail > > Mountpoint (Server 3): > > Filesystem InodesIUsed IFree IUse% > Mounted on > glusterfs.xxx:mail 578767760 10954915 5678128452% > /var/spool/mail/virtual > === > > glusterfs.xxx domain has two A records for both Server 1 and Server 2. > > Here is volume info: > > === > Volume Name: mail > Type: Replicate > Volume ID: f564e85c-7aa6-4170-9417-1f501aa98cd2 > Status: Started > Number of Bricks: 1 x 2 = 2 > Transport-type: tcp > Bricks: > Brick1: server1.xxx:/bricks/r6sdLV08_vd1_mail/mail > Brick2: server2.xxx:/bricks/r6sdLV07_vd0_mail/mail > Options Reconfigured: > nfs.rpc-auth-allow: 1.2.4.0/24,4.5.6.0/24 > features.cache-invalidation-timeout: 10 > performance.stat-prefetch: off > performance.quick-read: on > performance.read-ahead: off > performance.flush-behind: on > performance.write-behind: on > performance.io-thread-count: 4 > performance.cache-max-file-size: 1048576 > performance.cache-size: 67108864 > performance.readdir-ahead: off > === > > Soon enough after mounting and exim/dovecot start, glusterfs client > process begins to consume huge amount of RAM: > > === > user@server3 ~$ ps aux | grep glusterfs | grep mail > root 28895 14.4 15.0 15510324 14908868 ? Ssl Sep03 4310:05 > /usr/sbin/glusterfs --fopen-keep-cache --direct-io-mode=disable > --volfile-server=glusterfs.xxx --volfile-id=mail /var/spool/mail/virtual > === > > That is, ~15 GiB of RAM. > > Also we've tried to use mountpoint withing separate KVM VM with 2 or 3 GiB > of RAM, and soon after starting mail daemons got OOM killer for glusterfs > client process. > > Mounting same share via NFS works just fine. Also, we have much less > iowait and loadavg on client side with NFS. > > Also, we've tried to change IO threads count and cache size in order to > limit memory usage with no luck. As you can see, total cache size is > 4×64==256 MiB (compare to 15 GiB). > > Enabling-disabling stat-prefetch, read-ahead and readdir-ahead didn't help > as well. > > Here are volume memory stats: > > === > Memory status for volume : mail > -- > Brick : server1.xxx:/bricks/r6sdLV08_vd1_mail/mail > Mallinfo > > Arena: 36859904 > Ordblks : 10357 > Smblks : 519 > Hblks: 21 > Hblkhd : 30515200 > Usmblks : 0 > Fsmblks : 53440 > Uordblks : 18604144 > Fordblks : 18255760 > Keepcost : 114112 > > Mempool Stats > - > NameHotCount ColdCount PaddedSizeof AllocCount > MaxAlloc Misses Max-StdAlloc > - -- > > mail-server:fd_t 0 1024 108 > 30773120 13700 > mail-server:dentry_t 16110 274 84 > 23567614816384 1106499 1152 > mail-server:inode_t1636321 156 > 23721687616384 1876651 1169 > mail-trash:fd_t0 1024 108 > 0000 > mail-trash:dentry_t0 32768 84 > 0000 > mail-trash:inode_t 4 32764 156 > 4400 > mail-trash:trash_local_t 064 8628 > 0000 > mail-changetimerecorder:gf_ctr_local_t 06416540 > 0000 > mail-changelog:rpcsvc_request_t 0 8 2828 > 0000 > mail-changelog:changelog_local_t 064 116 > 0000 > mail-bitrot-stub:br_stub_local_t 0 512 84 > 79204400 > mail-locks:pl_local_t 032 148 > 6812757400 > mail-upcall:upcall_local_t 0 512 108 > 0000 > mail-marker:marker_local_t 0 128
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
I've checked statedump of volume in question and haven't found lots of iobuf as mentioned in that bugreport. However, I've noticed that there are lots of LRU records like this: === [conn.1.bound_xl./bricks/r6sdLV07_vd0_mail/mail.lru.1] gfid=c4b29310-a19d-451b-8dd1-b3ac2d86b595 nlookup=1 fd-count=0 ref=0 ia_type=1 === In fact, there are 16383 of them. I've checked "gluster volume set help" in order to find something LRU-related and have found this: === Option: network.inode-lru-limit Default Value: 16384 Description: Specifies the maximum megabytes of memory to be used in the inode cache. === Is there error in description stating "maximum megabytes of memory"? Shouldn't it mean "maximum amount of LRU records"? If no, is that true, that inode cache could grow up to 16 GiB for client, and one must lower network.inode-lru-limit value? Another thought: we've enabled write-behind, and the default write-behind-window-size value is 1 MiB. So, one may conclude that with lots of small files written, write-behind buffer could grow up to inode-lru-limit×write-behind-window-size=16 GiB? Who could explain that to me? 24.09.2015 10:42, Gabi C write: oh, my bad... coulb be this one? https://bugzilla.redhat.com/show_bug.cgi?id=1126831 [2] Anyway, on ovirt+gluster w I experienced similar behavior... ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
We use bare GlusterFS installation with no oVirt involved. 24.09.2015 10:29, Gabi C wrote: google vdsm memory leak..it's been discussed on list last year and earlier this one... ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel