Re: [Gluster-users] Fuse memleaks, all versions → OOM-killer

2016-08-29 Thread Yannick Perret

Hello,

back after holidays. I don't saw any new relies after this last mail, I 
hope I don't missed mails (too many mails to parse…).


BTW it seems that my problem is very similar to this opened bug: 
https://bugzilla.redhat.com/show_bug.cgi?id=1369364
-> memory usage always increasing for (here) read ops until reaching all 
mem/swap, using the fuse client.


Regards,
--
Y.

Le 02/08/2016 à 19:15, Yannick Perret a écrit :
In order to prevent too many swap usage I removed swap on this machine 
(swapoff -a).

Memory usage was still growing.
After that I started an other program that takes memory (in order to 
accelerate things) and I got the OOM-killer.


Here is the syslog:
[1246854.291996] Out of memory: Kill process 931 (glusterfs) score 742 
or sacrifice child
[1246854.292102] Killed process 931 (glusterfs) total-vm:3527624kB, 
anon-rss:3100328kB, file-rss:0kB


Last VSZ/RSS was: 3527624 / 3097096


Here is the rest of the OOM-killer data:
[1246854.291847] active_anon:600785 inactive_anon:377188 isolated_anon:0
 active_file:97 inactive_file:137 isolated_file:0
 unevictable:0 dirty:0 writeback:1 unstable:0
 free:21740 slab_reclaimable:3309 slab_unreclaimable:3728
 mapped:255 shmem:4267 pagetables:3286 bounce:0
 free_cma:0
[1246854.291851] Node 0 DMA free:15876kB min:264kB low:328kB 
high:396kB active_anon:0kB inactive_anon:0kB active_file:0kB 
inactive_file:0kB unevictable:0kB isolated(anon):0kB 
isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB 
dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB 
slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB 
bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 
all_unreclaimable? yes

[1246854.291858] lowmem_reserve[]: 0 2980 3948 3948
[1246854.291861] Node 0 DMA32 free:54616kB min:50828kB low:63532kB 
high:76240kB active_anon:1940432kB inactive_anon:1020924kB 
active_file:248kB inactive_file:260kB unevictable:0kB 
isolated(anon):0kB isolated(file):0kB present:3129280kB 
managed:3054836kB mlocked:0kB dirty:0kB writeback:0kB mapped:760kB 
shmem:14616kB slab_reclaimable:9660kB slab_unreclaimable:8244kB 
kernel_stack:1456kB pagetables:10056kB unstable:0kB bounce:0kB 
free_cma:0kB writeback_tmp:0kB pages_scanned:803 all_unreclaimable? yes

[1246854.291865] lowmem_reserve[]: 0 0 967 967
[1246854.291867] Node 0 Normal free:16468kB min:16488kB low:20608kB 
high:24732kB active_anon:462708kB inactive_anon:487828kB 
active_file:140kB inactive_file:288kB unevictable:0kB 
isolated(anon):0kB isolated(file):0kB present:1048576kB 
managed:990356kB mlocked:0kB dirty:0kB writeback:4kB mapped:260kB 
shmem:2452kB slab_reclaimable:3576kB slab_unreclaimable:6668kB 
kernel_stack:560kB pagetables:3088kB unstable:0kB bounce:0kB 
free_cma:0kB writeback_tmp:0kB pages_scanned:975 all_unreclaimable? yes

[1246854.291872] lowmem_reserve[]: 0 0 0 0
[1246854.291874] Node 0 DMA: 1*4kB (U) 0*8kB 0*16kB 2*32kB (U) 3*64kB 
(U) 0*128kB 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (R) 3*4096kB 
(EM) = 15876kB
[1246854.291882] Node 0 DMA32: 1218*4kB (UEM) 848*8kB (UE) 621*16kB 
(UE) 314*32kB (UEM) 189*64kB (UEM) 49*128kB (UEM) 2*256kB (E) 0*512kB 
0*1024kB 0*2048kB 1*4096kB (R) = 54616kB
[1246854.291891] Node 0 Normal: 3117*4kB (UE) 0*8kB 0*16kB 3*32kB (R) 
1*64kB (R) 2*128kB (R) 0*256kB 1*512kB (R) 1*1024kB (R) 1*2048kB (R) 
0*4096kB = 16468kB
[1246854.291900] Node 0 hugepages_total=0 hugepages_free=0 
hugepages_surp=0 hugepages_size=2048kB

[1246854.291902] 4533 total pagecache pages
[1246854.291903] 0 pages in swap cache
[1246854.291905] Swap cache stats: add 343501, delete 343501, find 
7730690/7732743

[1246854.291906] Free swap  = 0kB
[1246854.291907] Total swap = 0kB
[1246854.291908] 1048462 pages RAM
[1246854.291909] 0 pages HighMem/MovableOnly
[1246854.291909] 14555 pages reserved
[1246854.291910] 0 pages hwpoisoned

Regards,
--
Y.



Le 02/08/2016 à 17:00, Yannick Perret a écrit :

So here are the dumps, gzip'ed.

What I did:
1. mounting the volume, removing all its content, umounting it
2. mounting the volume
3. performing a cp -Rp /usr/* /root/MNT
4. performing a rm -rf /root/MNT/*
5. taking a dump (glusterdump.p1.dump)
6. re-doing 3, 4 and 5 (glusterdump.p2.dump)

VSZ/RSS are respectively:
- 381896 / 35688 just after mount
- 644040 / 309240 after 1st cp -Rp
- 644040 / 310128 after 1st rm -rf
- 709576 / 310128 after 1st kill -USR1
- 840648 / 421964 after 2nd cp -Rp
- 840648 / 44 after 2nd rm -rf

I created a small script that performs these actions in an infinite loop:
while /bin/true
do
  cp -Rp /usr/* /root/MNT/
  + get VSZ/RSS of glusterfs process
  rm -rf /root/MNT/*
  + get VSZ/RSS of glusterfs process
done

At this time here are the values so far:
971720 533988
1037256 645500
1037256 645840
1168328 757348
1168328 757620
1299400 869128
1299400 869328
1364936 980712
1364936 980944
1496008 1092384
1496008 1092404
1627080 1203796
1627080 1203996
1692616 1315572
1692616 1315504
1823688 1426812
1823688 1427340
1954760 1538716
1954760 153

Re: [Gluster-users] Fuse memleaks, all versions → OOM-killer

2016-08-02 Thread Yannick Perret
In order to prevent too many swap usage I removed swap on this machine 
(swapoff -a).

Memory usage was still growing.
After that I started an other program that takes memory (in order to 
accelerate things) and I got the OOM-killer.


Here is the syslog:
[1246854.291996] Out of memory: Kill process 931 (glusterfs) score 742 
or sacrifice child
[1246854.292102] Killed process 931 (glusterfs) total-vm:3527624kB, 
anon-rss:3100328kB, file-rss:0kB


Last VSZ/RSS was: 3527624 / 3097096


Here is the rest of the OOM-killer data:
[1246854.291847] active_anon:600785 inactive_anon:377188 isolated_anon:0
 active_file:97 inactive_file:137 isolated_file:0
 unevictable:0 dirty:0 writeback:1 unstable:0
 free:21740 slab_reclaimable:3309 slab_unreclaimable:3728
 mapped:255 shmem:4267 pagetables:3286 bounce:0
 free_cma:0
[1246854.291851] Node 0 DMA free:15876kB min:264kB low:328kB high:396kB 
active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB 
unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB 
managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB 
slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB 
pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB 
pages_scanned:0 all_unreclaimable? yes

[1246854.291858] lowmem_reserve[]: 0 2980 3948 3948
[1246854.291861] Node 0 DMA32 free:54616kB min:50828kB low:63532kB 
high:76240kB active_anon:1940432kB inactive_anon:1020924kB 
active_file:248kB inactive_file:260kB unevictable:0kB isolated(anon):0kB 
isolated(file):0kB present:3129280kB managed:3054836kB mlocked:0kB 
dirty:0kB writeback:0kB mapped:760kB shmem:14616kB 
slab_reclaimable:9660kB slab_unreclaimable:8244kB kernel_stack:1456kB 
pagetables:10056kB unstable:0kB bounce:0kB free_cma:0kB 
writeback_tmp:0kB pages_scanned:803 all_unreclaimable? yes

[1246854.291865] lowmem_reserve[]: 0 0 967 967
[1246854.291867] Node 0 Normal free:16468kB min:16488kB low:20608kB 
high:24732kB active_anon:462708kB inactive_anon:487828kB 
active_file:140kB inactive_file:288kB unevictable:0kB isolated(anon):0kB 
isolated(file):0kB present:1048576kB managed:990356kB mlocked:0kB 
dirty:0kB writeback:4kB mapped:260kB shmem:2452kB 
slab_reclaimable:3576kB slab_unreclaimable:6668kB kernel_stack:560kB 
pagetables:3088kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB 
pages_scanned:975 all_unreclaimable? yes

[1246854.291872] lowmem_reserve[]: 0 0 0 0
[1246854.291874] Node 0 DMA: 1*4kB (U) 0*8kB 0*16kB 2*32kB (U) 3*64kB 
(U) 0*128kB 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (R) 3*4096kB (EM) 
= 15876kB
[1246854.291882] Node 0 DMA32: 1218*4kB (UEM) 848*8kB (UE) 621*16kB (UE) 
314*32kB (UEM) 189*64kB (UEM) 49*128kB (UEM) 2*256kB (E) 0*512kB 
0*1024kB 0*2048kB 1*4096kB (R) = 54616kB
[1246854.291891] Node 0 Normal: 3117*4kB (UE) 0*8kB 0*16kB 3*32kB (R) 
1*64kB (R) 2*128kB (R) 0*256kB 1*512kB (R) 1*1024kB (R) 1*2048kB (R) 
0*4096kB = 16468kB
[1246854.291900] Node 0 hugepages_total=0 hugepages_free=0 
hugepages_surp=0 hugepages_size=2048kB

[1246854.291902] 4533 total pagecache pages
[1246854.291903] 0 pages in swap cache
[1246854.291905] Swap cache stats: add 343501, delete 343501, find 
7730690/7732743

[1246854.291906] Free swap  = 0kB
[1246854.291907] Total swap = 0kB
[1246854.291908] 1048462 pages RAM
[1246854.291909] 0 pages HighMem/MovableOnly
[1246854.291909] 14555 pages reserved
[1246854.291910] 0 pages hwpoisoned

Regards,
--
Y.



Le 02/08/2016 à 17:00, Yannick Perret a écrit :

So here are the dumps, gzip'ed.

What I did:
1. mounting the volume, removing all its content, umounting it
2. mounting the volume
3. performing a cp -Rp /usr/* /root/MNT
4. performing a rm -rf /root/MNT/*
5. taking a dump (glusterdump.p1.dump)
6. re-doing 3, 4 and 5 (glusterdump.p2.dump)

VSZ/RSS are respectively:
- 381896 / 35688 just after mount
- 644040 / 309240 after 1st cp -Rp
- 644040 / 310128 after 1st rm -rf
- 709576 / 310128 after 1st kill -USR1
- 840648 / 421964 after 2nd cp -Rp
- 840648 / 44 after 2nd rm -rf

I created a small script that performs these actions in an infinite loop:
while /bin/true
do
  cp -Rp /usr/* /root/MNT/
  + get VSZ/RSS of glusterfs process
  rm -rf /root/MNT/*
  + get VSZ/RSS of glusterfs process
done

At this time here are the values so far:
971720 533988
1037256 645500
1037256 645840
1168328 757348
1168328 757620
1299400 869128
1299400 869328
1364936 980712
1364936 980944
1496008 1092384
1496008 1092404
1627080 1203796
1627080 1203996
1692616 1315572
1692616 1315504
1823688 1426812
1823688 1427340
1954760 1538716
1954760 1538772
2085832 1647676
2085832 1647708
2151368 1750392
2151368 1750708
2282440 1853864
2282440 1853764
2413512 1952668
2413512 1952704
2479048 2056500
2479048 2056712

So at this time glusterfs process takes not far from 2Gb of resident 
memory, only performing exactly the same actions 'cp -Rp /usr/* 
/root/MNT' + 'rm -rf /root/MNT/*'.


Swap usage is starting to increase a little, and I don't saw any 
memory dropping at this ti

Re: [Gluster-users] Fuse memleaks, all versions

2016-08-02 Thread Yannick Perret

So here are the dumps, gzip'ed.

What I did:
1. mounting the volume, removing all its content, umounting it
2. mounting the volume
3. performing a cp -Rp /usr/* /root/MNT
4. performing a rm -rf /root/MNT/*
5. taking a dump (glusterdump.p1.dump)
6. re-doing 3, 4 and 5 (glusterdump.p2.dump)

VSZ/RSS are respectively:
- 381896 / 35688 just after mount
- 644040 / 309240 after 1st cp -Rp
- 644040 / 310128 after 1st rm -rf
- 709576 / 310128 after 1st kill -USR1
- 840648 / 421964 after 2nd cp -Rp
- 840648 / 44 after 2nd rm -rf

I created a small script that performs these actions in an infinite loop:
while /bin/true
do
  cp -Rp /usr/* /root/MNT/
  + get VSZ/RSS of glusterfs process
  rm -rf /root/MNT/*
  + get VSZ/RSS of glusterfs process
done

At this time here are the values so far:
971720 533988
1037256 645500
1037256 645840
1168328 757348
1168328 757620
1299400 869128
1299400 869328
1364936 980712
1364936 980944
1496008 1092384
1496008 1092404
1627080 1203796
1627080 1203996
1692616 1315572
1692616 1315504
1823688 1426812
1823688 1427340
1954760 1538716
1954760 1538772
2085832 1647676
2085832 1647708
2151368 1750392
2151368 1750708
2282440 1853864
2282440 1853764
2413512 1952668
2413512 1952704
2479048 2056500
2479048 2056712

So at this time glusterfs process takes not far from 2Gb of resident 
memory, only performing exactly the same actions 'cp -Rp /usr/* 
/root/MNT' + 'rm -rf /root/MNT/*'.


Swap usage is starting to increase a little, and I don't saw any memory 
dropping at this time.
I can understand that kernel may not release the removed files (after rm 
-rf) immediatly, but the fist 'rm' occured at ~12:00 today and it is 
~17:00 here so I can't understand why so much memory is used.
I would expect the memory to grow during 'cp -Rp', then reduce after 
'rm', but it stays the same. Even if it stays the same I would expect it 
to not grow more while cp-ing again.


I let the cp/rm loop running to see what will happen. Feel free to ask 
for other data if it may help.


Please note that I'll be in hollidays at the end of this week for 3 
weeks so I will mostly not be able to perform tests during this time 
(network connection is too bad where I go).


Regards,
--
Y.

Le 02/08/2016 à 05:11, Pranith Kumar Karampuri a écrit :



On Mon, Aug 1, 2016 at 3:40 PM, Yannick Perret 
mailto:yannick.per...@liris.cnrs.fr>> 
wrote:


Le 29/07/2016 à 18:39, Pranith Kumar Karampuri a écrit :



On Fri, Jul 29, 2016 at 2:26 PM, Yannick Perret
mailto:yannick.per...@liris.cnrs.fr>> wrote:

Ok, last try:
after investigating more versions I found that FUSE client
leaks memory on all of them.
I tested:
- 3.6.7 client on debian 7 32bit and on debian 8 64bit (with
3.6.7 serveurs on debian 8 64bit)
- 3.6.9 client on debian 7 32bit and on debian 8 64bit (with
3.6.7 serveurs on debian 8 64bit)
- 3.7.13 client on debian 8 64bit (with 3.8.1 serveurs on
debian 8 64bit)
- 3.8.1 client on debian 8 64bit (with 3.8.1 serveurs on
debian 8 64bit)
In all cases compiled from sources, appart for 3.8.1 where
.deb were used (due to a configure runtime error).
For 3.7 it was compiled with --disable-tiering. I also tried
to compile with --disable-fusermount (no change).

In all of these cases the memory (resident & virtual) of
glusterfs process on client grows on each activity and never
reach a max (and never reduce).
"Activity" for these tests is cp -Rp and ls -lR.
The client I let grows the most overreached ~4Go RAM. On
smaller machines it ends by OOM killer killing glusterfs
process or glusterfs dying due to allocation error.

In 3.6 mem seems to grow continusly, whereas in 3.8.1 it
grows by "steps" (430400 ko → 629144 (~1min) → 762324 (~1min)
→ 827860…).

All tests performed on a single test volume used only by my
test client. Volume in a basic x2 replica. The only
parameters I changed on this volume (without any effect) are
diagnostics.client-log-level set to ERROR and
network.inode-lru-limit set to 1024.


Could you attach statedumps of your runs?
The following link has steps to capture
this(https://gluster.readthedocs.io/en/latest/Troubleshooting/statedump/
). We basically need to see what are the memory types that are
increasing. If you could help find the issue, we can send the
fixes for your workload. There is a 3.8.2 release in around 10
days I think. We can probably target this issue for that?

Here are statedumps.
Steps:
1. mount -t glusterfs ldap1.my.domain:SHARE /root/MNT/ (here VSZ
and RSS are 381896 35828)
2. take a dump with kill -USR1  (file
glusterdump.n1.dump.1470042769)
3. perform a 'ls -lR /root/MNT | wc -l' (btw result of wc -l is
518396 :)) and a 'cp -Rp /usr/* /root/MNT/boo' (VSZ/RSS are
1301536/711

Re: [Gluster-users] Fuse memleaks, all versions

2016-08-02 Thread Yannick Perret

Le 02/08/2016 à 05:11, Pranith Kumar Karampuri a écrit :



On Mon, Aug 1, 2016 at 3:40 PM, Yannick Perret 
mailto:yannick.per...@liris.cnrs.fr>> 
wrote:


Le 29/07/2016 à 18:39, Pranith Kumar Karampuri a écrit :



On Fri, Jul 29, 2016 at 2:26 PM, Yannick Perret
mailto:yannick.per...@liris.cnrs.fr>> wrote:

Ok, last try:
after investigating more versions I found that FUSE client
leaks memory on all of them.
I tested:
- 3.6.7 client on debian 7 32bit and on debian 8 64bit (with
3.6.7 serveurs on debian 8 64bit)
- 3.6.9 client on debian 7 32bit and on debian 8 64bit (with
3.6.7 serveurs on debian 8 64bit)
- 3.7.13 client on debian 8 64bit (with 3.8.1 serveurs on
debian 8 64bit)
- 3.8.1 client on debian 8 64bit (with 3.8.1 serveurs on
debian 8 64bit)
In all cases compiled from sources, appart for 3.8.1 where
.deb were used (due to a configure runtime error).
For 3.7 it was compiled with --disable-tiering. I also tried
to compile with --disable-fusermount (no change).

In all of these cases the memory (resident & virtual) of
glusterfs process on client grows on each activity and never
reach a max (and never reduce).
"Activity" for these tests is cp -Rp and ls -lR.
The client I let grows the most overreached ~4Go RAM. On
smaller machines it ends by OOM killer killing glusterfs
process or glusterfs dying due to allocation error.

In 3.6 mem seems to grow continusly, whereas in 3.8.1 it
grows by "steps" (430400 ko → 629144 (~1min) → 762324 (~1min)
→ 827860…).

All tests performed on a single test volume used only by my
test client. Volume in a basic x2 replica. The only
parameters I changed on this volume (without any effect) are
diagnostics.client-log-level set to ERROR and
network.inode-lru-limit set to 1024.


Could you attach statedumps of your runs?
The following link has steps to capture
this(https://gluster.readthedocs.io/en/latest/Troubleshooting/statedump/
). We basically need to see what are the memory types that are
increasing. If you could help find the issue, we can send the
fixes for your workload. There is a 3.8.2 release in around 10
days I think. We can probably target this issue for that?

Here are statedumps.
Steps:
1. mount -t glusterfs ldap1.my.domain:SHARE /root/MNT/ (here VSZ
and RSS are 381896 35828)
2. take a dump with kill -USR1  (file
glusterdump.n1.dump.1470042769)
3. perform a 'ls -lR /root/MNT | wc -l' (btw result of wc -l is
518396 :)) and a 'cp -Rp /usr/* /root/MNT/boo' (VSZ/RSS are
1301536/711992 at end of these operations)
4. take a dump with kill -USR1  (file
glusterdump.n2.dump.1470043929)
5. do 'cp -Rp * /root/MNT/toto/', so on an other directory
(VSZ/RSS are 1432608/909968 at end of this operation)
6. take a dump with kill -USR1  (file
glusterdump.n3.dump.)


Hey,
  Thanks a lot for providing this information. Looking at these 
steps, I don't see any problem for the increase in memory. Both ls -lR 
and cp -Rp commands you did in the step-3 will add new inodes in 
memory which increase the memory. What happens is as long as the 
kernel thinks these inodes need to be in memory gluster keeps them in 
memory. Once kernel doesn't think the inode is necessary, it sends 
'inode-forgets'. At this point the memory starts reducing. So it kind 
of depends on the memory pressure kernel is under. But you said it 
lead to OOM-killers on smaller machines which means there could be 
some leaks. Could you modify the steps as follows to check to confirm 
there are leaks? Please do this test on those smaller machines which 
lead to OOM-killers.



Thanks for your feedback. I will send these statedumps today.
--
Y.


Steps:
1. mount -t glusterfs ldap1.my.domain:SHARE /root/MNT/ (here VSZ and 
RSS are 381896 35828)
2. perform a 'ls -lR /root/MNT | wc -l' (btw result of wc -l is 518396 
:)) and a 'cp -Rp /usr/* /root/MNT/boo' (VSZ/RSS are 1301536/711992 at 
end of these operations)
3. do 'cp -Rp * /root/MNT/toto/', so on an other directory (VSZ/RSS 
are 1432608/909968 at end of this operation)

4. Delete all the files and directories you created in steps 2, 3 above
5. Take statedump with kill -USR1 
6. Repeat steps from 2-5

Attach these two statedumps. I think the statedumps will be even more 
affective if the mount does not have any data when you start the 
experiment.


HTH


Dump files are gzip'ed because they are very large.
Dump files are here (too big for email):
http://wikisend.com/download/623430/glusterdump.n1.dump.1470042769.gz
http://wikisend.com/download/771220/glusterdump.n2.dump.1470043929.gz
http://wikisend.com/download/428752/glusterdump.n3.dump.1470045181.gz
(I keep the files if someone whats them in an other

Re: [Gluster-users] Fuse memleaks, all versions

2016-08-01 Thread Pranith Kumar Karampuri
On Mon, Aug 1, 2016 at 3:40 PM, Yannick Perret  wrote:

> Le 29/07/2016 à 18:39, Pranith Kumar Karampuri a écrit :
>
>
>
> On Fri, Jul 29, 2016 at 2:26 PM, Yannick Perret <
> yannick.per...@liris.cnrs.fr> wrote:
>
>> Ok, last try:
>> after investigating more versions I found that FUSE client leaks memory
>> on all of them.
>> I tested:
>> - 3.6.7 client on debian 7 32bit and on debian 8 64bit (with 3.6.7
>> serveurs on debian 8 64bit)
>> - 3.6.9 client on debian 7 32bit and on debian 8 64bit (with 3.6.7
>> serveurs on debian 8 64bit)
>> - 3.7.13 client on debian 8 64bit (with 3.8.1 serveurs on debian 8 64bit)
>> - 3.8.1 client on debian 8 64bit (with 3.8.1 serveurs on debian 8 64bit)
>> In all cases compiled from sources, appart for 3.8.1 where .deb were used
>> (due to a configure runtime error).
>> For 3.7 it was compiled with --disable-tiering. I also tried to compile
>> with --disable-fusermount (no change).
>>
>> In all of these cases the memory (resident & virtual) of glusterfs
>> process on client grows on each activity and never reach a max (and never
>> reduce).
>> "Activity" for these tests is cp -Rp and ls -lR.
>> The client I let grows the most overreached ~4Go RAM. On smaller machines
>> it ends by OOM killer killing glusterfs process or glusterfs dying due to
>> allocation error.
>>
>> In 3.6 mem seems to grow continusly, whereas in 3.8.1 it grows by "steps"
>> (430400 ko → 629144 (~1min) → 762324 (~1min) → 827860…).
>>
>> All tests performed on a single test volume used only by my test client.
>> Volume in a basic x2 replica. The only parameters I changed on this volume
>> (without any effect) are diagnostics.client-log-level set to ERROR and
>> network.inode-lru-limit set to 1024.
>>
>
> Could you attach statedumps of your runs?
> The following link has steps to capture this(
> https://gluster.readthedocs.io/en/latest/Troubleshooting/statedump/ ). We
> basically need to see what are the memory types that are increasing. If you
> could help find the issue, we can send the fixes for your workload. There
> is a 3.8.2 release in around 10 days I think. We can probably target this
> issue for that?
>
> Here are statedumps.
> Steps:
> 1. mount -t glusterfs ldap1.my.domain:SHARE /root/MNT/ (here VSZ and RSS
> are 381896 35828)
> 2. take a dump with kill -USR1  (file
> glusterdump.n1.dump.1470042769)
> 3. perform a 'ls -lR /root/MNT | wc -l' (btw result of wc -l is 518396 :))
> and a 'cp -Rp /usr/* /root/MNT/boo' (VSZ/RSS are 1301536/711992 at end of
> these operations)
> 4. take a dump with kill -USR1  (file
> glusterdump.n2.dump.1470043929)
> 5. do 'cp -Rp * /root/MNT/toto/', so on an other directory (VSZ/RSS are
> 1432608/909968 at end of this operation)
> 6. take a dump with kill -USR1  (file
> glusterdump.n3.dump.)
>

Hey,
  Thanks a lot for providing this information. Looking at these steps,
I don't see any problem for the increase in memory. Both ls -lR and cp -Rp
commands you did in the step-3 will add new inodes in memory which increase
the memory. What happens is as long as the kernel thinks these inodes need
to be in memory gluster keeps them in memory. Once kernel doesn't think the
inode is necessary, it sends 'inode-forgets'. At this point the memory
starts reducing. So it kind of depends on the memory pressure kernel is
under. But you said it lead to OOM-killers on smaller machines which means
there could be some leaks. Could you modify the steps as follows to check
to confirm there are leaks? Please do this test on those smaller machines
which lead to OOM-killers.

Steps:
1. mount -t glusterfs ldap1.my.domain:SHARE /root/MNT/ (here VSZ and RSS
are 381896 35828)
2. perform a 'ls -lR /root/MNT | wc -l' (btw result of wc -l is 518396 :))
and a 'cp -Rp /usr/* /root/MNT/boo' (VSZ/RSS are 1301536/711992 at end of
these operations)
3. do 'cp -Rp * /root/MNT/toto/', so on an other directory (VSZ/RSS are
1432608/909968 at end of this operation)
4. Delete all the files and directories you created in steps 2, 3 above
5. Take statedump with kill -USR1 
6. Repeat steps from 2-5

Attach these two statedumps. I think the statedumps will be even more
affective if the mount does not have any data when you start the experiment.

HTH


>
> Dump files are gzip'ed because they are very large.
> Dump files are here (too big for email):
> http://wikisend.com/download/623430/glusterdump.n1.dump.1470042769.gz
> http://wikisend.com/download/771220/glusterdump.n2.dump.1470043929.gz
> http://wikisend.com/download/428752/glusterdump.n3.dump.1470045181.gz
> (I keep the files if someone whats them in an other format)
>
> Client and servers are installed from .deb files
> (glusterfs-client_3.8.1-1_amd64.deb and glusterfs-common_3.8.1-1_amd64.deb
> on client side).
> They are all Debian 8 64bit. Servers are test machines that serve only one
> volume to this sole client. Volume is a simple x2 replica. I just changed
> for test network.inode-lru-limit value to 1024. Mount point /root/MNT is
> only used for these t

Re: [Gluster-users] Fuse memleaks, all versions

2016-08-01 Thread Yannick Perret

Le 29/07/2016 à 18:39, Pranith Kumar Karampuri a écrit :



On Fri, Jul 29, 2016 at 2:26 PM, Yannick Perret 
mailto:yannick.per...@liris.cnrs.fr>> 
wrote:


Ok, last try:
after investigating more versions I found that FUSE client leaks
memory on all of them.
I tested:
- 3.6.7 client on debian 7 32bit and on debian 8 64bit (with 3.6.7
serveurs on debian 8 64bit)
- 3.6.9 client on debian 7 32bit and on debian 8 64bit (with 3.6.7
serveurs on debian 8 64bit)
- 3.7.13 client on debian 8 64bit (with 3.8.1 serveurs on debian 8
64bit)
- 3.8.1 client on debian 8 64bit (with 3.8.1 serveurs on debian 8
64bit)
In all cases compiled from sources, appart for 3.8.1 where .deb
were used (due to a configure runtime error).
For 3.7 it was compiled with --disable-tiering. I also tried to
compile with --disable-fusermount (no change).

In all of these cases the memory (resident & virtual) of glusterfs
process on client grows on each activity and never reach a max
(and never reduce).
"Activity" for these tests is cp -Rp and ls -lR.
The client I let grows the most overreached ~4Go RAM. On smaller
machines it ends by OOM killer killing glusterfs process or
glusterfs dying due to allocation error.

In 3.6 mem seems to grow continusly, whereas in 3.8.1 it grows by
"steps" (430400 ko → 629144 (~1min) → 762324 (~1min) → 827860…).

All tests performed on a single test volume used only by my test
client. Volume in a basic x2 replica. The only parameters I
changed on this volume (without any effect) are
diagnostics.client-log-level set to ERROR and
network.inode-lru-limit set to 1024.


Could you attach statedumps of your runs?
The following link has steps to capture 
this(https://gluster.readthedocs.io/en/latest/Troubleshooting/statedump/ 
). We basically need to see what are the memory types that are 
increasing. If you could help find the issue, we can send the fixes 
for your workload. There is a 3.8.2 release in around 10 days I think. 
We can probably target this issue for that?

Here are statedumps.
Steps:
1. mount -t glusterfs ldap1.my.domain:SHARE /root/MNT/ (here VSZ and RSS 
are 381896 35828)
2. take a dump with kill -USR1  (file 
glusterdump.n1.dump.1470042769)
3. perform a 'ls -lR /root/MNT | wc -l' (btw result of wc -l is 518396 
:)) and a 'cp -Rp /usr/* /root/MNT/boo' (VSZ/RSS are 1301536/711992 at 
end of these operations)
4. take a dump with kill -USR1  (file 
glusterdump.n2.dump.1470043929)
5. do 'cp -Rp * /root/MNT/toto/', so on an other directory (VSZ/RSS are 
1432608/909968 at end of this operation)
6. take a dump with kill -USR1  (file 
glusterdump.n3.dump.)


Dump files are gzip'ed because they are very large.
Dump files are here (too big for email):
http://wikisend.com/download/623430/glusterdump.n1.dump.1470042769.gz
http://wikisend.com/download/771220/glusterdump.n2.dump.1470043929.gz
http://wikisend.com/download/428752/glusterdump.n3.dump.1470045181.gz
(I keep the files if someone whats them in an other format)

Client and servers are installed from .deb files 
(glusterfs-client_3.8.1-1_amd64.deb and 
glusterfs-common_3.8.1-1_amd64.deb on client side).
They are all Debian 8 64bit. Servers are test machines that serve only 
one volume to this sole client. Volume is a simple x2 replica. I just 
changed for test network.inode-lru-limit value to 1024. Mount point 
/root/MNT is only used for these tests.


--
Y.




smime.p7s
Description: Signature cryptographique S/MIME
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Fuse memleaks, all versions

2016-07-29 Thread Yannick Perret

Le 29/07/2016 20:27, Pranith Kumar Karampuri a écrit :



On Fri, Jul 29, 2016 at 10:09 PM, Pranith Kumar Karampuri 
mailto:pkara...@redhat.com>> wrote:




On Fri, Jul 29, 2016 at 2:26 PM, Yannick Perret
mailto:yannick.per...@liris.cnrs.fr>> wrote:

Ok, last try:
after investigating more versions I found that FUSE client
leaks memory on all of them.
I tested:
- 3.6.7 client on debian 7 32bit and on debian 8 64bit (with
3.6.7 serveurs on debian 8 64bit)
- 3.6.9 client on debian 7 32bit and on debian 8 64bit (with
3.6.7 serveurs on debian 8 64bit)=
- 3.7.13 client on debian 8 64bit (with 3.8.1 serveurs on
debian 8 64bit)
- 3.8.1 client on debian 8 64bit (with 3.8.1 serveurs on
debian 8 64bit)
In all cases compiled from sources, appart for 3.8.1 where
.deb were used (due to a configure runtime error).
For 3.7 it was compiled with --disable-tiering. I also tried
to compile with --disable-fusermount (no change).

In all of these cases the memory (resident & virtual) of
glusterfs process on client grows on each activity and never
reach a max (and never reduce).
"Activity" for these tests is cp -Rp and ls -lR.
The client I let grows the most overreached ~4Go RAM. On
smaller machines it ends by OOM killer killing glusterfs
process or glusterfs dying due to allocation error.

In 3.6 mem seems to grow continusly, whereas in 3.8.1 it grows
by "steps" (430400 ko → 629144 (~1min) → 762324 (~1min) →
827860…).

All tests performed on a single test volume used only by my
test client. Volume in a basic x2 replica. The only parameters
I changed on this volume (without any effect) are
diagnostics.client-log-level set to ERROR and
network.inode-lru-limit set to 1024.


Could you attach statedumps of your runs?
The following link has steps to capture
this(https://gluster.readthedocs.io/en/latest/Troubleshooting/statedump/
). We basically need to see what are the memory types that are
increasing. If you could help find the issue, we can send the
fixes for your workload. There is a 3.8.2 release in around 10
days I think. We can probably target this issue for that?


hi,
 We found a problem here: 
https://bugzilla.redhat.com/show_bug.cgi?id=1361681#c0, Based on 
git-blame this bug is in existence from 2012-Aug may be even longer. I 
am wondering if you guys are running into this. Do you guys want to 
help test the fix if we provide this? I don't think lot of others ran 
into this problem I guess.

Yes I saw that this seems to be a long-running bug.
I'm surprise that it don't hit too much other people because I'm really 
using a very simple and basic configuration (replica x2 servers + fuse 
clients which is a basic tuto in glusterfs docs). Maybe few people use 
the fuse client, or maybe only in a mount-use-umount manner.


I will send reports as explained in your previous mail.
I have 2 servers and 1 client that are tests machines so I can do what I 
want on them. I can also apply patches as I use build-from-sources 
servers/client (and the memory leak is easy and fast to check: with 
intensive activity I can go from ~140Mo to >2Go in less than 2 hours).


Note: I had a problem with 3.8.1 sources → running ./configure claims about:
configure: WARNING: cache variable ac_cv_build contains a newline
configure: WARNING: cache variable ac_cv_host contains a newline
and calling 'make' tells me:
Makefile:90: *** missing separator (did you mean TAB instead of 8 
spaces?). Stop.
That's why I used the .deb from gusterfs downloads instead of sources 
for this version.


--
Y.





This clearly prevent us to use glusterfs on our clients. Any
way to prevent this to happen? I still switched back to NFS
mounts but it is not what we're looking for.

Regards,
--
Y.



___
Gluster-users mailing list
Gluster-users@gluster.org 
http://www.gluster.org/mailman/listinfo/gluster-users




-- 
Pranith





--
Pranith




smime.p7s
Description: Signature cryptographique S/MIME
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Fuse memleaks, all versions

2016-07-29 Thread Pranith Kumar Karampuri
On Fri, Jul 29, 2016 at 10:09 PM, Pranith Kumar Karampuri <
pkara...@redhat.com> wrote:

>
>
> On Fri, Jul 29, 2016 at 2:26 PM, Yannick Perret <
> yannick.per...@liris.cnrs.fr> wrote:
>
>> Ok, last try:
>> after investigating more versions I found that FUSE client leaks memory
>> on all of them.
>> I tested:
>> - 3.6.7 client on debian 7 32bit and on debian 8 64bit (with 3.6.7
>> serveurs on debian 8 64bit)
>> - 3.6.9 client on debian 7 32bit and on debian 8 64bit (with 3.6.7
>> serveurs on debian 8 64bit)=
>> - 3.7.13 client on debian 8 64bit (with 3.8.1 serveurs on debian 8 64bit)
>> - 3.8.1 client on debian 8 64bit (with 3.8.1 serveurs on debian 8 64bit)
>> In all cases compiled from sources, appart for 3.8.1 where .deb were used
>> (due to a configure runtime error).
>> For 3.7 it was compiled with --disable-tiering. I also tried to compile
>> with --disable-fusermount (no change).
>>
>> In all of these cases the memory (resident & virtual) of glusterfs
>> process on client grows on each activity and never reach a max (and never
>> reduce).
>> "Activity" for these tests is cp -Rp and ls -lR.
>> The client I let grows the most overreached ~4Go RAM. On smaller machines
>> it ends by OOM killer killing glusterfs process or glusterfs dying due to
>> allocation error.
>>
>> In 3.6 mem seems to grow continusly, whereas in 3.8.1 it grows by "steps"
>> (430400 ko → 629144 (~1min) → 762324 (~1min) → 827860…).
>>
>> All tests performed on a single test volume used only by my test client.
>> Volume in a basic x2 replica. The only parameters I changed on this volume
>> (without any effect) are diagnostics.client-log-level set to ERROR and
>> network.inode-lru-limit set to 1024.
>>
>
> Could you attach statedumps of your runs?
> The following link has steps to capture this(
> https://gluster.readthedocs.io/en/latest/Troubleshooting/statedump/ ). We
> basically need to see what are the memory types that are increasing. If you
> could help find the issue, we can send the fixes for your workload. There
> is a 3.8.2 release in around 10 days I think. We can probably target this
> issue for that?
>

hi,
 We found a problem here:
https://bugzilla.redhat.com/show_bug.cgi?id=1361681#c0, Based on git-blame
this bug is in existence from 2012-Aug may be even longer. I am wondering
if you guys are running into this. Do you guys want to help test the fix if
we provide this? I don't think lot of others ran into this problem I guess.


>
>
>>
>> This clearly prevent us to use glusterfs on our clients. Any way to
>> prevent this to happen? I still switched back to NFS mounts but it is not
>> what we're looking for.
>>
>> Regards,
>> --
>> Y.
>>
>>
>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>>
>
>
>
> --
> Pranith
>



-- 
Pranith
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Fuse memleaks, all versions

2016-07-29 Thread Pranith Kumar Karampuri
On Fri, Jul 29, 2016 at 2:26 PM, Yannick Perret <
yannick.per...@liris.cnrs.fr> wrote:

> Ok, last try:
> after investigating more versions I found that FUSE client leaks memory on
> all of them.
> I tested:
> - 3.6.7 client on debian 7 32bit and on debian 8 64bit (with 3.6.7
> serveurs on debian 8 64bit)
> - 3.6.9 client on debian 7 32bit and on debian 8 64bit (with 3.6.7
> serveurs on debian 8 64bit)=
> - 3.7.13 client on debian 8 64bit (with 3.8.1 serveurs on debian 8 64bit)
> - 3.8.1 client on debian 8 64bit (with 3.8.1 serveurs on debian 8 64bit)
> In all cases compiled from sources, appart for 3.8.1 where .deb were used
> (due to a configure runtime error).
> For 3.7 it was compiled with --disable-tiering. I also tried to compile
> with --disable-fusermount (no change).
>
> In all of these cases the memory (resident & virtual) of glusterfs process
> on client grows on each activity and never reach a max (and never reduce).
> "Activity" for these tests is cp -Rp and ls -lR.
> The client I let grows the most overreached ~4Go RAM. On smaller machines
> it ends by OOM killer killing glusterfs process or glusterfs dying due to
> allocation error.
>
> In 3.6 mem seems to grow continusly, whereas in 3.8.1 it grows by "steps"
> (430400 ko → 629144 (~1min) → 762324 (~1min) → 827860…).
>
> All tests performed on a single test volume used only by my test client.
> Volume in a basic x2 replica. The only parameters I changed on this volume
> (without any effect) are diagnostics.client-log-level set to ERROR and
> network.inode-lru-limit set to 1024.
>

Could you attach statedumps of your runs?
The following link has steps to capture this(
https://gluster.readthedocs.io/en/latest/Troubleshooting/statedump/ ). We
basically need to see what are the memory types that are increasing. If you
could help find the issue, we can send the fixes for your workload. There
is a 3.8.2 release in around 10 days I think. We can probably target this
issue for that?


>
> This clearly prevent us to use glusterfs on our clients. Any way to
> prevent this to happen? I still switched back to NFS mounts but it is not
> what we're looking for.
>
> Regards,
> --
> Y.
>
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>



-- 
Pranith
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Fuse memleaks, all versions

2016-07-29 Thread Yannick Perret

Ok, last try:
after investigating more versions I found that FUSE client leaks memory 
on all of them.

I tested:
- 3.6.7 client on debian 7 32bit and on debian 8 64bit (with 3.6.7 
serveurs on debian 8 64bit)
- 3.6.9 client on debian 7 32bit and on debian 8 64bit (with 3.6.7 
serveurs on debian 8 64bit)=

- 3.7.13 client on debian 8 64bit (with 3.8.1 serveurs on debian 8 64bit)
- 3.8.1 client on debian 8 64bit (with 3.8.1 serveurs on debian 8 64bit)
In all cases compiled from sources, appart for 3.8.1 where .deb were 
used (due to a configure runtime error).
For 3.7 it was compiled with --disable-tiering. I also tried to compile 
with --disable-fusermount (no change).


In all of these cases the memory (resident & virtual) of glusterfs 
process on client grows on each activity and never reach a max (and 
never reduce).

"Activity" for these tests is cp -Rp and ls -lR.
The client I let grows the most overreached ~4Go RAM. On smaller 
machines it ends by OOM killer killing glusterfs process or glusterfs 
dying due to allocation error.


In 3.6 mem seems to grow continusly, whereas in 3.8.1 it grows by 
"steps" (430400 ko → 629144 (~1min) → 762324 (~1min) → 827860…).


All tests performed on a single test volume used only by my test client. 
Volume in a basic x2 replica. The only parameters I changed on this 
volume (without any effect) are diagnostics.client-log-level set to 
ERROR and network.inode-lru-limit set to 1024.



This clearly prevent us to use glusterfs on our clients. Any way to 
prevent this to happen? I still switched back to NFS mounts but it is 
not what we're looking for.


Regards,
--
Y.




smime.p7s
Description: Signature cryptographique S/MIME
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users