Re: [Gluster-users] Fuse memleaks, all versions

2016-08-01 Thread Pranith Kumar Karampuri
On Mon, Aug 1, 2016 at 3:40 PM, Yannick Perret  wrote:

> Le 29/07/2016 à 18:39, Pranith Kumar Karampuri a écrit :
>
>
>
> On Fri, Jul 29, 2016 at 2:26 PM, Yannick Perret <
> yannick.per...@liris.cnrs.fr> wrote:
>
>> Ok, last try:
>> after investigating more versions I found that FUSE client leaks memory
>> on all of them.
>> I tested:
>> - 3.6.7 client on debian 7 32bit and on debian 8 64bit (with 3.6.7
>> serveurs on debian 8 64bit)
>> - 3.6.9 client on debian 7 32bit and on debian 8 64bit (with 3.6.7
>> serveurs on debian 8 64bit)
>> - 3.7.13 client on debian 8 64bit (with 3.8.1 serveurs on debian 8 64bit)
>> - 3.8.1 client on debian 8 64bit (with 3.8.1 serveurs on debian 8 64bit)
>> In all cases compiled from sources, appart for 3.8.1 where .deb were used
>> (due to a configure runtime error).
>> For 3.7 it was compiled with --disable-tiering. I also tried to compile
>> with --disable-fusermount (no change).
>>
>> In all of these cases the memory (resident & virtual) of glusterfs
>> process on client grows on each activity and never reach a max (and never
>> reduce).
>> "Activity" for these tests is cp -Rp and ls -lR.
>> The client I let grows the most overreached ~4Go RAM. On smaller machines
>> it ends by OOM killer killing glusterfs process or glusterfs dying due to
>> allocation error.
>>
>> In 3.6 mem seems to grow continusly, whereas in 3.8.1 it grows by "steps"
>> (430400 ko → 629144 (~1min) → 762324 (~1min) → 827860…).
>>
>> All tests performed on a single test volume used only by my test client.
>> Volume in a basic x2 replica. The only parameters I changed on this volume
>> (without any effect) are diagnostics.client-log-level set to ERROR and
>> network.inode-lru-limit set to 1024.
>>
>
> Could you attach statedumps of your runs?
> The following link has steps to capture this(
> https://gluster.readthedocs.io/en/latest/Troubleshooting/statedump/ ). We
> basically need to see what are the memory types that are increasing. If you
> could help find the issue, we can send the fixes for your workload. There
> is a 3.8.2 release in around 10 days I think. We can probably target this
> issue for that?
>
> Here are statedumps.
> Steps:
> 1. mount -t glusterfs ldap1.my.domain:SHARE /root/MNT/ (here VSZ and RSS
> are 381896 35828)
> 2. take a dump with kill -USR1  (file
> glusterdump.n1.dump.1470042769)
> 3. perform a 'ls -lR /root/MNT | wc -l' (btw result of wc -l is 518396 :))
> and a 'cp -Rp /usr/* /root/MNT/boo' (VSZ/RSS are 1301536/711992 at end of
> these operations)
> 4. take a dump with kill -USR1  (file
> glusterdump.n2.dump.1470043929)
> 5. do 'cp -Rp * /root/MNT/toto/', so on an other directory (VSZ/RSS are
> 1432608/909968 at end of this operation)
> 6. take a dump with kill -USR1  (file
> glusterdump.n3.dump.)
>

Hey,
  Thanks a lot for providing this information. Looking at these steps,
I don't see any problem for the increase in memory. Both ls -lR and cp -Rp
commands you did in the step-3 will add new inodes in memory which increase
the memory. What happens is as long as the kernel thinks these inodes need
to be in memory gluster keeps them in memory. Once kernel doesn't think the
inode is necessary, it sends 'inode-forgets'. At this point the memory
starts reducing. So it kind of depends on the memory pressure kernel is
under. But you said it lead to OOM-killers on smaller machines which means
there could be some leaks. Could you modify the steps as follows to check
to confirm there are leaks? Please do this test on those smaller machines
which lead to OOM-killers.

Steps:
1. mount -t glusterfs ldap1.my.domain:SHARE /root/MNT/ (here VSZ and RSS
are 381896 35828)
2. perform a 'ls -lR /root/MNT | wc -l' (btw result of wc -l is 518396 :))
and a 'cp -Rp /usr/* /root/MNT/boo' (VSZ/RSS are 1301536/711992 at end of
these operations)
3. do 'cp -Rp * /root/MNT/toto/', so on an other directory (VSZ/RSS are
1432608/909968 at end of this operation)
4. Delete all the files and directories you created in steps 2, 3 above
5. Take statedump with kill -USR1 
6. Repeat steps from 2-5

Attach these two statedumps. I think the statedumps will be even more
affective if the mount does not have any data when you start the experiment.

HTH


>
> Dump files are gzip'ed because they are very large.
> Dump files are here (too big for email):
> http://wikisend.com/download/623430/glusterdump.n1.dump.1470042769.gz
> http://wikisend.com/download/771220/glusterdump.n2.dump.1470043929.gz
> http://wikisend.com/download/428752/glusterdump.n3.dump.1470045181.gz
> (I keep the files if someone whats them in an other format)
>
> Client and servers are installed from .deb files
> (glusterfs-client_3.8.1-1_amd64.deb and glusterfs-common_3.8.1-1_amd64.deb
> on client side).
> They are all Debian 8 64bit. Servers are test machines that serve only one
> volume to this sole client. Volume is a simple x2 replica. I just changed
> for test network.inode-lru-limit value to 1024. Mount point 

[Gluster-users] gluster 3.7.13 with shards won't heal with io-thread-count 16

2016-08-01 Thread Lenovo Lastname
 I was testing this with VMware workstation with 3 node v3.7.13 3Gram 2vcpu 
each,
Volume Name: v1
Type: Replicate
Volume ID: 52451d84-4176-4ec1-96e8-7e60d02a37f5
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 192.168.3.71:/gfs/b1/v1
Brick2: 192.168.3.72:/gfs/b1/v1
Brick3: 192.168.3.73:/gfs/b1/v1
Options Reconfigured:
network.ping-timeout: 10
performance.cache-refresh-timeout: 1
cluster.server-quorum-type: server
performance.quick-read: off
performance.stat-prefetch: off
features.shard-block-size: 16MB
features.shard: on
performance.readdir-ahead: on
performance.cache-size: 128MB
performance.write-behind-window-size: 4MB
performance.io-cache: off
performance.write-behind: on
performance.flush-behind: on
performance.io-thread-count: 16
nfs.rpc-auth-allow: 192.168.3.65
cluster.server-quorum-ratio: 51%
But since I had one running on my production 9Gram 6vcpu 3Gnicbond with no 
error but of course difference settings like
performance.cache-size: 1GB
performance.io-thread-count: 32features.shard-block-size: 
64MBperformance.write-behind-window-size: 16MB
I figured it out that the performance.io-thread-count: 16 is the problem, once 
I put it to 32 like my prod, the healing healed right away.

anymore I need to keep in mind, lol it's really freaking crazy to run this 
right away without more testing...
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

2016-08-01 Thread Mahdi Adnan
Many thanks,
here's the results;

(gdb) p cur_block$15 = 4088(gdb) p last_block$16 = 4088(gdb) p 
local->first_block$17 = 4087(gdb) p odirect$18 = _gf_false(gdb) p fd->flags$19 
= 2(gdb) p local->call_count$20 = 2

If you need more core dumps, i have several files i can upload.

-- 



Respectfully

Mahdi A. Mahdi



From: kdhan...@redhat.com
Date: Mon, 1 Aug 2016 18:39:27 +0530
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
To: mahdi.ad...@outlook.com
CC: gluster-users@gluster.org

Sorry I didn't make myself  clear. The reason I asked YOU to do it is because i 
tried it on my system and im not getting the backtrace (it's all question 
marks).

Attach the core to gdb.
At the gdb prompt, go to frame 2 by typing
(gdb) f 2

There, for each of the variables i asked you to get the values of, type p 
followed by the variable name.
For instance, to get the value of the variable 'odirect', do this:

(gdb) p odirect

and gdb will print its value for you in response.

-Krutika

On Mon, Aug 1, 2016 at 4:55 PM, Mahdi Adnan  wrote:



Hi,
How to get the results of the below variables ? i cant get the results from gdb.



-- 



Respectfully

Mahdi A. Mahdi



From: kdhan...@redhat.com
Date: Mon, 1 Aug 2016 15:51:38 +0530
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
To: mahdi.ad...@outlook.com
CC: gluster-users@gluster.org

Could you also print and share the values of the following variables from the 
backtrace please:

i. cur_block
ii. last_block
iii. local->first_block
iv. odirect
v. fd->flags
vi. local->call_count

-Krutika

On Sat, Jul 30, 2016 at 5:04 PM, Mahdi Adnan  wrote:



Hi,
I really appreciate if someone can help me fix my nfs crash, its happening a 
lot and it's causing lots of issues to my VMs;the problem is every few hours 
the native nfs crash and the volume become unavailable from the affected node 
unless i restart glusterd.the volume is used by vmware esxi as a datastore for 
it's VMs with the following options;

OS: CentOS 7.2Gluster: 3.7.13
Volume Name: vlm01Type: Distributed-ReplicateVolume ID: 
eacd8248-dca3-4530-9aed-7714a5a114f2Status: StartedNumber of Bricks: 7 x 3 = 
21Transport-type: tcpBricks:Brick1: gfs01:/bricks/b01/vlm01Brick2: 
gfs02:/bricks/b01/vlm01Brick3: gfs03:/bricks/b01/vlm01Brick4: 
gfs01:/bricks/b02/vlm01Brick5: gfs02:/bricks/b02/vlm01Brick6: 
gfs03:/bricks/b02/vlm01Brick7: gfs01:/bricks/b03/vlm01Brick8: 
gfs02:/bricks/b03/vlm01Brick9: gfs03:/bricks/b03/vlm01Brick10: 
gfs01:/bricks/b04/vlm01Brick11: gfs02:/bricks/b04/vlm01Brick12: 
gfs03:/bricks/b04/vlm01Brick13: gfs01:/bricks/b05/vlm01Brick14: 
gfs02:/bricks/b05/vlm01Brick15: gfs03:/bricks/b05/vlm01Brick16: 
gfs01:/bricks/b06/vlm01Brick17: gfs02:/bricks/b06/vlm01Brick18: 
gfs03:/bricks/b06/vlm01Brick19: gfs01:/bricks/b07/vlm01Brick20: 
gfs02:/bricks/b07/vlm01Brick21: gfs03:/bricks/b07/vlm01Options 
Reconfigured:performance.readdir-ahead: offperformance.quick-read: 
offperformance.read-ahead: offperformance.io-cache: 
offperformance.stat-prefetch: offcluster.eager-lock: enablenetwork.remote-dio: 
enablecluster.quorum-type: autocluster.server-quorum-type: 
serverperformance.strict-write-ordering: onperformance.write-behind: 
offcluster.data-self-heal-algorithm: fullcluster.self-heal-window-size: 
128features.shard-block-size: 16MBfeatures.shard: onauth.allow: 
192.168.221.50,192.168.221.51,192.168.221.52,192.168.221.56,192.168.208.130,192.168.208.131,192.168.208.132,192.168.208.89,192.168.208.85,192.168.208.208.86network.ping-timeout:
 10

latest bt;

(gdb) bt #0  0x7f196acab210 in pthread_spin_lock () from 
/lib64/libpthread.so.0#1  0x7f196be7bcd5 in fd_anonymous (inode=0x0) at 
fd.c:804#2  0x7f195deb1787 in shard_common_inode_write_do 
(frame=0x7f19699f1164, this=0x7f195802ac10) at shard.c:3716#3  
0x7f195deb1a53 in shard_common_inode_write_post_lookup_shards_handler 
(frame=, this=) at shard.c:3769#4  
0x7f195deaaff5 in shard_common_lookup_shards_cbk (frame=0x7f19699f1164, 
cookie=, this=0x7f195802ac10, op_ret=0, op_errno=, inode=, buf=0x7f194970bc40, xdata=0x7f196c15451c, 
postparent=0x7f194970bcb0) at shard.c:1601#5  0x7f195e10a141 in 
dht_lookup_cbk (frame=0x7f196998e7d4, cookie=, this=, op_ret=0, op_errno=0, inode=0x7f195c532b18, stbuf=0x7f194970bc40, 
xattr=0x7f196c15451c, postparent=0x7f194970bcb0) at dht-common.c:2174#6  
0x7f195e3931f3 in afr_lookup_done (frame=frame@entry=0x7f196997f8a4, 
this=this@entry=0x7f1958022a20) at afr-common.c:1825#7  0x7f195e393b84 in 
afr_lookup_metadata_heal_check (frame=frame@entry=0x7f196997f8a4, 
this=0x7f1958022a20, this@entry=0xe3a929e0b67fa500)at afr-common.c:2068#8  
0x7f195e39434f in afr_lookup_entry_heal (frame=frame@entry=0x7f196997f8a4, 
this=0xe3a929e0b67fa500, this@entry=0x7f1958022a20) at afr-common.c:2157#9  
0x7f195e39467d in afr_lookup_cbk (frame=0x7f196997f8a4, cookie=, this=0x7f1958022a20, op_ret=, op_errno=, inode=, buf=0x7f195effa940, 

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

2016-08-01 Thread Krutika Dhananjay
Sorry I didn't make myself  clear. The reason I asked YOU to do it is
because i tried it on my system and im not getting the backtrace (it's all
question marks).

Attach the core to gdb.
At the gdb prompt, go to frame 2 by typing
(gdb) f 2

There, for each of the variables i asked you to get the values of, type p
followed by the variable name.
For instance, to get the value of the variable 'odirect', do this:

(gdb) p odirect

and gdb will print its value for you in response.

-Krutika

On Mon, Aug 1, 2016 at 4:55 PM, Mahdi Adnan  wrote:

> Hi,
>
> How to get the results of the below variables ? i cant get the results
> from gdb.
>
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
>
>
> --
> From: kdhan...@redhat.com
> Date: Mon, 1 Aug 2016 15:51:38 +0530
> Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
> To: mahdi.ad...@outlook.com
> CC: gluster-users@gluster.org
>
>
> Could you also print and share the values of the following variables from
> the backtrace please:
>
> i. cur_block
> ii. last_block
> iii. local->first_block
> iv. odirect
> v. fd->flags
> vi. local->call_count
>
> -Krutika
>
> On Sat, Jul 30, 2016 at 5:04 PM, Mahdi Adnan 
> wrote:
>
> Hi,
>
> I really appreciate if someone can help me fix my nfs crash, its happening
> a lot and it's causing lots of issues to my VMs;
> the problem is every few hours the native nfs crash and the volume become
> unavailable from the affected node unless i restart glusterd.
> the volume is used by vmware esxi as a datastore for it's VMs with the
> following options;
>
>
> OS: CentOS 7.2
> Gluster: 3.7.13
>
> Volume Name: vlm01
> Type: Distributed-Replicate
> Volume ID: eacd8248-dca3-4530-9aed-7714a5a114f2
> Status: Started
> Number of Bricks: 7 x 3 = 21
> Transport-type: tcp
> Bricks:
> Brick1: gfs01:/bricks/b01/vlm01
> Brick2: gfs02:/bricks/b01/vlm01
> Brick3: gfs03:/bricks/b01/vlm01
> Brick4: gfs01:/bricks/b02/vlm01
> Brick5: gfs02:/bricks/b02/vlm01
> Brick6: gfs03:/bricks/b02/vlm01
> Brick7: gfs01:/bricks/b03/vlm01
> Brick8: gfs02:/bricks/b03/vlm01
> Brick9: gfs03:/bricks/b03/vlm01
> Brick10: gfs01:/bricks/b04/vlm01
> Brick11: gfs02:/bricks/b04/vlm01
> Brick12: gfs03:/bricks/b04/vlm01
> Brick13: gfs01:/bricks/b05/vlm01
> Brick14: gfs02:/bricks/b05/vlm01
> Brick15: gfs03:/bricks/b05/vlm01
> Brick16: gfs01:/bricks/b06/vlm01
> Brick17: gfs02:/bricks/b06/vlm01
> Brick18: gfs03:/bricks/b06/vlm01
> Brick19: gfs01:/bricks/b07/vlm01
> Brick20: gfs02:/bricks/b07/vlm01
> Brick21: gfs03:/bricks/b07/vlm01
> Options Reconfigured:
> performance.readdir-ahead: off
> performance.quick-read: off
> performance.read-ahead: off
> performance.io-cache: off
> performance.stat-prefetch: off
> cluster.eager-lock: enable
> network.remote-dio: enable
> cluster.quorum-type: auto
> cluster.server-quorum-type: server
> performance.strict-write-ordering: on
> performance.write-behind: off
> cluster.data-self-heal-algorithm: full
> cluster.self-heal-window-size: 128
> features.shard-block-size: 16MB
> features.shard: on
> auth.allow:
> 192.168.221.50,192.168.221.51,192.168.221.52,192.168.221.56,192.168.208.130,192.168.208.131,192.168.208.132,192.168.208.89,192.168.208.85,192.168.208.208.86
> network.ping-timeout: 10
>
>
> latest bt;
>
>
> (gdb) bt
> #0  0x7f196acab210 in pthread_spin_lock () from /lib64/libpthread.so.0
> #1  0x7f196be7bcd5 in fd_anonymous (inode=0x0) at fd.c:804
> #2  0x7f195deb1787 in shard_common_inode_write_do
> (frame=0x7f19699f1164, this=0x7f195802ac10) at shard.c:3716
> #3  0x7f195deb1a53 in
> shard_common_inode_write_post_lookup_shards_handler (frame=,
> this=) at shard.c:3769
> #4  0x7f195deaaff5 in shard_common_lookup_shards_cbk
> (frame=0x7f19699f1164, cookie=, this=0x7f195802ac10,
> op_ret=0,
> op_errno=, inode=, buf=0x7f194970bc40,
> xdata=0x7f196c15451c, postparent=0x7f194970bcb0) at shard.c:1601
> #5  0x7f195e10a141 in dht_lookup_cbk (frame=0x7f196998e7d4,
> cookie=, this=, op_ret=0, op_errno=0,
> inode=0x7f195c532b18,
> stbuf=0x7f194970bc40, xattr=0x7f196c15451c, postparent=0x7f194970bcb0)
> at dht-common.c:2174
> #6  0x7f195e3931f3 in afr_lookup_done (frame=frame@entry=0x7f196997f8a4,
> this=this@entry=0x7f1958022a20) at afr-common.c:1825
> #7  0x7f195e393b84 in afr_lookup_metadata_heal_check 
> (frame=frame@entry=0x7f196997f8a4,
> this=0x7f1958022a20, this@entry=0xe3a929e0b67fa500)
> at afr-common.c:2068
> #8  0x7f195e39434f in afr_lookup_entry_heal 
> (frame=frame@entry=0x7f196997f8a4,
> this=0xe3a929e0b67fa500, this@entry=0x7f1958022a20) at afr-common.c:2157
> #9  0x7f195e39467d in afr_lookup_cbk (frame=0x7f196997f8a4,
> cookie=, this=0x7f1958022a20, op_ret=,
> op_errno=, inode=, buf=0x7f195effa940,
> xdata=0x7f196c1853b0, postparent=0x7f195effa9b0) at afr-common.c:2205
> #10 0x7f195e5e2e42 in client3_3_lookup_cbk (req=,
> iov=, count=, myframe=0x7f19652c)
> at client-rpc-fops.c:2981
> #11 

[Gluster-users] Gfapi memleaks, all versions

2016-08-01 Thread Piotr Rybicki

Hello

There is a memleak (as reported by valgrind) in all gluster versions, 
even in 3.8.1 (although leak is smaller)


gluster: 3.8.1
valgrind: 3.11.0

simple C code:

#include 
#include 
#include 
#include 
#include 
#include 

int main (int argc, char** argv) {
glfs_t *fs = NULL;
//glfs_fd_t *fd = NULL;
//  int ret;
//  char *filename = "filename";

fs = glfs_new ("pool");
if (!fs) {
  fprintf (stderr, "glfs_new: returned NULL\n");
  return 1;
}

//  ret = glfs_set_volfile_server (fs, "rdma", "172.17.157.221", 24007);
//  ret = glfs_set_logging (fs, "/dev/stderr", 7);
//  ret = glfs_init (fs);

//ret = glfs_init (fs);
//fprintf (stderr, "glfs_init: returned %d\n", ret);

//  fd = glfs_creat (fs, filename, O_RDWR, 0644);
//  fprintf (stderr, "%s: (%p) %s\n", filename, fd, strerror(errno));

//  ret = glfs_write (fd, "hello gluster\n", 15, 0);
//fprintf (stderr, "glfs_write: returned %d\n", ret);

//  glfs_close (fd);
glfs_fini (fs);

return 0;
}


compiled by: gcc hellogluster-basic.c -lgfapi

valgrind output:

# valgrind --leak-check=full --show-reachable=yes --show-leak-kinds=all 
./a.out

==31396== Memcheck, a memory error detector
==31396== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==31396== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==31396== Command: ./a.out
==31396==
==31396==
==31396== HEAP SUMMARY:
==31396== in use at exit: 12,598,702 bytes in 57 blocks
==31396==   total heap usage: 141 allocs, 84 frees, 25,119,643 bytes 
allocated

==31396==
==31396== 8 bytes in 1 blocks are still reachable in loss record 1 of 55
==31396==at 0x4C2C0D0: calloc (vg_replace_malloc.c:711)
==31396==by 0x5A8B0D6: __gf_default_calloc (mem-pool.h:118)
==31396==by 0x5A8B0D6: __glusterfs_this_location (globals.c:147)
==31396==by 0x4E3D9FC: glfs_new@@GFAPI_3.4.0 (glfs.c:724)
==31396==by 0x4007D6: main (in /root/gf-test2/a.out)
==31396==
==31396== 82 bytes in 1 blocks are definitely lost in loss record 2 of 55
==31396==at 0x4C2C0D0: calloc (vg_replace_malloc.c:711)
==31396==by 0x5A86539: __gf_calloc (mem-pool.c:117)
==31396==by 0x5A6232F: gf_strdup (mem-pool.h:185)
==31396==by 0x5A6232F: gf_log_init (logging.c:735)
==31396==by 0x4E3DDB5: glfs_set_logging@@GFAPI_3.4.0 (glfs.c:862)
==31396==by 0x4E3DA4C: glfs_new@@GFAPI_3.4.0 (glfs.c:737)
==31396==by 0x4007D6: main (in /root/gf-test2/a.out)
==31396==
==31396== 89 bytes in 1 blocks are possibly lost in loss record 3 of 55
==31396==at 0x4C29FE0: malloc (vg_replace_malloc.c:299)
==31396==by 0x5A8666D: __gf_malloc (mem-pool.c:142)
==31396==by 0x5A86991: gf_vasprintf (mem-pool.c:221)
==31396==by 0x5A86A83: gf_asprintf (mem-pool.c:240)
==31396==by 0x5A86CD2: mem_pool_new_fn (mem-pool.c:361)
==31396==by 0x4E3CBA9: glusterfs_ctx_defaults_init (glfs.c:127)
==31396==by 0x4E3DB16: glfs_init_global_ctx (glfs.c:680)
==31396==by 0x4E3DB16: glfs_new@@GFAPI_3.4.0 (glfs.c:725)
==31396==by 0x4007D6: main (in /root/gf-test2/a.out)
==31396==
==31396== 89 bytes in 1 blocks are possibly lost in loss record 4 of 55
==31396==at 0x4C29FE0: malloc (vg_replace_malloc.c:299)
==31396==by 0x5A8666D: __gf_malloc (mem-pool.c:142)
==31396==by 0x5A86991: gf_vasprintf (mem-pool.c:221)
==31396==by 0x5A86A83: gf_asprintf (mem-pool.c:240)
==31396==by 0x5A86CD2: mem_pool_new_fn (mem-pool.c:361)
==31396==by 0x4E3CBF5: glusterfs_ctx_defaults_init (glfs.c:136)
==31396==by 0x4E3DB16: glfs_init_global_ctx (glfs.c:680)
==31396==by 0x4E3DB16: glfs_new@@GFAPI_3.4.0 (glfs.c:725)
==31396==by 0x4007D6: main (in /root/gf-test2/a.out)
==31396==
==31396== 92 bytes in 1 blocks are possibly lost in loss record 5 of 55
==31396==at 0x4C29FE0: malloc (vg_replace_malloc.c:299)
==31396==by 0x5A8666D: __gf_malloc (mem-pool.c:142)
==31396==by 0x5A86991: gf_vasprintf (mem-pool.c:221)
==31396==by 0x5A86A83: gf_asprintf (mem-pool.c:240)
==31396==by 0x5A86CD2: mem_pool_new_fn (mem-pool.c:361)
==31396==by 0x4E3CC1B: glusterfs_ctx_defaults_init (glfs.c:140)
==31396==by 0x4E3DB16: glfs_init_global_ctx (glfs.c:680)
==31396==by 0x4E3DB16: glfs_new@@GFAPI_3.4.0 (glfs.c:725)
==31396==by 0x4007D6: main (in /root/gf-test2/a.out)
==31396==
==31396== 94 bytes in 1 blocks are possibly lost in loss record 6 of 55
==31396==at 0x4C29FE0: malloc (vg_replace_malloc.c:299)
==31396==by 0x5A8666D: __gf_malloc (mem-pool.c:142)
==31396==by 0x5A86991: gf_vasprintf (mem-pool.c:221)
==31396==by 0x5A86A83: gf_asprintf (mem-pool.c:240)
==31396==by 0x5A86CD2: mem_pool_new_fn (mem-pool.c:361)
==31396==by 0x4E3CB83: glusterfs_ctx_defaults_init (glfs.c:122)
==31396==by 0x4E3DB16: glfs_init_global_ctx (glfs.c:680)
==31396==by 0x4E3DB16: glfs_new@@GFAPI_3.4.0 (glfs.c:725)
==31396==by 0x4007D6: main (in 

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

2016-08-01 Thread Mahdi Adnan
Hi,
How to get the results of the below variables ? i cant get the results from gdb.



-- 



Respectfully

Mahdi A. Mahdi



From: kdhan...@redhat.com
Date: Mon, 1 Aug 2016 15:51:38 +0530
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
To: mahdi.ad...@outlook.com
CC: gluster-users@gluster.org

Could you also print and share the values of the following variables from the 
backtrace please:

i. cur_block
ii. last_block
iii. local->first_block
iv. odirect
v. fd->flags
vi. local->call_count

-Krutika

On Sat, Jul 30, 2016 at 5:04 PM, Mahdi Adnan  wrote:



Hi,
I really appreciate if someone can help me fix my nfs crash, its happening a 
lot and it's causing lots of issues to my VMs;the problem is every few hours 
the native nfs crash and the volume become unavailable from the affected node 
unless i restart glusterd.the volume is used by vmware esxi as a datastore for 
it's VMs with the following options;

OS: CentOS 7.2Gluster: 3.7.13
Volume Name: vlm01Type: Distributed-ReplicateVolume ID: 
eacd8248-dca3-4530-9aed-7714a5a114f2Status: StartedNumber of Bricks: 7 x 3 = 
21Transport-type: tcpBricks:Brick1: gfs01:/bricks/b01/vlm01Brick2: 
gfs02:/bricks/b01/vlm01Brick3: gfs03:/bricks/b01/vlm01Brick4: 
gfs01:/bricks/b02/vlm01Brick5: gfs02:/bricks/b02/vlm01Brick6: 
gfs03:/bricks/b02/vlm01Brick7: gfs01:/bricks/b03/vlm01Brick8: 
gfs02:/bricks/b03/vlm01Brick9: gfs03:/bricks/b03/vlm01Brick10: 
gfs01:/bricks/b04/vlm01Brick11: gfs02:/bricks/b04/vlm01Brick12: 
gfs03:/bricks/b04/vlm01Brick13: gfs01:/bricks/b05/vlm01Brick14: 
gfs02:/bricks/b05/vlm01Brick15: gfs03:/bricks/b05/vlm01Brick16: 
gfs01:/bricks/b06/vlm01Brick17: gfs02:/bricks/b06/vlm01Brick18: 
gfs03:/bricks/b06/vlm01Brick19: gfs01:/bricks/b07/vlm01Brick20: 
gfs02:/bricks/b07/vlm01Brick21: gfs03:/bricks/b07/vlm01Options 
Reconfigured:performance.readdir-ahead: offperformance.quick-read: 
offperformance.read-ahead: offperformance.io-cache: 
offperformance.stat-prefetch: offcluster.eager-lock: enablenetwork.remote-dio: 
enablecluster.quorum-type: autocluster.server-quorum-type: 
serverperformance.strict-write-ordering: onperformance.write-behind: 
offcluster.data-self-heal-algorithm: fullcluster.self-heal-window-size: 
128features.shard-block-size: 16MBfeatures.shard: onauth.allow: 
192.168.221.50,192.168.221.51,192.168.221.52,192.168.221.56,192.168.208.130,192.168.208.131,192.168.208.132,192.168.208.89,192.168.208.85,192.168.208.208.86network.ping-timeout:
 10

latest bt;

(gdb) bt #0  0x7f196acab210 in pthread_spin_lock () from 
/lib64/libpthread.so.0#1  0x7f196be7bcd5 in fd_anonymous (inode=0x0) at 
fd.c:804#2  0x7f195deb1787 in shard_common_inode_write_do 
(frame=0x7f19699f1164, this=0x7f195802ac10) at shard.c:3716#3  
0x7f195deb1a53 in shard_common_inode_write_post_lookup_shards_handler 
(frame=, this=) at shard.c:3769#4  
0x7f195deaaff5 in shard_common_lookup_shards_cbk (frame=0x7f19699f1164, 
cookie=, this=0x7f195802ac10, op_ret=0, op_errno=, inode=, buf=0x7f194970bc40, xdata=0x7f196c15451c, 
postparent=0x7f194970bcb0) at shard.c:1601#5  0x7f195e10a141 in 
dht_lookup_cbk (frame=0x7f196998e7d4, cookie=, this=, op_ret=0, op_errno=0, inode=0x7f195c532b18, stbuf=0x7f194970bc40, 
xattr=0x7f196c15451c, postparent=0x7f194970bcb0) at dht-common.c:2174#6  
0x7f195e3931f3 in afr_lookup_done (frame=frame@entry=0x7f196997f8a4, 
this=this@entry=0x7f1958022a20) at afr-common.c:1825#7  0x7f195e393b84 in 
afr_lookup_metadata_heal_check (frame=frame@entry=0x7f196997f8a4, 
this=0x7f1958022a20, this@entry=0xe3a929e0b67fa500)at afr-common.c:2068#8  
0x7f195e39434f in afr_lookup_entry_heal (frame=frame@entry=0x7f196997f8a4, 
this=0xe3a929e0b67fa500, this@entry=0x7f1958022a20) at afr-common.c:2157#9  
0x7f195e39467d in afr_lookup_cbk (frame=0x7f196997f8a4, cookie=, this=0x7f1958022a20, op_ret=, op_errno=, inode=, buf=0x7f195effa940, xdata=0x7f196c1853b0, 
postparent=0x7f195effa9b0) at afr-common.c:2205#10 0x7f195e5e2e42 in 
client3_3_lookup_cbk (req=, iov=, 
count=, myframe=0x7f19652c)at client-rpc-fops.c:2981#11 
0x7f196bc0ca30 in rpc_clnt_handle_reply (clnt=clnt@entry=0x7f19583adaf0, 
pollin=pollin@entry=0x7f195907f930) at rpc-clnt.c:764#12 0x7f196bc0ccef in 
rpc_clnt_notify (trans=, mydata=0x7f19583adb20, event=, data=0x7f195907f930) at rpc-clnt.c:925#13 0x7f196bc087c3 in 
rpc_transport_notify (this=this@entry=0x7f19583bd770, 
event=event@entry=RPC_TRANSPORT_MSG_RECEIVED, data=data@entry=0x7f195907f930)   
 at rpc-transport.c:546#14 0x7f1960acf9a4 in socket_event_poll_in 
(this=this@entry=0x7f19583bd770) at socket.c:2353#15 0x7f1960ad25e4 in 
socket_event_handler (fd=fd@entry=25, idx=idx@entry=14, data=0x7f19583bd770, 
poll_in=1, poll_out=0, poll_err=0) at socket.c:2466#16 0x7f196beacf7a in 
event_dispatch_epoll_handler (event=0x7f195effae80, event_pool=0x7f196dbf5f20) 
at event-epoll.c:575#17 event_dispatch_epoll_worker (data=0x7f196dc41e10) at 

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

2016-08-01 Thread Krutika Dhananjay
Could you also print and share the values of the following variables from
the backtrace please:

i. cur_block
ii. last_block
iii. local->first_block
iv. odirect
v. fd->flags
vi. local->call_count

-Krutika

On Sat, Jul 30, 2016 at 5:04 PM, Mahdi Adnan 
wrote:

> Hi,
>
> I really appreciate if someone can help me fix my nfs crash, its happening
> a lot and it's causing lots of issues to my VMs;
> the problem is every few hours the native nfs crash and the volume become
> unavailable from the affected node unless i restart glusterd.
> the volume is used by vmware esxi as a datastore for it's VMs with the
> following options;
>
>
> OS: CentOS 7.2
> Gluster: 3.7.13
>
> Volume Name: vlm01
> Type: Distributed-Replicate
> Volume ID: eacd8248-dca3-4530-9aed-7714a5a114f2
> Status: Started
> Number of Bricks: 7 x 3 = 21
> Transport-type: tcp
> Bricks:
> Brick1: gfs01:/bricks/b01/vlm01
> Brick2: gfs02:/bricks/b01/vlm01
> Brick3: gfs03:/bricks/b01/vlm01
> Brick4: gfs01:/bricks/b02/vlm01
> Brick5: gfs02:/bricks/b02/vlm01
> Brick6: gfs03:/bricks/b02/vlm01
> Brick7: gfs01:/bricks/b03/vlm01
> Brick8: gfs02:/bricks/b03/vlm01
> Brick9: gfs03:/bricks/b03/vlm01
> Brick10: gfs01:/bricks/b04/vlm01
> Brick11: gfs02:/bricks/b04/vlm01
> Brick12: gfs03:/bricks/b04/vlm01
> Brick13: gfs01:/bricks/b05/vlm01
> Brick14: gfs02:/bricks/b05/vlm01
> Brick15: gfs03:/bricks/b05/vlm01
> Brick16: gfs01:/bricks/b06/vlm01
> Brick17: gfs02:/bricks/b06/vlm01
> Brick18: gfs03:/bricks/b06/vlm01
> Brick19: gfs01:/bricks/b07/vlm01
> Brick20: gfs02:/bricks/b07/vlm01
> Brick21: gfs03:/bricks/b07/vlm01
> Options Reconfigured:
> performance.readdir-ahead: off
> performance.quick-read: off
> performance.read-ahead: off
> performance.io-cache: off
> performance.stat-prefetch: off
> cluster.eager-lock: enable
> network.remote-dio: enable
> cluster.quorum-type: auto
> cluster.server-quorum-type: server
> performance.strict-write-ordering: on
> performance.write-behind: off
> cluster.data-self-heal-algorithm: full
> cluster.self-heal-window-size: 128
> features.shard-block-size: 16MB
> features.shard: on
> auth.allow:
> 192.168.221.50,192.168.221.51,192.168.221.52,192.168.221.56,192.168.208.130,192.168.208.131,192.168.208.132,192.168.208.89,192.168.208.85,192.168.208.208.86
> network.ping-timeout: 10
>
>
> latest bt;
>
>
> (gdb) bt
> #0  0x7f196acab210 in pthread_spin_lock () from /lib64/libpthread.so.0
> #1  0x7f196be7bcd5 in fd_anonymous (inode=0x0) at fd.c:804
> #2  0x7f195deb1787 in shard_common_inode_write_do
> (frame=0x7f19699f1164, this=0x7f195802ac10) at shard.c:3716
> #3  0x7f195deb1a53 in
> shard_common_inode_write_post_lookup_shards_handler (frame=,
> this=) at shard.c:3769
> #4  0x7f195deaaff5 in shard_common_lookup_shards_cbk
> (frame=0x7f19699f1164, cookie=, this=0x7f195802ac10,
> op_ret=0,
> op_errno=, inode=, buf=0x7f194970bc40,
> xdata=0x7f196c15451c, postparent=0x7f194970bcb0) at shard.c:1601
> #5  0x7f195e10a141 in dht_lookup_cbk (frame=0x7f196998e7d4,
> cookie=, this=, op_ret=0, op_errno=0,
> inode=0x7f195c532b18,
> stbuf=0x7f194970bc40, xattr=0x7f196c15451c, postparent=0x7f194970bcb0)
> at dht-common.c:2174
> #6  0x7f195e3931f3 in afr_lookup_done (frame=frame@entry=0x7f196997f8a4,
> this=this@entry=0x7f1958022a20) at afr-common.c:1825
> #7  0x7f195e393b84 in afr_lookup_metadata_heal_check 
> (frame=frame@entry=0x7f196997f8a4,
> this=0x7f1958022a20, this@entry=0xe3a929e0b67fa500)
> at afr-common.c:2068
> #8  0x7f195e39434f in afr_lookup_entry_heal 
> (frame=frame@entry=0x7f196997f8a4,
> this=0xe3a929e0b67fa500, this@entry=0x7f1958022a20) at afr-common.c:2157
> #9  0x7f195e39467d in afr_lookup_cbk (frame=0x7f196997f8a4,
> cookie=, this=0x7f1958022a20, op_ret=,
> op_errno=, inode=, buf=0x7f195effa940,
> xdata=0x7f196c1853b0, postparent=0x7f195effa9b0) at afr-common.c:2205
> #10 0x7f195e5e2e42 in client3_3_lookup_cbk (req=,
> iov=, count=, myframe=0x7f19652c)
> at client-rpc-fops.c:2981
> #11 0x7f196bc0ca30 in rpc_clnt_handle_reply 
> (clnt=clnt@entry=0x7f19583adaf0,
> pollin=pollin@entry=0x7f195907f930) at rpc-clnt.c:764
> #12 0x7f196bc0ccef in rpc_clnt_notify (trans=,
> mydata=0x7f19583adb20, event=, data=0x7f195907f930) at
> rpc-clnt.c:925
> #13 0x7f196bc087c3 in rpc_transport_notify 
> (this=this@entry=0x7f19583bd770,
> event=event@entry=RPC_TRANSPORT_MSG_RECEIVED, data=data@entry
> =0x7f195907f930)
> at rpc-transport.c:546
> #14 0x7f1960acf9a4 in socket_event_poll_in 
> (this=this@entry=0x7f19583bd770)
> at socket.c:2353
> #15 0x7f1960ad25e4 in socket_event_handler (fd=fd@entry=25,
> idx=idx@entry=14, data=0x7f19583bd770, poll_in=1, poll_out=0, poll_err=0)
> at socket.c:2466
> #16 0x7f196beacf7a in event_dispatch_epoll_handler
> (event=0x7f195effae80, event_pool=0x7f196dbf5f20) at event-epoll.c:575
> #17 event_dispatch_epoll_worker (data=0x7f196dc41e10) at event-epoll.c:678
> #18 0x7f196aca6dc5 in start_thread () 

Re: [Gluster-users] Fuse memleaks, all versions

2016-08-01 Thread Yannick Perret

Le 29/07/2016 à 18:39, Pranith Kumar Karampuri a écrit :



On Fri, Jul 29, 2016 at 2:26 PM, Yannick Perret 
> 
wrote:


Ok, last try:
after investigating more versions I found that FUSE client leaks
memory on all of them.
I tested:
- 3.6.7 client on debian 7 32bit and on debian 8 64bit (with 3.6.7
serveurs on debian 8 64bit)
- 3.6.9 client on debian 7 32bit and on debian 8 64bit (with 3.6.7
serveurs on debian 8 64bit)
- 3.7.13 client on debian 8 64bit (with 3.8.1 serveurs on debian 8
64bit)
- 3.8.1 client on debian 8 64bit (with 3.8.1 serveurs on debian 8
64bit)
In all cases compiled from sources, appart for 3.8.1 where .deb
were used (due to a configure runtime error).
For 3.7 it was compiled with --disable-tiering. I also tried to
compile with --disable-fusermount (no change).

In all of these cases the memory (resident & virtual) of glusterfs
process on client grows on each activity and never reach a max
(and never reduce).
"Activity" for these tests is cp -Rp and ls -lR.
The client I let grows the most overreached ~4Go RAM. On smaller
machines it ends by OOM killer killing glusterfs process or
glusterfs dying due to allocation error.

In 3.6 mem seems to grow continusly, whereas in 3.8.1 it grows by
"steps" (430400 ko → 629144 (~1min) → 762324 (~1min) → 827860…).

All tests performed on a single test volume used only by my test
client. Volume in a basic x2 replica. The only parameters I
changed on this volume (without any effect) are
diagnostics.client-log-level set to ERROR and
network.inode-lru-limit set to 1024.


Could you attach statedumps of your runs?
The following link has steps to capture 
this(https://gluster.readthedocs.io/en/latest/Troubleshooting/statedump/ 
). We basically need to see what are the memory types that are 
increasing. If you could help find the issue, we can send the fixes 
for your workload. There is a 3.8.2 release in around 10 days I think. 
We can probably target this issue for that?

Here are statedumps.
Steps:
1. mount -t glusterfs ldap1.my.domain:SHARE /root/MNT/ (here VSZ and RSS 
are 381896 35828)
2. take a dump with kill -USR1  (file 
glusterdump.n1.dump.1470042769)
3. perform a 'ls -lR /root/MNT | wc -l' (btw result of wc -l is 518396 
:)) and a 'cp -Rp /usr/* /root/MNT/boo' (VSZ/RSS are 1301536/711992 at 
end of these operations)
4. take a dump with kill -USR1  (file 
glusterdump.n2.dump.1470043929)
5. do 'cp -Rp * /root/MNT/toto/', so on an other directory (VSZ/RSS are 
1432608/909968 at end of this operation)
6. take a dump with kill -USR1  (file 
glusterdump.n3.dump.)


Dump files are gzip'ed because they are very large.
Dump files are here (too big for email):
http://wikisend.com/download/623430/glusterdump.n1.dump.1470042769.gz
http://wikisend.com/download/771220/glusterdump.n2.dump.1470043929.gz
http://wikisend.com/download/428752/glusterdump.n3.dump.1470045181.gz
(I keep the files if someone whats them in an other format)

Client and servers are installed from .deb files 
(glusterfs-client_3.8.1-1_amd64.deb and 
glusterfs-common_3.8.1-1_amd64.deb on client side).
They are all Debian 8 64bit. Servers are test machines that serve only 
one volume to this sole client. Volume is a simple x2 replica. I just 
changed for test network.inode-lru-limit value to 1024. Mount point 
/root/MNT is only used for these tests.


--
Y.




smime.p7s
Description: Signature cryptographique S/MIME
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users