subject:"Re\: \[Gluster\-users\] Gluster 3.7.13 NFS Crash"

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

2016-08-09 Thread Mahdi Adnan

Okay so after migrating a few VMs to the new cluster, the native nfs did NOT 
crash again, it's running for two days straight.My workload does not involve 
high throughput, but high IOp, it's average around 100 IO/ps for each brick.I 
will try to recreate this workload on a VM and see if it crash again.



-- 



Respectfully

Mahdi A. Mahdi



From: kdhan...@redhat.com
Date: Tue, 9 Aug 2016 11:02:44 +0530
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
To: mahdi.ad...@outlook.com
CC: gluster-users@gluster.org

Well, i'm not entirely sure it is a setup-related issue. If you have the steps 
to recreate the issue, along with the relevant
information about volume configuration, logs, core, version etc, then it would 
be good to track this issue through a bug report.

-Krutika

On Mon, Aug 8, 2016 at 8:56 PM, Mahdi Adnan  wrote:



Thank you very much for all the efforts.I have deployed a new cluster with 3 
servers and used nfs-ganesha instead of the native nfs, so far it's working 
fine, also, i tried to reproduce this issue in a test environment but i had no 
luck and it just worked as it should be.Do you think i should file a bug report 
? or maybe it's an issue with my setup only ? 



-- 



Respectfully

Mahdi A. Mahdi



From: kdhan...@redhat.com
Date: Mon, 8 Aug 2016 16:33:19 +0530
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
To: mahdi.ad...@outlook.com
CC: gluster-users@gluster.org

Hi,

Sorry I haven't had the chance to look into this issue last week. Do you mind 
raising a bug in upstream with all
the relevant information and I'll take a look sometime this week?

-Krutika

On Fri, Aug 5, 2016 at 11:58 AM, Mahdi Adnan  wrote:



Hi,
Yes, i got some messages regarding an existing file name in the renaming 
process while the VMs are online.
and here's the output;(gdb) frame 2#2  0x7f195deb1787 in 
shard_common_inode_write_do (frame=0x7f19699f1164, this=0x7f195802ac10) at 
shard.c:37163716anon_fd = fd_anonymous 
(local->inode_list[i]);(gdb) p local->fop$1 = GF_FOP_WRITE(gdb)


-- 



Respectfully

Mahdi A. Mahdi



From: kdhan...@redhat.com
Date: Fri, 5 Aug 2016 10:48:36 +0530
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
To: mahdi.ad...@outlook.com
CC: gluster-users@gluster.org

Also, could you print local->fop please?

-Krutika

On Fri, Aug 5, 2016 at 10:46 AM, Krutika Dhananjay  wrote:
Were the images being renamed (specifically to a pathname that already exists) 
while they were being written to?

-Krutika

On Thu, Aug 4, 2016 at 1:14 PM, Mahdi Adnan  wrote:



Hi,
Kindly check the following link for all 7 bricks logs;
https://db.tt/YP5qTGXk



-- 



Respectfully

Mahdi A. Mahdi



From: kdhan...@redhat.com
Date: Thu, 4 Aug 2016 13:00:43 +0530
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
To: mahdi.ad...@outlook.com
CC: gluster-users@gluster.org

Could you also attach the brick logs please?

-Krutika

On Thu, Aug 4, 2016 at 12:48 PM, Mahdi Adnan  wrote:



appreciate your help,
(gdb) frame 2#2  0x7f195deb1787 in shard_common_inode_write_do 
(frame=0x7f19699f1164, this=0x7f195802ac10) at shard.c:37163716 
 anon_fd = fd_anonymous (local->inode_list[i]);(gdb) p 
local->inode_list[0]$4 = (inode_t *) 0x7f195c532b18(gdb) p 
local->inode_list[1]$5 = (inode_t *) 0x0(gdb) 



-- 



Respectfully

Mahdi A. Mahdi



From: kdhan...@redhat.com
Date: Thu, 4 Aug 2016 12:43:10 +0530
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
To: mahdi.ad...@outlook.com
CC: gluster-users@gluster.org

OK.
Could you also print the values of the following variables from the original 
core:
i. i
ii. local->inode_list[0]
iii. local->inode_list[1]

-Krutika

On Wed, Aug 3, 2016 at 9:01 PM, Mahdi Adnan  wrote:



Hi,
Unfortunately no, but i can setup a test bench and see if it gets the same 
results.


-- 



Respectfully

Mahdi A. Mahdi



From: kdhan...@redhat.com
Date: Wed, 3 Aug 2016 20:59:50 +0530
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
To: mahdi.ad...@outlook.com
CC: gluster-users@gluster.org

Do you have a test case that consistently recreates this problem?

-Krutika

On Wed, Aug 3, 2016 at 8:32 PM, Mahdi Adnan  wrote:



Hi,
 So i have updated to 3.7.14 and i still have the same issue with NFS.based on 
what i have provided so far from logs and dumps do you think it's an NFS issue 
? should i switch to nfs-ganesha ?
the problem is, the current setup is used in a production environment, and 
switching the mount point of  +50 VMs from native nfs to nfs-ganesha is not 
going to be smooth and without downtime, so i really appreciate your thoughts 
on this matter.


-- 



Respectfully

Mahdi A. Mahdi



From: mahdi.ad...@outlook.com
To: kdhan...@redhat.com
Date: Tue, 2 Aug 2016 08:44:16 +0300
CC: gluster-users@gluster.org
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash




Hi,
The NFS just crashed again, l

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

2016-08-08 Thread Krutika Dhananjay

Well, i'm not entirely sure it is a setup-related issue. If you have the
steps to recreate the issue, along with the relevant
information about volume configuration, logs, core, version etc, then it
would be good to track this issue through a bug report.

-Krutika

On Mon, Aug 8, 2016 at 8:56 PM, Mahdi Adnan  wrote:

> Thank you very much for all the efforts.
> I have deployed a new cluster with 3 servers and used nfs-ganesha instead
> of the native nfs, so far it's working fine, also, i tried to reproduce
> this issue in a test environment but i had no luck and it just worked as it
> should be.
> Do you think i should file a bug report ? or maybe it's an issue with my
> setup only ?
>
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
>
>
> --
> From: kdhan...@redhat.com
> Date: Mon, 8 Aug 2016 16:33:19 +0530
>
> Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
> To: mahdi.ad...@outlook.com
> CC: gluster-users@gluster.org
>
> Hi,
>
> Sorry I haven't had the chance to look into this issue last week. Do you
> mind raising a bug in upstream with all
> the relevant information and I'll take a look sometime this week?
>
> -Krutika
>
> On Fri, Aug 5, 2016 at 11:58 AM, Mahdi Adnan 
> wrote:
>
> Hi,
>
> Yes, i got some messages regarding an existing file name in the renaming
> process while the VMs are online.
>
> and here's the output;
> (gdb) frame 2
> #2  0x7f195deb1787 in shard_common_inode_write_do
> (frame=0x7f19699f1164, this=0x7f195802ac10) at shard.c:3716
> 3716anon_fd = fd_anonymous (local->inode_list[i]);
> (gdb) p local->fop
> $1 = GF_FOP_WRITE
> (gdb)
>
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
>
>
> --
> From: kdhan...@redhat.com
> Date: Fri, 5 Aug 2016 10:48:36 +0530
>
> Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
> To: mahdi.ad...@outlook.com
> CC: gluster-users@gluster.org
>
> Also, could you print local->fop please?
>
> -Krutika
>
> On Fri, Aug 5, 2016 at 10:46 AM, Krutika Dhananjay 
> wrote:
>
> Were the images being renamed (specifically to a pathname that already
> exists) while they were being written to?
>
> -Krutika
>
> On Thu, Aug 4, 2016 at 1:14 PM, Mahdi Adnan 
> wrote:
>
> Hi,
>
> Kindly check the following link for all 7 bricks logs;
>
> https://db.tt/YP5qTGXk
>
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
>
>
> --
> From: kdhan...@redhat.com
> Date: Thu, 4 Aug 2016 13:00:43 +0530
>
> Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
> To: mahdi.ad...@outlook.com
> CC: gluster-users@gluster.org
>
> Could you also attach the brick logs please?
>
> -Krutika
>
> On Thu, Aug 4, 2016 at 12:48 PM, Mahdi Adnan 
> wrote:
>
> appreciate your help,
>
> (gdb) frame 2
> #2  0x7f195deb1787 in shard_common_inode_write_do
> (frame=0x7f19699f1164, this=0x7f195802ac10) at shard.c:3716
> 3716            anon_fd = fd_anonymous (local->inode_list[i]);
> (gdb) p local->inode_list[0]
> $4 = (inode_t *) 0x7f195c532b18
> (gdb) p local->inode_list[1]
> $5 = (inode_t *) 0x0
> (gdb)
>
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
>
>
> --
> From: kdhan...@redhat.com
> Date: Thu, 4 Aug 2016 12:43:10 +0530
>
> Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
> To: mahdi.ad...@outlook.com
> CC: gluster-users@gluster.org
>
> OK.
> Could you also print the values of the following variables from the
> original core:
> i. i
> ii. local->inode_list[0]
> iii. local->inode_list[1]
>
> -Krutika
>
> On Wed, Aug 3, 2016 at 9:01 PM, Mahdi Adnan 
> wrote:
>
> Hi,
>
> Unfortunately no, but i can setup a test bench and see if it gets the same
> results.
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
>
>
> --
> From: kdhan...@redhat.com
> Date: Wed, 3 Aug 2016 20:59:50 +0530
>
> Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
> To: mahdi.ad...@outlook.com
> CC: gluster-users@gluster.org
>
> Do you have a test case that consistently recreates this problem?
>
> -Krutika
>
> On Wed, Aug 3, 2016 at 8:32 PM, Mahdi Adnan 
> wrote:
>
> Hi,
>
>  So i have updated to 3.7.14 and i still have the same issue with NFS.
> based on what i have provided so far from logs and dumps do you think it's
> an NFS issue ? should i switch to nfs-ganesha ?
> the problem is, the current setup is used in a production environment, and
> switching the mo

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

2016-08-08 Thread Mahdi Adnan

Thank you very much for all the efforts.I have deployed a new cluster with 3 
servers and used nfs-ganesha instead of the native nfs, so far it's working 
fine, also, i tried to reproduce this issue in a test environment but i had no 
luck and it just worked as it should be.Do you think i should file a bug report 
? or maybe it's an issue with my setup only ? 



-- 



Respectfully

Mahdi A. Mahdi



From: kdhan...@redhat.com
Date: Mon, 8 Aug 2016 16:33:19 +0530
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
To: mahdi.ad...@outlook.com
CC: gluster-users@gluster.org

Hi,

Sorry I haven't had the chance to look into this issue last week. Do you mind 
raising a bug in upstream with all
the relevant information and I'll take a look sometime this week?

-Krutika

On Fri, Aug 5, 2016 at 11:58 AM, Mahdi Adnan  wrote:



Hi,
Yes, i got some messages regarding an existing file name in the renaming 
process while the VMs are online.
and here's the output;(gdb) frame 2#2  0x7f195deb1787 in 
shard_common_inode_write_do (frame=0x7f19699f1164, this=0x7f195802ac10) at 
shard.c:37163716anon_fd = fd_anonymous 
(local->inode_list[i]);(gdb) p local->fop$1 = GF_FOP_WRITE(gdb)


-- 



Respectfully

Mahdi A. Mahdi



From: kdhan...@redhat.com
Date: Fri, 5 Aug 2016 10:48:36 +0530
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
To: mahdi.ad...@outlook.com
CC: gluster-users@gluster.org

Also, could you print local->fop please?

-Krutika

On Fri, Aug 5, 2016 at 10:46 AM, Krutika Dhananjay  wrote:
Were the images being renamed (specifically to a pathname that already exists) 
while they were being written to?

-Krutika

On Thu, Aug 4, 2016 at 1:14 PM, Mahdi Adnan  wrote:



Hi,
Kindly check the following link for all 7 bricks logs;
https://db.tt/YP5qTGXk



-- 



Respectfully

Mahdi A. Mahdi



From: kdhan...@redhat.com
Date: Thu, 4 Aug 2016 13:00:43 +0530
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
To: mahdi.ad...@outlook.com
CC: gluster-users@gluster.org

Could you also attach the brick logs please?

-Krutika

On Thu, Aug 4, 2016 at 12:48 PM, Mahdi Adnan  wrote:



appreciate your help,
(gdb) frame 2#2  0x7f195deb1787 in shard_common_inode_write_do 
(frame=0x7f19699f1164, this=0x7f195802ac10) at shard.c:37163716 
 anon_fd = fd_anonymous (local->inode_list[i]);(gdb) p 
local->inode_list[0]$4 = (inode_t *) 0x7f195c532b18(gdb) p 
local->inode_list[1]$5 = (inode_t *) 0x0(gdb) 



-- 



Respectfully

Mahdi A. Mahdi



From: kdhan...@redhat.com
Date: Thu, 4 Aug 2016 12:43:10 +0530
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
To: mahdi.ad...@outlook.com
CC: gluster-users@gluster.org

OK.
Could you also print the values of the following variables from the original 
core:
i. i
ii. local->inode_list[0]
iii. local->inode_list[1]

-Krutika

On Wed, Aug 3, 2016 at 9:01 PM, Mahdi Adnan  wrote:



Hi,
Unfortunately no, but i can setup a test bench and see if it gets the same 
results.


-- 



Respectfully

Mahdi A. Mahdi



From: kdhan...@redhat.com
Date: Wed, 3 Aug 2016 20:59:50 +0530
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
To: mahdi.ad...@outlook.com
CC: gluster-users@gluster.org

Do you have a test case that consistently recreates this problem?

-Krutika

On Wed, Aug 3, 2016 at 8:32 PM, Mahdi Adnan  wrote:



Hi,
 So i have updated to 3.7.14 and i still have the same issue with NFS.based on 
what i have provided so far from logs and dumps do you think it's an NFS issue 
? should i switch to nfs-ganesha ?
the problem is, the current setup is used in a production environment, and 
switching the mount point of  +50 VMs from native nfs to nfs-ganesha is not 
going to be smooth and without downtime, so i really appreciate your thoughts 
on this matter.


-- 



Respectfully

Mahdi A. Mahdi



From: mahdi.ad...@outlook.com
To: kdhan...@redhat.com
Date: Tue, 2 Aug 2016 08:44:16 +0300
CC: gluster-users@gluster.org
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash




Hi,
The NFS just crashed again, latest bt;
(gdb) bt#0  0x7f0b71a9f210 in pthread_spin_lock () from 
/lib64/libpthread.so.0#1  0x7f0b72c6fcd5 in fd_anonymous (inode=0x0) at 
fd.c:804#2  0x7f0b64ca5787 in shard_common_inode_write_do 
(frame=0x7f0b707c062c, this=0x7f0b6002ac10) at shard.c:3716#3  
0x7f0b64ca5a53 in shard_common_inode_write_post_lookup_shards_handler 
(frame=, this=) at shard.c:3769#4  
0x7f0b64c9eff5 in shard_common_lookup_shards_cbk (frame=0x7f0b707c062c, 
cookie=, this=0x7f0b6002ac10, op_ret=0, op_errno=, inode=, buf=0x7f0b51407640, xdata=0x7f0b72f57648, 
postparent=0x7f0b514076b0) at shard.c:1601#5  0x7f0b64efe141 in 
dht_lookup_cbk (frame=0x7f0b7075fcdc, cookie=, this=, op_ret=0, op_errno=0, inode=0x7f0b5f1d1f58, stbuf=0x7f0b51407640, 
xattr=0x7f0b72f57648, postparent=0x7f0b514076b0) at dht-common.c:2174#6  
0x7f0b651871f3 in afr_loo

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

2016-08-08 Thread Krutika Dhananjay

Hi,

Sorry I haven't had the chance to look into this issue last week. Do you
mind raising a bug in upstream with all
the relevant information and I'll take a look sometime this week?

-Krutika

On Fri, Aug 5, 2016 at 11:58 AM, Mahdi Adnan 
wrote:

> Hi,
>
> Yes, i got some messages regarding an existing file name in the renaming
> process while the VMs are online.
>
> and here's the output;
> (gdb) frame 2
> #2  0x7f195deb1787 in shard_common_inode_write_do
> (frame=0x7f19699f1164, this=0x7f195802ac10) at shard.c:3716
> 3716anon_fd = fd_anonymous (local->inode_list[i]);
> (gdb) p local->fop
> $1 = GF_FOP_WRITE
> (gdb)
>
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
>
>
> ------
> From: kdhan...@redhat.com
> Date: Fri, 5 Aug 2016 10:48:36 +0530
>
> Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
> To: mahdi.ad...@outlook.com
> CC: gluster-users@gluster.org
>
> Also, could you print local->fop please?
>
> -Krutika
>
> On Fri, Aug 5, 2016 at 10:46 AM, Krutika Dhananjay 
> wrote:
>
> Were the images being renamed (specifically to a pathname that already
> exists) while they were being written to?
>
> -Krutika
>
> On Thu, Aug 4, 2016 at 1:14 PM, Mahdi Adnan 
> wrote:
>
> Hi,
>
> Kindly check the following link for all 7 bricks logs;
>
> https://db.tt/YP5qTGXk
>
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
>
>
> --
> From: kdhan...@redhat.com
> Date: Thu, 4 Aug 2016 13:00:43 +0530
>
> Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
> To: mahdi.ad...@outlook.com
> CC: gluster-users@gluster.org
>
> Could you also attach the brick logs please?
>
> -Krutika
>
> On Thu, Aug 4, 2016 at 12:48 PM, Mahdi Adnan 
> wrote:
>
> appreciate your help,
>
> (gdb) frame 2
> #2  0x7f195deb1787 in shard_common_inode_write_do
> (frame=0x7f19699f1164, this=0x7f195802ac10) at shard.c:3716
> 3716anon_fd = fd_anonymous (local->inode_list[i]);
> (gdb) p local->inode_list[0]
> $4 = (inode_t *) 0x7f195c532b18
> (gdb) p local->inode_list[1]
> $5 = (inode_t *) 0x0
> (gdb)
>
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
>
>
> --
> From: kdhan...@redhat.com
> Date: Thu, 4 Aug 2016 12:43:10 +0530
>
> Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
> To: mahdi.ad...@outlook.com
> CC: gluster-users@gluster.org
>
> OK.
> Could you also print the values of the following variables from the
> original core:
> i. i
> ii. local->inode_list[0]
> iii. local->inode_list[1]
>
> -Krutika
>
> On Wed, Aug 3, 2016 at 9:01 PM, Mahdi Adnan 
> wrote:
>
> Hi,
>
> Unfortunately no, but i can setup a test bench and see if it gets the same
> results.
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
>
>
> --
> From: kdhan...@redhat.com
> Date: Wed, 3 Aug 2016 20:59:50 +0530
>
> Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
> To: mahdi.ad...@outlook.com
> CC: gluster-users@gluster.org
>
> Do you have a test case that consistently recreates this problem?
>
> -Krutika
>
> On Wed, Aug 3, 2016 at 8:32 PM, Mahdi Adnan 
> wrote:
>
> Hi,
>
>  So i have updated to 3.7.14 and i still have the same issue with NFS.
> based on what i have provided so far from logs and dumps do you think it's
> an NFS issue ? should i switch to nfs-ganesha ?
> the problem is, the current setup is used in a production environment, and
> switching the mount point of  +50 VMs from native nfs to nfs-ganesha is not
> going to be smooth and without downtime, so i really appreciate your
> thoughts on this matter.
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
>
>
> --
> From: mahdi.ad...@outlook.com
> To: kdhan...@redhat.com
> Date: Tue, 2 Aug 2016 08:44:16 +0300
>
> CC: gluster-users@gluster.org
> Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
>
> Hi,
>
> The NFS just crashed again, latest bt;
>
> (gdb) bt
> #0  0x7f0b71a9f210 in pthread_spin_lock () from /lib64/libpthread.so.0
> #1  0x7f0b72c6fcd5 in fd_anonymous (inode=0x0) at fd.c:804
> #2  0x7f0b64ca5787 in shard_common_inode_write_do
> (frame=0x7f0b707c062c, this=0x7f0b6002ac10) at shard.c:3716
> #3  0x7f0b64ca5a53 in shard_common_inode_write_post_lookup_shards_handler
> (frame=, this=) at shard.c:3769
> #4  0x7f0b64c9eff5 in shard_common_lookup_shards_cbk
> (frame=0x7f0b707c062c, cookie=, this=0x7f0b6002ac10,
> op_ret=0,
> op_errno=,

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

2016-08-04 Thread Mahdi Adnan

Hi,
Yes, i got some messages regarding an existing file name in the renaming 
process while the VMs are online.
and here's the output;(gdb) frame 2#2  0x7f195deb1787 in 
shard_common_inode_write_do (frame=0x7f19699f1164, this=0x7f195802ac10) at 
shard.c:37163716anon_fd = fd_anonymous 
(local->inode_list[i]);(gdb) p local->fop$1 = GF_FOP_WRITE(gdb)


-- 



Respectfully

Mahdi A. Mahdi



From: kdhan...@redhat.com
Date: Fri, 5 Aug 2016 10:48:36 +0530
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
To: mahdi.ad...@outlook.com
CC: gluster-users@gluster.org

Also, could you print local->fop please?

-Krutika

On Fri, Aug 5, 2016 at 10:46 AM, Krutika Dhananjay  wrote:
Were the images being renamed (specifically to a pathname that already exists) 
while they were being written to?

-Krutika

On Thu, Aug 4, 2016 at 1:14 PM, Mahdi Adnan  wrote:



Hi,
Kindly check the following link for all 7 bricks logs;
https://db.tt/YP5qTGXk



-- 



Respectfully

Mahdi A. Mahdi



From: kdhan...@redhat.com
Date: Thu, 4 Aug 2016 13:00:43 +0530
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
To: mahdi.ad...@outlook.com
CC: gluster-users@gluster.org

Could you also attach the brick logs please?

-Krutika

On Thu, Aug 4, 2016 at 12:48 PM, Mahdi Adnan  wrote:



appreciate your help,
(gdb) frame 2#2  0x7f195deb1787 in shard_common_inode_write_do 
(frame=0x7f19699f1164, this=0x7f195802ac10) at shard.c:37163716 
 anon_fd = fd_anonymous (local->inode_list[i]);(gdb) p 
local->inode_list[0]$4 = (inode_t *) 0x7f195c532b18(gdb) p 
local->inode_list[1]$5 = (inode_t *) 0x0(gdb) 



-- 



Respectfully

Mahdi A. Mahdi



From: kdhan...@redhat.com
Date: Thu, 4 Aug 2016 12:43:10 +0530
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
To: mahdi.ad...@outlook.com
CC: gluster-users@gluster.org

OK.
Could you also print the values of the following variables from the original 
core:
i. i
ii. local->inode_list[0]
iii. local->inode_list[1]

-Krutika

On Wed, Aug 3, 2016 at 9:01 PM, Mahdi Adnan  wrote:



Hi,
Unfortunately no, but i can setup a test bench and see if it gets the same 
results.


-- 



Respectfully

Mahdi A. Mahdi



From: kdhan...@redhat.com
Date: Wed, 3 Aug 2016 20:59:50 +0530
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
To: mahdi.ad...@outlook.com
CC: gluster-users@gluster.org

Do you have a test case that consistently recreates this problem?

-Krutika

On Wed, Aug 3, 2016 at 8:32 PM, Mahdi Adnan  wrote:



Hi,
 So i have updated to 3.7.14 and i still have the same issue with NFS.based on 
what i have provided so far from logs and dumps do you think it's an NFS issue 
? should i switch to nfs-ganesha ?
the problem is, the current setup is used in a production environment, and 
switching the mount point of  +50 VMs from native nfs to nfs-ganesha is not 
going to be smooth and without downtime, so i really appreciate your thoughts 
on this matter.


-- 



Respectfully

Mahdi A. Mahdi



From: mahdi.ad...@outlook.com
To: kdhan...@redhat.com
Date: Tue, 2 Aug 2016 08:44:16 +0300
CC: gluster-users@gluster.org
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash




Hi,
The NFS just crashed again, latest bt;
(gdb) bt#0  0x7f0b71a9f210 in pthread_spin_lock () from 
/lib64/libpthread.so.0#1  0x7f0b72c6fcd5 in fd_anonymous (inode=0x0) at 
fd.c:804#2  0x7f0b64ca5787 in shard_common_inode_write_do 
(frame=0x7f0b707c062c, this=0x7f0b6002ac10) at shard.c:3716#3  
0x7f0b64ca5a53 in shard_common_inode_write_post_lookup_shards_handler 
(frame=, this=) at shard.c:3769#4  
0x7f0b64c9eff5 in shard_common_lookup_shards_cbk (frame=0x7f0b707c062c, 
cookie=, this=0x7f0b6002ac10, op_ret=0, op_errno=, inode=, buf=0x7f0b51407640, xdata=0x7f0b72f57648, 
postparent=0x7f0b514076b0) at shard.c:1601#5  0x7f0b64efe141 in 
dht_lookup_cbk (frame=0x7f0b7075fcdc, cookie=, this=, op_ret=0, op_errno=0, inode=0x7f0b5f1d1f58, stbuf=0x7f0b51407640, 
xattr=0x7f0b72f57648, postparent=0x7f0b514076b0) at dht-common.c:2174#6  
0x7f0b651871f3 in afr_lookup_done (frame=frame@entry=0x7f0b7079a4c8, 
this=this@entry=0x7f0b60023ba0) at afr-common.c:1825#7  0x7f0b65187b84 in 
afr_lookup_metadata_heal_check (frame=frame@entry=0x7f0b7079a4c8, 
this=0x7f0b60023ba0, this@entry=0xca0bd88259f5a800)at afr-common.c:2068#8  
0x7f0b6518834f in afr_lookup_entry_heal (frame=frame@entry=0x7f0b7079a4c8, 
this=0xca0bd88259f5a800, this@entry=0x7f0b60023ba0) at afr-common.c:2157#9  
0x7f0b6518867d in afr_lookup_cbk (frame=0x7f0b7079a4c8, cookie=, this=0x7f0b60023ba0, op_ret=, op_errno=, inode=, buf=0x7f0b564e9940, xdata=0x7f0b72f708c8, 
postparent=0x7f0b564e99b0) at afr-common.c:2205#10 0x7f0b653d6e42 in 
client3_3_lookup_cbk (req=, iov=, 
count=, myframe=0x7f0b7076354c)at client-rpc-fops.c:2981#11 
0x7f0b72a00a30 in rpc_clnt_handle_reply (clnt=clnt@entry=0x7f0b603393c0, 
pollin=pollin@e

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

2016-08-04 Thread Krutika Dhananjay

Also, could you print local->fop please?

-Krutika

On Fri, Aug 5, 2016 at 10:46 AM, Krutika Dhananjay 
wrote:

> Were the images being renamed (specifically to a pathname that already
> exists) while they were being written to?
>
> -Krutika
>
> On Thu, Aug 4, 2016 at 1:14 PM, Mahdi Adnan 
> wrote:
>
>> Hi,
>>
>> Kindly check the following link for all 7 bricks logs;
>>
>> https://db.tt/YP5qTGXk
>>
>>
>> --
>>
>> Respectfully
>> *Mahdi A. Mahdi*
>>
>>
>>
>> --------------
>> From: kdhan...@redhat.com
>> Date: Thu, 4 Aug 2016 13:00:43 +0530
>>
>> Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
>> To: mahdi.ad...@outlook.com
>> CC: gluster-users@gluster.org
>>
>> Could you also attach the brick logs please?
>>
>> -Krutika
>>
>> On Thu, Aug 4, 2016 at 12:48 PM, Mahdi Adnan 
>> wrote:
>>
>> appreciate your help,
>>
>> (gdb) frame 2
>> #2  0x7f195deb1787 in shard_common_inode_write_do
>> (frame=0x7f19699f1164, this=0x7f195802ac10) at shard.c:3716
>> 3716anon_fd = fd_anonymous
>> (local->inode_list[i]);
>> (gdb) p local->inode_list[0]
>> $4 = (inode_t *) 0x7f195c532b18
>> (gdb) p local->inode_list[1]
>> $5 = (inode_t *) 0x0
>> (gdb)
>>
>>
>> --
>>
>> Respectfully
>> *Mahdi A. Mahdi*
>>
>>
>>
>> --
>> From: kdhan...@redhat.com
>> Date: Thu, 4 Aug 2016 12:43:10 +0530
>>
>> Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
>> To: mahdi.ad...@outlook.com
>> CC: gluster-users@gluster.org
>>
>> OK.
>> Could you also print the values of the following variables from the
>> original core:
>> i. i
>> ii. local->inode_list[0]
>> iii. local->inode_list[1]
>>
>> -Krutika
>>
>> On Wed, Aug 3, 2016 at 9:01 PM, Mahdi Adnan 
>> wrote:
>>
>> Hi,
>>
>> Unfortunately no, but i can setup a test bench and see if it gets the
>> same results.
>>
>> --
>>
>> Respectfully
>> *Mahdi A. Mahdi*
>>
>>
>>
>> --
>> From: kdhan...@redhat.com
>> Date: Wed, 3 Aug 2016 20:59:50 +0530
>>
>> Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
>> To: mahdi.ad...@outlook.com
>> CC: gluster-users@gluster.org
>>
>> Do you have a test case that consistently recreates this problem?
>>
>> -Krutika
>>
>> On Wed, Aug 3, 2016 at 8:32 PM, Mahdi Adnan 
>> wrote:
>>
>> Hi,
>>
>>  So i have updated to 3.7.14 and i still have the same issue with NFS.
>> based on what i have provided so far from logs and dumps do you think
>> it's an NFS issue ? should i switch to nfs-ganesha ?
>> the problem is, the current setup is used in a production environment,
>> and switching the mount point of  +50 VMs from native nfs to nfs-ganesha is
>> not going to be smooth and without downtime, so i really appreciate your
>> thoughts on this matter.
>>
>> --
>>
>> Respectfully
>> *Mahdi A. Mahdi*
>>
>>
>>
>> --
>> From: mahdi.ad...@outlook.com
>> To: kdhan...@redhat.com
>> Date: Tue, 2 Aug 2016 08:44:16 +0300
>>
>> CC: gluster-users@gluster.org
>> Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
>>
>> Hi,
>>
>> The NFS just crashed again, latest bt;
>>
>> (gdb) bt
>> #0  0x7f0b71a9f210 in pthread_spin_lock () from /lib64/libpthread.so.0
>> #1  0x7f0b72c6fcd5 in fd_anonymous (inode=0x0) at fd.c:804
>> #2  0x7f0b64ca5787 in shard_common_inode_write_do
>> (frame=0x7f0b707c062c, this=0x7f0b6002ac10) at shard.c:3716
>> #3  0x7f0b64ca5a53 in shard_common_inode_write_post_lookup_shards_handler
>> (frame=, this=) at shard.c:3769
>> #4  0x7f0b64c9eff5 in shard_common_lookup_shards_cbk
>> (frame=0x7f0b707c062c, cookie=, this=0x7f0b6002ac10,
>> op_ret=0,
>> op_errno=, inode=, buf=0x7f0b51407640,
>> xdata=0x7f0b72f57648, postparent=0x7f0b514076b0) at shard.c:1601
>> #5  0x7f0b64efe141 in dht_lookup_cbk (frame=0x7f0b7075fcdc,
>> cookie=, this=, op_ret=0, op_errno=0,
>> inode=0x7f0b5f1d1f58,
>> stbuf=0x7f0b51407640, xattr=0x7f0b72f57648,
>> postparent=0x7f0b514076b0) at dht-common.c:2174
>> #6  0x7f0b651871f3 in afr_lookup_done (frame=frame@entry=0x7f0b7079a4c8,
>> this=this@entry=0x7f0b60023ba0) at afr-common.c:

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

2016-08-04 Thread Krutika Dhananjay

Were the images being renamed (specifically to a pathname that already
exists) while they were being written to?

-Krutika

On Thu, Aug 4, 2016 at 1:14 PM, Mahdi Adnan  wrote:

> Hi,
>
> Kindly check the following link for all 7 bricks logs;
>
> https://db.tt/YP5qTGXk
>
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
>
>
> --
> From: kdhan...@redhat.com
> Date: Thu, 4 Aug 2016 13:00:43 +0530
>
> Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
> To: mahdi.ad...@outlook.com
> CC: gluster-users@gluster.org
>
> Could you also attach the brick logs please?
>
> -Krutika
>
> On Thu, Aug 4, 2016 at 12:48 PM, Mahdi Adnan 
> wrote:
>
> appreciate your help,
>
> (gdb) frame 2
> #2  0x7f195deb1787 in shard_common_inode_write_do
> (frame=0x7f19699f1164, this=0x7f195802ac10) at shard.c:3716
> 3716anon_fd = fd_anonymous (local->inode_list[i]);
> (gdb) p local->inode_list[0]
> $4 = (inode_t *) 0x7f195c532b18
> (gdb) p local->inode_list[1]
> $5 = (inode_t *) 0x0
> (gdb)
>
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
>
>
> --
> From: kdhan...@redhat.com
> Date: Thu, 4 Aug 2016 12:43:10 +0530
>
> Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
> To: mahdi.ad...@outlook.com
> CC: gluster-users@gluster.org
>
> OK.
> Could you also print the values of the following variables from the
> original core:
> i. i
> ii. local->inode_list[0]
> iii. local->inode_list[1]
>
> -Krutika
>
> On Wed, Aug 3, 2016 at 9:01 PM, Mahdi Adnan 
> wrote:
>
> Hi,
>
> Unfortunately no, but i can setup a test bench and see if it gets the same
> results.
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
>
>
> --
> From: kdhan...@redhat.com
> Date: Wed, 3 Aug 2016 20:59:50 +0530
>
> Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
> To: mahdi.ad...@outlook.com
> CC: gluster-users@gluster.org
>
> Do you have a test case that consistently recreates this problem?
>
> -Krutika
>
> On Wed, Aug 3, 2016 at 8:32 PM, Mahdi Adnan 
> wrote:
>
> Hi,
>
>  So i have updated to 3.7.14 and i still have the same issue with NFS.
> based on what i have provided so far from logs and dumps do you think it's
> an NFS issue ? should i switch to nfs-ganesha ?
> the problem is, the current setup is used in a production environment, and
> switching the mount point of  +50 VMs from native nfs to nfs-ganesha is not
> going to be smooth and without downtime, so i really appreciate your
> thoughts on this matter.
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
>
>
> --
> From: mahdi.ad...@outlook.com
> To: kdhan...@redhat.com
> Date: Tue, 2 Aug 2016 08:44:16 +0300
>
> CC: gluster-users@gluster.org
> Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
>
> Hi,
>
> The NFS just crashed again, latest bt;
>
> (gdb) bt
> #0  0x7f0b71a9f210 in pthread_spin_lock () from /lib64/libpthread.so.0
> #1  0x7f0b72c6fcd5 in fd_anonymous (inode=0x0) at fd.c:804
> #2  0x7f0b64ca5787 in shard_common_inode_write_do
> (frame=0x7f0b707c062c, this=0x7f0b6002ac10) at shard.c:3716
> #3  0x7f0b64ca5a53 in shard_common_inode_write_post_lookup_shards_handler
> (frame=, this=) at shard.c:3769
> #4  0x7f0b64c9eff5 in shard_common_lookup_shards_cbk
> (frame=0x7f0b707c062c, cookie=, this=0x7f0b6002ac10,
> op_ret=0,
> op_errno=, inode=, buf=0x7f0b51407640,
> xdata=0x7f0b72f57648, postparent=0x7f0b514076b0) at shard.c:1601
> #5  0x7f0b64efe141 in dht_lookup_cbk (frame=0x7f0b7075fcdc,
> cookie=, this=, op_ret=0, op_errno=0,
> inode=0x7f0b5f1d1f58,
> stbuf=0x7f0b51407640, xattr=0x7f0b72f57648, postparent=0x7f0b514076b0)
> at dht-common.c:2174
> #6  0x7f0b651871f3 in afr_lookup_done (frame=frame@entry=0x7f0b7079a4c8,
> this=this@entry=0x7f0b60023ba0) at afr-common.c:1825
> #7  0x7f0b65187b84 in afr_lookup_metadata_heal_check (frame=frame@entry
> =0x7f0b7079a4c8, this=0x7f0b60023ba0, this@entry=0xca0bd88259f5a800)
> at afr-common.c:2068
> #8  0x7f0b6518834f in afr_lookup_entry_heal (frame=frame@entry
> =0x7f0b7079a4c8, this=0xca0bd88259f5a800, this@entry=0x7f0b60023ba0) at
> afr-common.c:2157
> #9  0x7f0b6518867d in afr_lookup_cbk (frame=0x7f0b7079a4c8,
> cookie=, this=0x7f0b60023ba0, op_ret=,
> op_errno=, inode=, buf=0x7f0b564e9940,
> xdata=0x7f0b72f708c8, postparent=0x7f0b564e99b0) at afr-common.c:2205
> #10 0x7f0b653d6e42 in client3_3_lookup_cbk (req=,
> iov=, count=, myframe=0x7f0b7076354c)
> at client-rpc-fops.c:29

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

2016-08-04 Thread Mahdi Adnan

Hi,
Kindly check the following link for all 7 bricks logs;
https://db.tt/YP5qTGXk



-- 



Respectfully

Mahdi A. Mahdi



From: kdhan...@redhat.com
Date: Thu, 4 Aug 2016 13:00:43 +0530
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
To: mahdi.ad...@outlook.com
CC: gluster-users@gluster.org

Could you also attach the brick logs please?

-Krutika

On Thu, Aug 4, 2016 at 12:48 PM, Mahdi Adnan  wrote:



appreciate your help,
(gdb) frame 2#2  0x7f195deb1787 in shard_common_inode_write_do 
(frame=0x7f19699f1164, this=0x7f195802ac10) at shard.c:37163716 
 anon_fd = fd_anonymous (local->inode_list[i]);(gdb) p 
local->inode_list[0]$4 = (inode_t *) 0x7f195c532b18(gdb) p 
local->inode_list[1]$5 = (inode_t *) 0x0(gdb) 



-- 



Respectfully

Mahdi A. Mahdi



From: kdhan...@redhat.com
Date: Thu, 4 Aug 2016 12:43:10 +0530
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
To: mahdi.ad...@outlook.com
CC: gluster-users@gluster.org

OK.
Could you also print the values of the following variables from the original 
core:
i. i
ii. local->inode_list[0]
iii. local->inode_list[1]

-Krutika

On Wed, Aug 3, 2016 at 9:01 PM, Mahdi Adnan  wrote:



Hi,
Unfortunately no, but i can setup a test bench and see if it gets the same 
results.


-- 



Respectfully

Mahdi A. Mahdi



From: kdhan...@redhat.com
Date: Wed, 3 Aug 2016 20:59:50 +0530
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
To: mahdi.ad...@outlook.com
CC: gluster-users@gluster.org

Do you have a test case that consistently recreates this problem?

-Krutika

On Wed, Aug 3, 2016 at 8:32 PM, Mahdi Adnan  wrote:



Hi,
 So i have updated to 3.7.14 and i still have the same issue with NFS.based on 
what i have provided so far from logs and dumps do you think it's an NFS issue 
? should i switch to nfs-ganesha ?
the problem is, the current setup is used in a production environment, and 
switching the mount point of  +50 VMs from native nfs to nfs-ganesha is not 
going to be smooth and without downtime, so i really appreciate your thoughts 
on this matter.


-- 



Respectfully

Mahdi A. Mahdi



From: mahdi.ad...@outlook.com
To: kdhan...@redhat.com
Date: Tue, 2 Aug 2016 08:44:16 +0300
CC: gluster-users@gluster.org
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash




Hi,
The NFS just crashed again, latest bt;
(gdb) bt#0  0x7f0b71a9f210 in pthread_spin_lock () from 
/lib64/libpthread.so.0#1  0x7f0b72c6fcd5 in fd_anonymous (inode=0x0) at 
fd.c:804#2  0x7f0b64ca5787 in shard_common_inode_write_do 
(frame=0x7f0b707c062c, this=0x7f0b6002ac10) at shard.c:3716#3  
0x7f0b64ca5a53 in shard_common_inode_write_post_lookup_shards_handler 
(frame=, this=) at shard.c:3769#4  
0x7f0b64c9eff5 in shard_common_lookup_shards_cbk (frame=0x7f0b707c062c, 
cookie=, this=0x7f0b6002ac10, op_ret=0, op_errno=, inode=, buf=0x7f0b51407640, xdata=0x7f0b72f57648, 
postparent=0x7f0b514076b0) at shard.c:1601#5  0x7f0b64efe141 in 
dht_lookup_cbk (frame=0x7f0b7075fcdc, cookie=, this=, op_ret=0, op_errno=0, inode=0x7f0b5f1d1f58, stbuf=0x7f0b51407640, 
xattr=0x7f0b72f57648, postparent=0x7f0b514076b0) at dht-common.c:2174#6  
0x7f0b651871f3 in afr_lookup_done (frame=frame@entry=0x7f0b7079a4c8, 
this=this@entry=0x7f0b60023ba0) at afr-common.c:1825#7  0x7f0b65187b84 in 
afr_lookup_metadata_heal_check (frame=frame@entry=0x7f0b7079a4c8, 
this=0x7f0b60023ba0, this@entry=0xca0bd88259f5a800)at afr-common.c:2068#8  
0x7f0b6518834f in afr_lookup_entry_heal (frame=frame@entry=0x7f0b7079a4c8, 
this=0xca0bd88259f5a800, this@entry=0x7f0b60023ba0) at afr-common.c:2157#9  
0x7f0b6518867d in afr_lookup_cbk (frame=0x7f0b7079a4c8, cookie=, this=0x7f0b60023ba0, op_ret=, op_errno=, inode=, buf=0x7f0b564e9940, xdata=0x7f0b72f708c8, 
postparent=0x7f0b564e99b0) at afr-common.c:2205#10 0x7f0b653d6e42 in 
client3_3_lookup_cbk (req=, iov=, 
count=, myframe=0x7f0b7076354c)at client-rpc-fops.c:2981#11 
0x7f0b72a00a30 in rpc_clnt_handle_reply (clnt=clnt@entry=0x7f0b603393c0, 
pollin=pollin@entry=0x7f0b50c1c2d0) at rpc-clnt.c:764#12 0x7f0b72a00cef in 
rpc_clnt_notify (trans=, mydata=0x7f0b603393f0, event=, data=0x7f0b50c1c2d0) at rpc-clnt.c:925#13 0x7f0b729fc7c3 in 
rpc_transport_notify (this=this@entry=0x7f0b60349040, 
event=event@entry=RPC_TRANSPORT_MSG_RECEIVED, data=data@entry=0x7f0b50c1c2d0)   
 at rpc-transport.c:546#14 0x7f0b678c39a4 in socket_event_poll_in 
(this=this@entry=0x7f0b60349040) at socket.c:2353#15 0x7f0b678c65e4 in 
socket_event_handler (fd=fd@entry=29, idx=idx@entry=17, data=0x7f0b60349040, 
poll_in=1, poll_out=0, poll_err=0) at socket.c:2466#16 0x7f0b72ca0f7a in 
event_dispatch_epoll_handler (event=0x7f0b564e9e80, event_pool=0x7f0b7349bf20) 
at event-epoll.c:575#17 event_dispatch_epoll_worker (data=0x7f0b60152d40) at 
event-epoll.c:678#18 0x7f0b71a9adc5 in start_thread () from 
/lib64/libpthread.so.0#19 0x7f0b713dfced in clone () f

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

2016-08-04 Thread Krutika Dhananjay

Could you also attach the brick logs please?

-Krutika

On Thu, Aug 4, 2016 at 12:48 PM, Mahdi Adnan 
wrote:

> appreciate your help,
>
> (gdb) frame 2
> #2  0x7f195deb1787 in shard_common_inode_write_do
> (frame=0x7f19699f1164, this=0x7f195802ac10) at shard.c:3716
> 3716anon_fd = fd_anonymous (local->inode_list[i]);
> (gdb) p local->inode_list[0]
> $4 = (inode_t *) 0x7f195c532b18
> (gdb) p local->inode_list[1]
> $5 = (inode_t *) 0x0
> (gdb)
>
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
>
>
> --
> From: kdhan...@redhat.com
> Date: Thu, 4 Aug 2016 12:43:10 +0530
>
> Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
> To: mahdi.ad...@outlook.com
> CC: gluster-users@gluster.org
>
> OK.
> Could you also print the values of the following variables from the
> original core:
> i. i
> ii. local->inode_list[0]
> iii. local->inode_list[1]
>
> -Krutika
>
> On Wed, Aug 3, 2016 at 9:01 PM, Mahdi Adnan 
> wrote:
>
> Hi,
>
> Unfortunately no, but i can setup a test bench and see if it gets the same
> results.
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
>
>
> --
> From: kdhan...@redhat.com
> Date: Wed, 3 Aug 2016 20:59:50 +0530
>
> Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
> To: mahdi.ad...@outlook.com
> CC: gluster-users@gluster.org
>
> Do you have a test case that consistently recreates this problem?
>
> -Krutika
>
> On Wed, Aug 3, 2016 at 8:32 PM, Mahdi Adnan 
> wrote:
>
> Hi,
>
>  So i have updated to 3.7.14 and i still have the same issue with NFS.
> based on what i have provided so far from logs and dumps do you think it's
> an NFS issue ? should i switch to nfs-ganesha ?
> the problem is, the current setup is used in a production environment, and
> switching the mount point of  +50 VMs from native nfs to nfs-ganesha is not
> going to be smooth and without downtime, so i really appreciate your
> thoughts on this matter.
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
>
>
> --
> From: mahdi.ad...@outlook.com
> To: kdhan...@redhat.com
> Date: Tue, 2 Aug 2016 08:44:16 +0300
>
> CC: gluster-users@gluster.org
> Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
>
> Hi,
>
> The NFS just crashed again, latest bt;
>
> (gdb) bt
> #0  0x7f0b71a9f210 in pthread_spin_lock () from /lib64/libpthread.so.0
> #1  0x7f0b72c6fcd5 in fd_anonymous (inode=0x0) at fd.c:804
> #2  0x7f0b64ca5787 in shard_common_inode_write_do
> (frame=0x7f0b707c062c, this=0x7f0b6002ac10) at shard.c:3716
> #3  0x7f0b64ca5a53 in shard_common_inode_write_post_lookup_shards_handler
> (frame=, this=) at shard.c:3769
> #4  0x7f0b64c9eff5 in shard_common_lookup_shards_cbk
> (frame=0x7f0b707c062c, cookie=, this=0x7f0b6002ac10,
> op_ret=0,
> op_errno=, inode=, buf=0x7f0b51407640,
> xdata=0x7f0b72f57648, postparent=0x7f0b514076b0) at shard.c:1601
> #5  0x7f0b64efe141 in dht_lookup_cbk (frame=0x7f0b7075fcdc,
> cookie=, this=, op_ret=0, op_errno=0,
> inode=0x7f0b5f1d1f58,
> stbuf=0x7f0b51407640, xattr=0x7f0b72f57648, postparent=0x7f0b514076b0)
> at dht-common.c:2174
> #6  0x7f0b651871f3 in afr_lookup_done (frame=frame@entry=0x7f0b7079a4c8,
> this=this@entry=0x7f0b60023ba0) at afr-common.c:1825
> #7  0x7f0b65187b84 in afr_lookup_metadata_heal_check (frame=frame@entry
> =0x7f0b7079a4c8, this=0x7f0b60023ba0, this@entry=0xca0bd88259f5a800)
> at afr-common.c:2068
> #8  0x7f0b6518834f in afr_lookup_entry_heal (frame=frame@entry
> =0x7f0b7079a4c8, this=0xca0bd88259f5a800, this@entry=0x7f0b60023ba0) at
> afr-common.c:2157
> #9  0x7f0b6518867d in afr_lookup_cbk (frame=0x7f0b7079a4c8,
> cookie=, this=0x7f0b60023ba0, op_ret=,
> op_errno=, inode=, buf=0x7f0b564e9940,
> xdata=0x7f0b72f708c8, postparent=0x7f0b564e99b0) at afr-common.c:2205
> #10 0x7f0b653d6e42 in client3_3_lookup_cbk (req=,
> iov=, count=, myframe=0x7f0b7076354c)
> at client-rpc-fops.c:2981
> #11 0x7f0b72a00a30 in rpc_clnt_handle_reply (clnt=clnt@entry
> =0x7f0b603393c0, pollin=pollin@entry=0x7f0b50c1c2d0) at rpc-clnt.c:764
> #12 0x7f0b72a00cef in rpc_clnt_notify (trans=,
> mydata=0x7f0b603393f0, event=, data=0x7f0b50c1c2d0) at
> rpc-clnt.c:925
> #13 0x7f0b729fc7c3 in rpc_transport_notify (this=this@entry
> =0x7f0b60349040, event=event@entry=RPC_TRANSPORT_MSG_RECEIVED,
> data=data@entry=0x7f0b50c1c2d0)
> at rpc-transport.c:546
> #14 0x7f0b678c39a4 in socket_event_poll_in (this=this@entry
> =0x7f0b60349040) at socket.c:2353
> #15 0x7f0b678c65e4 in socket_event_handler

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

2016-08-04 Thread Mahdi Adnan

appreciate your help,
(gdb) frame 2#2  0x7f195deb1787 in shard_common_inode_write_do 
(frame=0x7f19699f1164, this=0x7f195802ac10) at shard.c:37163716 
 anon_fd = fd_anonymous (local->inode_list[i]);(gdb) p 
local->inode_list[0]$4 = (inode_t *) 0x7f195c532b18(gdb) p 
local->inode_list[1]$5 = (inode_t *) 0x0(gdb) 



-- 



Respectfully

Mahdi A. Mahdi



From: kdhan...@redhat.com
Date: Thu, 4 Aug 2016 12:43:10 +0530
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
To: mahdi.ad...@outlook.com
CC: gluster-users@gluster.org

OK.
Could you also print the values of the following variables from the original 
core:
i. i
ii. local->inode_list[0]
iii. local->inode_list[1]

-Krutika

On Wed, Aug 3, 2016 at 9:01 PM, Mahdi Adnan  wrote:



Hi,
Unfortunately no, but i can setup a test bench and see if it gets the same 
results.


-- 



Respectfully

Mahdi A. Mahdi



From: kdhan...@redhat.com
Date: Wed, 3 Aug 2016 20:59:50 +0530
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
To: mahdi.ad...@outlook.com
CC: gluster-users@gluster.org

Do you have a test case that consistently recreates this problem?

-Krutika

On Wed, Aug 3, 2016 at 8:32 PM, Mahdi Adnan  wrote:



Hi,
 So i have updated to 3.7.14 and i still have the same issue with NFS.based on 
what i have provided so far from logs and dumps do you think it's an NFS issue 
? should i switch to nfs-ganesha ?
the problem is, the current setup is used in a production environment, and 
switching the mount point of  +50 VMs from native nfs to nfs-ganesha is not 
going to be smooth and without downtime, so i really appreciate your thoughts 
on this matter.


-- 



Respectfully

Mahdi A. Mahdi



From: mahdi.ad...@outlook.com
To: kdhan...@redhat.com
Date: Tue, 2 Aug 2016 08:44:16 +0300
CC: gluster-users@gluster.org
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash




Hi,
The NFS just crashed again, latest bt;
(gdb) bt#0  0x7f0b71a9f210 in pthread_spin_lock () from 
/lib64/libpthread.so.0#1  0x7f0b72c6fcd5 in fd_anonymous (inode=0x0) at 
fd.c:804#2  0x7f0b64ca5787 in shard_common_inode_write_do 
(frame=0x7f0b707c062c, this=0x7f0b6002ac10) at shard.c:3716#3  
0x7f0b64ca5a53 in shard_common_inode_write_post_lookup_shards_handler 
(frame=, this=) at shard.c:3769#4  
0x7f0b64c9eff5 in shard_common_lookup_shards_cbk (frame=0x7f0b707c062c, 
cookie=, this=0x7f0b6002ac10, op_ret=0, op_errno=, inode=, buf=0x7f0b51407640, xdata=0x7f0b72f57648, 
postparent=0x7f0b514076b0) at shard.c:1601#5  0x7f0b64efe141 in 
dht_lookup_cbk (frame=0x7f0b7075fcdc, cookie=, this=, op_ret=0, op_errno=0, inode=0x7f0b5f1d1f58, stbuf=0x7f0b51407640, 
xattr=0x7f0b72f57648, postparent=0x7f0b514076b0) at dht-common.c:2174#6  
0x7f0b651871f3 in afr_lookup_done (frame=frame@entry=0x7f0b7079a4c8, 
this=this@entry=0x7f0b60023ba0) at afr-common.c:1825#7  0x7f0b65187b84 in 
afr_lookup_metadata_heal_check (frame=frame@entry=0x7f0b7079a4c8, 
this=0x7f0b60023ba0, this@entry=0xca0bd88259f5a800)at afr-common.c:2068#8  
0x7f0b6518834f in afr_lookup_entry_heal (frame=frame@entry=0x7f0b7079a4c8, 
this=0xca0bd88259f5a800, this@entry=0x7f0b60023ba0) at afr-common.c:2157#9  
0x7f0b6518867d in afr_lookup_cbk (frame=0x7f0b7079a4c8, cookie=, this=0x7f0b60023ba0, op_ret=, op_errno=, inode=, buf=0x7f0b564e9940, xdata=0x7f0b72f708c8, 
postparent=0x7f0b564e99b0) at afr-common.c:2205#10 0x7f0b653d6e42 in 
client3_3_lookup_cbk (req=, iov=, 
count=, myframe=0x7f0b7076354c)at client-rpc-fops.c:2981#11 
0x7f0b72a00a30 in rpc_clnt_handle_reply (clnt=clnt@entry=0x7f0b603393c0, 
pollin=pollin@entry=0x7f0b50c1c2d0) at rpc-clnt.c:764#12 0x7f0b72a00cef in 
rpc_clnt_notify (trans=, mydata=0x7f0b603393f0, event=, data=0x7f0b50c1c2d0) at rpc-clnt.c:925#13 0x7f0b729fc7c3 in 
rpc_transport_notify (this=this@entry=0x7f0b60349040, 
event=event@entry=RPC_TRANSPORT_MSG_RECEIVED, data=data@entry=0x7f0b50c1c2d0)   
 at rpc-transport.c:546#14 0x7f0b678c39a4 in socket_event_poll_in 
(this=this@entry=0x7f0b60349040) at socket.c:2353#15 0x7f0b678c65e4 in 
socket_event_handler (fd=fd@entry=29, idx=idx@entry=17, data=0x7f0b60349040, 
poll_in=1, poll_out=0, poll_err=0) at socket.c:2466#16 0x7f0b72ca0f7a in 
event_dispatch_epoll_handler (event=0x7f0b564e9e80, event_pool=0x7f0b7349bf20) 
at event-epoll.c:575#17 event_dispatch_epoll_worker (data=0x7f0b60152d40) at 
event-epoll.c:678#18 0x7f0b71a9adc5 in start_thread () from 
/lib64/libpthread.so.0#19 0x7f0b713dfced in clone () from /lib64/libc.so.6


-- 



Respectfully

Mahdi A. Mahdi

From: mahdi.ad...@outlook.com
To: kdhan...@redhat.com
Date: Mon, 1 Aug 2016 16:31:50 +0300
CC: gluster-users@gluster.org
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash




Many thanks,
here's the results;

(gdb) p cur_block$15 = 4088(gdb) p last_block$16 = 4088(gdb) p 
local->first_block$17 = 4087(gdb) p odirect$18 = _gf_false(gdb) p fd

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

2016-08-04 Thread Krutika Dhananjay

OK.
Could you also print the values of the following variables from the
original core:
i. i
ii. local->inode_list[0]
iii. local->inode_list[1]

-Krutika

On Wed, Aug 3, 2016 at 9:01 PM, Mahdi Adnan  wrote:

> Hi,
>
> Unfortunately no, but i can setup a test bench and see if it gets the same
> results.
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
>
>
> --
> From: kdhan...@redhat.com
> Date: Wed, 3 Aug 2016 20:59:50 +0530
>
> Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
> To: mahdi.ad...@outlook.com
> CC: gluster-users@gluster.org
>
> Do you have a test case that consistently recreates this problem?
>
> -Krutika
>
> On Wed, Aug 3, 2016 at 8:32 PM, Mahdi Adnan 
> wrote:
>
> Hi,
>
>  So i have updated to 3.7.14 and i still have the same issue with NFS.
> based on what i have provided so far from logs and dumps do you think it's
> an NFS issue ? should i switch to nfs-ganesha ?
> the problem is, the current setup is used in a production environment, and
> switching the mount point of  +50 VMs from native nfs to nfs-ganesha is not
> going to be smooth and without downtime, so i really appreciate your
> thoughts on this matter.
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
>
>
> ------
> From: mahdi.ad...@outlook.com
> To: kdhan...@redhat.com
> Date: Tue, 2 Aug 2016 08:44:16 +0300
>
> CC: gluster-users@gluster.org
> Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
>
> Hi,
>
> The NFS just crashed again, latest bt;
>
> (gdb) bt
> #0  0x7f0b71a9f210 in pthread_spin_lock () from /lib64/libpthread.so.0
> #1  0x7f0b72c6fcd5 in fd_anonymous (inode=0x0) at fd.c:804
> #2  0x7f0b64ca5787 in shard_common_inode_write_do
> (frame=0x7f0b707c062c, this=0x7f0b6002ac10) at shard.c:3716
> #3  0x7f0b64ca5a53 in shard_common_inode_write_post_lookup_shards_handler
> (frame=, this=) at shard.c:3769
> #4  0x7f0b64c9eff5 in shard_common_lookup_shards_cbk
> (frame=0x7f0b707c062c, cookie=, this=0x7f0b6002ac10,
> op_ret=0,
> op_errno=, inode=, buf=0x7f0b51407640,
> xdata=0x7f0b72f57648, postparent=0x7f0b514076b0) at shard.c:1601
> #5  0x7f0b64efe141 in dht_lookup_cbk (frame=0x7f0b7075fcdc,
> cookie=, this=, op_ret=0, op_errno=0,
> inode=0x7f0b5f1d1f58,
> stbuf=0x7f0b51407640, xattr=0x7f0b72f57648, postparent=0x7f0b514076b0)
> at dht-common.c:2174
> #6  0x7f0b651871f3 in afr_lookup_done (frame=frame@entry=0x7f0b7079a4c8,
> this=this@entry=0x7f0b60023ba0) at afr-common.c:1825
> #7  0x7f0b65187b84 in afr_lookup_metadata_heal_check (frame=frame@entry
> =0x7f0b7079a4c8, this=0x7f0b60023ba0, this@entry=0xca0bd88259f5a800)
> at afr-common.c:2068
> #8  0x7f0b6518834f in afr_lookup_entry_heal 
> (frame=frame@entry=0x7f0b7079a4c8,
> this=0xca0bd88259f5a800, this@entry=0x7f0b60023ba0) at afr-common.c:2157
> #9  0x7f0b6518867d in afr_lookup_cbk (frame=0x7f0b7079a4c8,
> cookie=, this=0x7f0b60023ba0, op_ret=,
> op_errno=, inode=, buf=0x7f0b564e9940,
> xdata=0x7f0b72f708c8, postparent=0x7f0b564e99b0) at afr-common.c:2205
> #10 0x7f0b653d6e42 in client3_3_lookup_cbk (req=,
> iov=, count=, myframe=0x7f0b7076354c)
> at client-rpc-fops.c:2981
> #11 0x7f0b72a00a30 in rpc_clnt_handle_reply 
> (clnt=clnt@entry=0x7f0b603393c0,
> pollin=pollin@entry=0x7f0b50c1c2d0) at rpc-clnt.c:764
> #12 0x7f0b72a00cef in rpc_clnt_notify (trans=,
> mydata=0x7f0b603393f0, event=, data=0x7f0b50c1c2d0) at
> rpc-clnt.c:925
> #13 0x7f0b729fc7c3 in rpc_transport_notify 
> (this=this@entry=0x7f0b60349040,
> event=event@entry=RPC_TRANSPORT_MSG_RECEIVED, data=data@entry=
> 0x7f0b50c1c2d0)
> at rpc-transport.c:546
> #14 0x7f0b678c39a4 in socket_event_poll_in 
> (this=this@entry=0x7f0b60349040)
> at socket.c:2353
> #15 0x7f0b678c65e4 in socket_event_handler (fd=fd@entry=29,
> idx=idx@entry=17, data=0x7f0b60349040, poll_in=1, poll_out=0, poll_err=0)
> at socket.c:2466
> #16 0x7f0b72ca0f7a in event_dispatch_epoll_handler
> (event=0x7f0b564e9e80, event_pool=0x7f0b7349bf20) at event-epoll.c:575
> #17 event_dispatch_epoll_worker (data=0x7f0b60152d40) at event-epoll.c:678
> #18 0x7f0b71a9adc5 in start_thread () from /lib64/libpthread.so.0
> #19 0x7f0b713dfced in clone () from /lib64/libc.so.6
>
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
> --
> From: mahdi.ad...@outlook.com
> To: kdhan...@redhat.com
> Date: Mon, 1 Aug 2016 16:31:50 +0300
> CC: gluster-users@gluster.org
> Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
>
> Many thanks,
>
> here's the results;
>
>
> (gdb) p cur_block
> $15 = 4088

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

2016-08-03 Thread Mahdi Adnan

Yeah 5 MB because the VMs are serving monitoring software which doesn't do 
much, but i can easily hit +250 MB of write speed in benchmark.



-- 



Respectfully

Mahdi A. Mahdi



> From: gandalf.corvotempe...@gmail.com
> Date: Wed, 3 Aug 2016 22:44:16 +0200
> Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
> To: mahdi.ad...@outlook.com
> CC: gluster-users@gluster.org
> 
> 2016-08-03 22:33 GMT+02:00 Mahdi Adnan :
> > Yeah, only 3 for now running in 3 replica.
> > around 5MB (900 IOps) write and 3MB (250 IOps) read and the disks are 900GB
> > 10K SAS.
> 
> 5MB => five megabytes/s ?
> Less than an older and ancient 4x DVD reader ? Really ? Are you sure?
> 50VMs with five megabytes/s of reading speed?
> 
> One SAS 10k disk should be able to reach at least 100MB/s
> 
> Currently in my test cluster with 3 servers, replica 3, 1 brick per
> server, all 7200 SATA disks, 1GB bonded network, i'm able to write at
> about 50MB/s, ten times faster than you.
  ___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

2016-08-03 Thread Gandalf Corvotempesta

2016-08-03 22:33 GMT+02:00 Mahdi Adnan :
> Yeah, only 3 for now running in 3 replica.
> around 5MB (900 IOps) write and 3MB (250 IOps) read and the disks are 900GB
> 10K SAS.

5MB => five megabytes/s ?
Less than an older and ancient 4x DVD reader ? Really ? Are you sure?
50VMs with five megabytes/s of reading speed?

One SAS 10k disk should be able to reach at least 100MB/s

Currently in my test cluster with 3 servers, replica 3, 1 brick per
server, all 7200 SATA disks, 1GB bonded network, i'm able to write at
about 50MB/s, ten times faster than you.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

2016-08-03 Thread Mahdi Adnan

Yeah, only 3 for now running in 3 replica.around 5MB (900 IOps) write and 3MB 
(250 IOps) read and the disks are 900GB 10K SAS. 



-- 



Respectfully

Mahdi A. Mahdi



> From: gandalf.corvotempe...@gmail.com
> Date: Wed, 3 Aug 2016 22:09:59 +0200
> Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
> To: mahdi.ad...@outlook.com
> CC: gluster-users@gluster.org
> 
> 2016-08-03 21:40 GMT+02:00 Mahdi Adnan :
> > Hi,
> >
> > Currently, we have three UCS C220 M4, dual Xeon CPU (48 cores), 32GB of RAM,
> > 8x900GB spindles, with Intel X520 dual 10G ports. We are planning to migrate
> > more VMs and increase the number of servers in the cluster as soon as we
> > figure what's going on with the NFS mount.
> 
> Only 3 servers? How many IOPS are you getting and how much bandwidth
> when reading/writing?
> 900GB SAS 15k?
  ___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

2016-08-03 Thread Gandalf Corvotempesta

2016-08-03 21:40 GMT+02:00 Mahdi Adnan :
> Hi,
>
> Currently, we have three UCS C220 M4, dual Xeon CPU (48 cores), 32GB of RAM,
> 8x900GB spindles, with Intel X520 dual 10G ports. We are planning to migrate
> more VMs and increase the number of servers in the cluster as soon as we
> figure what's going on with the NFS mount.

Only 3 servers? How many IOPS are you getting and how much bandwidth
when reading/writing?
900GB SAS 15k?
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

2016-08-03 Thread Serkan Çoban

I had stability problems with centos 7.2 and gluster 3.7.11. Nodes
were crashing without any clue. I cannot find a solution and switched
to centos 6.8. All problems gone with 6.8. Maybe you can test with
centos 6.8?

On Wed, Aug 3, 2016 at 10:40 PM, Mahdi Adnan  wrote:
> Hi,
>
> Currently, we have three UCS C220 M4, dual Xeon CPU (48 cores), 32GB of RAM,
> 8x900GB spindles, with Intel X520 dual 10G ports. We are planning to migrate
> more VMs and increase the number of servers in the cluster as soon as we
> figure what's going on with the NFS mount.
>
>
> --
>
> Respectfully
> Mahdi A. Mahdi
>
>> From: gandalf.corvotempe...@gmail.com
>> Date: Wed, 3 Aug 2016 20:25:56 +0200
>> Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
>> To: mahdi.ad...@outlook.com
>> CC: kdhan...@redhat.com; gluster-users@gluster.org
>>
>> 2016-08-03 17:02 GMT+02:00 Mahdi Adnan :
>> > the problem is, the current setup is used in a production environment,
>> > and
>> > switching the mount point of +50 VMs from native nfs to nfs-ganesha is
>> > not
>> > going to be smooth and without downtime, so i really appreciate your
>> > thoughts on this matter.
>>
>> A little bit OT:
>>
>> 50+ VMs? Could you please share your hardware and network infrastructure?
>> We are thinking about a gluster cluster for hosting about 80-100 VMs and
>> we are
>> looking for some production clusters to use as reference.
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

2016-08-03 Thread Mahdi Adnan

Hi,
Currently, we have three UCS C220 M4, dual Xeon CPU (48 cores), 32GB of RAM, 
8x900GB spindles, with Intel X520 dual 10G ports. We are planning to migrate 
more VMs and increase the number of servers in the cluster as soon as we figure 
what's going on with the NFS mount.



-- 



Respectfully

Mahdi A. Mahdi

> From: gandalf.corvotempe...@gmail.com
> Date: Wed, 3 Aug 2016 20:25:56 +0200
> Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
> To: mahdi.ad...@outlook.com
> CC: kdhan...@redhat.com; gluster-users@gluster.org
> 
> 2016-08-03 17:02 GMT+02:00 Mahdi Adnan :
> > the problem is, the current setup is used in a production environment, and
> > switching the mount point of  +50 VMs from native nfs to nfs-ganesha is not
> > going to be smooth and without downtime, so i really appreciate your
> > thoughts on this matter.
> 
> A little bit OT:
> 
> 50+ VMs? Could you please share your hardware and network infrastructure?
> We are thinking about a gluster cluster for hosting about 80-100 VMs and we 
> are
> looking for some production clusters to use as reference.
  ___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

2016-08-03 Thread Gandalf Corvotempesta

2016-08-03 17:02 GMT+02:00 Mahdi Adnan :
> the problem is, the current setup is used in a production environment, and
> switching the mount point of  +50 VMs from native nfs to nfs-ganesha is not
> going to be smooth and without downtime, so i really appreciate your
> thoughts on this matter.

A little bit OT:

50+ VMs? Could you please share your hardware and network infrastructure?
We are thinking about a gluster cluster for hosting about 80-100 VMs and we are
looking for some production clusters to use as reference.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

2016-08-03 Thread Mahdi Adnan

Hi,
Unfortunately no, but i can setup a test bench and see if it gets the same 
results.


-- 



Respectfully

Mahdi A. Mahdi



From: kdhan...@redhat.com
Date: Wed, 3 Aug 2016 20:59:50 +0530
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
To: mahdi.ad...@outlook.com
CC: gluster-users@gluster.org

Do you have a test case that consistently recreates this problem?

-Krutika

On Wed, Aug 3, 2016 at 8:32 PM, Mahdi Adnan  wrote:



Hi,
 So i have updated to 3.7.14 and i still have the same issue with NFS.based on 
what i have provided so far from logs and dumps do you think it's an NFS issue 
? should i switch to nfs-ganesha ?
the problem is, the current setup is used in a production environment, and 
switching the mount point of  +50 VMs from native nfs to nfs-ganesha is not 
going to be smooth and without downtime, so i really appreciate your thoughts 
on this matter.


-- 



Respectfully

Mahdi A. Mahdi



From: mahdi.ad...@outlook.com
To: kdhan...@redhat.com
Date: Tue, 2 Aug 2016 08:44:16 +0300
CC: gluster-users@gluster.org
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash




Hi,
The NFS just crashed again, latest bt;
(gdb) bt#0  0x7f0b71a9f210 in pthread_spin_lock () from 
/lib64/libpthread.so.0#1  0x7f0b72c6fcd5 in fd_anonymous (inode=0x0) at 
fd.c:804#2  0x7f0b64ca5787 in shard_common_inode_write_do 
(frame=0x7f0b707c062c, this=0x7f0b6002ac10) at shard.c:3716#3  
0x7f0b64ca5a53 in shard_common_inode_write_post_lookup_shards_handler 
(frame=, this=) at shard.c:3769#4  
0x7f0b64c9eff5 in shard_common_lookup_shards_cbk (frame=0x7f0b707c062c, 
cookie=, this=0x7f0b6002ac10, op_ret=0, op_errno=, inode=, buf=0x7f0b51407640, xdata=0x7f0b72f57648, 
postparent=0x7f0b514076b0) at shard.c:1601#5  0x7f0b64efe141 in 
dht_lookup_cbk (frame=0x7f0b7075fcdc, cookie=, this=, op_ret=0, op_errno=0, inode=0x7f0b5f1d1f58, stbuf=0x7f0b51407640, 
xattr=0x7f0b72f57648, postparent=0x7f0b514076b0) at dht-common.c:2174#6  
0x7f0b651871f3 in afr_lookup_done (frame=frame@entry=0x7f0b7079a4c8, 
this=this@entry=0x7f0b60023ba0) at afr-common.c:1825#7  0x7f0b65187b84 in 
afr_lookup_metadata_heal_check (frame=frame@entry=0x7f0b7079a4c8, 
this=0x7f0b60023ba0, this@entry=0xca0bd88259f5a800)at afr-common.c:2068#8  
0x7f0b6518834f in afr_lookup_entry_heal (frame=frame@entry=0x7f0b7079a4c8, 
this=0xca0bd88259f5a800, this@entry=0x7f0b60023ba0) at afr-common.c:2157#9  
0x7f0b6518867d in afr_lookup_cbk (frame=0x7f0b7079a4c8, cookie=, this=0x7f0b60023ba0, op_ret=, op_errno=, inode=, buf=0x7f0b564e9940, xdata=0x7f0b72f708c8, 
postparent=0x7f0b564e99b0) at afr-common.c:2205#10 0x7f0b653d6e42 in 
client3_3_lookup_cbk (req=, iov=, 
count=, myframe=0x7f0b7076354c)at client-rpc-fops.c:2981#11 
0x7f0b72a00a30 in rpc_clnt_handle_reply (clnt=clnt@entry=0x7f0b603393c0, 
pollin=pollin@entry=0x7f0b50c1c2d0) at rpc-clnt.c:764#12 0x7f0b72a00cef in 
rpc_clnt_notify (trans=, mydata=0x7f0b603393f0, event=, data=0x7f0b50c1c2d0) at rpc-clnt.c:925#13 0x7f0b729fc7c3 in 
rpc_transport_notify (this=this@entry=0x7f0b60349040, 
event=event@entry=RPC_TRANSPORT_MSG_RECEIVED, data=data@entry=0x7f0b50c1c2d0)   
 at rpc-transport.c:546#14 0x7f0b678c39a4 in socket_event_poll_in 
(this=this@entry=0x7f0b60349040) at socket.c:2353#15 0x7f0b678c65e4 in 
socket_event_handler (fd=fd@entry=29, idx=idx@entry=17, data=0x7f0b60349040, 
poll_in=1, poll_out=0, poll_err=0) at socket.c:2466#16 0x7f0b72ca0f7a in 
event_dispatch_epoll_handler (event=0x7f0b564e9e80, event_pool=0x7f0b7349bf20) 
at event-epoll.c:575#17 event_dispatch_epoll_worker (data=0x7f0b60152d40) at 
event-epoll.c:678#18 0x7f0b71a9adc5 in start_thread () from 
/lib64/libpthread.so.0#19 0x7f0b713dfced in clone () from /lib64/libc.so.6


-- 



Respectfully

Mahdi A. Mahdi

From: mahdi.ad...@outlook.com
To: kdhan...@redhat.com
Date: Mon, 1 Aug 2016 16:31:50 +0300
CC: gluster-users@gluster.org
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash




Many thanks,
here's the results;

(gdb) p cur_block$15 = 4088(gdb) p last_block$16 = 4088(gdb) p 
local->first_block$17 = 4087(gdb) p odirect$18 = _gf_false(gdb) p fd->flags$19 
= 2(gdb) p local->call_count$20 = 2

If you need more core dumps, i have several files i can upload.

-- 



Respectfully

Mahdi A. Mahdi



From: kdhan...@redhat.com
Date: Mon, 1 Aug 2016 18:39:27 +0530
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
To: mahdi.ad...@outlook.com
CC: gluster-users@gluster.org

Sorry I didn't make myself  clear. The reason I asked YOU to do it is because i 
tried it on my system and im not getting the backtrace (it's all question 
marks).

Attach the core to gdb.
At the gdb prompt, go to frame 2 by typing
(gdb) f 2

There, for each of the variables i asked you to get the values of, type p 
followed by the variable name.
For instance, to get the value of the variable 'odirect', do this:

(gdb) p odirect

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

2016-08-03 Thread Krutika Dhananjay

Do you have a test case that consistently recreates this problem?

-Krutika

On Wed, Aug 3, 2016 at 8:32 PM, Mahdi Adnan  wrote:

> Hi,
>
>  So i have updated to 3.7.14 and i still have the same issue with NFS.
> based on what i have provided so far from logs and dumps do you think it's
> an NFS issue ? should i switch to nfs-ganesha ?
> the problem is, the current setup is used in a production environment, and
> switching the mount point of  +50 VMs from native nfs to nfs-ganesha is not
> going to be smooth and without downtime, so i really appreciate your
> thoughts on this matter.
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
>
>
> --
> From: mahdi.ad...@outlook.com
> To: kdhan...@redhat.com
> Date: Tue, 2 Aug 2016 08:44:16 +0300
>
> CC: gluster-users@gluster.org
> Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
>
> Hi,
>
> The NFS just crashed again, latest bt;
>
> (gdb) bt
> #0  0x7f0b71a9f210 in pthread_spin_lock () from /lib64/libpthread.so.0
> #1  0x7f0b72c6fcd5 in fd_anonymous (inode=0x0) at fd.c:804
> #2  0x7f0b64ca5787 in shard_common_inode_write_do
> (frame=0x7f0b707c062c, this=0x7f0b6002ac10) at shard.c:3716
> #3  0x7f0b64ca5a53 in
> shard_common_inode_write_post_lookup_shards_handler (frame=,
> this=) at shard.c:3769
> #4  0x7f0b64c9eff5 in shard_common_lookup_shards_cbk
> (frame=0x7f0b707c062c, cookie=, this=0x7f0b6002ac10,
> op_ret=0,
> op_errno=, inode=, buf=0x7f0b51407640,
> xdata=0x7f0b72f57648, postparent=0x7f0b514076b0) at shard.c:1601
> #5  0x7f0b64efe141 in dht_lookup_cbk (frame=0x7f0b7075fcdc,
> cookie=, this=, op_ret=0, op_errno=0,
> inode=0x7f0b5f1d1f58,
> stbuf=0x7f0b51407640, xattr=0x7f0b72f57648, postparent=0x7f0b514076b0)
> at dht-common.c:2174
> #6  0x7f0b651871f3 in afr_lookup_done (frame=frame@entry=0x7f0b7079a4c8,
> this=this@entry=0x7f0b60023ba0) at afr-common.c:1825
> #7  0x7f0b65187b84 in afr_lookup_metadata_heal_check 
> (frame=frame@entry=0x7f0b7079a4c8,
> this=0x7f0b60023ba0, this@entry=0xca0bd88259f5a800)
> at afr-common.c:2068
> #8  0x7f0b6518834f in afr_lookup_entry_heal 
> (frame=frame@entry=0x7f0b7079a4c8,
> this=0xca0bd88259f5a800, this@entry=0x7f0b60023ba0) at afr-common.c:2157
> #9  0x7f0b6518867d in afr_lookup_cbk (frame=0x7f0b7079a4c8,
> cookie=, this=0x7f0b60023ba0, op_ret=,
> op_errno=, inode=, buf=0x7f0b564e9940,
> xdata=0x7f0b72f708c8, postparent=0x7f0b564e99b0) at afr-common.c:2205
> #10 0x7f0b653d6e42 in client3_3_lookup_cbk (req=,
> iov=, count=, myframe=0x7f0b7076354c)
> at client-rpc-fops.c:2981
> #11 0x7f0b72a00a30 in rpc_clnt_handle_reply 
> (clnt=clnt@entry=0x7f0b603393c0,
> pollin=pollin@entry=0x7f0b50c1c2d0) at rpc-clnt.c:764
> #12 0x7f0b72a00cef in rpc_clnt_notify (trans=,
> mydata=0x7f0b603393f0, event=, data=0x7f0b50c1c2d0) at
> rpc-clnt.c:925
> #13 0x7f0b729fc7c3 in rpc_transport_notify 
> (this=this@entry=0x7f0b60349040,
> event=event@entry=RPC_TRANSPORT_MSG_RECEIVED, data=data@entry
> =0x7f0b50c1c2d0)
> at rpc-transport.c:546
> #14 0x7f0b678c39a4 in socket_event_poll_in 
> (this=this@entry=0x7f0b60349040)
> at socket.c:2353
> #15 0x7f0b678c65e4 in socket_event_handler (fd=fd@entry=29,
> idx=idx@entry=17, data=0x7f0b60349040, poll_in=1, poll_out=0, poll_err=0)
> at socket.c:2466
> #16 0x7f0b72ca0f7a in event_dispatch_epoll_handler
> (event=0x7f0b564e9e80, event_pool=0x7f0b7349bf20) at event-epoll.c:575
> #17 event_dispatch_epoll_worker (data=0x7f0b60152d40) at event-epoll.c:678
> #18 0x7f0b71a9adc5 in start_thread () from /lib64/libpthread.so.0
> #19 0x7f0b713dfced in clone () from /lib64/libc.so.6
>
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
> --
> From: mahdi.ad...@outlook.com
> To: kdhan...@redhat.com
> Date: Mon, 1 Aug 2016 16:31:50 +0300
> CC: gluster-users@gluster.org
> Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
>
> Many thanks,
>
> here's the results;
>
>
> (gdb) p cur_block
> $15 = 4088
> (gdb) p last_block
> $16 = 4088
> (gdb) p local->first_block
> $17 = 4087
> (gdb) p odirect
> $18 = _gf_false
> (gdb) p fd->flags
> $19 = 2
> (gdb) p local->call_count
> $20 = 2
>
>
> If you need more core dumps, i have several files i can upload.
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
>
>
> --
> From: kdhan...@redhat.com
> Date: Mon, 1 Aug 2016 18:39:27 +0530
> Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
> To: mahdi.ad...@outlook.com
> CC: gluster-users@gluster.org
>
> Sorry I didn't make myself  clear. The

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

2016-08-03 Thread Mahdi Adnan

Hi,
 So i have updated to 3.7.14 and i still have the same issue with NFS.based on 
what i have provided so far from logs and dumps do you think it's an NFS issue 
? should i switch to nfs-ganesha ?
the problem is, the current setup is used in a production environment, and 
switching the mount point of  +50 VMs from native nfs to nfs-ganesha is not 
going to be smooth and without downtime, so i really appreciate your thoughts 
on this matter.


-- 



Respectfully

Mahdi A. Mahdi



From: mahdi.ad...@outlook.com
To: kdhan...@redhat.com
Date: Tue, 2 Aug 2016 08:44:16 +0300
CC: gluster-users@gluster.org
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash




Hi,
The NFS just crashed again, latest bt;
(gdb) bt#0  0x7f0b71a9f210 in pthread_spin_lock () from 
/lib64/libpthread.so.0#1  0x7f0b72c6fcd5 in fd_anonymous (inode=0x0) at 
fd.c:804#2  0x7f0b64ca5787 in shard_common_inode_write_do 
(frame=0x7f0b707c062c, this=0x7f0b6002ac10) at shard.c:3716#3  
0x7f0b64ca5a53 in shard_common_inode_write_post_lookup_shards_handler 
(frame=, this=) at shard.c:3769#4  
0x7f0b64c9eff5 in shard_common_lookup_shards_cbk (frame=0x7f0b707c062c, 
cookie=, this=0x7f0b6002ac10, op_ret=0, op_errno=, inode=, buf=0x7f0b51407640, xdata=0x7f0b72f57648, 
postparent=0x7f0b514076b0) at shard.c:1601#5  0x7f0b64efe141 in 
dht_lookup_cbk (frame=0x7f0b7075fcdc, cookie=, this=, op_ret=0, op_errno=0, inode=0x7f0b5f1d1f58, stbuf=0x7f0b51407640, 
xattr=0x7f0b72f57648, postparent=0x7f0b514076b0) at dht-common.c:2174#6  
0x7f0b651871f3 in afr_lookup_done (frame=frame@entry=0x7f0b7079a4c8, 
this=this@entry=0x7f0b60023ba0) at afr-common.c:1825#7  0x7f0b65187b84 in 
afr_lookup_metadata_heal_check (frame=frame@entry=0x7f0b7079a4c8, 
this=0x7f0b60023ba0, this@entry=0xca0bd88259f5a800)at afr-common.c:2068#8  
0x7f0b6518834f in afr_lookup_entry_heal (frame=frame@entry=0x7f0b7079a4c8, 
this=0xca0bd88259f5a800, this@entry=0x7f0b60023ba0) at afr-common.c:2157#9  
0x7f0b6518867d in afr_lookup_cbk (frame=0x7f0b7079a4c8, cookie=, this=0x7f0b60023ba0, op_ret=, op_errno=, inode=, buf=0x7f0b564e9940, xdata=0x7f0b72f708c8, 
postparent=0x7f0b564e99b0) at afr-common.c:2205#10 0x7f0b653d6e42 in 
client3_3_lookup_cbk (req=, iov=, 
count=, myframe=0x7f0b7076354c)at client-rpc-fops.c:2981#11 
0x7f0b72a00a30 in rpc_clnt_handle_reply (clnt=clnt@entry=0x7f0b603393c0, 
pollin=pollin@entry=0x7f0b50c1c2d0) at rpc-clnt.c:764#12 0x7f0b72a00cef in 
rpc_clnt_notify (trans=, mydata=0x7f0b603393f0, event=, data=0x7f0b50c1c2d0) at rpc-clnt.c:925#13 0x7f0b729fc7c3 in 
rpc_transport_notify (this=this@entry=0x7f0b60349040, 
event=event@entry=RPC_TRANSPORT_MSG_RECEIVED, data=data@entry=0x7f0b50c1c2d0)   
 at rpc-transport.c:546#14 0x7f0b678c39a4 in socket_event_poll_in 
(this=this@entry=0x7f0b60349040) at socket.c:2353#15 0x7f0b678c65e4 in 
socket_event_handler (fd=fd@entry=29, idx=idx@entry=17, data=0x7f0b60349040, 
poll_in=1, poll_out=0, poll_err=0) at socket.c:2466#16 0x7f0b72ca0f7a in 
event_dispatch_epoll_handler (event=0x7f0b564e9e80, event_pool=0x7f0b7349bf20) 
at event-epoll.c:575#17 event_dispatch_epoll_worker (data=0x7f0b60152d40) at 
event-epoll.c:678#18 0x7f0b71a9adc5 in start_thread () from 
/lib64/libpthread.so.0#19 0x7f0b713dfced in clone () from /lib64/libc.so.6


-- 



Respectfully

Mahdi A. Mahdi

From: mahdi.ad...@outlook.com
To: kdhan...@redhat.com
Date: Mon, 1 Aug 2016 16:31:50 +0300
CC: gluster-users@gluster.org
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash




Many thanks,
here's the results;

(gdb) p cur_block$15 = 4088(gdb) p last_block$16 = 4088(gdb) p 
local->first_block$17 = 4087(gdb) p odirect$18 = _gf_false(gdb) p fd->flags$19 
= 2(gdb) p local->call_count$20 = 2

If you need more core dumps, i have several files i can upload.

-- 



Respectfully

Mahdi A. Mahdi



From: kdhan...@redhat.com
Date: Mon, 1 Aug 2016 18:39:27 +0530
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
To: mahdi.ad...@outlook.com
CC: gluster-users@gluster.org

Sorry I didn't make myself  clear. The reason I asked YOU to do it is because i 
tried it on my system and im not getting the backtrace (it's all question 
marks).

Attach the core to gdb.
At the gdb prompt, go to frame 2 by typing
(gdb) f 2

There, for each of the variables i asked you to get the values of, type p 
followed by the variable name.
For instance, to get the value of the variable 'odirect', do this:

(gdb) p odirect

and gdb will print its value for you in response.

-Krutika

On Mon, Aug 1, 2016 at 4:55 PM, Mahdi Adnan  wrote:



Hi,
How to get the results of the below variables ? i cant get the results from gdb.



-- 



Respectfully

Mahdi A. Mahdi



From: kdhan...@redhat.com
Date: Mon, 1 Aug 2016 15:51:38 +0530
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
To: mahdi.ad...@outlook.com
CC: gluster-users@gluster.org

Could you also prin

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

2016-08-01 Thread Mahdi Adnan

Hi,
The NFS just crashed again, latest bt;
(gdb) bt#0  0x7f0b71a9f210 in pthread_spin_lock () from 
/lib64/libpthread.so.0#1  0x7f0b72c6fcd5 in fd_anonymous (inode=0x0) at 
fd.c:804#2  0x7f0b64ca5787 in shard_common_inode_write_do 
(frame=0x7f0b707c062c, this=0x7f0b6002ac10) at shard.c:3716#3  
0x7f0b64ca5a53 in shard_common_inode_write_post_lookup_shards_handler 
(frame=, this=) at shard.c:3769#4  
0x7f0b64c9eff5 in shard_common_lookup_shards_cbk (frame=0x7f0b707c062c, 
cookie=, this=0x7f0b6002ac10, op_ret=0, op_errno=, inode=, buf=0x7f0b51407640, xdata=0x7f0b72f57648, 
postparent=0x7f0b514076b0) at shard.c:1601#5  0x7f0b64efe141 in 
dht_lookup_cbk (frame=0x7f0b7075fcdc, cookie=, this=, op_ret=0, op_errno=0, inode=0x7f0b5f1d1f58, stbuf=0x7f0b51407640, 
xattr=0x7f0b72f57648, postparent=0x7f0b514076b0) at dht-common.c:2174#6  
0x7f0b651871f3 in afr_lookup_done (frame=frame@entry=0x7f0b7079a4c8, 
this=this@entry=0x7f0b60023ba0) at afr-common.c:1825#7  0x7f0b65187b84 in 
afr_lookup_metadata_heal_check (frame=frame@entry=0x7f0b7079a4c8, 
this=0x7f0b60023ba0, this@entry=0xca0bd88259f5a800)at afr-common.c:2068#8  
0x7f0b6518834f in afr_lookup_entry_heal (frame=frame@entry=0x7f0b7079a4c8, 
this=0xca0bd88259f5a800, this@entry=0x7f0b60023ba0) at afr-common.c:2157#9  
0x7f0b6518867d in afr_lookup_cbk (frame=0x7f0b7079a4c8, cookie=, this=0x7f0b60023ba0, op_ret=, op_errno=, inode=, buf=0x7f0b564e9940, xdata=0x7f0b72f708c8, 
postparent=0x7f0b564e99b0) at afr-common.c:2205#10 0x7f0b653d6e42 in 
client3_3_lookup_cbk (req=, iov=, 
count=, myframe=0x7f0b7076354c)at client-rpc-fops.c:2981#11 
0x7f0b72a00a30 in rpc_clnt_handle_reply (clnt=clnt@entry=0x7f0b603393c0, 
pollin=pollin@entry=0x7f0b50c1c2d0) at rpc-clnt.c:764#12 0x7f0b72a00cef in 
rpc_clnt_notify (trans=, mydata=0x7f0b603393f0, event=, data=0x7f0b50c1c2d0) at rpc-clnt.c:925#13 0x7f0b729fc7c3 in 
rpc_transport_notify (this=this@entry=0x7f0b60349040, 
event=event@entry=RPC_TRANSPORT_MSG_RECEIVED, data=data@entry=0x7f0b50c1c2d0)   
 at rpc-transport.c:546#14 0x7f0b678c39a4 in socket_event_poll_in 
(this=this@entry=0x7f0b60349040) at socket.c:2353#15 0x7f0b678c65e4 in 
socket_event_handler (fd=fd@entry=29, idx=idx@entry=17, data=0x7f0b60349040, 
poll_in=1, poll_out=0, poll_err=0) at socket.c:2466#16 0x7f0b72ca0f7a in 
event_dispatch_epoll_handler (event=0x7f0b564e9e80, event_pool=0x7f0b7349bf20) 
at event-epoll.c:575#17 event_dispatch_epoll_worker (data=0x7f0b60152d40) at 
event-epoll.c:678#18 0x7f0b71a9adc5 in start_thread () from 
/lib64/libpthread.so.0#19 0x7f0b713dfced in clone () from /lib64/libc.so.6


-- 



Respectfully

Mahdi A. Mahdi

From: mahdi.ad...@outlook.com
To: kdhan...@redhat.com
Date: Mon, 1 Aug 2016 16:31:50 +0300
CC: gluster-users@gluster.org
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash




Many thanks,
here's the results;

(gdb) p cur_block$15 = 4088(gdb) p last_block$16 = 4088(gdb) p 
local->first_block$17 = 4087(gdb) p odirect$18 = _gf_false(gdb) p fd->flags$19 
= 2(gdb) p local->call_count$20 = 2

If you need more core dumps, i have several files i can upload.

-- 



Respectfully

Mahdi A. Mahdi



From: kdhan...@redhat.com
Date: Mon, 1 Aug 2016 18:39:27 +0530
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
To: mahdi.ad...@outlook.com
CC: gluster-users@gluster.org

Sorry I didn't make myself  clear. The reason I asked YOU to do it is because i 
tried it on my system and im not getting the backtrace (it's all question 
marks).

Attach the core to gdb.
At the gdb prompt, go to frame 2 by typing
(gdb) f 2

There, for each of the variables i asked you to get the values of, type p 
followed by the variable name.
For instance, to get the value of the variable 'odirect', do this:

(gdb) p odirect

and gdb will print its value for you in response.

-Krutika

On Mon, Aug 1, 2016 at 4:55 PM, Mahdi Adnan  wrote:



Hi,
How to get the results of the below variables ? i cant get the results from gdb.



-- 



Respectfully

Mahdi A. Mahdi



From: kdhan...@redhat.com
Date: Mon, 1 Aug 2016 15:51:38 +0530
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
To: mahdi.ad...@outlook.com
CC: gluster-users@gluster.org

Could you also print and share the values of the following variables from the 
backtrace please:

i. cur_block
ii. last_block
iii. local->first_block
iv. odirect
v. fd->flags
vi. local->call_count

-Krutika

On Sat, Jul 30, 2016 at 5:04 PM, Mahdi Adnan  wrote:



Hi,
I really appreciate if someone can help me fix my nfs crash, its happening a 
lot and it's causing lots of issues to my VMs;the problem is every few hours 
the native nfs crash and the volume become unavailable from the affected node 
unless i restart glusterd.the volume is used by vmware esxi as a datastore for 
it's VMs with the following options;

OS: CentOS 7.2Gluster: 3.7.13
Volume Name: vlm01Type: Dis

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

2016-08-01 Thread Mahdi Adnan

Many thanks,
here's the results;

(gdb) p cur_block$15 = 4088(gdb) p last_block$16 = 4088(gdb) p 
local->first_block$17 = 4087(gdb) p odirect$18 = _gf_false(gdb) p fd->flags$19 
= 2(gdb) p local->call_count$20 = 2

If you need more core dumps, i have several files i can upload.

-- 



Respectfully

Mahdi A. Mahdi



From: kdhan...@redhat.com
Date: Mon, 1 Aug 2016 18:39:27 +0530
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
To: mahdi.ad...@outlook.com
CC: gluster-users@gluster.org

Sorry I didn't make myself  clear. The reason I asked YOU to do it is because i 
tried it on my system and im not getting the backtrace (it's all question 
marks).

Attach the core to gdb.
At the gdb prompt, go to frame 2 by typing
(gdb) f 2

There, for each of the variables i asked you to get the values of, type p 
followed by the variable name.
For instance, to get the value of the variable 'odirect', do this:

(gdb) p odirect

and gdb will print its value for you in response.

-Krutika

On Mon, Aug 1, 2016 at 4:55 PM, Mahdi Adnan  wrote:



Hi,
How to get the results of the below variables ? i cant get the results from gdb.



-- 



Respectfully

Mahdi A. Mahdi



From: kdhan...@redhat.com
Date: Mon, 1 Aug 2016 15:51:38 +0530
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
To: mahdi.ad...@outlook.com
CC: gluster-users@gluster.org

Could you also print and share the values of the following variables from the 
backtrace please:

i. cur_block
ii. last_block
iii. local->first_block
iv. odirect
v. fd->flags
vi. local->call_count

-Krutika

On Sat, Jul 30, 2016 at 5:04 PM, Mahdi Adnan  wrote:



Hi,
I really appreciate if someone can help me fix my nfs crash, its happening a 
lot and it's causing lots of issues to my VMs;the problem is every few hours 
the native nfs crash and the volume become unavailable from the affected node 
unless i restart glusterd.the volume is used by vmware esxi as a datastore for 
it's VMs with the following options;

OS: CentOS 7.2Gluster: 3.7.13
Volume Name: vlm01Type: Distributed-ReplicateVolume ID: 
eacd8248-dca3-4530-9aed-7714a5a114f2Status: StartedNumber of Bricks: 7 x 3 = 
21Transport-type: tcpBricks:Brick1: gfs01:/bricks/b01/vlm01Brick2: 
gfs02:/bricks/b01/vlm01Brick3: gfs03:/bricks/b01/vlm01Brick4: 
gfs01:/bricks/b02/vlm01Brick5: gfs02:/bricks/b02/vlm01Brick6: 
gfs03:/bricks/b02/vlm01Brick7: gfs01:/bricks/b03/vlm01Brick8: 
gfs02:/bricks/b03/vlm01Brick9: gfs03:/bricks/b03/vlm01Brick10: 
gfs01:/bricks/b04/vlm01Brick11: gfs02:/bricks/b04/vlm01Brick12: 
gfs03:/bricks/b04/vlm01Brick13: gfs01:/bricks/b05/vlm01Brick14: 
gfs02:/bricks/b05/vlm01Brick15: gfs03:/bricks/b05/vlm01Brick16: 
gfs01:/bricks/b06/vlm01Brick17: gfs02:/bricks/b06/vlm01Brick18: 
gfs03:/bricks/b06/vlm01Brick19: gfs01:/bricks/b07/vlm01Brick20: 
gfs02:/bricks/b07/vlm01Brick21: gfs03:/bricks/b07/vlm01Options 
Reconfigured:performance.readdir-ahead: offperformance.quick-read: 
offperformance.read-ahead: offperformance.io-cache: 
offperformance.stat-prefetch: offcluster.eager-lock: enablenetwork.remote-dio: 
enablecluster.quorum-type: autocluster.server-quorum-type: 
serverperformance.strict-write-ordering: onperformance.write-behind: 
offcluster.data-self-heal-algorithm: fullcluster.self-heal-window-size: 
128features.shard-block-size: 16MBfeatures.shard: onauth.allow: 
192.168.221.50,192.168.221.51,192.168.221.52,192.168.221.56,192.168.208.130,192.168.208.131,192.168.208.132,192.168.208.89,192.168.208.85,192.168.208.208.86network.ping-timeout:
 10

latest bt;

(gdb) bt #0  0x7f196acab210 in pthread_spin_lock () from 
/lib64/libpthread.so.0#1  0x7f196be7bcd5 in fd_anonymous (inode=0x0) at 
fd.c:804#2  0x7f195deb1787 in shard_common_inode_write_do 
(frame=0x7f19699f1164, this=0x7f195802ac10) at shard.c:3716#3  
0x7f195deb1a53 in shard_common_inode_write_post_lookup_shards_handler 
(frame=, this=) at shard.c:3769#4  
0x7f195deaaff5 in shard_common_lookup_shards_cbk (frame=0x7f19699f1164, 
cookie=, this=0x7f195802ac10, op_ret=0, op_errno=, inode=, buf=0x7f194970bc40, xdata=0x7f196c15451c, 
postparent=0x7f194970bcb0) at shard.c:1601#5  0x7f195e10a141 in 
dht_lookup_cbk (frame=0x7f196998e7d4, cookie=, this=, op_ret=0, op_errno=0, inode=0x7f195c532b18, stbuf=0x7f194970bc40, 
xattr=0x7f196c15451c, postparent=0x7f194970bcb0) at dht-common.c:2174#6  
0x7f195e3931f3 in afr_lookup_done (frame=frame@entry=0x7f196997f8a4, 
this=this@entry=0x7f1958022a20) at afr-common.c:1825#7  0x7f195e393b84 in 
afr_lookup_metadata_heal_check (frame=frame@entry=0x7f196997f8a4, 
this=0x7f1958022a20, this@entry=0xe3a929e0b67fa500)at afr-common.c:2068#8  
0x7f195e39434f in afr_lookup_entry_heal (frame=frame@entry=0x7f196997f8a4, 
this=0xe3a929e0b67fa500, this@entry=0x7f1958022a20) at afr-common.c:2157#9  
0x7f195e39467d in afr_lookup_cbk (frame=0x7f196997f8a4, cookie=, this=0x7f1958022a20, op_ret=, op_errno=, inode=, buf=0x7f195effa940,

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

2016-08-01 Thread Krutika Dhananjay

Sorry I didn't make myself  clear. The reason I asked YOU to do it is
because i tried it on my system and im not getting the backtrace (it's all
question marks).

Attach the core to gdb.
At the gdb prompt, go to frame 2 by typing
(gdb) f 2

There, for each of the variables i asked you to get the values of, type p
followed by the variable name.
For instance, to get the value of the variable 'odirect', do this:

(gdb) p odirect

and gdb will print its value for you in response.

-Krutika

On Mon, Aug 1, 2016 at 4:55 PM, Mahdi Adnan  wrote:

> Hi,
>
> How to get the results of the below variables ? i cant get the results
> from gdb.
>
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
>
>
> --
> From: kdhan...@redhat.com
> Date: Mon, 1 Aug 2016 15:51:38 +0530
> Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
> To: mahdi.ad...@outlook.com
> CC: gluster-users@gluster.org
>
>
> Could you also print and share the values of the following variables from
> the backtrace please:
>
> i. cur_block
> ii. last_block
> iii. local->first_block
> iv. odirect
> v. fd->flags
> vi. local->call_count
>
> -Krutika
>
> On Sat, Jul 30, 2016 at 5:04 PM, Mahdi Adnan 
> wrote:
>
> Hi,
>
> I really appreciate if someone can help me fix my nfs crash, its happening
> a lot and it's causing lots of issues to my VMs;
> the problem is every few hours the native nfs crash and the volume become
> unavailable from the affected node unless i restart glusterd.
> the volume is used by vmware esxi as a datastore for it's VMs with the
> following options;
>
>
> OS: CentOS 7.2
> Gluster: 3.7.13
>
> Volume Name: vlm01
> Type: Distributed-Replicate
> Volume ID: eacd8248-dca3-4530-9aed-7714a5a114f2
> Status: Started
> Number of Bricks: 7 x 3 = 21
> Transport-type: tcp
> Bricks:
> Brick1: gfs01:/bricks/b01/vlm01
> Brick2: gfs02:/bricks/b01/vlm01
> Brick3: gfs03:/bricks/b01/vlm01
> Brick4: gfs01:/bricks/b02/vlm01
> Brick5: gfs02:/bricks/b02/vlm01
> Brick6: gfs03:/bricks/b02/vlm01
> Brick7: gfs01:/bricks/b03/vlm01
> Brick8: gfs02:/bricks/b03/vlm01
> Brick9: gfs03:/bricks/b03/vlm01
> Brick10: gfs01:/bricks/b04/vlm01
> Brick11: gfs02:/bricks/b04/vlm01
> Brick12: gfs03:/bricks/b04/vlm01
> Brick13: gfs01:/bricks/b05/vlm01
> Brick14: gfs02:/bricks/b05/vlm01
> Brick15: gfs03:/bricks/b05/vlm01
> Brick16: gfs01:/bricks/b06/vlm01
> Brick17: gfs02:/bricks/b06/vlm01
> Brick18: gfs03:/bricks/b06/vlm01
> Brick19: gfs01:/bricks/b07/vlm01
> Brick20: gfs02:/bricks/b07/vlm01
> Brick21: gfs03:/bricks/b07/vlm01
> Options Reconfigured:
> performance.readdir-ahead: off
> performance.quick-read: off
> performance.read-ahead: off
> performance.io-cache: off
> performance.stat-prefetch: off
> cluster.eager-lock: enable
> network.remote-dio: enable
> cluster.quorum-type: auto
> cluster.server-quorum-type: server
> performance.strict-write-ordering: on
> performance.write-behind: off
> cluster.data-self-heal-algorithm: full
> cluster.self-heal-window-size: 128
> features.shard-block-size: 16MB
> features.shard: on
> auth.allow:
> 192.168.221.50,192.168.221.51,192.168.221.52,192.168.221.56,192.168.208.130,192.168.208.131,192.168.208.132,192.168.208.89,192.168.208.85,192.168.208.208.86
> network.ping-timeout: 10
>
>
> latest bt;
>
>
> (gdb) bt
> #0  0x7f196acab210 in pthread_spin_lock () from /lib64/libpthread.so.0
> #1  0x7f196be7bcd5 in fd_anonymous (inode=0x0) at fd.c:804
> #2  0x7f195deb1787 in shard_common_inode_write_do
> (frame=0x7f19699f1164, this=0x7f195802ac10) at shard.c:3716
> #3  0x7f195deb1a53 in
> shard_common_inode_write_post_lookup_shards_handler (frame=,
> this=) at shard.c:3769
> #4  0x7f195deaaff5 in shard_common_lookup_shards_cbk
> (frame=0x7f19699f1164, cookie=, this=0x7f195802ac10,
> op_ret=0,
> op_errno=, inode=, buf=0x7f194970bc40,
> xdata=0x7f196c15451c, postparent=0x7f194970bcb0) at shard.c:1601
> #5  0x7f195e10a141 in dht_lookup_cbk (frame=0x7f196998e7d4,
> cookie=, this=, op_ret=0, op_errno=0,
> inode=0x7f195c532b18,
> stbuf=0x7f194970bc40, xattr=0x7f196c15451c, postparent=0x7f194970bcb0)
> at dht-common.c:2174
> #6  0x7f195e3931f3 in afr_lookup_done (frame=frame@entry=0x7f196997f8a4,
> this=this@entry=0x7f1958022a20) at afr-common.c:1825
> #7  0x7f195e393b84 in afr_lookup_metadata_heal_check 
> (frame=frame@entry=0x7f196997f8a4,
> this=0x7f1958022a20, this@entry=0xe3a929e0b67fa500)
> at afr-common.c:2068
> #8  0x7f195e39434f in afr_lookup_entry_heal 
> (frame=frame@entry=0x7f196997f8a4,
> this=0xe3a929e0b67fa500, this@entry=0x7f1958022a20) at afr-common.c:215

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

2016-08-01 Thread Mahdi Adnan

Hi,
How to get the results of the below variables ? i cant get the results from gdb.



-- 



Respectfully

Mahdi A. Mahdi



From: kdhan...@redhat.com
Date: Mon, 1 Aug 2016 15:51:38 +0530
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
To: mahdi.ad...@outlook.com
CC: gluster-users@gluster.org

Could you also print and share the values of the following variables from the 
backtrace please:

i. cur_block
ii. last_block
iii. local->first_block
iv. odirect
v. fd->flags
vi. local->call_count

-Krutika

On Sat, Jul 30, 2016 at 5:04 PM, Mahdi Adnan  wrote:



Hi,
I really appreciate if someone can help me fix my nfs crash, its happening a 
lot and it's causing lots of issues to my VMs;the problem is every few hours 
the native nfs crash and the volume become unavailable from the affected node 
unless i restart glusterd.the volume is used by vmware esxi as a datastore for 
it's VMs with the following options;

OS: CentOS 7.2Gluster: 3.7.13
Volume Name: vlm01Type: Distributed-ReplicateVolume ID: 
eacd8248-dca3-4530-9aed-7714a5a114f2Status: StartedNumber of Bricks: 7 x 3 = 
21Transport-type: tcpBricks:Brick1: gfs01:/bricks/b01/vlm01Brick2: 
gfs02:/bricks/b01/vlm01Brick3: gfs03:/bricks/b01/vlm01Brick4: 
gfs01:/bricks/b02/vlm01Brick5: gfs02:/bricks/b02/vlm01Brick6: 
gfs03:/bricks/b02/vlm01Brick7: gfs01:/bricks/b03/vlm01Brick8: 
gfs02:/bricks/b03/vlm01Brick9: gfs03:/bricks/b03/vlm01Brick10: 
gfs01:/bricks/b04/vlm01Brick11: gfs02:/bricks/b04/vlm01Brick12: 
gfs03:/bricks/b04/vlm01Brick13: gfs01:/bricks/b05/vlm01Brick14: 
gfs02:/bricks/b05/vlm01Brick15: gfs03:/bricks/b05/vlm01Brick16: 
gfs01:/bricks/b06/vlm01Brick17: gfs02:/bricks/b06/vlm01Brick18: 
gfs03:/bricks/b06/vlm01Brick19: gfs01:/bricks/b07/vlm01Brick20: 
gfs02:/bricks/b07/vlm01Brick21: gfs03:/bricks/b07/vlm01Options 
Reconfigured:performance.readdir-ahead: offperformance.quick-read: 
offperformance.read-ahead: offperformance.io-cache: 
offperformance.stat-prefetch: offcluster.eager-lock: enablenetwork.remote-dio: 
enablecluster.quorum-type: autocluster.server-quorum-type: 
serverperformance.strict-write-ordering: onperformance.write-behind: 
offcluster.data-self-heal-algorithm: fullcluster.self-heal-window-size: 
128features.shard-block-size: 16MBfeatures.shard: onauth.allow: 
192.168.221.50,192.168.221.51,192.168.221.52,192.168.221.56,192.168.208.130,192.168.208.131,192.168.208.132,192.168.208.89,192.168.208.85,192.168.208.208.86network.ping-timeout:
 10

latest bt;

(gdb) bt #0  0x7f196acab210 in pthread_spin_lock () from 
/lib64/libpthread.so.0#1  0x7f196be7bcd5 in fd_anonymous (inode=0x0) at 
fd.c:804#2  0x7f195deb1787 in shard_common_inode_write_do 
(frame=0x7f19699f1164, this=0x7f195802ac10) at shard.c:3716#3  
0x7f195deb1a53 in shard_common_inode_write_post_lookup_shards_handler 
(frame=, this=) at shard.c:3769#4  
0x7f195deaaff5 in shard_common_lookup_shards_cbk (frame=0x7f19699f1164, 
cookie=, this=0x7f195802ac10, op_ret=0, op_errno=, inode=, buf=0x7f194970bc40, xdata=0x7f196c15451c, 
postparent=0x7f194970bcb0) at shard.c:1601#5  0x7f195e10a141 in 
dht_lookup_cbk (frame=0x7f196998e7d4, cookie=, this=, op_ret=0, op_errno=0, inode=0x7f195c532b18, stbuf=0x7f194970bc40, 
xattr=0x7f196c15451c, postparent=0x7f194970bcb0) at dht-common.c:2174#6  
0x7f195e3931f3 in afr_lookup_done (frame=frame@entry=0x7f196997f8a4, 
this=this@entry=0x7f1958022a20) at afr-common.c:1825#7  0x7f195e393b84 in 
afr_lookup_metadata_heal_check (frame=frame@entry=0x7f196997f8a4, 
this=0x7f1958022a20, this@entry=0xe3a929e0b67fa500)at afr-common.c:2068#8  
0x7f195e39434f in afr_lookup_entry_heal (frame=frame@entry=0x7f196997f8a4, 
this=0xe3a929e0b67fa500, this@entry=0x7f1958022a20) at afr-common.c:2157#9  
0x7f195e39467d in afr_lookup_cbk (frame=0x7f196997f8a4, cookie=, this=0x7f1958022a20, op_ret=, op_errno=, inode=, buf=0x7f195effa940, xdata=0x7f196c1853b0, 
postparent=0x7f195effa9b0) at afr-common.c:2205#10 0x7f195e5e2e42 in 
client3_3_lookup_cbk (req=, iov=, 
count=, myframe=0x7f19652c)at client-rpc-fops.c:2981#11 
0x7f196bc0ca30 in rpc_clnt_handle_reply (clnt=clnt@entry=0x7f19583adaf0, 
pollin=pollin@entry=0x7f195907f930) at rpc-clnt.c:764#12 0x7f196bc0ccef in 
rpc_clnt_notify (trans=, mydata=0x7f19583adb20, event=, data=0x7f195907f930) at rpc-clnt.c:925#13 0x7f196bc087c3 in 
rpc_transport_notify (this=this@entry=0x7f19583bd770, 
event=event@entry=RPC_TRANSPORT_MSG_RECEIVED, data=data@entry=0x7f195907f930)   
 at rpc-transport.c:546#14 0x7f1960acf9a4 in socket_event_poll_in 
(this=this@entry=0x7f19583bd770) at socket.c:2353#15 0x7f1960ad25e4 in 
socket_event_handler (fd=fd@entry=25, idx=idx@entry=14, data=0x7f19583bd770, 
poll_in=1, poll_out=0, poll_err=0) at socket.c:2466#16 0x7f196beacf7a in 
event_dispatch_epoll_handler (event=0x7f195effae80, event_pool=0x7f196dbf5f20) 
at event-epoll.c:575#17 event_dispatch_epoll_worker (data=0x7f196dc41e10) at 
e

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

2016-08-01 Thread Krutika Dhananjay

Could you also print and share the values of the following variables from
the backtrace please:

i. cur_block
ii. last_block
iii. local->first_block
iv. odirect
v. fd->flags
vi. local->call_count

-Krutika

On Sat, Jul 30, 2016 at 5:04 PM, Mahdi Adnan 
wrote:

> Hi,
>
> I really appreciate if someone can help me fix my nfs crash, its happening
> a lot and it's causing lots of issues to my VMs;
> the problem is every few hours the native nfs crash and the volume become
> unavailable from the affected node unless i restart glusterd.
> the volume is used by vmware esxi as a datastore for it's VMs with the
> following options;
>
>
> OS: CentOS 7.2
> Gluster: 3.7.13
>
> Volume Name: vlm01
> Type: Distributed-Replicate
> Volume ID: eacd8248-dca3-4530-9aed-7714a5a114f2
> Status: Started
> Number of Bricks: 7 x 3 = 21
> Transport-type: tcp
> Bricks:
> Brick1: gfs01:/bricks/b01/vlm01
> Brick2: gfs02:/bricks/b01/vlm01
> Brick3: gfs03:/bricks/b01/vlm01
> Brick4: gfs01:/bricks/b02/vlm01
> Brick5: gfs02:/bricks/b02/vlm01
> Brick6: gfs03:/bricks/b02/vlm01
> Brick7: gfs01:/bricks/b03/vlm01
> Brick8: gfs02:/bricks/b03/vlm01
> Brick9: gfs03:/bricks/b03/vlm01
> Brick10: gfs01:/bricks/b04/vlm01
> Brick11: gfs02:/bricks/b04/vlm01
> Brick12: gfs03:/bricks/b04/vlm01
> Brick13: gfs01:/bricks/b05/vlm01
> Brick14: gfs02:/bricks/b05/vlm01
> Brick15: gfs03:/bricks/b05/vlm01
> Brick16: gfs01:/bricks/b06/vlm01
> Brick17: gfs02:/bricks/b06/vlm01
> Brick18: gfs03:/bricks/b06/vlm01
> Brick19: gfs01:/bricks/b07/vlm01
> Brick20: gfs02:/bricks/b07/vlm01
> Brick21: gfs03:/bricks/b07/vlm01
> Options Reconfigured:
> performance.readdir-ahead: off
> performance.quick-read: off
> performance.read-ahead: off
> performance.io-cache: off
> performance.stat-prefetch: off
> cluster.eager-lock: enable
> network.remote-dio: enable
> cluster.quorum-type: auto
> cluster.server-quorum-type: server
> performance.strict-write-ordering: on
> performance.write-behind: off
> cluster.data-self-heal-algorithm: full
> cluster.self-heal-window-size: 128
> features.shard-block-size: 16MB
> features.shard: on
> auth.allow:
> 192.168.221.50,192.168.221.51,192.168.221.52,192.168.221.56,192.168.208.130,192.168.208.131,192.168.208.132,192.168.208.89,192.168.208.85,192.168.208.208.86
> network.ping-timeout: 10
>
>
> latest bt;
>
>
> (gdb) bt
> #0  0x7f196acab210 in pthread_spin_lock () from /lib64/libpthread.so.0
> #1  0x7f196be7bcd5 in fd_anonymous (inode=0x0) at fd.c:804
> #2  0x7f195deb1787 in shard_common_inode_write_do
> (frame=0x7f19699f1164, this=0x7f195802ac10) at shard.c:3716
> #3  0x7f195deb1a53 in
> shard_common_inode_write_post_lookup_shards_handler (frame=,
> this=) at shard.c:3769
> #4  0x7f195deaaff5 in shard_common_lookup_shards_cbk
> (frame=0x7f19699f1164, cookie=, this=0x7f195802ac10,
> op_ret=0,
> op_errno=, inode=, buf=0x7f194970bc40,
> xdata=0x7f196c15451c, postparent=0x7f194970bcb0) at shard.c:1601
> #5  0x7f195e10a141 in dht_lookup_cbk (frame=0x7f196998e7d4,
> cookie=, this=, op_ret=0, op_errno=0,
> inode=0x7f195c532b18,
> stbuf=0x7f194970bc40, xattr=0x7f196c15451c, postparent=0x7f194970bcb0)
> at dht-common.c:2174
> #6  0x7f195e3931f3 in afr_lookup_done (frame=frame@entry=0x7f196997f8a4,
> this=this@entry=0x7f1958022a20) at afr-common.c:1825
> #7  0x7f195e393b84 in afr_lookup_metadata_heal_check 
> (frame=frame@entry=0x7f196997f8a4,
> this=0x7f1958022a20, this@entry=0xe3a929e0b67fa500)
> at afr-common.c:2068
> #8  0x7f195e39434f in afr_lookup_entry_heal 
> (frame=frame@entry=0x7f196997f8a4,
> this=0xe3a929e0b67fa500, this@entry=0x7f1958022a20) at afr-common.c:2157
> #9  0x7f195e39467d in afr_lookup_cbk (frame=0x7f196997f8a4,
> cookie=, this=0x7f1958022a20, op_ret=,
> op_errno=, inode=, buf=0x7f195effa940,
> xdata=0x7f196c1853b0, postparent=0x7f195effa9b0) at afr-common.c:2205
> #10 0x7f195e5e2e42 in client3_3_lookup_cbk (req=,
> iov=, count=, myframe=0x7f19652c)
> at client-rpc-fops.c:2981
> #11 0x7f196bc0ca30 in rpc_clnt_handle_reply 
> (clnt=clnt@entry=0x7f19583adaf0,
> pollin=pollin@entry=0x7f195907f930) at rpc-clnt.c:764
> #12 0x7f196bc0ccef in rpc_clnt_notify (trans=,
> mydata=0x7f19583adb20, event=, data=0x7f195907f930) at
> rpc-clnt.c:925
> #13 0x7f196bc087c3 in rpc_transport_notify 
> (this=this@entry=0x7f19583bd770,
> event=event@entry=RPC_TRANSPORT_MSG_RECEIVED, data=data@entry
> =0x7f195907f930)
> at rpc-transport.c:546
> #14 0x7f1960acf9a4 in socket_event_poll_in 
> (this=this@entry=0x7f19583bd770)
> at socket.c:2353
> #15 0x7f1960ad25e4 in socket_event_handler (fd=fd@entry=25,
> idx=idx@entry=14, data=0x7f19583bd770, poll_in=1, poll_out=0, poll_err=0)
> at socket.c:2466
> #16 0x7f196beacf7a in event_dispatch_epoll_handler
> (event=0x7f195effae80, event_pool=0x7f196dbf5f20) at event-epoll.c:575
> #17 event_dispatch_epoll_worker (data=0x7f196dc41e10) at event-epoll.c:678
> #18 0x7f196aca6dc5 in start_thread () from /lib64/libpthread.so.0

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

2016-07-30 Thread Soumya Koduri


Inode stored in the shard xlator local is NULL.  CCin Kruthika to comment.

Thanks,
Soumya



(gdb) bt
#0  0x7f196acab210 in pthread_spin_lock () from /lib64/libpthread.so.0
#1  0x7f196be7bcd5 in fd_anonymous (inode=0x0) at fd.c:804
#2  0x7f195deb1787 in shard_common_inode_write_do
(frame=0x7f19699f1164, this=0x7f195802ac10) at shard.c:3716
#3  0x7f195deb1a53 in
shard_common_inode_write_post_lookup_shards_handler (frame=, this=) at shard.c:3769
#4  0x7f195deaaff5 in shard_common_lookup_shards_cbk
(frame=0x7f19699f1164, cookie=, this=0x7f195802ac10,
op_ret=0,
op_errno=, inode=, buf=0x7f194970bc40,
xdata=0x7f196c15451c, postparent=0x7f194970bcb0) at shard.c:1601
#5  0x7f195e10a141 in dht_lookup_cbk (frame=0x7f196998e7d4,
cookie=, this=, op_ret=0, op_errno=0,
inode=0x7f195c532b18,
stbuf=0x7f194970bc40, xattr=0x7f196c15451c,
postparent=0x7f194970bcb0) at dht-common.c:2174
#6  0x7f195e3931f3 in afr_lookup_done
(frame=frame@entry=0x7f196997f8a4, this=this@entry=0x7f1958022a20) at
afr-common.c:1825
#7  0x7f195e393b84 in afr_lookup_metadata_heal_check
(frame=frame@entry=0x7f196997f8a4, this=0x7f1958022a20,
this@entry=0xe3a929e0b67fa500)
at afr-common.c:2068
#8  0x7f195e39434f in afr_lookup_entry_heal
(frame=frame@entry=0x7f196997f8a4, this=0xe3a929e0b67fa500,
this@entry=0x7f1958022a20) at afr-common.c:2157
#9  0x7f195e39467d in afr_lookup_cbk (frame=0x7f196997f8a4,
cookie=, this=0x7f1958022a20, op_ret=,
op_errno=, inode=, buf=0x7f195effa940,
xdata=0x7f196c1853b0, postparent=0x7f195effa9b0) at afr-common.c:2205
#10 0x7f195e5e2e42 in client3_3_lookup_cbk (req=,
iov=, count=, myframe=0x7f19652c)
at client-rpc-fops.c:2981
#11 0x7f196bc0ca30 in rpc_clnt_handle_reply
(clnt=clnt@entry=0x7f19583adaf0, pollin=pollin@entry=0x7f195907f930) at
rpc-clnt.c:764
#12 0x7f196bc0ccef in rpc_clnt_notify (trans=,
mydata=0x7f19583adb20, event=, data=0x7f195907f930) at
rpc-clnt.c:925
#13 0x7f196bc087c3 in rpc_transport_notify
(this=this@entry=0x7f19583bd770,
event=event@entry=RPC_TRANSPORT_MSG_RECEIVED,
data=data@entry=0x7f195907f930)
at rpc-transport.c:546
#14 0x7f1960acf9a4 in socket_event_poll_in
(this=this@entry=0x7f19583bd770) at socket.c:2353
#15 0x7f1960ad25e4 in socket_event_handler (fd=fd@entry=25,
idx=idx@entry=14, data=0x7f19583bd770, poll_in=1, poll_out=0,
poll_err=0) at socket.c:2466
#16 0x7f196beacf7a in event_dispatch_epoll_handler
(event=0x7f195effae80, event_pool=0x7f196dbf5f20) at event-epoll.c:575
#17 event_dispatch_epoll_worker (data=0x7f196dc41e10) at event-epoll.c:678
#18 0x7f196aca6dc5 in start_thread () from /lib64/libpthread.so.0
#19 0x7f196a5ebced in clone () from /lib64/libc.so.6




nfs logs and the core dump can be found in the dropbox link below;
https://db.tt/rZrC9d7f


thanks in advance.

Respectfully*
**Mahdi A. Mahdi*



___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

27 matches

Site Navigation

Mail list logo

Footer information