Re: [Gluster-users] Gluster install using Ganesha for NFS

2017-07-07 Thread Mahdi Adnan
Hi,


Why change to storhaug ? and whats going to happen to the current setup if i 
want to update Gluster to 3.11 or beyond ?


--

Respectfully
Mahdi A. Mahdi


From: gluster-users-boun...@gluster.org  on 
behalf of Kaleb S. KEITHLEY 
Sent: Thursday, July 6, 2017 8:31:34 PM
To: Anthony Valentine; Gluster Users
Subject: Re: [Gluster-users] Gluster install using Ganesha for NFS


After 3.10 you'd need to use storhaug Which doesn't work (yet).

You need to use 3.10 for now.

On 07/06/2017 12:53 PM, Anthony Valentine wrote:
> I'm running this on CentOS 7.3
>
> [root@glustertest1 ~]# cat /etc/redhat-release
> CentOS Linux release 7.3.1611 (Core)
>
>
> Here are the software versions I have installed.
>
> [root@glustertest1 ~]# rpm -qa | egrep -i "nfs|gluster|ganesha"
> glusterfs-api-3.11.0-0.1.rc0.el7.x86_64
> glusterfs-server-3.11.0-0.1.rc0.el7.x86_64
> glusterfs-libs-3.11.0-0.1.rc0.el7.x86_64
> nfs-ganesha-gluster-2.4.5-1.el7.x86_64
> nfs-ganesha-utils-2.4.5-1.el7.x86_64
> centos-release-gluster310-1.0-1.el7.centos.noarch
> nfs-utils-1.3.0-0.33.el7_3.x86_64
> glusterfs-3.11.0-0.1.rc0.el7.x86_64
> nfs-ganesha-2.4.5-1.el7.x86_64
> libnfsidmap-0.25-15.el7.x86_64
> glusterfs-client-xlators-3.11.0-0.1.rc0.el7.x86_64
> glusterfs-cli-3.11.0-0.1.rc0.el7.x86_64
> glusterfs-fuse-3.11.0-0.1.rc0.el7.x86_64
>
> -Original Message-
> From: Kaleb S. KEITHLEY [mailto:kkeit...@redhat.com]
> Sent: Thursday, July 6, 2017 10:49 AM
> To: Anthony Valentine 
> Subject: Re: [Gluster-users] Gluster install using Ganesha for NFS
>
> On 07/06/2017 12:42 PM, Anthony Valentine wrote:
>> Hello!
>>
>> I am attempting to setup a Gluster install using Ganesha for NFS using
>> the guide found here
>> http://blog.gluster.org/2015/10/linux-scale-out-nfsv4-using-nfs-ganesh
>> a-and-glusterfs-one-step-at-a-time/
>>
>>
>> The Gluster portion is working fine, however when I try to setup
>> Ganesha I have a problem.  The guide says to run 'gluster nfs-ganesha enable'
>> however when I do, I get the following error:
>>
>>   [root@glustertest1 ~]# gluster nfs-ganesha enable
>>
>> unrecognized word: nfs-ganesha (position 0)
>>
>> Has this command changed?  If so, what is the new command?  If not,
>> why would I be getting this error?
>>
>> Is there a more recent guide that I should be following?
>
> It could use some updating but it's mostly accurate.
>
> What version of gluster? That command should work.
>
> --
>
> Kaleb
>
>
>
>

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] gfid and volume-id extended attributes lost

2017-07-07 Thread Pranith Kumar Karampuri
Ram,
   As per the code, self-heal was the only candidate which *can* do it.
Could you check logs of self-heal daemon and the mount to check if there
are any metadata heals on root?


+Sanoj

Sanoj,
   Is there any systemtap script we can use to detect which process is
removing these xattrs?

On Sat, Jul 8, 2017 at 2:58 AM, Ankireddypalle Reddy 
wrote:

> We lost the attributes on all the bricks on servers glusterfs2 and
> glusterfs3 again.
>
>
>
> [root@glusterfs2 Log_Files]# gluster volume info
>
>
>
> Volume Name: StoragePool
>
> Type: Distributed-Disperse
>
> Volume ID: 149e976f-4e21-451c-bf0f-f5691208531f
>
> Status: Started
>
> Number of Bricks: 20 x (2 + 1) = 60
>
> Transport-type: tcp
>
> Bricks:
>
> Brick1: glusterfs1sds:/ws/disk1/ws_brick
>
> Brick2: glusterfs2sds:/ws/disk1/ws_brick
>
> Brick3: glusterfs3sds:/ws/disk1/ws_brick
>
> Brick4: glusterfs1sds:/ws/disk2/ws_brick
>
> Brick5: glusterfs2sds:/ws/disk2/ws_brick
>
> Brick6: glusterfs3sds:/ws/disk2/ws_brick
>
> Brick7: glusterfs1sds:/ws/disk3/ws_brick
>
> Brick8: glusterfs2sds:/ws/disk3/ws_brick
>
> Brick9: glusterfs3sds:/ws/disk3/ws_brick
>
> Brick10: glusterfs1sds:/ws/disk4/ws_brick
>
> Brick11: glusterfs2sds:/ws/disk4/ws_brick
>
> Brick12: glusterfs3sds:/ws/disk4/ws_brick
>
> Brick13: glusterfs1sds:/ws/disk5/ws_brick
>
> Brick14: glusterfs2sds:/ws/disk5/ws_brick
>
> Brick15: glusterfs3sds:/ws/disk5/ws_brick
>
> Brick16: glusterfs1sds:/ws/disk6/ws_brick
>
> Brick17: glusterfs2sds:/ws/disk6/ws_brick
>
> Brick18: glusterfs3sds:/ws/disk6/ws_brick
>
> Brick19: glusterfs1sds:/ws/disk7/ws_brick
>
> Brick20: glusterfs2sds:/ws/disk7/ws_brick
>
> Brick21: glusterfs3sds:/ws/disk7/ws_brick
>
> Brick22: glusterfs1sds:/ws/disk8/ws_brick
>
> Brick23: glusterfs2sds:/ws/disk8/ws_brick
>
> Brick24: glusterfs3sds:/ws/disk8/ws_brick
>
> Brick25: glusterfs4sds.commvault.com:/ws/disk1/ws_brick
>
> Brick26: glusterfs5sds.commvault.com:/ws/disk1/ws_brick
>
> Brick27: glusterfs6sds.commvault.com:/ws/disk1/ws_brick
>
> Brick28: glusterfs4sds.commvault.com:/ws/disk10/ws_brick
>
> Brick29: glusterfs5sds.commvault.com:/ws/disk10/ws_brick
>
> Brick30: glusterfs6sds.commvault.com:/ws/disk10/ws_brick
>
> Brick31: glusterfs4sds.commvault.com:/ws/disk11/ws_brick
>
> Brick32: glusterfs5sds.commvault.com:/ws/disk11/ws_brick
>
> Brick33: glusterfs6sds.commvault.com:/ws/disk11/ws_brick
>
> Brick34: glusterfs4sds.commvault.com:/ws/disk12/ws_brick
>
> Brick35: glusterfs5sds.commvault.com:/ws/disk12/ws_brick
>
> Brick36: glusterfs6sds.commvault.com:/ws/disk12/ws_brick
>
> Brick37: glusterfs4sds.commvault.com:/ws/disk2/ws_brick
>
> Brick38: glusterfs5sds.commvault.com:/ws/disk2/ws_brick
>
> Brick39: glusterfs6sds.commvault.com:/ws/disk2/ws_brick
>
> Brick40: glusterfs4sds.commvault.com:/ws/disk3/ws_brick
>
> Brick41: glusterfs5sds.commvault.com:/ws/disk3/ws_brick
>
> Brick42: glusterfs6sds.commvault.com:/ws/disk3/ws_brick
>
> Brick43: glusterfs4sds.commvault.com:/ws/disk4/ws_brick
>
> Brick44: glusterfs5sds.commvault.com:/ws/disk4/ws_brick
>
> Brick45: glusterfs6sds.commvault.com:/ws/disk4/ws_brick
>
> Brick46: glusterfs4sds.commvault.com:/ws/disk5/ws_brick
>
> Brick47: glusterfs5sds.commvault.com:/ws/disk5/ws_brick
>
> Brick48: glusterfs6sds.commvault.com:/ws/disk5/ws_brick
>
> Brick49: glusterfs4sds.commvault.com:/ws/disk6/ws_brick
>
> Brick50: glusterfs5sds.commvault.com:/ws/disk6/ws_brick
>
> Brick51: glusterfs6sds.commvault.com:/ws/disk6/ws_brick
>
> Brick52: glusterfs4sds.commvault.com:/ws/disk7/ws_brick
>
> Brick53: glusterfs5sds.commvault.com:/ws/disk7/ws_brick
>
> Brick54: glusterfs6sds.commvault.com:/ws/disk7/ws_brick
>
> Brick55: glusterfs4sds.commvault.com:/ws/disk8/ws_brick
>
> Brick56: glusterfs5sds.commvault.com:/ws/disk8/ws_brick
>
> Brick57: glusterfs6sds.commvault.com:/ws/disk8/ws_brick
>
> Brick58: glusterfs4sds.commvault.com:/ws/disk9/ws_brick
>
> Brick59: glusterfs5sds.commvault.com:/ws/disk9/ws_brick
>
> Brick60: glusterfs6sds.commvault.com:/ws/disk9/ws_brick
>
> Options Reconfigured:
>
> performance.readdir-ahead: on
>
> diagnostics.client-log-level: INFO
>
> auth.allow: glusterfs1sds,glusterfs2sds,glusterfs3sds,glusterfs4sds.
> commvault.com,glusterfs5sds.commvault.com,glusterfs6sds.commvault.com
>
>
>
> Thanks and Regards,
>
> Ram
>
> *From:* Pranith Kumar Karampuri [mailto:pkara...@redhat.com]
> *Sent:* Friday, July 07, 2017 12:15 PM
>
> *To:* Ankireddypalle Reddy
> *Cc:* Gluster Devel (gluster-de...@gluster.org); gluster-users@gluster.org
> *Subject:* Re: [Gluster-devel] gfid and volume-id extended attributes lost
>
>
>
>
>
>
>
> On Fri, Jul 7, 2017 at 9:25 PM, Ankireddypalle Reddy 
> wrote:
>
> 3.7.19
>
>
>
> These are the only callers for removexattr and only _posix_remove_xattr
> has the potential to do removexattr as posix_removexattr already makes sure
> that it is not gfid/volume-id. And surprise surprise _posix_remove_xattr
> happens only from healing code of afr/ec. And this can only happen if 

Re: [Gluster-users] [Gluster-devel] gfid and volume-id extended attributes lost

2017-07-07 Thread Vijay Bellur
Do you observe any event pattern (self-healing / disk failures / reboots
etc.) after which the extended attributes are missing?

Regards,
Vijay

On Fri, Jul 7, 2017 at 5:28 PM, Ankireddypalle Reddy 
wrote:

> We lost the attributes on all the bricks on servers glusterfs2 and
> glusterfs3 again.
>
>
>
> [root@glusterfs2 Log_Files]# gluster volume info
>
>
>
> Volume Name: StoragePool
>
> Type: Distributed-Disperse
>
> Volume ID: 149e976f-4e21-451c-bf0f-f5691208531f
>
> Status: Started
>
> Number of Bricks: 20 x (2 + 1) = 60
>
> Transport-type: tcp
>
> Bricks:
>
> Brick1: glusterfs1sds:/ws/disk1/ws_brick
>
> Brick2: glusterfs2sds:/ws/disk1/ws_brick
>
> Brick3: glusterfs3sds:/ws/disk1/ws_brick
>
> Brick4: glusterfs1sds:/ws/disk2/ws_brick
>
> Brick5: glusterfs2sds:/ws/disk2/ws_brick
>
> Brick6: glusterfs3sds:/ws/disk2/ws_brick
>
> Brick7: glusterfs1sds:/ws/disk3/ws_brick
>
> Brick8: glusterfs2sds:/ws/disk3/ws_brick
>
> Brick9: glusterfs3sds:/ws/disk3/ws_brick
>
> Brick10: glusterfs1sds:/ws/disk4/ws_brick
>
> Brick11: glusterfs2sds:/ws/disk4/ws_brick
>
> Brick12: glusterfs3sds:/ws/disk4/ws_brick
>
> Brick13: glusterfs1sds:/ws/disk5/ws_brick
>
> Brick14: glusterfs2sds:/ws/disk5/ws_brick
>
> Brick15: glusterfs3sds:/ws/disk5/ws_brick
>
> Brick16: glusterfs1sds:/ws/disk6/ws_brick
>
> Brick17: glusterfs2sds:/ws/disk6/ws_brick
>
> Brick18: glusterfs3sds:/ws/disk6/ws_brick
>
> Brick19: glusterfs1sds:/ws/disk7/ws_brick
>
> Brick20: glusterfs2sds:/ws/disk7/ws_brick
>
> Brick21: glusterfs3sds:/ws/disk7/ws_brick
>
> Brick22: glusterfs1sds:/ws/disk8/ws_brick
>
> Brick23: glusterfs2sds:/ws/disk8/ws_brick
>
> Brick24: glusterfs3sds:/ws/disk8/ws_brick
>
> Brick25: glusterfs4sds.commvault.com:/ws/disk1/ws_brick
>
> Brick26: glusterfs5sds.commvault.com:/ws/disk1/ws_brick
>
> Brick27: glusterfs6sds.commvault.com:/ws/disk1/ws_brick
>
> Brick28: glusterfs4sds.commvault.com:/ws/disk10/ws_brick
>
> Brick29: glusterfs5sds.commvault.com:/ws/disk10/ws_brick
>
> Brick30: glusterfs6sds.commvault.com:/ws/disk10/ws_brick
>
> Brick31: glusterfs4sds.commvault.com:/ws/disk11/ws_brick
>
> Brick32: glusterfs5sds.commvault.com:/ws/disk11/ws_brick
>
> Brick33: glusterfs6sds.commvault.com:/ws/disk11/ws_brick
>
> Brick34: glusterfs4sds.commvault.com:/ws/disk12/ws_brick
>
> Brick35: glusterfs5sds.commvault.com:/ws/disk12/ws_brick
>
> Brick36: glusterfs6sds.commvault.com:/ws/disk12/ws_brick
>
> Brick37: glusterfs4sds.commvault.com:/ws/disk2/ws_brick
>
> Brick38: glusterfs5sds.commvault.com:/ws/disk2/ws_brick
>
> Brick39: glusterfs6sds.commvault.com:/ws/disk2/ws_brick
>
> Brick40: glusterfs4sds.commvault.com:/ws/disk3/ws_brick
>
> Brick41: glusterfs5sds.commvault.com:/ws/disk3/ws_brick
>
> Brick42: glusterfs6sds.commvault.com:/ws/disk3/ws_brick
>
> Brick43: glusterfs4sds.commvault.com:/ws/disk4/ws_brick
>
> Brick44: glusterfs5sds.commvault.com:/ws/disk4/ws_brick
>
> Brick45: glusterfs6sds.commvault.com:/ws/disk4/ws_brick
>
> Brick46: glusterfs4sds.commvault.com:/ws/disk5/ws_brick
>
> Brick47: glusterfs5sds.commvault.com:/ws/disk5/ws_brick
>
> Brick48: glusterfs6sds.commvault.com:/ws/disk5/ws_brick
>
> Brick49: glusterfs4sds.commvault.com:/ws/disk6/ws_brick
>
> Brick50: glusterfs5sds.commvault.com:/ws/disk6/ws_brick
>
> Brick51: glusterfs6sds.commvault.com:/ws/disk6/ws_brick
>
> Brick52: glusterfs4sds.commvault.com:/ws/disk7/ws_brick
>
> Brick53: glusterfs5sds.commvault.com:/ws/disk7/ws_brick
>
> Brick54: glusterfs6sds.commvault.com:/ws/disk7/ws_brick
>
> Brick55: glusterfs4sds.commvault.com:/ws/disk8/ws_brick
>
> Brick56: glusterfs5sds.commvault.com:/ws/disk8/ws_brick
>
> Brick57: glusterfs6sds.commvault.com:/ws/disk8/ws_brick
>
> Brick58: glusterfs4sds.commvault.com:/ws/disk9/ws_brick
>
> Brick59: glusterfs5sds.commvault.com:/ws/disk9/ws_brick
>
> Brick60: glusterfs6sds.commvault.com:/ws/disk9/ws_brick
>
> Options Reconfigured:
>
> performance.readdir-ahead: on
>
> diagnostics.client-log-level: INFO
>
> auth.allow: glusterfs1sds,glusterfs2sds,glusterfs3sds,glusterfs4sds.
> commvault.com,glusterfs5sds.commvault.com,glusterfs6sds.commvault.com
>
>
>
> Thanks and Regards,
>
> Ram
>
> *From:* Pranith Kumar Karampuri [mailto:pkara...@redhat.com]
> *Sent:* Friday, July 07, 2017 12:15 PM
>
> *To:* Ankireddypalle Reddy
> *Cc:* Gluster Devel (gluster-de...@gluster.org); gluster-users@gluster.org
> *Subject:* Re: [Gluster-devel] gfid and volume-id extended attributes lost
>
>
>
>
>
>
>
> On Fri, Jul 7, 2017 at 9:25 PM, Ankireddypalle Reddy 
> wrote:
>
> 3.7.19
>
>
>
> These are the only callers for removexattr and only _posix_remove_xattr
> has the potential to do removexattr as posix_removexattr already makes sure
> that it is not gfid/volume-id. And surprise surprise _posix_remove_xattr
> happens only from healing code of afr/ec. And this can only happen if the
> source brick doesn't have gfid, which doesn't seem to match with the
> situation you explained.
>
>#   line  filename / context / line
>1   1234  

Re: [Gluster-users] [Gluster-devel] gfid and volume-id extended attributes lost

2017-07-07 Thread Ankireddypalle Reddy
We lost the attributes on all the bricks on servers glusterfs2 and glusterfs3 
again.

[root@glusterfs2 Log_Files]# gluster volume info

Volume Name: StoragePool
Type: Distributed-Disperse
Volume ID: 149e976f-4e21-451c-bf0f-f5691208531f
Status: Started
Number of Bricks: 20 x (2 + 1) = 60
Transport-type: tcp
Bricks:
Brick1: glusterfs1sds:/ws/disk1/ws_brick
Brick2: glusterfs2sds:/ws/disk1/ws_brick
Brick3: glusterfs3sds:/ws/disk1/ws_brick
Brick4: glusterfs1sds:/ws/disk2/ws_brick
Brick5: glusterfs2sds:/ws/disk2/ws_brick
Brick6: glusterfs3sds:/ws/disk2/ws_brick
Brick7: glusterfs1sds:/ws/disk3/ws_brick
Brick8: glusterfs2sds:/ws/disk3/ws_brick
Brick9: glusterfs3sds:/ws/disk3/ws_brick
Brick10: glusterfs1sds:/ws/disk4/ws_brick
Brick11: glusterfs2sds:/ws/disk4/ws_brick
Brick12: glusterfs3sds:/ws/disk4/ws_brick
Brick13: glusterfs1sds:/ws/disk5/ws_brick
Brick14: glusterfs2sds:/ws/disk5/ws_brick
Brick15: glusterfs3sds:/ws/disk5/ws_brick
Brick16: glusterfs1sds:/ws/disk6/ws_brick
Brick17: glusterfs2sds:/ws/disk6/ws_brick
Brick18: glusterfs3sds:/ws/disk6/ws_brick
Brick19: glusterfs1sds:/ws/disk7/ws_brick
Brick20: glusterfs2sds:/ws/disk7/ws_brick
Brick21: glusterfs3sds:/ws/disk7/ws_brick
Brick22: glusterfs1sds:/ws/disk8/ws_brick
Brick23: glusterfs2sds:/ws/disk8/ws_brick
Brick24: glusterfs3sds:/ws/disk8/ws_brick
Brick25: glusterfs4sds.commvault.com:/ws/disk1/ws_brick
Brick26: glusterfs5sds.commvault.com:/ws/disk1/ws_brick
Brick27: glusterfs6sds.commvault.com:/ws/disk1/ws_brick
Brick28: glusterfs4sds.commvault.com:/ws/disk10/ws_brick
Brick29: glusterfs5sds.commvault.com:/ws/disk10/ws_brick
Brick30: glusterfs6sds.commvault.com:/ws/disk10/ws_brick
Brick31: glusterfs4sds.commvault.com:/ws/disk11/ws_brick
Brick32: glusterfs5sds.commvault.com:/ws/disk11/ws_brick
Brick33: glusterfs6sds.commvault.com:/ws/disk11/ws_brick
Brick34: glusterfs4sds.commvault.com:/ws/disk12/ws_brick
Brick35: glusterfs5sds.commvault.com:/ws/disk12/ws_brick
Brick36: glusterfs6sds.commvault.com:/ws/disk12/ws_brick
Brick37: glusterfs4sds.commvault.com:/ws/disk2/ws_brick
Brick38: glusterfs5sds.commvault.com:/ws/disk2/ws_brick
Brick39: glusterfs6sds.commvault.com:/ws/disk2/ws_brick
Brick40: glusterfs4sds.commvault.com:/ws/disk3/ws_brick
Brick41: glusterfs5sds.commvault.com:/ws/disk3/ws_brick
Brick42: glusterfs6sds.commvault.com:/ws/disk3/ws_brick
Brick43: glusterfs4sds.commvault.com:/ws/disk4/ws_brick
Brick44: glusterfs5sds.commvault.com:/ws/disk4/ws_brick
Brick45: glusterfs6sds.commvault.com:/ws/disk4/ws_brick
Brick46: glusterfs4sds.commvault.com:/ws/disk5/ws_brick
Brick47: glusterfs5sds.commvault.com:/ws/disk5/ws_brick
Brick48: glusterfs6sds.commvault.com:/ws/disk5/ws_brick
Brick49: glusterfs4sds.commvault.com:/ws/disk6/ws_brick
Brick50: glusterfs5sds.commvault.com:/ws/disk6/ws_brick
Brick51: glusterfs6sds.commvault.com:/ws/disk6/ws_brick
Brick52: glusterfs4sds.commvault.com:/ws/disk7/ws_brick
Brick53: glusterfs5sds.commvault.com:/ws/disk7/ws_brick
Brick54: glusterfs6sds.commvault.com:/ws/disk7/ws_brick
Brick55: glusterfs4sds.commvault.com:/ws/disk8/ws_brick
Brick56: glusterfs5sds.commvault.com:/ws/disk8/ws_brick
Brick57: glusterfs6sds.commvault.com:/ws/disk8/ws_brick
Brick58: glusterfs4sds.commvault.com:/ws/disk9/ws_brick
Brick59: glusterfs5sds.commvault.com:/ws/disk9/ws_brick
Brick60: glusterfs6sds.commvault.com:/ws/disk9/ws_brick
Options Reconfigured:
performance.readdir-ahead: on
diagnostics.client-log-level: INFO
auth.allow: 
glusterfs1sds,glusterfs2sds,glusterfs3sds,glusterfs4sds.commvault.com,glusterfs5sds.commvault.com,glusterfs6sds.commvault.com

Thanks and Regards,
Ram
From: Pranith Kumar Karampuri [mailto:pkara...@redhat.com]
Sent: Friday, July 07, 2017 12:15 PM
To: Ankireddypalle Reddy
Cc: Gluster Devel (gluster-de...@gluster.org); gluster-users@gluster.org
Subject: Re: [Gluster-devel] gfid and volume-id extended attributes lost



On Fri, Jul 7, 2017 at 9:25 PM, Ankireddypalle Reddy 
> wrote:
3.7.19

These are the only callers for removexattr and only _posix_remove_xattr has the 
potential to do removexattr as posix_removexattr already makes sure that it is 
not gfid/volume-id. And surprise surprise _posix_remove_xattr happens only from 
healing code of afr/ec. And this can only happen if the source brick doesn't 
have gfid, which doesn't seem to match with the situation you explained.

   #   line  filename / context / line
   1   1234  xlators/mgmt/glusterd/src/glusterd-quota.c 
<>
 ret = sys_lremovexattr (abspath, QUOTA_LIMIT_KEY);
   2   1243  xlators/mgmt/glusterd/src/glusterd-quota.c 
<>
 ret = sys_lremovexattr (abspath, QUOTA_LIMIT_OBJECTS_KEY);
   3   6102  xlators/mgmt/glusterd/src/glusterd-utils.c 
<>
 sys_lremovexattr (path, "trusted.glusterfs.test");
   4 80  xlators/storage/posix/src/posix-handle.h <>
 op_ret = sys_lremovexattr (path, key); \
   5   5026  xlators/storage/posix/src/posix.c <<_posix_remove_xattr>>
 op_ret 

Re: [Gluster-users] Gluster failure due to "0-management: Lock not released for "

2017-07-07 Thread Victor Nomura
It’s working again! After countless hours trying to get it fixed, I just redid 
everything and tested to see what caused Gluster to fail.

 

The problem went away and there no more locks after I disabled jumbo frames and 
changed the MTU back to 1500.  If the MTU is set to 9000, Gluster was dead.  
Why? All other networking functions were fine.

 

Regards,

 

Victor Nomura

 

 

From: Atin Mukherjee [mailto:amukh...@redhat.com] 
Sent: July-04-17 10:07 PM
To: Victor Nomura
Cc: gluster-users
Subject: Re: [Gluster-users] Gluster failure due to "0-management: Lock not 
released for "

 

By any chance are you having any redundant peer entries in 
/var/lib/glusterd/peers directory? Can you please share the content of this 
folder from all the nodes?

 

On Tue, Jul 4, 2017 at 11:55 PM, Victor Nomura  wrote:

Specifically, I must stop glusterfs-server service on the other nodes in order 
to perform any gluster commands on any node.

 

From: Victor Nomura [mailto:vic...@mezine.com] 
Sent: July-04-17 9:41 AM
To: 'Atin Mukherjee'
Cc: 'gluster-users'
Subject: RE: [Gluster-users] Gluster failure due to "0-management: Lock not 
released for "

 

The nodes have all been rebooted numerous times with no difference in outcome.  
The nodes are all connected to the same switch and I also replaced it to see if 
made any difference.

 

There is no issues with connectivity network wise and no firewall in place 
between the nodes.  

 

I can’t do a gluster volume status without it timing out the moment the other 2 
nodes are connected to the switch. Which is odd.   With one node turned on and 
the others off, I can perform some volume commands but the moment any one of 
the others are connected,  a lot of commands just timeout. There’s no IP 
address conflict or anything of that nature either.

 

Seems nothing can resolve the locks.  Is there a manual  way to resolve the 
locks?

 

Regards,

 

Victor

 

 

 

 

From: Atin Mukherjee [mailto:amukh...@redhat.com] 
Sent: June-30-17 3:40 AM


To: Victor Nomura
Cc: gluster-users
Subject: Re: [Gluster-users] Gluster failure due to "0-management: Lock not 
released for "

 

 

On Thu, 29 Jun 2017 at 22:51, Victor Nomura  wrote:

Thanks for the reply.  What would be the best course of action?  The data on 
the volume isn’t important right now but I’m worried when our setup goes to 
production we don’t have the same situation and really need to recover our 
Gluster setup.

 

I’m assuming that to redo is to delete everything in the /var/lib/glusterd 
directory on each of the nodes and recreate the volume again. Essentially 
starting over.  If I leave the mount points the same and keep the data 
intact will the files still be there and accessible after? (I don’t delete the 
data on the bricks)

 

I dont think there is anything wrong at gluster stack. If you cross check the 
n/w layer and make sure its up all the time then restarting glusterd on all the 
nodes should resolve the stale locks.

 

 

Regards,

 

Victor Nomura

 

From: Atin Mukherjee [mailto:amukh...@redhat.com] 
Sent: June-27-17 12:29 AM


To: Victor Nomura
Cc: gluster-users

Subject: Re: [Gluster-users] Gluster failure due to "0-management: Lock not 
released for "

 

I had looked at the logs shared by Victor privately and it seems to be there is 
a N/W glitch in the cluster which is causing the glusterd to lose its 
connection with other peers and as a side effect to this, lot of rpc requests 
are getting bailed out resulting glusterd to end up into a stale lock and hence 
you see that some of the commands failed with "another transaction is in 
progress or locking failed."

Some examples of the symptom highlighted:

[2017-06-21 23:02:03.826858] E [rpc-clnt.c:200:call_bail] 0-management: bailing 
out frame type(Peer mgmt) op(--(2)) xid = 0x4 sent = 2017-06-21 
22:52:02.719068. timeout = 600 for 192.168.150.53:24007
[2017-06-21 23:02:03.826888] E [rpc-clnt.c:200:call_bail] 0-management: bailing 
out frame type(Peer mgmt) op(--(2)) xid = 0x4 sent = 2017-06-21 
22:52:02.716782. timeout = 600 for 192.168.150.52:24007
[2017-06-21 23:02:53.836936] E [rpc-clnt.c:200:call_bail] 0-management: bailing 
out frame type(glusterd mgmt v3) op(--(1)) xid = 0x5 sent = 2017-06-21 
22:52:47.909169. timeout = 600 for 192.168.150.53:24007
[2017-06-21 23:02:53.836991] E [MSGID: 106116] 
[glusterd-mgmt.c:124:gd_mgmt_v3_collate_errors] 0-management: Locking failed on 
gfsnode3. Please check log file for details.
[2017-06-21 23:02:53.837016] E [rpc-clnt.c:200:call_bail] 0-management: bailing 
out frame type(glusterd mgmt v3) op(--(1)) xid = 0x5 sent = 2017-06-21 
22:52:47.909175. timeout = 600 for 192.168.150.52:24007

I'd like you to request to first look at the N/W layer and rectify the problems.

 

 

On Thu, Jun 22, 2017 at 9:30 PM, Atin Mukherjee  wrote:

Could you attach glusterd.log and cmd_history.log files from all the nodes?

 

On Wed, Jun 21, 2017 at 11:40 

Re: [Gluster-users] Ganesha "Failed to create client in recovery dir" in logs

2017-07-07 Thread Soumya Koduri



On 07/07/2017 11:36 PM, Renaud Fortier wrote:

Hi all,

I have this entry in ganesha.log file on server when mounting the volume
on client :



« GLUSTER-NODE3 : ganesha.nfsd-54084[work-27] nfs4_add_clid :CLIENT ID
:EVENT :Failed to create client in recovery dir
(/var/lib/nfs/ganesha/v4recov/node0/:::192.168.2.152-(24:Linux
NFSv4.2 client-host-name)), errno=2 »


This directory is used to store client IP for state recovery post server 
crash. To fix this, please add below option in the ganesha.conf file and 
then restart nfs-ganesha servers on all the nodes.


NFS_CORE_PARAM {
clustered = false;
}

Thanks,
Soumya





But everything seems to work as expected without any other errors (so far).



Gluster 3.10

Ganesha 2.5



Should I worry about this ?



Thank you

Renaud





___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Ganesha "Failed to create client in recovery dir" in logs

2017-07-07 Thread Renaud Fortier
Hi all,
I have this entry in ganesha.log file on server when mounting the volume on 
client :

< GLUSTER-NODE3 : ganesha.nfsd-54084[work-27] nfs4_add_clid :CLIENT ID :EVENT 
:Failed to create client in recovery dir 
(/var/lib/nfs/ganesha/v4recov/node0/:::192.168.2.152-(24:Linux NFSv4.2 
client-host-name)), errno=2 >

But everything seems to work as expected without any other errors (so far).

Gluster 3.10
Ganesha 2.5

Should I worry about this ?

Thank you
Renaud

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Slow write times to gluster disk

2017-07-07 Thread Soumya Koduri

Hi,

On 07/07/2017 06:16 AM, Pat Haley wrote:


Hi All,

A follow-up question.  I've been looking at various pages on nfs-ganesha
& gluster.  Is there a version of nfs-ganesha that is recommended for
use with

glusterfs 3.7.11 built on Apr 27 2016 14:09:22
CentOS release 6.8 (Final)


For glusterfs 3.7, nfs-ganesha-2.3-* version can be used.

I see the packages built in centos7 storage sig [1] but not for centos6. 
Request Niels to comment.





Thanks

Pat


On 07/05/2017 11:36 AM, Pat Haley wrote:


Hi Soumya,

(1) In http://mseas.mit.edu/download/phaley/GlusterUsers/TestNFSmount/
I've placed the following 2 log files

etc-glusterfs-glusterd.vol.log
gdata.log

The first has repeated messages about nfs disconnects.  The second had
the .log name (but not much information).



Hmm yeah..weird ..there are not much logs in fuse mnt log file.


(2) About the gluster-NFS native server:  do you know where we can
find documentation on how to use/install it?  We haven't had success
in our searches.



Till glusterfs-3.7, gluster-NFS (gNFS) gets enabled by default. The only 
requirement is that kernel-NFS has to be disabled for gluster-NFS to 
come up. Please disable kernel-NFS server and restart glusterd to start 
gNFS. In case of any issues with starting gNFS server, please look at 
/var/log/glusterfs/nfs.log.


Thanks,
Soumya


[1] https://buildlogs.centos.org/centos/7/storage/x86_64/gluster-3.7/
[2] https://buildlogs.centos.org/centos/6/storage/x86_64/gluster-3.7/


Thanks

Pat


On 07/04/2017 05:01 AM, Soumya Koduri wrote:



On 07/03/2017 09:01 PM, Pat Haley wrote:


Hi Soumya,

When I originally did the tests I ran tcpdump on the client.

I have rerun the tests, doing tcpdump on the server

tcpdump -i any -nnSs 0 host 172.16.1.121 -w /root/capture_nfsfail.pcap

The results are in the same place

http://mseas.mit.edu/download/phaley/GlusterUsers/TestNFSmount/

capture_nfsfail.pcap   has the results from the failed touch experiment
capture_nfssucceed.pcap  has the results from the successful touch
experiment

The brick log files are there too.


Thanks for sharing. Looks like the error is not generated
@gluster-server side. The permission denied error was caused by
either kNFS or by fuse-mnt process or probably by the combination.

To check fuse-mnt logs, please look at
/var/log/glusterfs/.log

For eg.: if you have fuse mounted the gluster volume at /mnt/fuse-mnt
and exported it via kNFS, the log location for that fuse_mnt shall be
at /var/log/glusterfs/mnt-fuse-mnt.log


Also why not switch to either gluster-NFS native server or
NFS-Ganesha instead of using kNFS, as they are recommended NFS
servers to use with gluster?

Thanks,
Soumya



I believe we are using kernel-NFS exporting a fuse mounted gluster
volume.  I am having Steve confirm this.  I tried to find the fuse-mnt
logs but failed.  Where should I look for them?

Thanks

Pat



On 07/03/2017 07:58 AM, Soumya Koduri wrote:



On 06/30/2017 07:56 PM, Pat Haley wrote:


Hi,

I was wondering if there were any additional test we could perform to
help debug the group write-permissions issue?


Sorry for the delay. Please find response inline --



Thanks

Pat


On 06/27/2017 12:29 PM, Pat Haley wrote:


Hi Soumya,

One example, we have a common working directory dri_fleat in the
gluster volume

drwxrwsr-x 22 root dri_fleat 4.0K May  1 15:14 dri_fleat

my user (phaley) does not own that directory but is a member of the
group  dri_fleat and should have write permissions. When I go to the
nfs-mounted version and try to use the touch command I get the
following

ibfdr-compute-0-4(dri_fleat)% touch dum
touch: cannot touch `dum': Permission denied

One of the sub-directories under dri_fleat is "test" which phaley
owns

drwxrwsr-x  2 phaley   dri_fleat 4.0K May  1 15:16 test

Under this directory (mounted via nfs) user phaley can write

ibfdr-compute-0-4(test)% touch dum
ibfdr-compute-0-4(test)%

I have put the packet captures in

http://mseas.mit.edu/download/phaley/GlusterUsers/TestNFSmount/

capture_nfsfail.pcap   has the results from the failed touch
experiment
capture_nfssucceed.pcap  has the results from the successful touch
experiment

The command I used for these was

tcpdump -i ib0 -nnSs 0 host 172.16.1.119 -w
/root/capture_nfstest.pcap


I hope these pkts were captured on the node where NFS server is
running. Could you please use '-i any' as I do not see glusterfs
traffic in the tcpdump.

Also looks like NFS v4 is used between client & nfs server. Are you
using kernel-NFS here (i.e, kernel-NFS exporting fuse mounted gluster
volume)?
If that is the case please capture fuse-mnt logs as well. This error
may well be coming from kernel-NFS itself before the request is sent
to fuse-mnt process.

FWIW, we have below option -

Option: server.manage-gids
Default Value: off
Description: Resolve groups on the server-side.

I haven't looked into what this option exactly does. But it may worth
testing with this option on.

Thanks,
Soumya




The brick log files 

[Gluster-users] Gluster 3.11 on ubuntu 16.04 not working

2017-07-07 Thread Christiane Baier

Hi There,

we have a problem with a fresh installation of gluster 3.11 on a ubuntu 
16.04 server.


we have made the installaton straight forward like it ist described on 
the gluster.org website.




in  fstab is:
/dev/sdb1 /gluster xfs defaults 0 0
knoten5:/gv0 /glusterfs glusterfs defaults,_netdev,acl,selinux 0 0

after reboot /gluster is mounted
/glusterfs not
with mount -a mounting is possible.

gluster peer status shows
Number of Peers: 1

Hostname: knoten5
Uuid: 996c9b7b-9913-4f0f-a0e2-387fbd970129
State: Peer in Cluster (Connected)

Network connectivity is okay

gluster volume info shows
Volume Name: gv0
Type: Replicate
Volume ID: 0e049b18-9fb7-4554-a4b7-b7413753af3a
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: knoten4:/gluster/brick
Brick2: knoten5:/gluster/brick
Options Reconfigured:
nfs.disable: on
transport.address-family: inet

glusterfs-server is up an running.

But if I put a file in /glusterfs it won't show up on the other system?
What is wrong.

By the way, we have the same problems on a ubuntu 14.04 server with an 
older gluster (since two days).


Kind Regards
Chris
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] [Gluster-devel] gfid and volume-id extended attributes lost

2017-07-07 Thread Pranith Kumar Karampuri
On Fri, Jul 7, 2017 at 9:25 PM, Ankireddypalle Reddy 
wrote:

> 3.7.19
>

These are the only callers for removexattr and only _posix_remove_xattr has
the potential to do removexattr as posix_removexattr already makes sure
that it is not gfid/volume-id. And surprise surprise _posix_remove_xattr
happens only from healing code of afr/ec. And this can only happen if the
source brick doesn't have gfid, which doesn't seem to match with the
situation you explained.

   #   line  filename / context / line
   1   1234  xlators/mgmt/glusterd/src/glusterd-quota.c
<>
 ret = sys_lremovexattr (abspath, QUOTA_LIMIT_KEY);
   2   1243  xlators/mgmt/glusterd/src/glusterd-quota.c
<>
 ret = sys_lremovexattr (abspath, QUOTA_LIMIT_OBJECTS_KEY);
   3   6102  xlators/mgmt/glusterd/src/glusterd-utils.c
<>
 sys_lremovexattr (path, "trusted.glusterfs.test");
   4 80  xlators/storage/posix/src/posix-handle.h <>
 op_ret = sys_lremovexattr (path, key); \
   5   5026  xlators/storage/posix/src/posix.c <<_posix_remove_xattr>>
 op_ret = sys_lremovexattr (filler->real_path, key);
   6   5101  xlators/storage/posix/src/posix.c <>
 op_ret = sys_lremovexattr (real_path, name);
   7   6811  xlators/storage/posix/src/posix.c <>
 sys_lremovexattr (dir_data->data, "trusted.glusterfs.test");

So there are only two possibilities:
1) Source directory in ec/afr doesn't have gfid
2) Something else removed these xattrs.

What is your volume info? May be that will give more clues.

 PS: sys_fremovexattr is called only from posix_fremovexattr(), so that
doesn't seem to be the culprit as it also have checks to guard against
gfid/volume-id removal.


>
> Thanks and Regards,
>
> Ram
>
> *From:* Pranith Kumar Karampuri [mailto:pkara...@redhat.com]
> *Sent:* Friday, July 07, 2017 11:54 AM
>
> *To:* Ankireddypalle Reddy
> *Cc:* Gluster Devel (gluster-de...@gluster.org); gluster-users@gluster.org
> *Subject:* Re: [Gluster-devel] gfid and volume-id extended attributes lost
>
>
>
>
>
>
>
> On Fri, Jul 7, 2017 at 9:20 PM, Ankireddypalle Reddy 
> wrote:
>
> Pranith,
>
>  Thanks for looking in to the issue. The bricks were
> mounted after the reboot. One more thing that I noticed was when the
> attributes were manually set when glusterd was up then on starting the
> volume the attributes were again lost. Had to stop glusterd set attributes
> and then start glusterd. After that the volume start succeeded.
>
>
>
> Which version is this?
>
>
>
>
>
> Thanks and Regards,
>
> Ram
>
>
>
> *From:* Pranith Kumar Karampuri [mailto:pkara...@redhat.com]
> *Sent:* Friday, July 07, 2017 11:46 AM
> *To:* Ankireddypalle Reddy
> *Cc:* Gluster Devel (gluster-de...@gluster.org); gluster-users@gluster.org
> *Subject:* Re: [Gluster-devel] gfid and volume-id extended attributes lost
>
>
>
> Did anything special happen on these two bricks? It can't happen in the
> I/O path:
> posix_removexattr() has:
>   0 if (!strcmp (GFID_XATTR_KEY, name))
> {
>
>
>   1 gf_msg (this->name, GF_LOG_WARNING, 0,
> P_MSG_XATTR_NOT_REMOVED,
>   2 "Remove xattr called on gfid for file %s",
> real_path);
>   3 op_ret = -1;
>
>   4 goto out;
>
>   5 }
>
>   6 if (!strcmp (GF_XATTR_VOL_ID_KEY, name))
> {
>   7 gf_msg (this->name, GF_LOG_WARNING, 0,
> P_MSG_XATTR_NOT_REMOVED,
>   8 "Remove xattr called on volume-id for file
> %s",
>   9 real_path);
>
>  10 op_ret = -1;
>
>  11 goto out;
>
>  12 }
>
> I just found that op_errno is not set correctly, but it can't happen in
> the I/O path, so self-heal/rebalance are off the hook.
>
> I also grepped for any removexattr of trusted.gfid from glusterd and
> didn't find any.
>
> So one thing that used to happen was that sometimes when machines reboot,
> the brick mounts wouldn't happen and this would lead to absence of both
> trusted.gfid and volume-id. So at the moment this is my wild guess.
>
>
>
>
>
> On Fri, Jul 7, 2017 at 8:39 PM, Ankireddypalle Reddy 
> wrote:
>
> Hi,
>
>We faced an issue in the production today. We had to stop the
> volume and reboot all the servers in the cluster.  Once the servers
> rebooted starting of the volume failed because the following extended
> attributes were not present on all the bricks on 2 servers.
>
> 1)  trusted.gfid
>
> 2)  trusted.glusterfs.volume-id
>
>
>
> We had to manually set these extended attributes to start the volume.  Are
> there any such known issues.
>
>
>
> Thanks and Regards,
>
> Ram
>
> ***Legal Disclaimer***
>
> "This communication may contain confidential and privileged material for
> the
>
> sole use of the intended recipient. Any unauthorized review, use or
> distribution
>
> by 

Re: [Gluster-users] [Gluster-devel] gfid and volume-id extended attributes lost

2017-07-07 Thread Ankireddypalle Reddy
3.7.19

Thanks and Regards,
Ram
From: Pranith Kumar Karampuri [mailto:pkara...@redhat.com]
Sent: Friday, July 07, 2017 11:54 AM
To: Ankireddypalle Reddy
Cc: Gluster Devel (gluster-de...@gluster.org); gluster-users@gluster.org
Subject: Re: [Gluster-devel] gfid and volume-id extended attributes lost



On Fri, Jul 7, 2017 at 9:20 PM, Ankireddypalle Reddy 
> wrote:
Pranith,
 Thanks for looking in to the issue. The bricks were mounted 
after the reboot. One more thing that I noticed was when the attributes were 
manually set when glusterd was up then on starting the volume the attributes 
were again lost. Had to stop glusterd set attributes and then start glusterd. 
After that the volume start succeeded.

Which version is this?


Thanks and Regards,
Ram

From: Pranith Kumar Karampuri 
[mailto:pkara...@redhat.com]
Sent: Friday, July 07, 2017 11:46 AM
To: Ankireddypalle Reddy
Cc: Gluster Devel 
(gluster-de...@gluster.org); 
gluster-users@gluster.org
Subject: Re: [Gluster-devel] gfid and volume-id extended attributes lost

Did anything special happen on these two bricks? It can't happen in the I/O 
path:
posix_removexattr() has:
  0 if (!strcmp (GFID_XATTR_KEY, name)) {
  1 gf_msg (this->name, GF_LOG_WARNING, 0, 
P_MSG_XATTR_NOT_REMOVED,
  2 "Remove xattr called on gfid for file %s", 
real_path);
  3 op_ret = -1;
  4 goto out;
  5 }
  6 if (!strcmp (GF_XATTR_VOL_ID_KEY, name)) {
  7 gf_msg (this->name, GF_LOG_WARNING, 0, 
P_MSG_XATTR_NOT_REMOVED,
  8 "Remove xattr called on volume-id for file %s",
  9 real_path);
 10 op_ret = -1;
 11 goto out;
 12 }
I just found that op_errno is not set correctly, but it can't happen in the I/O 
path, so self-heal/rebalance are off the hook.
I also grepped for any removexattr of trusted.gfid from glusterd and didn't 
find any.
So one thing that used to happen was that sometimes when machines reboot, the 
brick mounts wouldn't happen and this would lead to absence of both 
trusted.gfid and volume-id. So at the moment this is my wild guess.


On Fri, Jul 7, 2017 at 8:39 PM, Ankireddypalle Reddy 
> wrote:
Hi,
   We faced an issue in the production today. We had to stop the volume and 
reboot all the servers in the cluster.  Once the servers rebooted starting of 
the volume failed because the following extended attributes were not present on 
all the bricks on 2 servers.

1)  trusted.gfid

2)  trusted.glusterfs.volume-id

We had to manually set these extended attributes to start the volume.  Are 
there any such known issues.

Thanks and Regards,
Ram
***Legal Disclaimer***
"This communication may contain confidential and privileged material for the
sole use of the intended recipient. Any unauthorized review, use or distribution
by others is strictly prohibited. If you have received the message by mistake,
please advise the sender by reply email and delete the message. Thank you."
**

___
Gluster-devel mailing list
gluster-de...@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel



--
Pranith
***Legal Disclaimer***
"This communication may contain confidential and privileged material for the
sole use of the intended recipient. Any unauthorized review, use or distribution
by others is strictly prohibited. If you have received the message by mistake,
please advise the sender by reply email and delete the message. Thank you."
**



--
Pranith
***Legal Disclaimer***
"This communication may contain confidential and privileged material for the
sole use of the intended recipient. Any unauthorized review, use or distribution
by others is strictly prohibited. If you have received the message by mistake,
please advise the sender by reply email and delete the message. Thank you."
**
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] gfid and volume-id extended attributes lost

2017-07-07 Thread Pranith Kumar Karampuri
On Fri, Jul 7, 2017 at 9:20 PM, Ankireddypalle Reddy 
wrote:

> Pranith,
>
>  Thanks for looking in to the issue. The bricks were
> mounted after the reboot. One more thing that I noticed was when the
> attributes were manually set when glusterd was up then on starting the
> volume the attributes were again lost. Had to stop glusterd set attributes
> and then start glusterd. After that the volume start succeeded.
>

Which version is this?


>
>
> Thanks and Regards,
>
> Ram
>
>
>
> *From:* Pranith Kumar Karampuri [mailto:pkara...@redhat.com]
> *Sent:* Friday, July 07, 2017 11:46 AM
> *To:* Ankireddypalle Reddy
> *Cc:* Gluster Devel (gluster-de...@gluster.org); gluster-users@gluster.org
> *Subject:* Re: [Gluster-devel] gfid and volume-id extended attributes lost
>
>
>
> Did anything special happen on these two bricks? It can't happen in the
> I/O path:
> posix_removexattr() has:
>   0 if (!strcmp (GFID_XATTR_KEY, name))
> {
>
>
>   1 gf_msg (this->name, GF_LOG_WARNING, 0,
> P_MSG_XATTR_NOT_REMOVED,
>   2 "Remove xattr called on gfid for file %s",
> real_path);
>   3 op_ret = -1;
>
>   4 goto out;
>
>   5 }
>
>   6 if (!strcmp (GF_XATTR_VOL_ID_KEY, name))
> {
>   7 gf_msg (this->name, GF_LOG_WARNING, 0,
> P_MSG_XATTR_NOT_REMOVED,
>   8 "Remove xattr called on volume-id for file
> %s",
>   9 real_path);
>
>  10 op_ret = -1;
>
>  11 goto out;
>
>  12 }
>
> I just found that op_errno is not set correctly, but it can't happen in
> the I/O path, so self-heal/rebalance are off the hook.
>
> I also grepped for any removexattr of trusted.gfid from glusterd and
> didn't find any.
>
> So one thing that used to happen was that sometimes when machines reboot,
> the brick mounts wouldn't happen and this would lead to absence of both
> trusted.gfid and volume-id. So at the moment this is my wild guess.
>
>
>
>
>
> On Fri, Jul 7, 2017 at 8:39 PM, Ankireddypalle Reddy 
> wrote:
>
> Hi,
>
>We faced an issue in the production today. We had to stop the
> volume and reboot all the servers in the cluster.  Once the servers
> rebooted starting of the volume failed because the following extended
> attributes were not present on all the bricks on 2 servers.
>
> 1)  trusted.gfid
>
> 2)  trusted.glusterfs.volume-id
>
>
>
> We had to manually set these extended attributes to start the volume.  Are
> there any such known issues.
>
>
>
> Thanks and Regards,
>
> Ram
>
> ***Legal Disclaimer***
>
> "This communication may contain confidential and privileged material for
> the
>
> sole use of the intended recipient. Any unauthorized review, use or
> distribution
>
> by others is strictly prohibited. If you have received the message by
> mistake,
>
> please advise the sender by reply email and delete the message. Thank you."
>
> **
>
>
> ___
> Gluster-devel mailing list
> gluster-de...@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-devel
>
>
>
>
> --
>
> Pranith
> ***Legal Disclaimer***
> "This communication may contain confidential and privileged material for
> the
> sole use of the intended recipient. Any unauthorized review, use or
> distribution
> by others is strictly prohibited. If you have received the message by
> mistake,
> please advise the sender by reply email and delete the message. Thank you."
> **
>



-- 
Pranith
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] gfid and volume-id extended attributes lost

2017-07-07 Thread Ankireddypalle Reddy
Pranith,
 Thanks for looking in to the issue. The bricks were mounted 
after the reboot. One more thing that I noticed was when the attributes were 
manually set when glusterd was up then on starting the volume the attributes 
were again lost. Had to stop glusterd set attributes and then start glusterd. 
After that the volume start succeeded.

Thanks and Regards,
Ram

From: Pranith Kumar Karampuri [mailto:pkara...@redhat.com]
Sent: Friday, July 07, 2017 11:46 AM
To: Ankireddypalle Reddy
Cc: Gluster Devel (gluster-de...@gluster.org); gluster-users@gluster.org
Subject: Re: [Gluster-devel] gfid and volume-id extended attributes lost

Did anything special happen on these two bricks? It can't happen in the I/O 
path:
posix_removexattr() has:
  0 if (!strcmp (GFID_XATTR_KEY, name)) {
  1 gf_msg (this->name, GF_LOG_WARNING, 0, 
P_MSG_XATTR_NOT_REMOVED,
  2 "Remove xattr called on gfid for file %s", 
real_path);
  3 op_ret = -1;
  4 goto out;
  5 }
  6 if (!strcmp (GF_XATTR_VOL_ID_KEY, name)) {
  7 gf_msg (this->name, GF_LOG_WARNING, 0, 
P_MSG_XATTR_NOT_REMOVED,
  8 "Remove xattr called on volume-id for file %s",
  9 real_path);
 10 op_ret = -1;
 11 goto out;
 12 }
I just found that op_errno is not set correctly, but it can't happen in the I/O 
path, so self-heal/rebalance are off the hook.
I also grepped for any removexattr of trusted.gfid from glusterd and didn't 
find any.
So one thing that used to happen was that sometimes when machines reboot, the 
brick mounts wouldn't happen and this would lead to absence of both 
trusted.gfid and volume-id. So at the moment this is my wild guess.


On Fri, Jul 7, 2017 at 8:39 PM, Ankireddypalle Reddy 
> wrote:
Hi,
   We faced an issue in the production today. We had to stop the volume and 
reboot all the servers in the cluster.  Once the servers rebooted starting of 
the volume failed because the following extended attributes were not present on 
all the bricks on 2 servers.

1)  trusted.gfid

2)  trusted.glusterfs.volume-id

We had to manually set these extended attributes to start the volume.  Are 
there any such known issues.

Thanks and Regards,
Ram
***Legal Disclaimer***
"This communication may contain confidential and privileged material for the
sole use of the intended recipient. Any unauthorized review, use or distribution
by others is strictly prohibited. If you have received the message by mistake,
please advise the sender by reply email and delete the message. Thank you."
**

___
Gluster-devel mailing list
gluster-de...@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel



--
Pranith
***Legal Disclaimer***
"This communication may contain confidential and privileged material for the
sole use of the intended recipient. Any unauthorized review, use or distribution
by others is strictly prohibited. If you have received the message by mistake,
please advise the sender by reply email and delete the message. Thank you."
**
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] gfid and volume-id extended attributes lost

2017-07-07 Thread Pranith Kumar Karampuri
On Fri, Jul 7, 2017 at 9:15 PM, Pranith Kumar Karampuri  wrote:

> Did anything special happen on these two bricks? It can't happen in the
> I/O path:
> posix_removexattr() has:
>   0 if (!strcmp (GFID_XATTR_KEY, name))
> {
>
>
>   1 gf_msg (this->name, GF_LOG_WARNING, 0,
> P_MSG_XATTR_NOT_REMOVED,
>   2 "Remove xattr called on gfid for file %s",
> real_path);
>   3 op_ret = -1;
>
>   4 goto out;
>
>   5 }
>
>   6 if (!strcmp (GF_XATTR_VOL_ID_KEY, name))
> {
>   7 gf_msg (this->name, GF_LOG_WARNING, 0,
> P_MSG_XATTR_NOT_REMOVED,
>   8 "Remove xattr called on volume-id for file
> %s",
>   9 real_path);
>
>  10 op_ret = -1;
>
>  11 goto out;
>
>  12 }
>
> I just found that op_errno is not set correctly, but it can't happen in
> the I/O path, so self-heal/rebalance are off the hook.
>
> I also grepped for any removexattr of trusted.gfid from glusterd and
> didn't find any.
>
> So one thing that used to happen was that sometimes when machines reboot,
> the brick mounts wouldn't happen and this would lead to absence of both
> trusted.gfid and volume-id. So at the moment this is my wild guess.
>

Fix for this was to mount the bricks. But considering that you guys did
setting of the xattrs instead, I am guessing the other data was intact and
only these particular xattrs were missing? I wonder what new problem this
is.


>
>
> On Fri, Jul 7, 2017 at 8:39 PM, Ankireddypalle Reddy  > wrote:
>
>> Hi,
>>
>>We faced an issue in the production today. We had to stop the
>> volume and reboot all the servers in the cluster.  Once the servers
>> rebooted starting of the volume failed because the following extended
>> attributes were not present on all the bricks on 2 servers.
>>
>> 1)  trusted.gfid
>>
>> 2)  trusted.glusterfs.volume-id
>>
>>
>>
>> We had to manually set these extended attributes to start the volume.
>> Are there any such known issues.
>>
>>
>>
>> Thanks and Regards,
>>
>> Ram
>> ***Legal Disclaimer***
>> "This communication may contain confidential and privileged material for
>> the
>> sole use of the intended recipient. Any unauthorized review, use or
>> distribution
>> by others is strictly prohibited. If you have received the message by
>> mistake,
>> please advise the sender by reply email and delete the message. Thank
>> you."
>> **
>>
>> ___
>> Gluster-devel mailing list
>> gluster-de...@gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-devel
>>
>
>
>
> --
> Pranith
>



-- 
Pranith
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] gfid and volume-id extended attributes lost

2017-07-07 Thread Pranith Kumar Karampuri
Did anything special happen on these two bricks? It can't happen in the I/O
path:
posix_removexattr() has:
  0 if (!strcmp (GFID_XATTR_KEY, name))
{

  1 gf_msg (this->name, GF_LOG_WARNING, 0,
P_MSG_XATTR_NOT_REMOVED,
  2 "Remove xattr called on gfid for file %s",
real_path);
  3 op_ret =
-1;
  4 goto
out;
  5
}
  6 if (!strcmp (GF_XATTR_VOL_ID_KEY, name))
{
  7 gf_msg (this->name, GF_LOG_WARNING, 0,
P_MSG_XATTR_NOT_REMOVED,
  8 "Remove xattr called on volume-id for file
%s",
  9
real_path);
 10 op_ret =
-1;
 11 goto
out;
 12 }

I just found that op_errno is not set correctly, but it can't happen in the
I/O path, so self-heal/rebalance are off the hook.

I also grepped for any removexattr of trusted.gfid from glusterd and didn't
find any.

So one thing that used to happen was that sometimes when machines reboot,
the brick mounts wouldn't happen and this would lead to absence of both
trusted.gfid and volume-id. So at the moment this is my wild guess.


On Fri, Jul 7, 2017 at 8:39 PM, Ankireddypalle Reddy 
wrote:

> Hi,
>
>We faced an issue in the production today. We had to stop the
> volume and reboot all the servers in the cluster.  Once the servers
> rebooted starting of the volume failed because the following extended
> attributes were not present on all the bricks on 2 servers.
>
> 1)  trusted.gfid
>
> 2)  trusted.glusterfs.volume-id
>
>
>
> We had to manually set these extended attributes to start the volume.  Are
> there any such known issues.
>
>
>
> Thanks and Regards,
>
> Ram
> ***Legal Disclaimer***
> "This communication may contain confidential and privileged material for
> the
> sole use of the intended recipient. Any unauthorized review, use or
> distribution
> by others is strictly prohibited. If you have received the message by
> mistake,
> please advise the sender by reply email and delete the message. Thank you."
> **
>
> ___
> Gluster-devel mailing list
> gluster-de...@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-devel
>



-- 
Pranith
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] gfid and volume-id extended attributes lost

2017-07-07 Thread Ankireddypalle Reddy
Hi,
   We faced an issue in the production today. We had to stop the volume and 
reboot all the servers in the cluster.  Once the servers rebooted starting of 
the volume failed because the following extended attributes were not present on 
all the bricks on 2 servers.

1)  trusted.gfid

2)  trusted.glusterfs.volume-id

We had to manually set these extended attributes to start the volume.  Are 
there any such known issues.

Thanks and Regards,
Ram
***Legal Disclaimer***
"This communication may contain confidential and privileged material for the
sole use of the intended recipient. Any unauthorized review, use or distribution
by others is strictly prohibited. If you have received the message by mistake,
please advise the sender by reply email and delete the message. Thank you."
**
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] I/O error for one folder within the mountpoint

2017-07-07 Thread Florian Leleu
Thank you Ravi, after checking gfid within the brick I think someone
made modification inside the brick and not inside the mountpoint ...

Well, I'll try to fix it all it's all within my hands.

Thanks again, have a nice day.

 
Le 07/07/2017 à 12:28, Ravishankar N a écrit :
> On 07/07/2017 03:39 PM, Florian Leleu wrote:
>>
>> I guess you're right aboug gfid, I got that:
>>
>> [2017-07-07 07:35:15.197003] W [MSGID: 108008]
>> [afr-self-heal-name.c:354:afr_selfheal_name_gfid_mismatch_check]
>> 0-applicatif-replicate-0: GFID mismatch for
>> /snooper
>> b9222041-72dd-43a3-b0ab-4169dbd9a87f on applicatif-client-1 and
>> 60056f98-20f8-4949-a4ae-81cc1a139147 on applicatif-client-0
>>
>> Can you tell me how can I fix that ?If that helps I don't mind
>> deleting the whole folder snooper, I have backup.
>>
>
> The steps listed in "Fixing Directory entry split-brain:" of
> https://gluster.readthedocs.io/en/latest/Troubleshooting/split-brain/ 
> should give you an idea. It is for files whose gfids mismatch but the
> steps are similar for directories too.
> If the contents of the snooper is same on all bricks ,  you could also
> try directly deleting the directory from one of the bricks and
> immediately doing an `ls snooper` from the mount to trigger heals to
> recreate the entries.
> Hope this helps
> Ravi
>>
>> Thanks.
>>
>>
>> Le 07/07/2017 à 11:54, Ravishankar N a écrit :
>>> What does the mount log say when you get the EIO error on snooper?
>>> Check if there is a gfid mismatch on snooper directory or the files
>>> under it for all 3 bricks. In any case the mount log or the
>>> glustershd.log of the 3 nodes for the gfids you listed below should
>>> give you some idea on why the files aren't healed.
>>> Thanks.
>>>
>>> On 07/07/2017 03:10 PM, Florian Leleu wrote:

 Hi Ravi,

 thanks for your answer, sure there you go:

 # gluster volume heal applicatif info
 Brick ipvr7.xxx:/mnt/gluster-applicatif/brick
 
 
 
 
 
 
 Status: Connected
 Number of entries: 6

 Brick ipvr8.xxx:/mnt/gluster-applicatif/brick
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 Status: Connected
 Number of entries: 29

 Brick ipvr9.xxx:/mnt/gluster-applicatif/brick
 
 
 
 
 
 
 Status: Connected
 Number of entries: 6


 # gluster volume heal applicatif info split-brain
 Brick ipvr7.xxx:/mnt/gluster-applicatif/brick
 Status: Connected
 Number of entries in split-brain: 0

 Brick ipvr8.xxx:/mnt/gluster-applicatif/brick
 Status: Connected
 Number of entries in split-brain: 0

 Brick ipvr9.xxx:/mnt/gluster-applicatif/brick
 Status: Connected
 Number of entries in split-brain: 0

 Doesn't it seem odd that the first command give some different output ?

 Le 07/07/2017 à 11:31, Ravishankar N a écrit :
> On 07/07/2017 01:23 PM, Florian Leleu wrote:
>>
>> Hello everyone,
>>
>> first time on the ML so excuse me if I'm not following well the
>> rules, I'll improve if I get comments.
>>
>> We got one volume "applicatif" on three nodes (2 and 1 arbiter),
>> each following command was made on node ipvr8.xxx:
>>
>> # gluster volume info applicatif
>>  
>> Volume Name: applicatif
>> Type: Replicate
>> Volume ID: ac222863-9210-4354-9636-2c822b332504
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 1 x (2 + 1) = 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: ipvr7.xxx:/mnt/gluster-applicatif/brick
>> Brick2: ipvr8.xxx:/mnt/gluster-applicatif/brick
>> Brick3: ipvr9.xxx:/mnt/gluster-applicatif/brick (arbiter)
>> Options Reconfigured:
>> performance.read-ahead: on
>> performance.cache-size: 1024MB
>> performance.quick-read: off
>> performance.stat-prefetch: on
>> performance.io-cache: off
>> transport.address-family: inet
>> performance.readdir-ahead: on
>> nfs.disable: off
>>
>> # gluster volume status applicatif
>> Status of volume: applicatif
>> Gluster process TCP Port  RDMA Port 
>> Online  Pid
>> --
>> Brick ipvr7.xxx:/mnt/gluster-applicatif/
>> brick   49154 0 
>> Y   2814
>> Brick ipvr8.xxx:/mnt/gluster-applicatif/
>> brick   49154 0 
>> Y   2672
>> Brick ipvr9.xxx:/mnt/gluster-applicatif/
>> brick   49154 0 
>> Y   3424
>> NFS Server on localhost 2049  0 
>> Y   26530
>> Self-heal Daemon on localhost   N/A   N/A   
>> Y   

Re: [Gluster-users] I/O error for one folder within the mountpoint

2017-07-07 Thread Ravishankar N

On 07/07/2017 03:39 PM, Florian Leleu wrote:


I guess you're right aboug gfid, I got that:

[2017-07-07 07:35:15.197003] W [MSGID: 108008] 
[afr-self-heal-name.c:354:afr_selfheal_name_gfid_mismatch_check] 
0-applicatif-replicate-0: GFID mismatch for 
/snooper 
b9222041-72dd-43a3-b0ab-4169dbd9a87f on applicatif-client-1 and 
60056f98-20f8-4949-a4ae-81cc1a139147 on applicatif-client-0


Can you tell me how can I fix that ?If that helps I don't mind 
deleting the whole folder snooper, I have backup.




The steps listed in "Fixing Directory entry split-brain:" of 
https://gluster.readthedocs.io/en/latest/Troubleshooting/split-brain/ 
should give you an idea. It is for files whose gfids mismatch but the 
steps are similar for directories too.
If the contents of the snooper is same on all bricks ,  you could also 
try directly deleting the directory from one of the bricks and 
immediately doing an `ls snooper` from the mount to trigger heals to 
recreate the entries.

Hope this helps
Ravi


Thanks.


Le 07/07/2017 à 11:54, Ravishankar N a écrit :
What does the mount log say when you get the EIO error on snooper? 
Check if there is a gfid mismatch on snooper directory or the files 
under it for all 3 bricks. In any case the mount log or the 
glustershd.log of the 3 nodes for the gfids you listed below should 
give you some idea on why the files aren't healed.

Thanks.

On 07/07/2017 03:10 PM, Florian Leleu wrote:


Hi Ravi,

thanks for your answer, sure there you go:

# gluster volume heal applicatif info
Brick ipvr7.xxx:/mnt/gluster-applicatif/brick






Status: Connected
Number of entries: 6

Brick ipvr8.xxx:/mnt/gluster-applicatif/brick





























Status: Connected
Number of entries: 29

Brick ipvr9.xxx:/mnt/gluster-applicatif/brick






Status: Connected
Number of entries: 6


# gluster volume heal applicatif info split-brain
Brick ipvr7.xxx:/mnt/gluster-applicatif/brick
Status: Connected
Number of entries in split-brain: 0

Brick ipvr8.xxx:/mnt/gluster-applicatif/brick
Status: Connected
Number of entries in split-brain: 0

Brick ipvr9.xxx:/mnt/gluster-applicatif/brick
Status: Connected
Number of entries in split-brain: 0

Doesn't it seem odd that the first command give some different output ?

Le 07/07/2017 à 11:31, Ravishankar N a écrit :

On 07/07/2017 01:23 PM, Florian Leleu wrote:


Hello everyone,

first time on the ML so excuse me if I'm not following well the 
rules, I'll improve if I get comments.


We got one volume "applicatif" on three nodes (2 and 1 arbiter), 
each following command was made on node ipvr8.xxx:


# gluster volume info applicatif

Volume Name: applicatif
Type: Replicate
Volume ID: ac222863-9210-4354-9636-2c822b332504
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: ipvr7.xxx:/mnt/gluster-applicatif/brick
Brick2: ipvr8.xxx:/mnt/gluster-applicatif/brick
Brick3: ipvr9.xxx:/mnt/gluster-applicatif/brick (arbiter)
Options Reconfigured:
performance.read-ahead: on
performance.cache-size: 1024MB
performance.quick-read: off
performance.stat-prefetch: on
performance.io-cache: off
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: off

# gluster volume status applicatif
Status of volume: applicatif
Gluster process TCP Port RDMA Port  
Online  Pid

--
Brick ipvr7.xxx:/mnt/gluster-applicatif/
brick   49154 0  
Y   2814

Brick ipvr8.xxx:/mnt/gluster-applicatif/
brick   49154 0  
Y   2672

Brick ipvr9.xxx:/mnt/gluster-applicatif/
brick   49154 0  
Y   3424
NFS Server on localhost 2049 0  
Y   26530
Self-heal Daemon on localhost   N/A N/AY   
26538

NFS Server on ipvr9.xxx  2049 0  Y   12238
Self-heal Daemon on ipvr9.xxxN/A N/AY   12246
NFS Server on ipvr7.xxx  2049 0  Y   2234
Self-heal Daemon on ipvr7.xxxN/A N/AY   2243

Task Status of Volume applicatif
--
There are no active volume tasks

The volume is mounted with autofs (nfs) in /home/applicatif and 
one folder is "broken":


l /home/applicatif/services/
ls: cannot access /home/applicatif/services/snooper: Input/output 
error

total 16
lrwxrwxrwx  1 applicatif applicatif9 Apr  6 15:53 config -> 
../config

lrwxrwxrwx  1 applicatif applicatif7 Apr  6 15:54 .pwd -> ../.pwd
drwxr-xr-x  3 applicatif applicatif 4096 Apr 12 10:24 querybuilder
d?  ? ?  ? ?? snooper
drwxr-xr-x  3 applicatif applicatif 4096 Jul  6 02:57 snooper_new
drwxr-xr-x 16 applicatif applicatif 4096 Jul  6 02:58 snooper_old
drwxr-xr-x  4 applicatif 

Re: [Gluster-users] I/O error for one folder within the mountpoint

2017-07-07 Thread Florian Leleu
I guess you're right aboug gfid, I got that:

[2017-07-07 07:35:15.197003] W [MSGID: 108008]
[afr-self-heal-name.c:354:afr_selfheal_name_gfid_mismatch_check]
0-applicatif-replicate-0: GFID mismatch for
/snooper
b9222041-72dd-43a3-b0ab-4169dbd9a87f on applicatif-client-1 and
60056f98-20f8-4949-a4ae-81cc1a139147 on applicatif-client-0

Can you tell me how can I fix that ?If that helps I don't mind deleting
the whole folder snooper, I have backup.

Thanks.


Le 07/07/2017 à 11:54, Ravishankar N a écrit :
> What does the mount log say when you get the EIO error on snooper?
> Check if there is a gfid mismatch on snooper directory or the files
> under it for all 3 bricks. In any case the mount log or the
> glustershd.log of the 3 nodes for the gfids you listed below should
> give you some idea on why the files aren't healed.
> Thanks.
>
> On 07/07/2017 03:10 PM, Florian Leleu wrote:
>>
>> Hi Ravi,
>>
>> thanks for your answer, sure there you go:
>>
>> # gluster volume heal applicatif info
>> Brick ipvr7.xxx:/mnt/gluster-applicatif/brick
>> 
>> 
>> 
>> 
>> 
>> 
>> Status: Connected
>> Number of entries: 6
>>
>> Brick ipvr8.xxx:/mnt/gluster-applicatif/brick
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> Status: Connected
>> Number of entries: 29
>>
>> Brick ipvr9.xxx:/mnt/gluster-applicatif/brick
>> 
>> 
>> 
>> 
>> 
>> 
>> Status: Connected
>> Number of entries: 6
>>
>>
>> # gluster volume heal applicatif info split-brain
>> Brick ipvr7.xxx:/mnt/gluster-applicatif/brick
>> Status: Connected
>> Number of entries in split-brain: 0
>>
>> Brick ipvr8.xxx:/mnt/gluster-applicatif/brick
>> Status: Connected
>> Number of entries in split-brain: 0
>>
>> Brick ipvr9.xxx:/mnt/gluster-applicatif/brick
>> Status: Connected
>> Number of entries in split-brain: 0
>>
>> Doesn't it seem odd that the first command give some different output ?
>>
>> Le 07/07/2017 à 11:31, Ravishankar N a écrit :
>>> On 07/07/2017 01:23 PM, Florian Leleu wrote:

 Hello everyone,

 first time on the ML so excuse me if I'm not following well the
 rules, I'll improve if I get comments.

 We got one volume "applicatif" on three nodes (2 and 1 arbiter),
 each following command was made on node ipvr8.xxx:

 # gluster volume info applicatif
  
 Volume Name: applicatif
 Type: Replicate
 Volume ID: ac222863-9210-4354-9636-2c822b332504
 Status: Started
 Snapshot Count: 0
 Number of Bricks: 1 x (2 + 1) = 3
 Transport-type: tcp
 Bricks:
 Brick1: ipvr7.xxx:/mnt/gluster-applicatif/brick
 Brick2: ipvr8.xxx:/mnt/gluster-applicatif/brick
 Brick3: ipvr9.xxx:/mnt/gluster-applicatif/brick (arbiter)
 Options Reconfigured:
 performance.read-ahead: on
 performance.cache-size: 1024MB
 performance.quick-read: off
 performance.stat-prefetch: on
 performance.io-cache: off
 transport.address-family: inet
 performance.readdir-ahead: on
 nfs.disable: off

 # gluster volume status applicatif
 Status of volume: applicatif
 Gluster process TCP Port  RDMA Port 
 Online  Pid
 --
 Brick ipvr7.xxx:/mnt/gluster-applicatif/
 brick   49154 0 
 Y   2814
 Brick ipvr8.xxx:/mnt/gluster-applicatif/
 brick   49154 0 
 Y   2672
 Brick ipvr9.xxx:/mnt/gluster-applicatif/
 brick   49154 0 
 Y   3424
 NFS Server on localhost 2049  0 
 Y   26530
 Self-heal Daemon on localhost   N/A   N/A   
 Y   26538
 NFS Server on ipvr9.xxx  2049  0 
 Y   12238
 Self-heal Daemon on ipvr9.xxxN/A   N/A   
 Y   12246
 NFS Server on ipvr7.xxx  2049  0 
 Y   2234
 Self-heal Daemon on ipvr7.xxxN/A   N/A   
 Y   2243
  
 Task Status of Volume applicatif
 --
 There are no active volume tasks

 The volume is mounted with autofs (nfs) in /home/applicatif and one
 folder is "broken":

 l /home/applicatif/services/
 ls: cannot access /home/applicatif/services/snooper: Input/output error
 total 16
 lrwxrwxrwx  1 applicatif applicatif9 Apr  6 15:53 config ->
 ../config
 lrwxrwxrwx  1 applicatif applicatif7 Apr  6 15:54 .pwd -> ../.pwd
 drwxr-xr-x  3 applicatif applicatif 4096 Apr 12 10:24 querybuilder
 d?  ? ?  ? ?? snooper
 drwxr-xr-x  3 applicatif applicatif 4096 Jul  6 02:57 snooper_new
 drwxr-xr-x 16 applicatif 

[Gluster-users] Rebalance task fails

2017-07-07 Thread Szymon Miotk
Hello everyone,


I have problem rebalancing Gluster volume.
Gluster version is 3.7.3.
My 1x3 replicated volume become full, so I've added three more bricks
to make it 2x3 and wanted to rebalance.
But every time I start rebalancing, it fails immediately.
Rebooting Gluster nodes doesn't help.

# gluster volume rebalance  gsae_artifactory_cluster_storage start
volume rebalance: gsae_artifactory_cluster_storage: success: Rebalance
on gsae_artifactory_cluster_storage has been started successfully. Use
rebalance status command to check status of the rebalance process.
ID: b22572ff-7575-4557-8317-765f7e52d445

# gluster volume rebalance  gsae_artifactory_cluster_storage status
Node Rebalanced-files
size   scanned  failures   skipped   status
run time in secs
   -  ---
---   ---   ---   ---
 --
   localhost0
0Bytes 0 0 0   failed
 0.00
 10.239.40.90
0Bytes 0 0 0   failed
 0.00
 10.239.40.80
0Bytes 0 0 0   failed
 0.00
volume rebalance: gsae_artifactory_cluster_storage: success:

The messages in logfiles mention 'failed to get index':
[2017-07-07 10:07:18.230202] E [MSGID: 106062]
[glusterd-utils.c:7997:glusterd_volume_rebalance_use_rsp_dict]
0-glusterd: failed to get index

and then the rebalance process crashes:
pending frames:
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 4
time of crash:
2017-07-07 10:07:23
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.7.3
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_msg_backtrace_nomem+0x92)[0x7f24de214502]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(gf_print_trace+0x31d)[0x7f24de23059d]
/lib/x86_64-linux-gnu/libc.so.6(+0x36d40)[0x7f24dd612d40]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x11f6b)[0x7f24dd9b2f6b]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(inode_ref+0x19)[0x7f24de234e69]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(loc_copy+0x4a)[0x7f24de21291a]
/usr/lib/x86_64-linux-gnu/glusterfs/3.7.3/xlator/cluster/distribute.so(dht_local_init+0x4b)[0x7f24d851f51b]
/usr/lib/x86_64-linux-gnu/glusterfs/3.7.3/xlator/cluster/distribute.so(dht_lookup+0x91)[0x7f24d8550521]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(syncop_lookup+0x1a2)[0x7f24de258fc2]
/usr/lib/x86_64-linux-gnu/glusterfs/3.7.3/xlator/cluster/distribute.so(gf_defrag_fix_layout+0x87)[0x7f24d85289e7]
/usr/lib/x86_64-linux-gnu/glusterfs/3.7.3/xlator/cluster/distribute.so(gf_defrag_start_crawl+0x6d3)[0x7f24d8529ce3]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(synctask_wrap+0x12)[0x7f24de255822]
/lib/x86_64-linux-gnu/libc.so.6(+0x498b0)[0x7f24dd6258b0]


Anybody has a clue how to fix 'failed to get index' error?

Thank you in advance!
Szymon Miotk
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] op-version for reset-brick (Was: Re: [ovirt-users] Upgrading HC from 4.0 to 4.1)

2017-07-07 Thread Atin Mukherjee
You'd need to allow some more time to dig into the logs. I'll try to get
back on this by Monday.

On Fri, Jul 7, 2017 at 2:23 PM, Gianluca Cecchi 
wrote:

> On Thu, Jul 6, 2017 at 3:22 PM, Gianluca Cecchi  > wrote:
>
>> On Thu, Jul 6, 2017 at 2:16 PM, Atin Mukherjee 
>> wrote:
>>
>>>
>>>
>>> On Thu, Jul 6, 2017 at 5:26 PM, Gianluca Cecchi <
>>> gianluca.cec...@gmail.com> wrote:
>>>
 On Thu, Jul 6, 2017 at 8:38 AM, Gianluca Cecchi <
 gianluca.cec...@gmail.com> wrote:

>
> Eventually I can destroy and recreate this "export" volume again with
> the old names (ovirt0N.localdomain.local) if you give me the sequence of
> commands, then enable debug and retry the reset-brick command
>
> Gianluca
>


 So it seems I was able to destroy and re-create.
 Now I see that the volume creation uses by default the new ip, so I
 reverted the hostnames roles in the commands after putting glusterd in
 debug mode on the host where I execute the reset-brick command (do I have
 to set debug for the the nodes too?)

>>>
>>> You have to set the log level to debug for glusterd instance where the
>>> commit fails and share the glusterd log of that particular node.
>>>
>>>
>>
>> Ok, done.
>>
>> Command executed on ovirt01 with timestamp "2017-07-06 13:04:12" in
>> glusterd log files
>>
>> [root@ovirt01 export]# gluster volume reset-brick export
>> gl01.localdomain.local:/gluster/brick3/export start
>> volume reset-brick: success: reset-brick start operation successful
>>
>> [root@ovirt01 export]# gluster volume reset-brick export
>> gl01.localdomain.local:/gluster/brick3/export
>> ovirt01.localdomain.local:/gluster/brick3/export commit force
>> volume reset-brick: failed: Commit failed on ovirt02.localdomain.local.
>> Please check log file for details.
>> Commit failed on ovirt03.localdomain.local. Please check log file for
>> details.
>> [root@ovirt01 export]#
>>
>> See glusterd log files for the 3 nodes in debug mode here:
>> ovirt01: https://drive.google.com/file/d/0BwoPbcrMv8mvY1RTTG
>> p3RUhScm8/view?usp=sharing
>> ovirt02: https://drive.google.com/file/d/0BwoPbcrMv8mvSVpJUH
>> NhMzhMSU0/view?usp=sharing
>> ovirt03: https://drive.google.com/file/d/0BwoPbcrMv8mvT2xiWE
>> dQVmJNb0U/view?usp=sharing
>>
>> HIH debugging
>> Gianluca
>>
>>
> Hi Atin,
> did you have time to see the logs?
> Comparing debug enabled messages with previous ones, I see these added
> lines on nodes where commit failed after running the commands
>
> gluster volume reset-brick export 
> gl01.localdomain.local:/gluster/brick3/export
> start
> gluster volume reset-brick export 
> gl01.localdomain.local:/gluster/brick3/export
> ovirt01.localdomain.local:/gluster/brick3/export commit force
>
>
> [2017-07-06 13:04:30.221872] D [MSGID: 0] 
> [glusterd-peer-utils.c:674:gd_peerinfo_find_from_hostname]
> 0-management: Friend ovirt01.localdomain.local found.. state: 3
> [2017-07-06 13:04:30.221882] D [MSGID: 0] 
> [glusterd-peer-utils.c:167:glusterd_hostname_to_uuid]
> 0-management: returning 0
> [2017-07-06 13:04:30.221888] D [MSGID: 0] 
> [glusterd-utils.c:1039:glusterd_resolve_brick]
> 0-management: Returning 0
> [2017-07-06 13:04:30.221908] D [MSGID: 0] 
> [glusterd-utils.c:998:glusterd_brickinfo_new]
> 0-management: Returning 0
> [2017-07-06 13:04:30.221915] D [MSGID: 0] [glusterd-utils.c:1195:
> glusterd_brickinfo_new_from_brick] 0-management: Returning 0
> [2017-07-06 13:04:30.222187] D [MSGID: 0] 
> [glusterd-peer-utils.c:167:glusterd_hostname_to_uuid]
> 0-management: returning 0
> [2017-07-06 13:04:30.01] D [MSGID: 0] 
> [glusterd-utils.c:1486:glusterd_volume_brickinfo_get]
> 0-management: Returning -1
> [2017-07-06 13:04:30.07] D [MSGID: 0] 
> [store.c:459:gf_store_handle_destroy]
> 0-: Returning 0
> [2017-07-06 13:04:30.42] D [MSGID: 0] [glusterd-utils.c:1512:
> glusterd_volume_brickinfo_get_by_brick] 0-glusterd: Returning -1
> [2017-07-06 13:04:30.50] D [MSGID: 0] [glusterd-replace-brick.c:416:
> glusterd_op_perform_replace_brick] 0-glusterd: Returning -1
> [2017-07-06 13:04:30.57] C [MSGID: 106074] 
> [glusterd-reset-brick.c:372:glusterd_op_reset_brick]
> 0-management: Unable to add dst-brick: 
> ovirt01.localdomain.local:/gluster/brick3/export
> to volume: export
>
>
> Does it share up more light?
>
> Thanks,
> Gianluca
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] I/O error for one folder within the mountpoint

2017-07-07 Thread Ravishankar N
What does the mount log say when you get the EIO error on snooper? Check 
if there is a gfid mismatch on snooper directory or the files under it 
for all 3 bricks. In any case the mount log or the glustershd.log of the 
3 nodes for the gfids you listed below should give you some idea on why 
the files aren't healed.

Thanks.

On 07/07/2017 03:10 PM, Florian Leleu wrote:


Hi Ravi,

thanks for your answer, sure there you go:

# gluster volume heal applicatif info
Brick ipvr7.xxx:/mnt/gluster-applicatif/brick






Status: Connected
Number of entries: 6

Brick ipvr8.xxx:/mnt/gluster-applicatif/brick





























Status: Connected
Number of entries: 29

Brick ipvr9.xxx:/mnt/gluster-applicatif/brick






Status: Connected
Number of entries: 6


# gluster volume heal applicatif info split-brain
Brick ipvr7.xxx:/mnt/gluster-applicatif/brick
Status: Connected
Number of entries in split-brain: 0

Brick ipvr8.xxx:/mnt/gluster-applicatif/brick
Status: Connected
Number of entries in split-brain: 0

Brick ipvr9.xxx:/mnt/gluster-applicatif/brick
Status: Connected
Number of entries in split-brain: 0

Doesn't it seem odd that the first command give some different output ?

Le 07/07/2017 à 11:31, Ravishankar N a écrit :

On 07/07/2017 01:23 PM, Florian Leleu wrote:


Hello everyone,

first time on the ML so excuse me if I'm not following well the 
rules, I'll improve if I get comments.


We got one volume "applicatif" on three nodes (2 and 1 arbiter), 
each following command was made on node ipvr8.xxx:


# gluster volume info applicatif

Volume Name: applicatif
Type: Replicate
Volume ID: ac222863-9210-4354-9636-2c822b332504
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: ipvr7.xxx:/mnt/gluster-applicatif/brick
Brick2: ipvr8.xxx:/mnt/gluster-applicatif/brick
Brick3: ipvr9.xxx:/mnt/gluster-applicatif/brick (arbiter)
Options Reconfigured:
performance.read-ahead: on
performance.cache-size: 1024MB
performance.quick-read: off
performance.stat-prefetch: on
performance.io-cache: off
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: off

# gluster volume status applicatif
Status of volume: applicatif
Gluster process TCP Port  RDMA Port  
Online  Pid

--
Brick ipvr7.xxx:/mnt/gluster-applicatif/
brick   49154 0  Y   
2814

Brick ipvr8.xxx:/mnt/gluster-applicatif/
brick   49154 0  Y   
2672

Brick ipvr9.xxx:/mnt/gluster-applicatif/
brick   49154 0  Y   
3424
NFS Server on localhost 2049 0  Y   
26530

Self-heal Daemon on localhost   N/A N/AY   26538
NFS Server on ipvr9.xxx  2049 0  Y   12238
Self-heal Daemon on ipvr9.xxxN/A N/AY   12246
NFS Server on ipvr7.xxx  2049 0  Y   2234
Self-heal Daemon on ipvr7.xxxN/A N/AY   2243

Task Status of Volume applicatif
--
There are no active volume tasks

The volume is mounted with autofs (nfs) in /home/applicatif and one 
folder is "broken":


l /home/applicatif/services/
ls: cannot access /home/applicatif/services/snooper: Input/output error
total 16
lrwxrwxrwx  1 applicatif applicatif9 Apr  6 15:53 config -> 
../config

lrwxrwxrwx  1 applicatif applicatif7 Apr  6 15:54 .pwd -> ../.pwd
drwxr-xr-x  3 applicatif applicatif 4096 Apr 12 10:24 querybuilder
d?  ? ?  ? ?? snooper
drwxr-xr-x  3 applicatif applicatif 4096 Jul  6 02:57 snooper_new
drwxr-xr-x 16 applicatif applicatif 4096 Jul  6 02:58 snooper_old
drwxr-xr-x  4 applicatif applicatif 4096 Jul  4 23:45 ssnooper

I checked wether there was a heal, and it seems so:

# gluster volume heal applicatif statistics heal-count
Gathering count of entries to be healed on volume applicatif has 
been successful


Brick ipvr7.xxx:/mnt/gluster-applicatif/brick
Number of entries: 8

Brick ipvr8.xxx:/mnt/gluster-applicatif/brick
Number of entries: 29

Brick ipvr9.xxx:/mnt/gluster-applicatif/brick
Number of entries: 8

But actually in the brick on each server the folder "snooper" is fine.

I tried rebooting the servers, restarting gluster after killing 
every process using it but it's not working.


Has anyone already experienced that ? Any help would be nice.



Can you share the output of `gluster volume heal  info` and 
`gluster volume heal  info split-brain`? If the second 
command shows entries, please also share the getfattr output from the 
bricks for these files (getfattr -d -m . -e hex /brick/path/to/file).

-Ravi


Thanks a lot !

--

Cordialement,

  

Florian LELEU
Responsable Hosting, 

Re: [Gluster-users] I/O error for one folder within the mountpoint

2017-07-07 Thread Florian Leleu
Hi Ravi,

thanks for your answer, sure there you go:

# gluster volume heal applicatif info
Brick ipvr7.xxx:/mnt/gluster-applicatif/brick






Status: Connected
Number of entries: 6

Brick ipvr8.xxx:/mnt/gluster-applicatif/brick





























Status: Connected
Number of entries: 29

Brick ipvr9.xxx:/mnt/gluster-applicatif/brick






Status: Connected
Number of entries: 6


# gluster volume heal applicatif info split-brain
Brick ipvr7.xxx:/mnt/gluster-applicatif/brick
Status: Connected
Number of entries in split-brain: 0

Brick ipvr8.xxx:/mnt/gluster-applicatif/brick
Status: Connected
Number of entries in split-brain: 0

Brick ipvr9.xxx:/mnt/gluster-applicatif/brick
Status: Connected
Number of entries in split-brain: 0

Doesn't it seem odd that the first command give some different output ?

Le 07/07/2017 à 11:31, Ravishankar N a écrit :
> On 07/07/2017 01:23 PM, Florian Leleu wrote:
>>
>> Hello everyone,
>>
>> first time on the ML so excuse me if I'm not following well the
>> rules, I'll improve if I get comments.
>>
>> We got one volume "applicatif" on three nodes (2 and 1 arbiter), each
>> following command was made on node ipvr8.xxx:
>>
>> # gluster volume info applicatif
>>  
>> Volume Name: applicatif
>> Type: Replicate
>> Volume ID: ac222863-9210-4354-9636-2c822b332504
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 1 x (2 + 1) = 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: ipvr7.xxx:/mnt/gluster-applicatif/brick
>> Brick2: ipvr8.xxx:/mnt/gluster-applicatif/brick
>> Brick3: ipvr9.xxx:/mnt/gluster-applicatif/brick (arbiter)
>> Options Reconfigured:
>> performance.read-ahead: on
>> performance.cache-size: 1024MB
>> performance.quick-read: off
>> performance.stat-prefetch: on
>> performance.io-cache: off
>> transport.address-family: inet
>> performance.readdir-ahead: on
>> nfs.disable: off
>>
>> # gluster volume status applicatif
>> Status of volume: applicatif
>> Gluster process TCP Port  RDMA Port 
>> Online  Pid
>> --
>> Brick ipvr7.xxx:/mnt/gluster-applicatif/
>> brick   49154 0 
>> Y   2814
>> Brick ipvr8.xxx:/mnt/gluster-applicatif/
>> brick   49154 0 
>> Y   2672
>> Brick ipvr9.xxx:/mnt/gluster-applicatif/
>> brick   49154 0 
>> Y   3424
>> NFS Server on localhost 2049  0 
>> Y   26530
>> Self-heal Daemon on localhost   N/A   N/A   
>> Y   26538
>> NFS Server on ipvr9.xxx  2049  0  Y  
>> 12238
>> Self-heal Daemon on ipvr9.xxxN/A   N/AY  
>> 12246
>> NFS Server on ipvr7.xxx  2049  0  Y  
>> 2234
>> Self-heal Daemon on ipvr7.xxxN/A   N/AY  
>> 2243
>>  
>> Task Status of Volume applicatif
>> --
>> There are no active volume tasks
>>
>> The volume is mounted with autofs (nfs) in /home/applicatif and one
>> folder is "broken":
>>
>> l /home/applicatif/services/
>> ls: cannot access /home/applicatif/services/snooper: Input/output error
>> total 16
>> lrwxrwxrwx  1 applicatif applicatif9 Apr  6 15:53 config -> ../config
>> lrwxrwxrwx  1 applicatif applicatif7 Apr  6 15:54 .pwd -> ../.pwd
>> drwxr-xr-x  3 applicatif applicatif 4096 Apr 12 10:24 querybuilder
>> d?  ? ?  ? ?? snooper
>> drwxr-xr-x  3 applicatif applicatif 4096 Jul  6 02:57 snooper_new
>> drwxr-xr-x 16 applicatif applicatif 4096 Jul  6 02:58 snooper_old
>> drwxr-xr-x  4 applicatif applicatif 4096 Jul  4 23:45 ssnooper
>>
>> I checked wether there was a heal, and it seems so:
>>
>> # gluster volume heal applicatif statistics heal-count
>> Gathering count of entries to be healed on volume applicatif has been
>> successful
>>
>> Brick ipvr7.xxx:/mnt/gluster-applicatif/brick
>> Number of entries: 8
>>
>> Brick ipvr8.xxx:/mnt/gluster-applicatif/brick
>> Number of entries: 29
>>
>> Brick ipvr9.xxx:/mnt/gluster-applicatif/brick
>> Number of entries: 8
>>
>> But actually in the brick on each server the folder "snooper" is fine.
>>
>> I tried rebooting the servers, restarting gluster after killing every
>> process using it but it's not working.
>>
>> Has anyone already experienced that ? Any help would be nice.
>>
>
> Can you share the output of `gluster volume heal  info` and
> `gluster volume heal  info split-brain`? If the second
> command shows entries, please also share the getfattr output from the
> bricks for these files (getfattr -d -m . -e hex /brick/path/to/file).
> -Ravi
>>
>> Thanks a lot !
>>
>> -- 
>>
>> Cordialement,
>>
>>  
>>
>> Florian LELEU
>> Responsable Hosting, Cognix Systems
>>
>> *Rennes* | 

Re: [Gluster-users] GluserFS WORM hardlink

2017-07-07 Thread Karthik Subrahmanya
Hi,

If I did not misunderstood, you are saying that WORM is not allowing to
create hard links for the files.
Answering it based on that assumption.
If the volume level or file level WORM feature is enabled and the file is
in WORM/WORM-Retained state,
then those files should be immutable and hence we don't allow to create
hardlinks for those files.

If I did not clarify your doubt or misunderstood your question, could you
please elaborate?

Regards,
Karthik

On Fri, Jul 7, 2017 at 2:30 PM, 최두일  wrote:

> GlusterFS WORM hard links will not be created
>
> OS is CentOS7
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] I/O error for one folder within the mountpoint

2017-07-07 Thread Ravishankar N

On 07/07/2017 01:23 PM, Florian Leleu wrote:


Hello everyone,

first time on the ML so excuse me if I'm not following well the rules, 
I'll improve if I get comments.


We got one volume "applicatif" on three nodes (2 and 1 arbiter), each 
following command was made on node ipvr8.xxx:


# gluster volume info applicatif

Volume Name: applicatif
Type: Replicate
Volume ID: ac222863-9210-4354-9636-2c822b332504
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: ipvr7.xxx:/mnt/gluster-applicatif/brick
Brick2: ipvr8.xxx:/mnt/gluster-applicatif/brick
Brick3: ipvr9.xxx:/mnt/gluster-applicatif/brick (arbiter)
Options Reconfigured:
performance.read-ahead: on
performance.cache-size: 1024MB
performance.quick-read: off
performance.stat-prefetch: on
performance.io-cache: off
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: off

# gluster volume status applicatif
Status of volume: applicatif
Gluster process TCP Port  RDMA Port 
Online  Pid

--
Brick ipvr7.xxx:/mnt/gluster-applicatif/
brick   49154 0 Y   2814
Brick ipvr8.xxx:/mnt/gluster-applicatif/
brick   49154 0 Y   2672
Brick ipvr9.xxx:/mnt/gluster-applicatif/
brick   49154 0 Y   3424
NFS Server on localhost 2049  0 Y   26530
Self-heal Daemon on localhost   N/A   N/A Y   26538
NFS Server on ipvr9.xxx  2049  0 Y   12238
Self-heal Daemon on ipvr9.xxxN/A   N/A Y   12246
NFS Server on ipvr7.xxx  2049  0 Y   2234
Self-heal Daemon on ipvr7.xxxN/A   N/A Y   2243

Task Status of Volume applicatif
--
There are no active volume tasks

The volume is mounted with autofs (nfs) in /home/applicatif and one 
folder is "broken":


l /home/applicatif/services/
ls: cannot access /home/applicatif/services/snooper: Input/output error
total 16
lrwxrwxrwx  1 applicatif applicatif9 Apr  6 15:53 config -> ../config
lrwxrwxrwx  1 applicatif applicatif7 Apr  6 15:54 .pwd -> ../.pwd
drwxr-xr-x  3 applicatif applicatif 4096 Apr 12 10:24 querybuilder
d?  ? ?  ? ?? snooper
drwxr-xr-x  3 applicatif applicatif 4096 Jul  6 02:57 snooper_new
drwxr-xr-x 16 applicatif applicatif 4096 Jul  6 02:58 snooper_old
drwxr-xr-x  4 applicatif applicatif 4096 Jul  4 23:45 ssnooper

I checked wether there was a heal, and it seems so:

# gluster volume heal applicatif statistics heal-count
Gathering count of entries to be healed on volume applicatif has been 
successful


Brick ipvr7.xxx:/mnt/gluster-applicatif/brick
Number of entries: 8

Brick ipvr8.xxx:/mnt/gluster-applicatif/brick
Number of entries: 29

Brick ipvr9.xxx:/mnt/gluster-applicatif/brick
Number of entries: 8

But actually in the brick on each server the folder "snooper" is fine.

I tried rebooting the servers, restarting gluster after killing every 
process using it but it's not working.


Has anyone already experienced that ? Any help would be nice.



Can you share the output of `gluster volume heal  info` and 
`gluster volume heal  info split-brain`? If the second command 
shows entries, please also share the getfattr output from the bricks for 
these files (getfattr -d -m . -e hex /brick/path/to/file).

-Ravi


Thanks a lot !

--

Cordialement,

  

Florian LELEU
Responsable Hosting, Cognix Systems

*Rennes* | Brest | Saint-Malo | Paris
florian.le...@cognix-systems.com 

Tél. : 02 99 27 75 92


Facebook Cognix Systems 
Twitter Cognix Systems 
Logo Cognix Systems 



___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users



___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] I/O error for one folder within the mountpoint

2017-07-07 Thread Florian Leleu
Hello everyone,

first time on the ML so excuse me if I'm not following well the rules,
I'll improve if I get comments.

We got one volume "applicatif" on three nodes (2 and 1 arbiter), each
following command was made on node ipvr8.xxx:

# gluster volume info applicatif
 
Volume Name: applicatif
Type: Replicate
Volume ID: ac222863-9210-4354-9636-2c822b332504
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: ipvr7.xxx:/mnt/gluster-applicatif/brick
Brick2: ipvr8.xxx:/mnt/gluster-applicatif/brick
Brick3: ipvr9.xxx:/mnt/gluster-applicatif/brick (arbiter)
Options Reconfigured:
performance.read-ahead: on
performance.cache-size: 1024MB
performance.quick-read: off
performance.stat-prefetch: on
performance.io-cache: off
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: off

# gluster volume status applicatif
Status of volume: applicatif
Gluster process TCP Port  RDMA Port  Online  Pid
--
Brick ipvr7.xxx:/mnt/gluster-applicatif/
brick   49154 0  Y  
2814
Brick ipvr8.xxx:/mnt/gluster-applicatif/
brick   49154 0  Y  
2672
Brick ipvr9.xxx:/mnt/gluster-applicatif/
brick   49154 0  Y  
3424
NFS Server on localhost 2049  0  Y  
26530
Self-heal Daemon on localhost   N/A   N/AY  
26538
NFS Server on ipvr9.xxx  2049  0  Y   12238
Self-heal Daemon on ipvr9.xxxN/A   N/AY   12246
NFS Server on ipvr7.xxx  2049  0  Y   2234
Self-heal Daemon on ipvr7.xxxN/A   N/AY   2243
 
Task Status of Volume applicatif
--
There are no active volume tasks

The volume is mounted with autofs (nfs) in /home/applicatif and one
folder is "broken":

l /home/applicatif/services/
ls: cannot access /home/applicatif/services/snooper: Input/output error
total 16
lrwxrwxrwx  1 applicatif applicatif9 Apr  6 15:53 config -> ../config
lrwxrwxrwx  1 applicatif applicatif7 Apr  6 15:54 .pwd -> ../.pwd
drwxr-xr-x  3 applicatif applicatif 4096 Apr 12 10:24 querybuilder
d?  ? ?  ? ?? snooper
drwxr-xr-x  3 applicatif applicatif 4096 Jul  6 02:57 snooper_new
drwxr-xr-x 16 applicatif applicatif 4096 Jul  6 02:58 snooper_old
drwxr-xr-x  4 applicatif applicatif 4096 Jul  4 23:45 ssnooper

I checked wether there was a heal, and it seems so:

# gluster volume heal applicatif statistics heal-count
Gathering count of entries to be healed on volume applicatif has been
successful

Brick ipvr7.xxx:/mnt/gluster-applicatif/brick
Number of entries: 8

Brick ipvr8.xxx:/mnt/gluster-applicatif/brick
Number of entries: 29

Brick ipvr9.xxx:/mnt/gluster-applicatif/brick
Number of entries: 8

But actually in the brick on each server the folder "snooper" is fine.

I tried rebooting the servers, restarting gluster after killing every
process using it but it's not working.

Has anyone already experienced that ? Any help would be nice.

Thanks a lot !

-- 

Cordialement,



Florian LELEU
Responsable Hosting, Cognix Systems

*Rennes* | Brest | Saint-Malo | Paris
florian.le...@cognix-systems.com 

Tél. : 02 99 27 75 92


Facebook Cognix Systems 
Twitter Cognix Systems 
Logo Cognix Systems 

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] GluserFS WORM hardlink

2017-07-07 Thread 최두일
GlusterFS WORM hard links will not be created
OS is CentOS7
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Community Meeting 2017-07-05 Minutes

2017-07-07 Thread Kaushal M
Hi all,

The meeting minutes and logs for the community meeting held on
Wednesday are available at the links below. [1][2][3][4]

We had a good showing this meeting. Thank you everyone who attended
this meeting.

Our next meeting will be on 19th July. Everyone is welcome to attend.
The meeting note pad is available at [5] to add your topics for
discussion.

Thanks,
Kaushal

[1]: Minutes: 
https://meetbot-raw.fedoraproject.org/gluster-meeting/2017-07-05/gluster_community_meeting_2017-07-05.2017-07-05-15.02.html
[2]: Minutes (text):
https://meetbot-raw.fedoraproject.org/gluster-meeting/2017-07-05/gluster_community_meeting_2017-07-05.2017-07-05-15.02.txt
[3]: Log: 
https://meetbot.fedoraproject.org/gluster-meeting/2017-07-05/gluster_community_meeting_2017-07-05.2017-07-05-15.02.log.html
[4]: https://github.com/gluster/glusterfs/wiki/Community-Meeting-2017-07-05
[5]: https://bit.ly/gluster-community-meetings

Meeting summary
---
* Experimental features  (kshlm, 15:06:16)

* Test cases contribution from community  (kshlm, 15:11:40)
  * ACTION: The person leading the application specific tests cases
should start the survey to collect info on applications used with
gluster  (kshlm, 15:39:13)
  * ACTION: kshlm to find out who that person is  (kshlm, 15:39:26)

* ndevos will check with Eric Harney about the Openstack Gluster efforts
  (kshlm, 15:45:17)
  * ACTION: ndevos will check with Eric Harney about the Openstack
Gluster efforts  (kshlm, 15:46:23)

* nigelb will document kkeithley's build process for glusterfs packages
  (kshlm, 15:47:21)
  * ACTION: nigelb will document the walkthrough given by kkeithley on
building packages  (kshlm, 15:48:58)
  * 3.11.1 was tagged. But there hasn't been a release announcement that
I've seen yet.  (kshlm, 15:50:58)
  * LINK:
http://lists.gluster.org/pipermail/gluster-users/2017-June/031618.html
(amye, 15:51:56)
  * 3.11.1 was announced.  (kshlm, 15:52:32)
  * LINK:
http://lists.gluster.org/pipermail/gluster-users/2017-June/031400.html
is the last post I saw on this  (amye, 15:59:42)

Meeting ended at 16:07:53 UTC.




Action Items

* The person leading the application specific tests cases should start
  the survey to collect info on applications used with gluster
* kshlm to find out who that person is
* ndevos will check with Eric Harney about the Openstack Gluster efforts
* nigelb will document the walkthrough given by kkeithley on building
  packages




Action Items, by person
---
* kkeithley
  * nigelb will document the walkthrough given by kkeithley on building
packages
* kshlm
  * kshlm to find out who that person is
* ndevos
  * ndevos will check with Eric Harney about the Openstack Gluster
efforts
* **UNASSIGNED**
  * The person leading the application specific tests cases should start
the survey to collect info on applications used with gluster




People Present (lines said)
---
* kshlm (95)
* ndevos (30)
* amye (18)
* kkeithley (15)
* zodbot (3)
* vbellur (3)
* jstrunk (2)
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users