Re: [Gluster-users] Files not healing & missing their extended attributes - Help!

Ashish Pandey Sun, 01 Jul 2018 19:38:25 -0700

The only problem at the moment is that arbiter brick offline. You should only 
bother about completion of maintenance of arbiter brick ASAP. 
Bring this brick UP, start FULL heal or index heal and the volume will be in 
healthy state.

--- 
Ashish 

----- Original Message -----

From: "Gambit15" <dougti+glus...@gmail.com> 
To: "Ashish Pandey" <aspan...@redhat.com> 
Cc: "gluster-users" <gluster-users@gluster.org> 
Sent: Monday, July 2, 2018 1:45:01 AM 
Subject: Re: [Gluster-users] Files not healing & missing their extended 
attributes - Help! 

Hi Ashish, 

The output is below. It's a rep 2+1 volume. The arbiter is offline for 
maintenance at the moment, however quorum is met & no files are reported as in 
split-brain (it hosts VMs, so files aren't accessed concurrently). 

====================== 
[root@v0 glusterfs]# gluster volume info engine 

Volume Name: engine 
Type: Replicate 
Volume ID: 279737d3-3e5a-4ee9-8d4a-97edcca42427 
Status: Started 
Snapshot Count: 0 
Number of Bricks: 1 x (2 + 1) = 3 
Transport-type: tcp 
Bricks: 
Brick1: s0:/gluster/engine/brick 
Brick2: s1:/gluster/engine/brick 
Brick3: s2:/gluster/engine/arbiter (arbiter) 
Options Reconfigured: 
nfs.disable: on 
performance.readdir-ahead: on 
transport.address-family: inet 
performance.quick-read: off 
performance.read-ahead: off 
performance.io-cache: off 
performance.stat-prefetch: off 
cluster.eager-lock: enable 
network.remote-dio: enable 
cluster.quorum-type: auto 
cluster.server-quorum-type: server 
storage.owner-uid: 36 
storage.owner-gid: 36 
performance.low-prio-threads: 32 

====================== 

[root@v0 glusterfs]# gluster volume heal engine info 
Brick s0:/gluster/engine/brick 
/__DIRECT_IO_TEST__ 
/98495dbc-a29c-4893-b6a0-0aa70860d0c9/ha_agent 
/98495dbc-a29c-4893-b6a0-0aa70860d0c9 
<LIST TRUNCATED FOR BREVITY> 
Status: Connected 
Number of entries: 34 

Brick s1:/gluster/engine/brick 
<SAME AS ABOVE - TRUNCATED FOR BREVITY> 
Status: Connected 
Number of entries: 34 

Brick s2:/gluster/engine/arbiter 
Status: Ponto final de transporte não está conectado 
Number of entries: - 

====================== 
=== PEER V0 === 

[root@v0 glusterfs]# getfattr -m . -d -e hex 
/gluster/engine/brick/98495dbc-a29c-4893-b6a0-0aa70860d0c9/ha_agent 
getfattr: Removing leading '/' from absolute path names 
# file: gluster/engine/brick/98495dbc-a29c-4893-b6a0-0aa70860d0c9/ha_agent 
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000

trusted.afr.dirty=0x000000000000000000000000 
trusted.afr.engine-client-2=0x0000000000000000000024e8 
trusted.gfid=0xdb9afb92d2bc49ed8e34dcd437ba7be2 
trusted.glusterfs.dht=0x000000010000000000000000ffffffff 

[root@v0 glusterfs]# getfattr -m . -d -e hex 
/gluster/engine/brick/98495dbc-a29c-4893-b6a0-0aa70860d0c9/ha_agent/* 
getfattr: Removing leading '/' from absolute path names 
# file: 
gluster/engine/brick/98495dbc-a29c-4893-b6a0-0aa70860d0c9/ha_agent/hosted-engine.lockspace

security.selinux=0x73797374656d5f753a6f626a6563745f723a6675736566735f743a733000 

# file: 
gluster/engine/brick/98495dbc-a29c-4893-b6a0-0aa70860d0c9/ha_agent/hosted-engine.metadata

security.selinux=0x73797374656d5f753a6f626a6563745f723a6675736566735f743a733000 

=== PEER V1 === 

[root@v1 glusterfs]# getfattr -m . -d -e hex 
/gluster/engine/brick/98495dbc-a29c-4893-b6a0-0aa70860d0c9/ha_agent 
getfattr: Removing leading '/' from absolute path names 
# file: gluster/engine/brick/98495dbc-a29c-4893-b6a0-0aa70860d0c9/ha_agent 
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000

trusted.afr.dirty=0x000000000000000000000000 
trusted.afr.engine-client-2=0x0000000000000000000024ec 
trusted.gfid=0xdb9afb92d2bc49ed8e34dcd437ba7be2 
trusted.glusterfs.dht=0x000000010000000000000000ffffffff 

====================== 

cmd_history.log-20180701: 

[2018-07-01 03:11:38.461175] : volume heal engine full : SUCCESS 
[2018-07-01 03:11:51.151891] : volume heal data full : SUCCESS 

glustershd.log-20180701: 
<LOGS FROM 06/01 TRUNCATED> 
[2018-07-01 07:15:04.779122] I [MSGID: 100011] [glusterfsd.c:1396:reincarnate] 
0-glusterfsd: Fetching the volume file from server... 

glustershd.log: 
[2018-07-01 07:15:04.779693] I [glusterfsd-mgmt.c:1596:mgmt_getspec_cbk] 
0-glusterfs: No change in volfile, continuing 

That's the *only* message in glustershd.log today. 

====================== 

[root@v0 glusterfs]# gluster volume status engine 
Status of volume: engine 
Gluster process TCP Port RDMA Port Online Pid 
------------------------------------------------------------------------------ 
Brick s0:/gluster/engine/brick 49154 0 Y 2816 
Brick s1:/gluster/engine/brick 49154 0 Y 3995 
Self-heal Daemon on localhost N/A N/A Y 2919 
Self-heal Daemon on s1 N/A N/A Y 4013 

Task Status of Volume engine 
------------------------------------------------------------------------------ 
There are no active volume tasks 

====================== 

Okay, so actually only the directory ha_agent is listed for healing (not its 
contents), & that does have attributes set. 

Many thanks for the reply! 

On 1 July 2018 at 15:34, Ashish Pandey < aspan...@redhat.com > wrote: 

You have not even talked about the volume type and configuration and this issue 
would require lot of other information to fix it. 

1 - What is the type of volume and config. 
2 - Provide the gluster v <volname> info out put 
3 - Heal info out put 
4 - getxattr of one of the file, which needs healing, from all the bricks. 
5 - What lead to the healing of file? 
6 - gluster v <volname> status 
7 - glustershd.log out put just after you run full heal or index heal 

---- 
Ashish 

From: "Gambit15" < dougti+glus...@gmail.com > 
To: "gluster-users" < gluster-users@gluster.org > 
Sent: Sunday, July 1, 2018 11:50:16 PM 
Subject: [Gluster-users] Files not healing & missing their extended attributes 
- Help! 

Hi Guys, 
I had to restart our datacenter yesterday, but since doing so a number of the 
files on my gluster share have been stuck, marked as healing. After no signs of 
progress, I manually set off a full heal last night, but after 24hrs, nothing's 
happened. 

The gluster logs all look normal, and there're no messages about failed 
connections or heal processes kicking off. 

I checked the listed files' extended attributes on their bricks today, and they 
only show the selinux attribute. There's none of the trusted.* attributes I'd 
expect. 
The healthy files on the bricks do have their extended attributes though. 

I'm guessing that perhaps the files somehow lost their attributes, and gluster 
is no longer able to work out what to do with them? It's not logged any errors, 
warnings, or anything else out of the normal though, so I've no idea what the 
problem is or how to resolve it. 

I've got 16 hours to get this sorted before the start of work, Monday. Help! 

_______________________________________________ 
Gluster-users mailing list 
Gluster-users@gluster.org 
http://lists.gluster.org/mailman/listinfo/gluster-users 

_______________________________________________ 
Gluster-users mailing list 
Gluster-users@gluster.org 
http://lists.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Files not healing & missing their extended attributes - Help!

Reply via email to