Re: [Gluster-users] gfid entries in volume heal info that do not heal

Matt Waymack Mon, 23 Oct 2017 11:55:39 -0700

In my case I was able to delete the hard links in the .glusterfs folders of the 
bricks and it seems to have done the trick, thanks!


From: Karthik Subrahmanya [mailto:ksubr...@redhat.com]
Sent: Monday, October 23, 2017 1:52 AM
To: Jim Kinney <jim.kin...@gmail.com>; Matt Waymack <mwaym...@nsgdv.com>
Cc: gluster-users <Gluster-users@gluster.org>
Subject: Re: [Gluster-users] gfid entries in volume heal info that do not heal

Hi Jim & Matt,
Can you also check for the link count in the stat output of those hardlink 
entries in the .glusterfs folder on the bricks.
If the link count is 1 on all the bricks for those entries, then they are 
orphaned entries and you can delete those hardlinks.
To be on the safer side have a backup before deleting any of the entries.
Regards,
Karthik

On Fri, Oct 20, 2017 at 3:18 AM, Jim Kinney 
<jim.kin...@gmail.com<mailto:jim.kin...@gmail.com>> wrote:
I've been following this particular thread as I have a similar issue (RAID6 
array failed out with 3 dead drives at once while a 12 TB load was being copied 
into one mounted space - what a mess)

I have >700K GFID entries that have no path data:
Example:
getfattr -d -e hex -m . .glusterfs/00/00/0000a5ef-5af7-401b-84b5-ff2a51c10421
# file: .glusterfs/00/00/0000a5ef-5af7-401b-84b5-ff2a51c10421
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.bit-rot.version=0x020000000000000059b1b316000270e7
trusted.gfid=0x0000a5ef5af7401b84b5ff2a51c10421

[root@bmidata1<mailto:root@bmidata1> brick]# getfattr -d -n 
trusted.glusterfs.pathinfo -e hex -m . 
.glusterfs/00/00/0000a5ef-5af7-401b-84b5-ff2a51c10421
.glusterfs/00/00/0000a5ef-5af7-401b-84b5-ff2a51c10421: 
trusted.glusterfs.pathinfo: No such attribute

I had to totally rebuild the dead RAID array and did a copy from the live one 
before activating gluster on the rebuilt system. I accidentally copied over the 
.glusterfs folder from the working side
(replica 2 only for now - adding arbiter node as soon as I can get this one 
cleaned up).

I've run the methods from 
"http://docs.gluster.org/en/latest/Troubleshooting/gfid-to-path/"; with no 
results using random GFIDs. A full systemic run using the script from method 3 
crashes with "too many nested links" error (or something similar).

When I run gluster volume heal volname info, I get 700K+ GFIDs. Oh. gluster 
3.8.4 on Centos 7.3

Should I just remove the contents of the .glusterfs folder on both and restart 
gluster and run a ls/stat on every file?


When I run a heal, it no longer has a decreasing number of files to heal so 
that's an improvement over the last 2-3 weeks :-)

On Tue, 2017-10-17 at 14:34 +0000, Matt Waymack wrote:

Attached is the heal log for the volume as well as the shd log.







Run these commands on all the bricks of the replica pair to get the attrs set 
on the backend.







[root@tpc-cent-glus1-081017 ~]# getfattr -d -e hex -m . 
/exp/b1/gv0/.glusterfs/10/86/108694db-c039-4b7c-bd3d-ad6a15d811a2

getfattr: Removing leading '/' from absolute path names

# file: exp/b1/gv0/.glusterfs/10/86/108694db-c039-4b7c-bd3d-ad6a15d811a2

security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000

trusted.afr.dirty=0x000000000000000000000000

trusted.afr.gv0-client-2=0x000000000000000100000000

trusted.gfid=0x108694dbc0394b7cbd3dad6a15d811a2

trusted.gfid2path.9a2f5ada22eb9c45=0x38633262623330322d323466332d346463622d393630322d3839356136396461363131662f435f564f4c2d623030312d693637342d63642d63772e6d6435



[root@tpc-cent-glus2-081017 ~]# getfattr -d -e hex -m . 
/exp/b1/gv0/.glusterfs/10/86/108694db-c039-4b7c-bd3d-ad6a15d811a2

getfattr: Removing leading '/' from absolute path names

# file: exp/b1/gv0/.glusterfs/10/86/108694db-c039-4b7c-bd3d-ad6a15d811a2

security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000

trusted.afr.dirty=0x000000000000000000000000

trusted.afr.gv0-client-2=0x000000000000000100000000

trusted.gfid=0x108694dbc0394b7cbd3dad6a15d811a2

trusted.gfid2path.9a2f5ada22eb9c45=0x38633262623330322d323466332d346463622d393630322d3839356136396461363131662f435f564f4c2d623030312d693637342d63642d63772e6d6435



[root@tpc-arbiter1-100617 ~]# getfattr -d -e hex -m . 
/exp/b1/gv0/.glusterfs/10/86/108694db-c039-4b7c-bd3d-ad6a15d811a2

getfattr: /exp/b1/gv0/.glusterfs/10/86/108694db-c039-4b7c-bd3d-ad6a15d811a2: No 
such file or directory





[root@tpc-cent-glus1-081017 ~]# getfattr -d -e hex -m . 
/exp/b4/gv0/.glusterfs/e0/c5/e0c56bf7-8bfe-46ca-bde1-e46b92d33df3

getfattr: Removing leading '/' from absolute path names

# file: exp/b4/gv0/.glusterfs/e0/c5/e0c56bf7-8bfe-46ca-bde1-e46b92d33df3

security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000

trusted.afr.dirty=0x000000000000000000000000

trusted.afr.gv0-client-11=0x000000000000000100000000

trusted.gfid=0xe0c56bf78bfe46cabde1e46b92d33df3

trusted.gfid2path.be3ba24c3ef95ff2=0x63323366353834652d353566652d343033382d393131622d3866373063656334616136662f435f564f4c2d623030332d69313331342d63642d636d2d63722e6d6435



[root@tpc-cent-glus2-081017 ~]# getfattr -d -e hex -m . 
/exp/b4/gv0/.glusterfs/e0/c5/e0c56bf7-8bfe-46ca-bde1-e46b92d33df3

getfattr: Removing leading '/' from absolute path names

# file: exp/b4/gv0/.glusterfs/e0/c5/e0c56bf7-8bfe-46ca-bde1-e46b92d33df3

security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000

trusted.afr.dirty=0x000000000000000000000000

trusted.afr.gv0-client-11=0x000000000000000100000000

trusted.gfid=0xe0c56bf78bfe46cabde1e46b92d33df3

trusted.gfid2path.be3ba24c3ef95ff2=0x63323366353834652d353566652d343033382d393131622d3866373063656334616136662f435f564f4c2d623030332d69313331342d63642d636d2d63722e6d6435



[root@tpc-arbiter1-100617 ~]# getfattr -d -e hex -m . 
/exp/b4/gv0/.glusterfs/e0/c5/e0c56bf7-8bfe-46ca-bde1-e46b92d33df3

getfattr: /exp/b4/gv0/.glusterfs/e0/c5/e0c56bf7-8bfe-46ca-bde1-e46b92d33df3: No 
such file or directory







And the output of "gluster volume heal <volname> info split-brain"







[root@tpc-cent-glus1-081017 ~]# gluster volume heal gv0 info split-brain

Brick tpc-cent-glus1-081017:/exp/b1/gv0

Status: Connected

Number of entries in split-brain: 0



Brick tpc-cent-glus2-081017:/exp/b1/gv0

Status: Connected

Number of entries in split-brain: 0



Brick tpc-arbiter1-100617:/exp/b1/gv0

Status: Connected

Number of entries in split-brain: 0



Brick tpc-cent-glus1-081017:/exp/b2/gv0

Status: Connected

Number of entries in split-brain: 0



Brick tpc-cent-glus2-081017:/exp/b2/gv0

Status: Connected

Number of entries in split-brain: 0



Brick tpc-arbiter1-100617:/exp/b2/gv0

Status: Connected

Number of entries in split-brain: 0



Brick tpc-cent-glus1-081017:/exp/b3/gv0

Status: Connected

Number of entries in split-brain: 0



Brick tpc-cent-glus2-081017:/exp/b3/gv0

Status: Connected

Number of entries in split-brain: 0



Brick tpc-arbiter1-100617:/exp/b3/gv0

Status: Connected

Number of entries in split-brain: 0



Brick tpc-cent-glus1-081017:/exp/b4/gv0

Status: Connected

Number of entries in split-brain: 0



Brick tpc-cent-glus2-081017:/exp/b4/gv0

Status: Connected

Number of entries in split-brain: 0



Brick tpc-arbiter1-100617:/exp/b4/gv0

Status: Connected

Number of entries in split-brain: 0



-Matt



From: Karthik Subrahmanya [mailto:ksubr...@redhat.com]

Sent: Tuesday, October 17, 2017 1:26 AM

To: Matt Waymack <mwaym...@nsgdv.com<mailto:mwaym...@nsgdv.com>>

Cc: gluster-users <Gluster-users@gluster.org<mailto:Gluster-users@gluster.org>>

Subject: Re: [Gluster-users] gfid entries in volume heal info that do not heal



Hi Matt,



Run these commands on all the bricks of the replica pair to get the attrs set 
on the backend.



On the bricks of first replica set:

getfattr -d -e hex -m . <brick 
path>/.glusterfs/10/86/108694db-c039-4b7c-bd3d-ad6a15d811a2

On the fourth replica set:

getfattr -d -e hex -m . <brick 
path>/.glusterfs/e0/c5/e0c56bf7-8bfe-46ca-bde1-e46b92d33df3

Also run the "gluster volume heal <volname>" once and send the shd log.

And the output of "gluster volume heal <volname> info split-brain"

Regards,

Karthik



On Mon, Oct 16, 2017 at 9:51 PM, Matt Waymack <mailto:mwaym...@nsgdv.com> wrote:

OK, so here’s my output of the volume info and the heal info. I have not yet 
tracked down physical location of these files, any tips to finding them would 
be appreciated, but I’m definitely just wanting them gone.  I forgot to mention 
earlier that the cluster is running 3.12 and was upgraded from 3.10; these 
files were likely stuck like this when it was on 3.10.



[root@tpc-cent-glus1-081017 ~]# gluster volume info gv0



Volume Name: gv0

Type: Distributed-Replicate

Volume ID: 8f07894d-e3ab-4a65-bda1-9d9dd46db007

Status: Started

Snapshot Count: 0

Number of Bricks: 4 x (2 + 1) = 12

Transport-type: tcp

Bricks:

Brick1: tpc-cent-glus1-081017:/exp/b1/gv0

Brick2: tpc-cent-glus2-081017:/exp/b1/gv0

Brick3: tpc-arbiter1-100617:/exp/b1/gv0 (arbiter)

Brick4: tpc-cent-glus1-081017:/exp/b2/gv0

Brick5: tpc-cent-glus2-081017:/exp/b2/gv0

Brick6: tpc-arbiter1-100617:/exp/b2/gv0 (arbiter)

Brick7: tpc-cent-glus1-081017:/exp/b3/gv0

Brick8: tpc-cent-glus2-081017:/exp/b3/gv0

Brick9: tpc-arbiter1-100617:/exp/b3/gv0 (arbiter)

Brick10: tpc-cent-glus1-081017:/exp/b4/gv0

Brick11: tpc-cent-glus2-081017:/exp/b4/gv0

Brick12: tpc-arbiter1-100617:/exp/b4/gv0 (arbiter)

Options Reconfigured:

nfs.disable: on

transport.address-family: inet



[root@tpc-cent-glus1-081017 ~]# gluster volume heal gv0 info

Brick tpc-cent-glus1-081017:/exp/b1/gv0

<gfid:108694db-c039-4b7c-bd3d-ad6a15d811a2>

<gfid:6d5ade20-8996-4de2-95d5-20ef98004742>

<gfid:bc6cdc3d-5c46-4597-a7eb-282b21e9bdd5>

<gfid:3c2ff4d1-3662-4214-8f21-f8f47dbdbf06>

<gfid:053e2fb1-bc89-476e-a529-90dffa39963c>



<removed to save scrolling>



Status: Connected

Number of entries: 118



Brick tpc-cent-glus2-081017:/exp/b1/gv0

<gfid:108694db-c039-4b7c-bd3d-ad6a15d811a2>

<gfid:6d5ade20-8996-4de2-95d5-20ef98004742>

<gfid:bc6cdc3d-5c46-4597-a7eb-282b21e9bdd5>

<gfid:3c2ff4d1-3662-4214-8f21-f8f47dbdbf06>

<gfid:053e2fb1-bc89-476e-a529-90dffa39963c>



<removed to save scrolling>



Status: Connected

Number of entries: 118



Brick tpc-arbiter1-100617:/exp/b1/gv0

Status: Connected

Number of entries: 0



Brick tpc-cent-glus1-081017:/exp/b2/gv0

Status: Connected

Number of entries: 0



Brick tpc-cent-glus2-081017:/exp/b2/gv0

Status: Connected

Number of entries: 0



Brick tpc-arbiter1-100617:/exp/b2/gv0

Status: Connected

Number of entries: 0



Brick tpc-cent-glus1-081017:/exp/b3/gv0

Status: Connected

Number of entries: 0



Brick tpc-cent-glus2-081017:/exp/b3/gv0

Status: Connected

Number of entries: 0



Brick tpc-arbiter1-100617:/exp/b3/gv0

Status: Connected

Number of entries: 0



Brick tpc-cent-glus1-081017:/exp/b4/gv0

<gfid:e0c56bf7-8bfe-46ca-bde1-e46b92d33df3>

<gfid:6f0a0549-8669-46de-8823-d6677fdca8e3>

<gfid:d0e2fb2a-21b5-4ea8-a578-0801280b2530>

<gfid:48bff79c-7bc2-4dc5-8b7f-4401b27fdf5a>

<gfid:5902593d-a059-4ec7-b18b-7a2ab5c49a50>

<gfid:cb821178-4621-4fcf-90f3-5b5c2ad7f756>

<gfid:6aea0805-8dd1-437c-b922-52c9d11e488a>

<gfid:f4076a37-2e2f-4d7a-90dd-0a3560a4bdff>

<gfid:51ff7386-a550-4971-957c-b42c4d915e9f>

<gfid:4309f7b8-3a9d-4bc8-ba2b-799f8a02611b>

<gfid:b76746ec-6d7d-4ea3-a001-c96672a4d47e>

<gfid:f8de26e7-d17d-41e0-adcd-e7d24ed74ac8>

<gfid:8e2c4540-e0b4-4006-bb5d-aacd57f8f21b>

<gfid:183ebefb-b827-4cbc-b42b-bfd136d5cabb>

<gfid:88d492fe-bfbd-4463-ba55-0582d0ad671b>

<gfid:e3a6c068-d48b-44b5-9480-245a69648a9b>

<gfid:4aab9c6a-22d2-469a-a688-7b0a8784f4b1>

<gfid:c6d182f2-7e46-4502-a0d2-b92824caa4de>

<gfid:eb546f93-e9d6-4a59-ac35-6139b5c40919>

<gfid:6043e381-7edf-4569-bc37-e27dd13549d2>

<gfid:52090dc7-7a3c-40f9-9c54-3395f5158eab>

<gfid:ecceee46-4310-421e-b56e-5fe46bd5263c>

<gfid:354aea57-4b40-47fc-8ede-1d7e3b7501b4>

<gfid:d43284d4-86aa-42ff-98b8-f6340b407d9d>

Status: Connected

Number of entries: 24



Brick tpc-cent-glus2-081017:/exp/b4/gv0

<gfid:e0c56bf7-8bfe-46ca-bde1-e46b92d33df3>

<gfid:6f0a0549-8669-46de-8823-d6677fdca8e3>

<gfid:d0e2fb2a-21b5-4ea8-a578-0801280b2530>

<gfid:48bff79c-7bc2-4dc5-8b7f-4401b27fdf5a>

<gfid:5902593d-a059-4ec7-b18b-7a2ab5c49a50>

<gfid:cb821178-4621-4fcf-90f3-5b5c2ad7f756>

<gfid:6aea0805-8dd1-437c-b922-52c9d11e488a>

<gfid:f4076a37-2e2f-4d7a-90dd-0a3560a4bdff>

<gfid:51ff7386-a550-4971-957c-b42c4d915e9f>

<gfid:4309f7b8-3a9d-4bc8-ba2b-799f8a02611b>

<gfid:b76746ec-6d7d-4ea3-a001-c96672a4d47e>

<gfid:f8de26e7-d17d-41e0-adcd-e7d24ed74ac8>

<gfid:8e2c4540-e0b4-4006-bb5d-aacd57f8f21b>

<gfid:183ebefb-b827-4cbc-b42b-bfd136d5cabb>

<gfid:88d492fe-bfbd-4463-ba55-0582d0ad671b>

<gfid:e3a6c068-d48b-44b5-9480-245a69648a9b>

<gfid:4aab9c6a-22d2-469a-a688-7b0a8784f4b1>

<gfid:c6d182f2-7e46-4502-a0d2-b92824caa4de>

<gfid:eb546f93-e9d6-4a59-ac35-6139b5c40919>

<gfid:6043e381-7edf-4569-bc37-e27dd13549d2>

<gfid:52090dc7-7a3c-40f9-9c54-3395f5158eab>

<gfid:ecceee46-4310-421e-b56e-5fe46bd5263c>

<gfid:354aea57-4b40-47fc-8ede-1d7e3b7501b4>

<gfid:d43284d4-86aa-42ff-98b8-f6340b407d9d>

Status: Connected

Number of entries: 24



Brick tpc-arbiter1-100617:/exp/b4/gv0

Status: Connected

Number of entries: 0



Thank you for your help!



From: Karthik Subrahmanya [mailto:mailto:ksubr...@redhat.com]

Sent: Monday, October 16, 2017 10:27 AM

To: Matt Waymack <mailto:mwaym...@nsgdv.com>

Cc: gluster-users <mailto:Gluster-users@gluster.org>

Subject: Re: [Gluster-users] gfid entries in volume heal info that do not heal



Hi Matt,



The files might be in split brain. Could you please send the outputs of these?

gluster volume info <volname>

gluster volume heal <volname> info

And also the getfattr output of the files which are in the heal info output 
from all the bricks of that replica pair.

getfattr -d -e hex -m . <file path on brick>



Thanks &  Regards

Karthik



On 16-Oct-2017 8:16 PM, "Matt Waymack" <mailto:mwaym...@nsgdv.com> wrote:

Hi all,



I have a volume where the output of volume heal info shows several gfid entries 
to be healed, but they’ve been there for weeks and have not healed.  Any normal 
file that shows up on the heal info does get healed as expected, but these gfid 
entries do not.  Is there any way to remove these orphaned entries from the 
volume so they are no longer stuck in the heal process?



Thank you!



_______________________________________________

Gluster-users mailing list

mailto:Gluster-users@gluster.org

http://lists.gluster.org/mailman/listinfo/gluster-users





_______________________________________________

Gluster-users mailing list

Gluster-users@gluster.org<mailto:Gluster-users@gluster.org>

http://lists.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] gfid entries in volume heal info that do not heal

Reply via email to