Re: [Gluster-users] Gfid mismatch detected - but no split brain - how to solve?

2020-06-01 Thread Karthik Subrahmanya
Hi,

I am assuming that you are using one of the maintained versions of gluster.

GFID split-brains can be resolved using one of the methods in the
split-brain resolution CLI as explained in the section "3. Resolution of
split-brain using gluster CLI" of
https://docs.gluster.org/en/latest/Troubleshooting/resolving-splitbrain/.

The things to be noted here while using this CLI for resolving gfid
split-brains are:
- You can not use the GFID of the file as an argument with any of the CLI
options to resolve GFID split-brain. It should be the absolute path as seen
from the mount point to the file considered as source.
- With source-brick option there is no way to resolve all the GFID
split-brain in one shot by not specifying any file-path in the CLI as done
while resolving data or metadata split-brain. For each file in GFID
split-brain, run the CLI with the policy you want to use.
- Resolving directory GFID split-brain using CLI with the "source-brick"
option in a "distributed-replicated" volume needs to be done on all the
volumes explicitly if the file is in gfid split-brain on multiple
subvolumes. Since directories get created on all the subvolumes, using one
particular brick as source for directory GFID split-brain, heal the
directories for that subvolume. In this case, other subvolumes must be
healed using the brick which has the same GFID as that of the previous
brick which was used as source for healing other subvolume.
Regards,
Karthik

On Sat, May 30, 2020 at 3:39 AM lejeczek  wrote:

> hi Guys
>
> I'm seeing "Gfid mismatch detected" in the logs but no split
> brain indicated (4-way replica)
>
> Brick
> swir-ring8:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER.USER-HOME
> Status: Connected
> Total Number of entries: 22
> Number of entries in heal pending: 22
> Number of entries in split-brain: 0
> Number of entries possibly healing: 0
>
> Brick
> whale-ring8:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER.USER-HOME
> Status: Connected
> Total Number of entries: 22
> Number of entries in heal pending: 22
> Number of entries in split-brain: 0
> Number of entries possibly healing: 0
>
> Brick
> rider-ring8:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER.USER-HOME
> Status: Connected
> Total Number of entries: 0
> Number of entries in heal pending: 0
> Number of entries in split-brain: 0
> Number of entries possibly healing: 0
>
> Brick dzien:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER.USER-HOME
> Status: Connected
> Total Number of entries: 10
> Number of entries in heal pending: 10
> Number of entries in split-brain: 0
> Number of entries possibly healing: 0
>
> On swir-ring8:
> ...
> The message "E [MSGID: 108008]
> [afr-self-heal-common.c:384:afr_gfid_split_brain_source]
> 0-USER-HOME-replicate-0: Gfid mismatch detected for
> /lock_file>,
> 37b2456f-5216-4679-ac5c-4908b24f895a on USER-HOME-client-15
> and ba8f87ed-9bf3-404e-8d67-2631923e1645 on
> USER-HOME-client-13." repeated 2 times between [2020-05-29
> 21:47:49.034935] and [2020-05-29 21:47:49.079480]
> The message "E [MSGID: 108008]
> [afr-self-heal-common.c:384:afr_gfid_split_brain_source]
> 0-USER-HOME-replicate-0: Gfid mismatch detected for
> /t>,
> d7a4ed01-139b-4df3-8070-31bd620a6f15 on USER-HOME-client-15
> and d794b6ba-2a1d-4043-bb31-b98b22692763 on
> USER-HOME-client-13." repeated 2 times between [2020-05-29
> 21:47:49.126173] and [2020-05-29 21:47:49.155432]
> The message "E [MSGID: 108008]
> [afr-self-heal-common.c:384:afr_gfid_split_brain_source]
> 0-USER-HOME-replicate-0: Gfid mismatch detected for
> /Tables.docx>,
> 344febd8-c89c-4bf3-8ad8-6494c2189c43 on USER-HOME-client-15
> and 48d5b12b-03f4-46bf-bed1-9f8f88815615 on
> USER-HOME-client-13." repeated 2 times between [2020-05-29
> 21:47:49.194061] and [2020-05-29 21:47:49.239896]
> The message "E [MSGID: 108008]
> [afr-self-heal-entry.c:257:afr_selfheal_detect_gfid_and_type_mismatch]
> 0-USER-HOME-replicate-0: Skipping conservative merge on the
> file." repeated 8 times between [2020-05-29 21:47:49.037812]
> and [2020-05-29 21:47:49.240423]
> ...
>
> On whale-ring8:
> ...
> The message "E [MSGID: 108008]
> [afr-self-heal-common.c:384:afr_gfid_split_brain_source]
> 0-USER-HOME-replicate-0: Gfid mismatch detected for
> /pcs>,
> a83d0e5f-ef3a-40ab-be7b-784538d150be on USER-HOME-client-15
> and 89af3d31-81fa-4242-b8f7-0f49fd5fe57b on
> USER-HOME-client-13." repeated 2 times between [2020-05-29
> 21:45:46.152052] and [2020-05-29 21:45:46.422393]
> The message "E [MSGID: 108008]
> [afr-self-heal-common.c:384:afr_gfid_split_brain_source]
> 0-USER-HOME-replicate-0: Gfid mismatch detected for
> /history_database>,
> 81ebb0d5-264a-4eba-984a-e18673b43826 on USER-HOME-client-15
> and 2498a303-8937-43c3-939e-5e1d786b07fa on
> USER-HOME-client-13." repeated 2 times between [2020-05-29
> 21:45:46.167704] and [2020-05-29 21:45:46.437702]
> The message "E [MSGID: 108008]
> [afr-self-heal-common.c:384:afr_gfid_split_brain_source]
> 0-USER-HOME-replicate-0: Gfid mismatch detected for
> /client-state>,
> 

[Gluster-users] Gfid mismatch detected - but no split brain - how to solve?

2020-05-29 Thread lejeczek
hi Guys

I'm seeing "Gfid mismatch detected" in the logs but no split
brain indicated (4-way replica)

Brick
swir-ring8:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER.USER-HOME
Status: Connected
Total Number of entries: 22
Number of entries in heal pending: 22
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick
whale-ring8:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER.USER-HOME
Status: Connected
Total Number of entries: 22
Number of entries in heal pending: 22
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick
rider-ring8:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER.USER-HOME
Status: Connected
Total Number of entries: 0
Number of entries in heal pending: 0
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick dzien:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER.USER-HOME
Status: Connected
Total Number of entries: 10
Number of entries in heal pending: 10
Number of entries in split-brain: 0
Number of entries possibly healing: 0

On swir-ring8:
...
The message "E [MSGID: 108008]
[afr-self-heal-common.c:384:afr_gfid_split_brain_source]
0-USER-HOME-replicate-0: Gfid mismatch detected for
/lock_file>,
37b2456f-5216-4679-ac5c-4908b24f895a on USER-HOME-client-15
and ba8f87ed-9bf3-404e-8d67-2631923e1645 on
USER-HOME-client-13." repeated 2 times between [2020-05-29
21:47:49.034935] and [2020-05-29 21:47:49.079480]
The message "E [MSGID: 108008]
[afr-self-heal-common.c:384:afr_gfid_split_brain_source]
0-USER-HOME-replicate-0: Gfid mismatch detected for
/t>,
d7a4ed01-139b-4df3-8070-31bd620a6f15 on USER-HOME-client-15
and d794b6ba-2a1d-4043-bb31-b98b22692763 on
USER-HOME-client-13." repeated 2 times between [2020-05-29
21:47:49.126173] and [2020-05-29 21:47:49.155432]
The message "E [MSGID: 108008]
[afr-self-heal-common.c:384:afr_gfid_split_brain_source]
0-USER-HOME-replicate-0: Gfid mismatch detected for
/Tables.docx>,
344febd8-c89c-4bf3-8ad8-6494c2189c43 on USER-HOME-client-15
and 48d5b12b-03f4-46bf-bed1-9f8f88815615 on
USER-HOME-client-13." repeated 2 times between [2020-05-29
21:47:49.194061] and [2020-05-29 21:47:49.239896]
The message "E [MSGID: 108008]
[afr-self-heal-entry.c:257:afr_selfheal_detect_gfid_and_type_mismatch]
0-USER-HOME-replicate-0: Skipping conservative merge on the
file." repeated 8 times between [2020-05-29 21:47:49.037812]
and [2020-05-29 21:47:49.240423]
...

On whale-ring8:
...
The message "E [MSGID: 108008]
[afr-self-heal-common.c:384:afr_gfid_split_brain_source]
0-USER-HOME-replicate-0: Gfid mismatch detected for
/pcs>,
a83d0e5f-ef3a-40ab-be7b-784538d150be on USER-HOME-client-15
and 89af3d31-81fa-4242-b8f7-0f49fd5fe57b on
USER-HOME-client-13." repeated 2 times between [2020-05-29
21:45:46.152052] and [2020-05-29 21:45:46.422393]
The message "E [MSGID: 108008]
[afr-self-heal-common.c:384:afr_gfid_split_brain_source]
0-USER-HOME-replicate-0: Gfid mismatch detected for
/history_database>,
81ebb0d5-264a-4eba-984a-e18673b43826 on USER-HOME-client-15
and 2498a303-8937-43c3-939e-5e1d786b07fa on
USER-HOME-client-13." repeated 2 times between [2020-05-29
21:45:46.167704] and [2020-05-29 21:45:46.437702]
The message "E [MSGID: 108008]
[afr-self-heal-common.c:384:afr_gfid_split_brain_source]
0-USER-HOME-replicate-0: Gfid mismatch detected for
/client-state>,
fe86c057-c74d-417f-9c2c-6e6eb9778851 on USER-HOME-client-15
and a66f2714-c2a0-4bdc-8786-ad5b93e0e988 on
USER-HOME-client-13." repeated 2 times between [2020-05-29
21:45:46.144242] and [2020-05-29 21:45:46.442526]
The message "E [MSGID: 108008]
[afr-self-heal-common.c:384:afr_gfid_split_brain_source]
0-USER-HOME-replicate-0: Gfid mismatch detected for
/history_database.1>,
9826d8ad-fecc-4dd7-bc1f-87d0eff23d73 on USER-HOME-client-15
and 81ebb0d5-264a-4eba-984a-e18673b43826 on
USER-HOME-client-13." repeated 3 times between [2020-05-29
21:45:46.162016] and [2020-05-29 21:45:46.476935]
...

On rider-ring8:
...
2020-05-29 21:46:53.122929] E [MSGID: 114031]
[client-rpc-fops_v2.c:1548:client4_0_xattrop_cbk]
0-QEMU_VMs-client-3: remote operation failed. Path:

(6f01098f-e8db-4f63-a661-86b4d02d937f) [Permission denied]
[2020-05-29 21:46:53.124148] E [MSGID: 114031]
[client-rpc-fops_v2.c:1548:client4_0_xattrop_cbk]
0-QEMU_VMs-client-4: remote operation failed. Path:

(6f01098f-e8db-4f63-a661-86b4d02d937f) [Permission denied]
[2020-05-29 21:46:53.133566] I [MSGID: 108026]
[afr-self-heal-entry.c:898:afr_selfheal_entry_do]
0-QEMU_VMs-replicate-0: performing entry selfheal on
e0121f76-2452-44dc-b1a6-82b46cc9ec79
[2020-05-29 21:46:53.145991] E [MSGID: 114031]
[client-rpc-fops_v2.c:1548:client4_0_xattrop_cbk]
0-QEMU_VMs-client-3: remote operation failed. Path:

(3f0239ac-e027-4a0c-b271-431e76ad97b1) [Permission denied]
[2020-05-29 21:46:53.147110] E [MSGID: 114031]
[client-rpc-fops_v2.c:1548:client4_0_xattrop_cbk]
0-QEMU_VMs-client-4: remote operation failed. Path:

(3f0239ac-e027-4a0c-b271-431e76ad97b1) [Permission denied]

The most recent data I'm 100% certain is on rider-ring8.
Any expert could