Hi Richard, Thanks for the informations. As you said there is gfid mismatch for the file. On brick-1 & brick-2 the gfids are same & on brick-3 the gfid is different. This is not considered as split-brain because we have two good copies here. Gluster 3.10 does not have a method to resolve this situation other than the manual intervention [1]. Basically what you need to do is remove the file and the gfid hardlink from brick-3 (considering brick-3 entry as bad). Then when you do a lookup for the file from mount it will recreate the entry on the other brick.
Form 3.12 we have methods to resolve this situation with the cli option [2] and with favorite-child-policy [3]. For the time being you can use [1] to resolve this and if you can consider upgrading to 3.12 that would give you options to handle these scenarios. [1] http://docs.gluster.org/en/latest/Troubleshooting/split-brain/#fixing-directory-entry-split-brain [2] https://review.gluster.org/#/c/17485/ [3] https://review.gluster.org/#/c/16878/ HTH, Karthik On Thu, Oct 26, 2017 at 12:40 PM, Richard Neuboeck <h...@tbi.univie.ac.at> wrote: > Hi Karthik, > > thanks for taking a look at this. I'm not working with gluster long > enough to make heads or tails out of the logs. The logs are attached to > this mail and here is the other information: > > # gluster volume info home > > Volume Name: home > Type: Replicate > Volume ID: fe6218ae-f46b-42b3-a467-5fc6a36ad48a > Status: Started > Snapshot Count: 1 > Number of Bricks: 1 x 3 = 3 > Transport-type: tcp > Bricks: > Brick1: sphere-six:/srv/gluster_home/brick > Brick2: sphere-five:/srv/gluster_home/brick > Brick3: sphere-four:/srv/gluster_home/brick > Options Reconfigured: > features.barrier: disable > cluster.quorum-type: auto > cluster.server-quorum-type: server > nfs.disable: on > performance.readdir-ahead: on > transport.address-family: inet > features.cache-invalidation: on > features.cache-invalidation-timeout: 600 > performance.stat-prefetch: on > performance.cache-samba-metadata: on > performance.cache-invalidation: on > performance.md-cache-timeout: 600 > network.inode-lru-limit: 90000 > performance.cache-size: 1GB > performance.client-io-threads: on > cluster.lookup-optimize: on > cluster.readdir-optimize: on > features.quota: on > features.inode-quota: on > features.quota-deem-statfs: on > cluster.server-quorum-ratio: 51% > > > [root@sphere-four ~]# getfattr -d -e hex -m . > /srv/gluster_home/brick/romanoch/.mozilla/firefox/vzzqqxrm.default- > 1396429081309/sessionstore-backups/recovery.baklz4 > getfattr: Removing leading '/' from absolute path names > # file: > srv/gluster_home/brick/romanoch/.mozilla/firefox/vzzqqxrm.default- > 1396429081309/sessionstore-backups/recovery.baklz4 > security.selinux=0x73797374656d5f753a6f626a6563 > 745f723a756e6c6162656c65645f743a733000 > trusted.afr.dirty=0x000000000000000000000000 > trusted.bit-rot.version=0x020000000000000059df20a40006f989 > trusted.gfid=0xda1c94b1643544b18d5b6f4654f60bf5 > trusted.glusterfs.quota.48e9eea6-cda6-4e53-bb4a-72059debf4c2.contri.1= > 0x0000000000009a000000000000000001 > trusted.pgfid.48e9eea6-cda6-4e53-bb4a-72059debf4c2=0x00000001 > > [root@sphere-five ~]# getfattr -d -e hex -m . > /srv/gluster_home/brick/romanoch/.mozilla/firefox/vzzqqxrm.default- > 1396429081309/sessionstore-backups/recovery.baklz4 > getfattr: Removing leading '/' from absolute path names > # file: > srv/gluster_home/brick/romanoch/.mozilla/firefox/vzzqqxrm.default- > 1396429081309/sessionstore-backups/recovery.baklz4 > security.selinux=0x73797374656d5f753a6f626a6563 > 745f723a756e6c6162656c65645f743a733000 > trusted.afr.dirty=0x000000000000000000000000 > trusted.afr.home-client-4=0x000000010000000100000000 > trusted.bit-rot.version=0x020000000000000059df1f310006ce63 > trusted.gfid=0xea8ecfd195fd4e48b994fd0a2da226f9 > trusted.glusterfs.quota.48e9eea6-cda6-4e53-bb4a-72059debf4c2.contri.1= > 0x0000000000009a000000000000000001 > trusted.pgfid.48e9eea6-cda6-4e53-bb4a-72059debf4c2=0x00000001 > > [root@sphere-six ~]# getfattr -d -e hex -m . > /srv/gluster_home/brick/romanoch/.mozilla/firefox/vzzqqxrm.default- > 1396429081309/sessionstore-backups/recovery.baklz4 > getfattr: Removing leading '/' from absolute path names > # file: > srv/gluster_home/brick/romanoch/.mozilla/firefox/vzzqqxrm.default- > 1396429081309/sessionstore-backups/recovery.baklz4 > security.selinux=0x73797374656d5f753a6f626a6563 > 745f723a756e6c6162656c65645f743a733000 > trusted.afr.dirty=0x000000000000000000000000 > trusted.afr.home-client-4=0x000000010000000100000000 > trusted.bit-rot.version=0x020000000000000059df11cd000548ec > trusted.gfid=0xea8ecfd195fd4e48b994fd0a2da226f9 > trusted.glusterfs.quota.48e9eea6-cda6-4e53-bb4a-72059debf4c2.contri.1= > 0x0000000000009a000000000000000001 > trusted.pgfid.48e9eea6-cda6-4e53-bb4a-72059debf4c2=0x00000001 > > Cheers > Richard > > On 26.10.17 07:41, Karthik Subrahmanya wrote: > > HeyRichard, > > > > Could you share the following informations please? > > 1. gluster volume info <volname> > > 2. getfattr output of that file from all the bricks > > getfattr -d -e hex -m . <brickpath/filepath> > > 3. glustershd & glfsheal logs > > > > Regards, > > Karthik > > > > On Thu, Oct 26, 2017 at 10:21 AM, Amar Tumballi <atumb...@redhat.com > > <mailto:atumb...@redhat.com>> wrote: > > > > On a side note, try recently released health report tool, and see if > > it does diagnose any issues in setup. Currently you may have to run > > it in all the three machines. > > > > > > > > On 26-Oct-2017 6:50 AM, "Amar Tumballi" <atumb...@redhat.com > > <mailto:atumb...@redhat.com>> wrote: > > > > Thanks for this report. This week many of the developers are at > > Gluster Summit in Prague, will be checking this and respond next > > week. Hope that's fine. > > > > Thanks, > > Amar > > > > > > On 25-Oct-2017 3:07 PM, "Richard Neuboeck" > > <h...@tbi.univie.ac.at <mailto:h...@tbi.univie.ac.at>> wrote: > > > > Hi Gluster Gurus, > > > > I'm using a gluster volume as home for our users. The volume > is > > replica 3, running on CentOS 7, gluster version 3.10 > > (3.10.6-1.el7.x86_64). Clients are running Fedora 26 and also > > gluster 3.10 (3.10.6-3.fc26.x86_64). > > > > During the data backup I got an I/O error on one file. > Manually > > checking for this file on a client confirms this: > > > > ls -l > > romanoch/.mozilla/firefox/vzzqqxrm.default- > 1396429081309/sessionstore-backups/ > > ls: cannot access > > 'romanoch/.mozilla/firefox/vzzqqxrm.default- > 1396429081309/sessionstore-backups/recovery.ba > > <http://recovery.ba>klz4': > > Input/output error > > total 2015 > > -rw-------. 1 romanoch tbi 998211 Sep 15 18:44 previous.js > > -rw-------. 1 romanoch tbi 65222 Oct 17 17:57 > previous.jsonlz4 > > -rw-------. 1 romanoch tbi 149161 Oct 1 13:46 recovery.bak > > -?????????? ? ? ? ? ? > recovery.baklz4 > > > > Out of curiosity I checked all the bricks for this file. It's > > present there. Making a checksum shows that the file is > > different on > > one of the three replica servers. > > > > Querying healing information shows that the file should be > > healed: > > # gluster volume heal home info > > Brick sphere-six:/srv/gluster_home/brick > > /romanoch/.mozilla/firefox/vzzqqxrm.default- > 1396429081309/sessionstore-backups/recovery.ba > > <http://recovery.ba>klz4 > > > > Status: Connected > > Number of entries: 1 > > > > Brick sphere-five:/srv/gluster_home/brick > > /romanoch/.mozilla/firefox/vzzqqxrm.default- > 1396429081309/sessionstore-backups/recovery.ba > > <http://recovery.ba>klz4 > > > > Status: Connected > > Number of entries: 1 > > > > Brick sphere-four:/srv/gluster_home/brick > > Status: Connected > > Number of entries: 0 > > > > Manually triggering heal doesn't report an error but also > > does not > > heal the file. > > # gluster volume heal home > > Launching heal operation to perform index self heal on > > volume home > > has been successful > > > > Same with a full heal > > # gluster volume heal home full > > Launching heal operation to perform full self heal on volume > > home > > has been successful > > > > According to the split brain query that's not the problem: > > # gluster volume heal home info split-brain > > Brick sphere-six:/srv/gluster_home/brick > > Status: Connected > > Number of entries in split-brain: 0 > > > > Brick sphere-five:/srv/gluster_home/brick > > Status: Connected > > Number of entries in split-brain: 0 > > > > Brick sphere-four:/srv/gluster_home/brick > > Status: Connected > > Number of entries in split-brain: 0 > > > > > > I have no idea why this situation arose in the first place > > and also > > no idea as how to solve this problem. I would highly > > appreciate any > > helpful feedback I can get. > > > > The only mention in the logs matching this file is a rename > > operation: > > /var/log/glusterfs/bricks/srv-gluster_home-brick.log:[2017- > 10-23 > > 09:19:11.561661] I [MSGID: 115061] > > [server-rpc-fops.c:1022:server_rename_cbk] 0-home-server: > > 5266153: > > RENAME > > /romanoch/.mozilla/firefox/vzzqqxrm.default- > 1396429081309/sessionstore-backups/recovery.jsonlz4 > > (48e9eea6-cda6-4e53-bb4a-72059debf4c2/recovery.jsonlz4) -> > > /romanoch/.mozilla/firefox/vzzqqxrm.default- > 1396429081309/sessionstore-backups/recovery.ba > > <http://recovery.ba>klz4 > > (48e9eea6-cda6-4e53-bb4a-72059debf4c2/recovery.baklz4), > client: > > romulus.tbi.univie.ac.at-11894-2017/10/18-07:06:07: > 206366-home-client-3-0-0, > > error-xlator: home-posix [No data available] > > > > I enabled directory quotas the same day this problem showed > > up but > > I'm not sure how quotas could have an effect like this > > (maybe unless > > the limit is reached but that's also not the case). > > > > Thanks again if anyone as an idea. > > Cheers > > Richard > > -- > > /dev/null > > > > > > _______________________________________________ > > Gluster-users mailing list > > Gluster-users@gluster.org <mailto:Gluster-users@gluster.org> > > http://lists.gluster.org/mailman/listinfo/gluster-users > > <http://lists.gluster.org/mailman/listinfo/gluster-users> > > > > > > _______________________________________________ > > Gluster-users mailing list > > Gluster-users@gluster.org <mailto:Gluster-users@gluster.org> > > http://lists.gluster.org/mailman/listinfo/gluster-users > > <http://lists.gluster.org/mailman/listinfo/gluster-users> > > > > > >
_______________________________________________ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users