3.5.2 installed via rpm from official gluster repo, running on amazon ami. Thanks Ramesh
> On Sep 6, 2014, at 7:29 AM, Pranith Kumar Karampuri <pkara...@redhat.com> > wrote: > > What is the glusterfs version where you ran into this issue? > > Pranith >> On 09/05/2014 11:52 PM, Ramesh Natarajan wrote: >> I have a replicate glusterfs setup on 3 Bricks ( replicate = 3 ). I have >> client and server quorum turned on. I rebooted one of the 3 bricks. When it >> came back up, the client started throwing error messages that one of the >> files went into split brain. >> >> When i check the file sizes and sha1sum on the bricks, 2 of the 3 bricks >> have the same value. So by quorum logic the first brick should have healed >> with this information. But i don't see that happening. Can someone please >> tell me if this is expected behavior? >> >> >> Can someone please tell me if i have things misconfigured... >> >> thanks >> Ramesh >> >> My config is as below. >> >> [root@ip-172-31-12-218 ~]# gluster volume info >> >> Volume Name: PL1 >> Type: Replicate >> Volume ID: a7aabae0-c6bc-40a9-8b26-0498d488ee39 >> Status: Started >> Number of Bricks: 1 x 3 = 3 >> Transport-type: tcp >> Bricks: >> Brick1: 172.31.38.189:/data/vol1/gluster-data >> Brick2: 172.31.16.220:/data/vol1/gluster-data >> Brick3: 172.31.12.218:/data/vol1/gluster-data >> Options Reconfigured: >> performance.cache-size: 2147483648 >> nfs.addr-namelookup: off >> network.ping-timeout: 12 >> cluster.server-quorum-type: server >> nfs.enable-ino32: on >> cluster.quorum-type: auto >> cluster.server-quorum-ratio: 51% >> >> Volume Name: PL2 >> Type: Replicate >> Volume ID: fadb3671-7a92-40b7-bccd-fbacf672f6dc >> Status: Started >> Number of Bricks: 1 x 3 = 3 >> Transport-type: tcp >> Bricks: >> Brick1: 172.31.38.189:/data/vol2/gluster-data >> Brick2: 172.31.16.220:/data/vol2/gluster-data >> Brick3: 172.31.12.218:/data/vol2/gluster-data >> Options Reconfigured: >> performance.cache-size: 2147483648 >> nfs.addr-namelookup: off >> network.ping-timeout: 12 >> cluster.server-quorum-type: server >> nfs.enable-ino32: on >> cluster.quorum-type: auto >> cluster.server-quorum-ratio: 51% >> [root@ip-172-31-12-218 ~]# >> >> >> I have 2 clients each mounting one of the volumes. At no time the same >> volume is mounted by more than 1 client. >> >> mount -t glusterfs -o >> defaults,enable-ino32,direct-io-mode=disable,log-level=WARNING,log-file=/var/log/gluster.log,backupvolfile-server=172.31.38.189,backupvolfile-server=172.31.12.218,background-qlen=256 >> 172.31.16.220:/PL2 /mnt/vm >> >> >> I restarted the Brick 1 172.31.38.189 and when it came up, one of the file >> on PL2 volume went into split mode.. >> >> >> [2014-09-05 17:59:42.997308] W [afr-open.c:209:afr_open] 0-PL2-replicate-0: >> failed to open as split brain seen, returning EIO >> [2014-09-05 17:59:42.997350] W [fuse-bridge.c:2209:fuse_writev_cbk] >> 0-glusterfs-fuse: 3359683: WRITE => -1 (Input/output error) >> [2014-09-05 17:59:42.997476] W [fuse-bridge.c:690:fuse_truncate_cbk] >> 0-glusterfs-fuse: 3359684: FTRUNCATE() ERR => -1 (Input/ >> output error)[2014-09-05 17:59:42.997647] W >> [fuse-bridge.c:2209:fuse_writev_cbk] 0-glusterfs-fuse: 3359686: WRITE => -1 >> (Input/output erro >> r)[2014-09-05 17:59:42.997783] W [fuse-bridge.c:1214:fuse_err_cbk] >> 0-glusterfs-fuse: 3359687: FLUSH() ERR => -1 (Input/output e >> rror)[2014-09-05 17:59:44.009187] E >> [afr-self-heal-common.c:233:afr_sh_print_split_brain_log] 0-PL2-replicate-0: >> Unable to self-he >> al contents of '/apache_cp_mm1/logs/access_log.2014-09-05-17_00_00' >> (possible split-brain). Please delete the file from all but the preferred >> subvolume.- Pending matrix: [ [ 0 1 1 ] [ 3398 0 0 ] [ 3398 0 0 ] ] >> [2014-09-05 17:59:44.011116] E >> [afr-self-heal-common.c:2868:afr_log_self_heal_completion_status] >> 0-PL2-replicate-0: backgroung data self heal failed, on >> /apache_cp_mm1/logs/access_log.2014-09-05-17_00_00 >> [2014-09-05 17:59:44.011480] W [afr-open.c:209:afr_open] 0-PL2-replicate-0: >> failed to open as split brain seen, returning EIO >> >> Starting time of crawl: Fri Sep 5 17:55:32 2014 >> >> Ending time of crawl: Fri Sep 5 17:55:33 2014 >> >> Type of crawl: INDEX >> No. of entries healed: 4 >> No. of entries in split-brain: 1 >> No. of heal failed entries: 0 >> [root@ip-172-31-16-220 ~]# gluster volume heal PL2 info >> Brick ip-172-31-38-189:/data/vol2/gluster-data/ >> /apache_cp_mm1/logs/mm1.access_log.2014-09-05-17_00_00 >> Number of entries: 1 >> >> Brick ip-172-31-16-220:/data/vol2/gluster-data/ >> /apache_cp_mm1/logs/mm1.access_log.2014-09-05-17_00_00 >> Number of entries: 1 >> >> Brick ip-172-31-12-218:/data/vol2/gluster-data/ >> /apache_cp_mm1/logs/mm1.access_log.2014-09-05-17_00_00 >> Number of entries: 1 >> >> >> BRICK1 >> ======== >> >> [root@ip-172-31-38-189 ~]# sha1sum access_log.2014-09-05-17_00_00 >> aa72d0f3949700f67b61d3c58fdbc75b772d607b access_log.2014-09-05-17_00_00 >> >> [root@ip-172-31-38-189 ~]# ls -al >> total 12760 >> dr-xr-x--- 3 root root 4096 Sep 5 17:42 . >> dr-xr-xr-x 24 root root 4096 Sep 5 17:34 .. >> -rw-r----- 1 root root 13019808 Sep 5 17:42 >> access_log.2014-09-05-17_00_00 >> >> [root@ip-172-31-38-189 ~]# getfattr -d -m . -e hex >> /data/vol2/gluster-data/apache_cp_mm1/logs/access_log.2014-09-05-17_00_00 >> getfattr: Removing leading '/' from absolute path names >> # file: >> data/vol2/gluster-data/apache_cp_mm1/logs/access_log.2014-09-05-17_00_00 >> trusted.afr.PL2-client-0=0x000000000000000000000000 >> trusted.afr.PL2-client-1=0x000000010000000000000000 >> trusted.afr.PL2-client-2=0x000000010000000000000000 >> trusted.gfid=0xea950263977e46bf89a0ef631ca139c2 >> >> >> BRICK 2 >> ======= >> >> [root@ip-172-31-16-220 ~]# sha1sum access_log.2014-09-05-17_00_00 >> 0f7b72f77a792b5c2b68456c906cf7b93287f0d6 access_log.2014-09-05-17_00_00 >> >> [root@ip-172-31-16-220 ~]# getfattr -d -m . -e hex >> /data/vol2/gluster-data/apache_cp_mm1/logs/access_log.2014-09-05-17_00_00 >> getfattr: Removing leading '/' from absolute path names >> # file: >> data/vol2/gluster-data/apache_cp_mm1/logs/access_log.2014-09-05-17_00_00 >> trusted.afr.PL2-client-0=0x00000d460000000000000000 >> trusted.afr.PL2-client-1=0x000000000000000000000000 >> trusted.afr.PL2-client-2=0x000000000000000000000000 >> trusted.gfid=0xea950263977e46bf89a0ef631ca139c2 >> >> BRICK 3 >> ========= >> >> [root@ip-172-31-12-218 ~]# sha1sum access_log.2014-09-05-17_00_00 >> 0f7b72f77a792b5c2b68456c906cf7b93287f0d6 access_log.2014-09-05-17_00_00 >> >> [root@ip-172-31-12-218 ~]# getfattr -d -m . -e hex >> /data/vol2/gluster-data/apache_cp_mm1/logs/access_log.2014-09-05-17_00_00 >> getfattr: Removing leading '/' from absolute path names >> # file: >> data/vol2/gluster-data/apache_cp_mm1/logs/access_log.2014-09-05-17_00_00 >> trusted.afr.PL2-client-0=0x00000d460000000000000000 >> trusted.afr.PL2-client-1=0x000000000000000000000000 >> trusted.afr.PL2-client-2=0x000000000000000000000000 >> trusted.gfid=0xea950263977e46bf89a0ef631ca139c2 >> >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users@gluster.org >> http://supercolony.gluster.org/mailman/listinfo/gluster-users >
_______________________________________________ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users