Re: [Gluster-users] after upgrade to 3.6.7 : Internal error xfs_attr3_leaf_write_verify
On Thu, Dec 03, 2015 at 03:16:54PM -0500, Vijay Bellur wrote: > Looks like an issue with xfs. Adding Brian to check if it is a familiar > problem. > > Regards, > Vijay > > - Original Message - > > From: "Dietmar Putz" > > To: gluster-users@gluster.org > > Sent: Thursday, December 3, 2015 6:06:11 AM > > Subject: [Gluster-users] after upgrade to 3.6.7 : Internal error > > xfs_attr3_leaf_write_verify > > > > Hello all, > > > > on 1st december i upgraded two 6 node cluster from glusterfs 3.5.6 to 3.6.7. > > all of them are equal in hw, os and patchlevel, currently running ubuntu > > 14.04 lts by an do-release-upgrade from 12.04 lts (this was done before > > gfs upgrade to 3.5.6, not directly before upgrading to 3.6.7). > > because of a geo-replication issue all of the nodes have rsync 3.1.1.3 > > installed instead 3.1.0 which comes by the repositories. this is the > > only deviation from ubuntu repositories for 14.04 lts. > > since upgrade to gfs 3.6.7 the glusterd on two nodes of the same cluster > > are going offline after getting an xfs_attr3_leaf_write_verify error for > > the underlying bricks as shown below. > > this happens about every 4-5 hours after the problem was solved by an > > umount / remount of the brick. it makes no difference to run a xfs_check > > / xfs_repair before remount. > > xfs_check / xfs_repair did not show any faults. the underlying hw is a > > raid 5 vol on lsi-9271 8i. megacli does not show any errors. > > the syslog does not show more than the dmesg output below. > > every time the same two nodes of the same cluster are affected. > > as shown in dmesg and syslog, the system recognizes the > > xfs_attr_leaf_write_verify error about 38 min. before finally giving up. > > for both events i can not found corresponding events in gluster logs. > > this is strange...the gluster is historical grown from 3.2.5, 3.3, to > > 3.4.6/7 which was running well for month, gfs 3.5.6 was running for > > about two weeks and upgrade to 3.6.7 was done because of a geo-repl > > log-flood. > > even when i have no hint/evidence that this is caused by gfs 3.6.7 > > somehow i believe that this is the case... > > does anybody experienced such an error or have some hints to getting out > > of this big problem...? > > unfortunately the affected cluster is the master of a geo-replication > > which is not well running since update from gfs 3.4.7...fortunately both > > affected gluster-nodes are not of the same sub-volume. > > > > any help is appreciated... > > > > best regards > > dietmar > > ... > > - root@gluster-ger-ber-10 /var/log $dmesg -T > > ... > > [Wed Dec 2 12:43:47 2015] XFS (sdc1): xfs_log_force: error 5 returned. > > [Wed Dec 2 12:43:48 2015] XFS (sdc1): xfs_log_force: error 5 returned. > > [Wed Dec 2 12:45:58 2015] XFS (sdc1): Mounting Filesystem > > [Wed Dec 2 12:45:58 2015] XFS (sdc1): Starting recovery (logdev: internal) > > [Wed Dec 2 12:45:59 2015] XFS (sdc1): Ending recovery (logdev: internal) > > [Wed Dec 2 13:11:53 2015] XFS (sdc1): Mounting Filesystem > > [Wed Dec 2 13:11:54 2015] XFS (sdc1): Ending clean mount > > [Wed Dec 2 13:12:29 2015] init: statd main process (25924) killed by > > KILL signal > > [Wed Dec 2 13:12:29 2015] init: statd main process ended, respawning > > [Wed Dec 2 13:13:24 2015] init: statd main process (13433) killed by > > KILL signal > > [Wed Dec 2 13:13:24 2015] init: statd main process ended, respawning > > [Wed Dec 2 17:22:28 2015] 8807076b1000: 00 00 00 00 00 00 00 00 fb > > ee 00 00 00 00 00 00 > > [Wed Dec 2 17:22:28 2015] 8807076b1010: 10 00 00 00 00 20 0f e0 00 > > 00 00 00 00 00 00 00 . .. > > [Wed Dec 2 17:22:28 2015] 8807076b1020: 00 00 00 00 00 00 00 00 00 > > 00 00 00 00 00 00 00 > > [Wed Dec 2 17:22:28 2015] 8807076b1030: 00 00 00 00 00 00 00 00 00 > > 00 00 00 00 00 00 00 > > [Wed Dec 2 17:22:28 2015] XFS (sdc1): Internal error > > xfs_attr3_leaf_write_verify at line 216 of file > > /build/linux-XHaR1x/linux-3.13.0/fs/xfs/xfs_attr_leaf.c. Caller > > 0xa01a66f0 That's a write verifier error on an extended attribute write. The purpose of the write verifier is to check metadata structure immediately prior to write submission. Failure means some kind of corruption has occurred in memory and the filesystem shuts down to prevent any further damage. Is this an upstream stable 3.13 kernel or a distro kernel? You could try something more recent and see if it resolves t
Re: [Gluster-users] after upgrade to 3.6.7 : Internal error xfs_attr3_leaf_write_verify
Hi Saravana, we are having this issue since upgrading from glusterfs 3.4.7. to 3.5.6 to 3.6.7. Now we are trying to downgrade the kernel now. The bug is communicated here https://bugs.launchpad.net/ubuntu/+source/linux-lts-trusty/+bug/1468039 Regards Julius On 06.12.2015 19:00, Saravanakumar Arumugam wrote: Hi, This seems like XFS filesystem issue. Can you communicate this error to xfs mailing list? Thanks, Saravana On 12/06/2015 05:23 AM, Julius Thomas wrote: Dear Gluster Users, after fixing the problem in the last mail from my colleague by upgrading to kernel 3.19.0-39-generic in case of changes with this bug in the xfs tree, the xfs filesystem crashes again after 4 - 5 hours on several peers. Has anyone recommendations for fixing this problems? Are there known issues with xfs and ubuntu 14.04? What is the latest stable release of gluster3, v3.6.3? You can find latest gluster here. http://download.gluster.org/pub/gluster/glusterfs/LATEST/ and follow the link here for Ubuntu: http://download.gluster.org/pub/gluster/glusterfs/LATEST/Ubuntu/ Dec 5 21:14:48 gluster-ger-ber-11 kernel: [16564.018838] XFS (sdc1): Metadata corruption detected at xfs_attr3_leaf_write_verify+0xe5/0x100 [xfs], block 0x44458e670 Dec 5 21:14:48 gluster-ger-ber-11 kernel: [16564.018879] XFS (sdc1): Unmount and run xfs_repair Dec 5 21:14:48 gluster-ger-ber-11 kernel: [16564.018895] XFS (sdc1): First 64 bytes of corrupted metadata buffer: Dec 5 21:14:48 gluster-ger-ber-11 kernel: [16564.018916] 880417ff3000: 00 00 00 00 00 00 00 00 fb ee 00 00 00 00 00 00 Dec 5 21:14:48 gluster-ger-ber-11 kernel: [16564.018956] 880417ff3010: 10 00 00 00 00 20 0f e0 00 00 00 00 00 00 00 00 . .. Dec 5 21:14:48 gluster-ger-ber-11 kernel: [16564.018984] 880417ff3020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Dec 5 21:14:48 gluster-ger-ber-11 kernel: [16564.019011] 880417ff3030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Dec 5 21:14:48 gluster-ger-ber-11 kernel: [16564.019041] XFS (sdc1): xfs_do_force_shutdown(0x8) called from line 1249 of file /build/linux-lts-vivid-1jarlV/linux-lts-vivid-3.19.0/fs/xfs/xfs_buf.c. Return address = 0xc02bbd22 Dec 5 21:14:48 gluster-ger-ber-11 kernel: [16564.019044] XFS (sdc1): Corruption of in-memory data detected. Shutting down filesystem Dec 5 21:14:48 gluster-ger-ber-11 kernel: [16564.019069] XFS (sdc1): Please umount the filesystem and rectify the problem(s) Dec 5 21:14:48 gluster-ger-ber-11 kernel: [16564.069906] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:15:08 gluster-ger-ber-11 gluster-export[4447]: [2015-12-05 21:15:08.797327] M [posix-helpers.c:1559:posix_health_check_thread_proc] 0-ger-ber-01-posix: health-check failed, going down Dec 5 21:15:18 gluster-ger-ber-11 kernel: [16594.068660] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:15:38 gluster-ger-ber-11 gluster-export[4447]: [2015-12-05 21:15:38.797422] M [posix-helpers.c:1564:posix_health_check_thread_proc] 0-ger-ber-01-posix: still alive! -> SIGTERM Dec 5 21:15:48 gluster-ger-ber-11 kernel: [16624.119428] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:16:18 gluster-ger-ber-11 kernel: [16654.170134] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:16:48 gluster-ger-ber-11 kernel: [16684.220834] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:17:01 gluster-ger-ber-11 CRON[17656]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly) Dec 5 21:17:18 gluster-ger-ber-11 kernel: [16714.271507] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:17:48 gluster-ger-ber-11 kernel: [16744.322244] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:18:18 gluster-ger-ber-11 kernel: [16774.372948] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:18:48 gluster-ger-ber-11 kernel: [16804.423650] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:19:18 gluster-ger-ber-11 kernel: [16834.474365] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:19:48 gluster-ger-ber-11 kernel: [16864.525082] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:20:18 gluster-ger-ber-11 kernel: [16894.575778] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:20:49 gluster-ger-ber-11 kernel: [16924.626464] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:21:19 gluster-ger-ber-11 kernel: [16954.677161] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:21:49 gluster-ger-ber-11 kernel: [16984.727791] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:22:19 gluster-ger-ber-11 kernel: [17014.778570] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:22:49 gluster-ger-ber-11 kernel: [17044.829240] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:23:19 gluster-ger-ber-11 kernel: [17074.880003] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:23:49 gluster-ger-ber-11 kernel: [17104.930643] XFS (sdc1): xfs_log_
Re: [Gluster-users] after upgrade to 3.6.7 : Internal error xfs_attr3_leaf_write_verify
Hi, This seems like XFS filesystem issue. Can you communicate this error to xfs mailing list? Thanks, Saravana On 12/06/2015 05:23 AM, Julius Thomas wrote: Dear Gluster Users, after fixing the problem in the last mail from my colleague by upgrading to kernel 3.19.0-39-generic in case of changes with this bug in the xfs tree, the xfs filesystem crashes again after 4 - 5 hours on several peers. Has anyone recommendations for fixing this problems? Are there known issues with xfs and ubuntu 14.04? What is the latest stable release of gluster3, v3.6.3? You can find latest gluster here. http://download.gluster.org/pub/gluster/glusterfs/LATEST/ and follow the link here for Ubuntu: http://download.gluster.org/pub/gluster/glusterfs/LATEST/Ubuntu/ Dec 5 21:14:48 gluster-ger-ber-11 kernel: [16564.018838] XFS (sdc1): Metadata corruption detected at xfs_attr3_leaf_write_verify+0xe5/0x100 [xfs], block 0x44458e670 Dec 5 21:14:48 gluster-ger-ber-11 kernel: [16564.018879] XFS (sdc1): Unmount and run xfs_repair Dec 5 21:14:48 gluster-ger-ber-11 kernel: [16564.018895] XFS (sdc1): First 64 bytes of corrupted metadata buffer: Dec 5 21:14:48 gluster-ger-ber-11 kernel: [16564.018916] 880417ff3000: 00 00 00 00 00 00 00 00 fb ee 00 00 00 00 00 00 Dec 5 21:14:48 gluster-ger-ber-11 kernel: [16564.018956] 880417ff3010: 10 00 00 00 00 20 0f e0 00 00 00 00 00 00 00 00 . .. Dec 5 21:14:48 gluster-ger-ber-11 kernel: [16564.018984] 880417ff3020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Dec 5 21:14:48 gluster-ger-ber-11 kernel: [16564.019011] 880417ff3030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Dec 5 21:14:48 gluster-ger-ber-11 kernel: [16564.019041] XFS (sdc1): xfs_do_force_shutdown(0x8) called from line 1249 of file /build/linux-lts-vivid-1jarlV/linux-lts-vivid-3.19.0/fs/xfs/xfs_buf.c. Return address = 0xc02bbd22 Dec 5 21:14:48 gluster-ger-ber-11 kernel: [16564.019044] XFS (sdc1): Corruption of in-memory data detected. Shutting down filesystem Dec 5 21:14:48 gluster-ger-ber-11 kernel: [16564.019069] XFS (sdc1): Please umount the filesystem and rectify the problem(s) Dec 5 21:14:48 gluster-ger-ber-11 kernel: [16564.069906] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:15:08 gluster-ger-ber-11 gluster-export[4447]: [2015-12-05 21:15:08.797327] M [posix-helpers.c:1559:posix_health_check_thread_proc] 0-ger-ber-01-posix: health-check failed, going down Dec 5 21:15:18 gluster-ger-ber-11 kernel: [16594.068660] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:15:38 gluster-ger-ber-11 gluster-export[4447]: [2015-12-05 21:15:38.797422] M [posix-helpers.c:1564:posix_health_check_thread_proc] 0-ger-ber-01-posix: still alive! -> SIGTERM Dec 5 21:15:48 gluster-ger-ber-11 kernel: [16624.119428] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:16:18 gluster-ger-ber-11 kernel: [16654.170134] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:16:48 gluster-ger-ber-11 kernel: [16684.220834] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:17:01 gluster-ger-ber-11 CRON[17656]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly) Dec 5 21:17:18 gluster-ger-ber-11 kernel: [16714.271507] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:17:48 gluster-ger-ber-11 kernel: [16744.322244] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:18:18 gluster-ger-ber-11 kernel: [16774.372948] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:18:48 gluster-ger-ber-11 kernel: [16804.423650] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:19:18 gluster-ger-ber-11 kernel: [16834.474365] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:19:48 gluster-ger-ber-11 kernel: [16864.525082] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:20:18 gluster-ger-ber-11 kernel: [16894.575778] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:20:49 gluster-ger-ber-11 kernel: [16924.626464] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:21:19 gluster-ger-ber-11 kernel: [16954.677161] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:21:49 gluster-ger-ber-11 kernel: [16984.727791] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:22:19 gluster-ger-ber-11 kernel: [17014.778570] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:22:49 gluster-ger-ber-11 kernel: [17044.829240] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:23:19 gluster-ger-ber-11 kernel: [17074.880003] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:23:49 gluster-ger-ber-11 kernel: [17104.930643] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:24:19 gluster-ger-ber-11 kernel: [17134.981336] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:24:49 gluster-ger-ber-11 kernel: [17165.032049] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:25:19 gluster-ger-ber-11 kernel: [17195.082689] XFS (sdc1): xfs_log_forc
Re: [Gluster-users] after upgrade to 3.6.7 : Internal error xfs_attr3_leaf_write_verify
Dear Gluster Users, after fixing the problem in the last mail from my colleague by upgrading to kernel 3.19.0-39-generic in case of changes with this bug in the xfs tree, the xfs filesystem crashes again after 4 - 5 hours on several peers. Has anyone recommendations for fixing this problems? Are there known issues with xfs and ubuntu 14.04? What is the latest stable release of gluster3, v3.6.3? Dec 5 21:14:48 gluster-ger-ber-11 kernel: [16564.018838] XFS (sdc1): Metadata corruption detected at xfs_attr3_leaf_write_verify+0xe5/0x100 [xfs], block 0x44458e670 Dec 5 21:14:48 gluster-ger-ber-11 kernel: [16564.018879] XFS (sdc1): Unmount and run xfs_repair Dec 5 21:14:48 gluster-ger-ber-11 kernel: [16564.018895] XFS (sdc1): First 64 bytes of corrupted metadata buffer: Dec 5 21:14:48 gluster-ger-ber-11 kernel: [16564.018916] 880417ff3000: 00 00 00 00 00 00 00 00 fb ee 00 00 00 00 00 00 Dec 5 21:14:48 gluster-ger-ber-11 kernel: [16564.018956] 880417ff3010: 10 00 00 00 00 20 0f e0 00 00 00 00 00 00 00 00 . .. Dec 5 21:14:48 gluster-ger-ber-11 kernel: [16564.018984] 880417ff3020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Dec 5 21:14:48 gluster-ger-ber-11 kernel: [16564.019011] 880417ff3030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Dec 5 21:14:48 gluster-ger-ber-11 kernel: [16564.019041] XFS (sdc1): xfs_do_force_shutdown(0x8) called from line 1249 of file /build/linux-lts-vivid-1jarlV/linux-lts-vivid-3.19.0/fs/xfs/xfs_buf.c. Return address = 0xc02bbd22 Dec 5 21:14:48 gluster-ger-ber-11 kernel: [16564.019044] XFS (sdc1): Corruption of in-memory data detected. Shutting down filesystem Dec 5 21:14:48 gluster-ger-ber-11 kernel: [16564.019069] XFS (sdc1): Please umount the filesystem and rectify the problem(s) Dec 5 21:14:48 gluster-ger-ber-11 kernel: [16564.069906] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:15:08 gluster-ger-ber-11 gluster-export[4447]: [2015-12-05 21:15:08.797327] M [posix-helpers.c:1559:posix_health_check_thread_proc] 0-ger-ber-01-posix: health-check failed, going down Dec 5 21:15:18 gluster-ger-ber-11 kernel: [16594.068660] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:15:38 gluster-ger-ber-11 gluster-export[4447]: [2015-12-05 21:15:38.797422] M [posix-helpers.c:1564:posix_health_check_thread_proc] 0-ger-ber-01-posix: still alive! -> SIGTERM Dec 5 21:15:48 gluster-ger-ber-11 kernel: [16624.119428] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:16:18 gluster-ger-ber-11 kernel: [16654.170134] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:16:48 gluster-ger-ber-11 kernel: [16684.220834] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:17:01 gluster-ger-ber-11 CRON[17656]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly) Dec 5 21:17:18 gluster-ger-ber-11 kernel: [16714.271507] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:17:48 gluster-ger-ber-11 kernel: [16744.322244] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:18:18 gluster-ger-ber-11 kernel: [16774.372948] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:18:48 gluster-ger-ber-11 kernel: [16804.423650] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:19:18 gluster-ger-ber-11 kernel: [16834.474365] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:19:48 gluster-ger-ber-11 kernel: [16864.525082] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:20:18 gluster-ger-ber-11 kernel: [16894.575778] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:20:49 gluster-ger-ber-11 kernel: [16924.626464] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:21:19 gluster-ger-ber-11 kernel: [16954.677161] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:21:49 gluster-ger-ber-11 kernel: [16984.727791] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:22:19 gluster-ger-ber-11 kernel: [17014.778570] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:22:49 gluster-ger-ber-11 kernel: [17044.829240] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:23:19 gluster-ger-ber-11 kernel: [17074.880003] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:23:49 gluster-ger-ber-11 kernel: [17104.930643] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:24:19 gluster-ger-ber-11 kernel: [17134.981336] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:24:49 gluster-ger-ber-11 kernel: [17165.032049] XFS (sdc1): xfs_log_force: error -5 returned. Dec 5 21:25:19 gluster-ger-ber-11 kernel: [17195.082689] XFS (sdc1): xfs_log_force: error -5 returned. On 03.12.2015 12:06, Dietmar Putz wrote: Hello all, on 1st december i upgraded two 6 node cluster from glusterfs 3.5.6 to 3.6.7. all of them are equal in hw, os and patchlevel, currently running ubuntu 14.04 lts by an do-release-upgrade from 12.04 lts (this was done before gfs upgrade to 3.5.6, not directly before upgrading to
Re: [Gluster-users] after upgrade to 3.6.7 : Internal error xfs_attr3_leaf_write_verify
Looks like an issue with xfs. Adding Brian to check if it is a familiar problem. Regards, Vijay - Original Message - > From: "Dietmar Putz" > To: gluster-users@gluster.org > Sent: Thursday, December 3, 2015 6:06:11 AM > Subject: [Gluster-users] after upgrade to 3.6.7 : Internal error > xfs_attr3_leaf_write_verify > > Hello all, > > on 1st december i upgraded two 6 node cluster from glusterfs 3.5.6 to 3.6.7. > all of them are equal in hw, os and patchlevel, currently running ubuntu > 14.04 lts by an do-release-upgrade from 12.04 lts (this was done before > gfs upgrade to 3.5.6, not directly before upgrading to 3.6.7). > because of a geo-replication issue all of the nodes have rsync 3.1.1.3 > installed instead 3.1.0 which comes by the repositories. this is the > only deviation from ubuntu repositories for 14.04 lts. > since upgrade to gfs 3.6.7 the glusterd on two nodes of the same cluster > are going offline after getting an xfs_attr3_leaf_write_verify error for > the underlying bricks as shown below. > this happens about every 4-5 hours after the problem was solved by an > umount / remount of the brick. it makes no difference to run a xfs_check > / xfs_repair before remount. > xfs_check / xfs_repair did not show any faults. the underlying hw is a > raid 5 vol on lsi-9271 8i. megacli does not show any errors. > the syslog does not show more than the dmesg output below. > every time the same two nodes of the same cluster are affected. > as shown in dmesg and syslog, the system recognizes the > xfs_attr_leaf_write_verify error about 38 min. before finally giving up. > for both events i can not found corresponding events in gluster logs. > this is strange...the gluster is historical grown from 3.2.5, 3.3, to > 3.4.6/7 which was running well for month, gfs 3.5.6 was running for > about two weeks and upgrade to 3.6.7 was done because of a geo-repl > log-flood. > even when i have no hint/evidence that this is caused by gfs 3.6.7 > somehow i believe that this is the case... > does anybody experienced such an error or have some hints to getting out > of this big problem...? > unfortunately the affected cluster is the master of a geo-replication > which is not well running since update from gfs 3.4.7...fortunately both > affected gluster-nodes are not of the same sub-volume. > > any help is appreciated... > > best regards > dietmar > > > > > [ 09:32:29 ] - root@gluster-ger-ber-10 /var/log $gluster volume info > > Volume Name: ger-ber-01 > Type: Distributed-Replicate > Volume ID: 6a071cfa-b150-4f0b-b1ed-96ab5d4bd671 > Status: Started > Number of Bricks: 3 x 2 = 6 > Transport-type: tcp > Bricks: > Brick1: gluster-ger-ber-11-int:/gluster-export > Brick2: gluster-ger-ber-12-int:/gluster-export > Brick3: gluster-ger-ber-09-int:/gluster-export > Brick4: gluster-ger-ber-10-int:/gluster-export > Brick5: gluster-ger-ber-07-int:/gluster-export > Brick6: gluster-ger-ber-08-int:/gluster-export > Options Reconfigured: > changelog.changelog: on > geo-replication.ignore-pid-check: on > cluster.min-free-disk: 200GB > geo-replication.indexing: on > auth.allow: > 10.0.1.*,188.138.82.*,188.138.123.*,82.193.249.198,82.193.249.200,31.7.178.137,31.7.178.135,31.7.180.109,31.7.180.98,82.199.147.*,104.155.22.202,104.155.30.201,104.155.5.117,104.155.11.253,104.155.15.34,104.155.25.145,146.148.120.255,31.7.180.148 > nfs.disable: off > performance.cache-refresh-timeout: 2 > performance.io-thread-count: 32 > performance.cache-size: 1024MB > performance.read-ahead: on > performance.cache-min-file-size: 0 > network.ping-timeout: 10 > [ 09:32:52 ] - root@gluster-ger-ber-10 /var/log $ > > > > > [ 19:10:55 ] - root@gluster-ger-ber-10 /var/log $gluster volume status > Status of volume: ger-ber-01 > Gluster processPortOnline Pid > -- > > Brick gluster-ger-ber-11-int:/gluster-export 49152Y 15994 > Brick gluster-ger-ber-12-int:/gluster-export N/AN N/A > Brick gluster-ger-ber-09-int:/gluster-export 49152Y 10965 > Brick gluster-ger-ber-10-int:/gluster-export N/AN N/A > Brick gluster-ger-ber-07-int:/gluster-export 49152Y 18542 > Brick gluster-ger-ber-08-int:/gluster-export 49152Y 20275 > NFS Server on localhost2049Y 13658 > Self-heal Daemon on localhostN/AY 13666 > NFS Server on gluster-ger-ber-09-int2049 Y13503 > Self-heal Daemon on gluster-ger-ber-09-intN/A Y 13511 > NFS Server on gluster-ger-ber-07-int2049 Y21526 > Self-heal Daemon on gluster-ger-ber-07-intN/A Y 21534 > NFS Server on gluster-ger
[Gluster-users] after upgrade to 3.6.7 : Internal error xfs_attr3_leaf_write_verify
Hello all, on 1st december i upgraded two 6 node cluster from glusterfs 3.5.6 to 3.6.7. all of them are equal in hw, os and patchlevel, currently running ubuntu 14.04 lts by an do-release-upgrade from 12.04 lts (this was done before gfs upgrade to 3.5.6, not directly before upgrading to 3.6.7). because of a geo-replication issue all of the nodes have rsync 3.1.1.3 installed instead 3.1.0 which comes by the repositories. this is the only deviation from ubuntu repositories for 14.04 lts. since upgrade to gfs 3.6.7 the glusterd on two nodes of the same cluster are going offline after getting an xfs_attr3_leaf_write_verify error for the underlying bricks as shown below. this happens about every 4-5 hours after the problem was solved by an umount / remount of the brick. it makes no difference to run a xfs_check / xfs_repair before remount. xfs_check / xfs_repair did not show any faults. the underlying hw is a raid 5 vol on lsi-9271 8i. megacli does not show any errors. the syslog does not show more than the dmesg output below. every time the same two nodes of the same cluster are affected. as shown in dmesg and syslog, the system recognizes the xfs_attr_leaf_write_verify error about 38 min. before finally giving up. for both events i can not found corresponding events in gluster logs. this is strange...the gluster is historical grown from 3.2.5, 3.3, to 3.4.6/7 which was running well for month, gfs 3.5.6 was running for about two weeks and upgrade to 3.6.7 was done because of a geo-repl log-flood. even when i have no hint/evidence that this is caused by gfs 3.6.7 somehow i believe that this is the case... does anybody experienced such an error or have some hints to getting out of this big problem...? unfortunately the affected cluster is the master of a geo-replication which is not well running since update from gfs 3.4.7...fortunately both affected gluster-nodes are not of the same sub-volume. any help is appreciated... best regards dietmar [ 09:32:29 ] - root@gluster-ger-ber-10 /var/log $gluster volume info Volume Name: ger-ber-01 Type: Distributed-Replicate Volume ID: 6a071cfa-b150-4f0b-b1ed-96ab5d4bd671 Status: Started Number of Bricks: 3 x 2 = 6 Transport-type: tcp Bricks: Brick1: gluster-ger-ber-11-int:/gluster-export Brick2: gluster-ger-ber-12-int:/gluster-export Brick3: gluster-ger-ber-09-int:/gluster-export Brick4: gluster-ger-ber-10-int:/gluster-export Brick5: gluster-ger-ber-07-int:/gluster-export Brick6: gluster-ger-ber-08-int:/gluster-export Options Reconfigured: changelog.changelog: on geo-replication.ignore-pid-check: on cluster.min-free-disk: 200GB geo-replication.indexing: on auth.allow: 10.0.1.*,188.138.82.*,188.138.123.*,82.193.249.198,82.193.249.200,31.7.178.137,31.7.178.135,31.7.180.109,31.7.180.98,82.199.147.*,104.155.22.202,104.155.30.201,104.155.5.117,104.155.11.253,104.155.15.34,104.155.25.145,146.148.120.255,31.7.180.148 nfs.disable: off performance.cache-refresh-timeout: 2 performance.io-thread-count: 32 performance.cache-size: 1024MB performance.read-ahead: on performance.cache-min-file-size: 0 network.ping-timeout: 10 [ 09:32:52 ] - root@gluster-ger-ber-10 /var/log $ [ 19:10:55 ] - root@gluster-ger-ber-10 /var/log $gluster volume status Status of volume: ger-ber-01 Gluster processPortOnline Pid -- Brick gluster-ger-ber-11-int:/gluster-export 49152Y 15994 Brick gluster-ger-ber-12-int:/gluster-export N/AN N/A Brick gluster-ger-ber-09-int:/gluster-export 49152Y 10965 Brick gluster-ger-ber-10-int:/gluster-export N/AN N/A Brick gluster-ger-ber-07-int:/gluster-export 49152Y 18542 Brick gluster-ger-ber-08-int:/gluster-export 49152Y 20275 NFS Server on localhost2049Y 13658 Self-heal Daemon on localhostN/AY 13666 NFS Server on gluster-ger-ber-09-int2049 Y13503 Self-heal Daemon on gluster-ger-ber-09-intN/A Y 13511 NFS Server on gluster-ger-ber-07-int2049 Y21526 Self-heal Daemon on gluster-ger-ber-07-intN/A Y 21534 NFS Server on gluster-ger-ber-08-int2049 Y24004 Self-heal Daemon on gluster-ger-ber-08-intN/A Y 24011 NFS Server on gluster-ger-ber-11-int2049 Y18944 Self-heal Daemon on gluster-ger-ber-11-intN/A Y 18952 NFS Server on gluster-ger-ber-12-int2049 Y19138 Self-heal Daemon on gluster-ger-ber-12-intN/A Y 19146 Task Status of Volume ger-ber-01 -- There are no active volume tasks - root@gluster-ger-ber-10 /var/log $ - root@gluster-ger-ber-10 /var/log $dmesg -T ... [Wed Dec 2 12:43:47 2015] XFS (sdc1): xfs_log_force: error 5 returned. [Wed Dec 2 12:43:48 2015] XFS (sdc1): xfs_log_force: error 5 returned. [Wed Dec 2 12:45:58 2015] XFS (sdc1): Mounting Filesystem [Wed Dec 2 12:45