Hi Bob, > -----Original Message----- > From: Users [mailto:users-boun...@clusterlabs.org] On Behalf Of Bob > Peterson > Sent: 2019年10月21日 21:02 > To: Cluster Labs - All topics related to open-source clustering welcomed > <users@clusterlabs.org> > Subject: Re: [ClusterLabs] gfs2: fsid=xxxx:work.3: fatal: filesystem > consistency > error > > ----- Original Message ----- > > Hello List, > > > > I got gfs2 file system consistency error from one user, who is using > > kernel 4.12.14-95.29-default on SLE12SP4(x86_64). > > The error message is as below, > > 2019-09-26T10:22:10.333792+02:00 node4 kernel: [ 3456.176234] gfs2: > > fsid=xxxx:work.3: fatal: filesystem consistency error > > 2019-09-26T10:22:10.333806+02:00 node4 kernel: [ 3456.176234] > inode = 280 > > 342097926 > > 2019-09-26T10:22:10.333807+02:00 node4 kernel: [ 3456.176234] > function = > > gfs2_dinode_dealloc, file = ../fs/gfs2/super.c, line = 1459 > > 2019-09-26T10:22:10.333808+02:00 node4 kernel: [ 3456.176235] gfs2: > > fsid=xxxx:work.3: about to withdraw this file system > > > > I cat the super.c file, the related code is, > > 1451 static int gfs2_dinode_dealloc(struct gfs2_inode *ip) > > 1452 { > > 1453 struct gfs2_sbd *sdp = GFS2_SB(&ip->i_inode); > > 1454 struct gfs2_rgrpd *rgd; > > 1455 struct gfs2_holder gh; > > 1456 int error; > > 1457 > > 1458 if (gfs2_get_inode_blocks(&ip->i_inode) != 1) { > > 1459 gfs2_consist_inode(ip); <<== here > > 1460 return -EIO; > > 1461 } > > > > > > It looks the upstream has fixed this bug? who can help to point out > > which patches to be needed for back-port? > > > > Thanks > > Gang > > Hi, > > Yes, we have made lots of patches since the 4.12 kernel, some of which may > be relevant. However, that error often indicates file system corruption. > (It means the block count for a dinode became corrupt.) > > I've been working on a set of problems caused whenever gfs2 replays one of > its journals during recovery, with a wide variety of symptoms, including that > one. So it might be one of those. Some of my resulting patches are already > pushed to upstream, but I'm not yet at the point where I can push them all. > > I recommend doing a fsck.gfs2 on the volume to ensure consistency.
The customer has repaired it using fsck.gfs2, however every time the application workload starts (concurrent writing), the filesystem becomes inaccessible, causing also a stop operation failure of the app resource, consequently causing a fence. Do you have any suggestion in this case? It looks there is a serious bug in case concurrent writing with some stress. Thanks Gang > > Regards, > > Bob Peterson > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/