Re: [Gluster-users] NFS not start on localhost
On 10/19/2014 06:56 PM, Niels de Vos wrote: On Sat, Oct 18, 2014 at 01:24:12PM +0200, Demeter Tibor wrote: Hi, [root@node0 ~]# tail -n 20 /var/log/glusterfs/nfs.log [2014-10-18 07:41:06.136035] E [graph.c:307:glusterfs_graph_init] 0-nfs-server: initializing translator failed [2014-10-18 07:41:06.136040] E [graph.c:502:glusterfs_graph_activate] 0-graph: init failed pending frames: frame : type(0) op(0) patchset: git://git.gluster.com/glusterfs.git signal received: 11 time of crash: 2014-10-18 07:41:06configuration details: argp 1 backtrace 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.5.2 This definitely is a gluster/nfs issue. For whatever reasone, the gluster/nfs server crashes :-/ The log does not show enough details, some more lines before this are needed. I wonder if the crash is due to a cleanup after the translator initialization failure. The complete logs might help in understanding why the initialization failed. -Vijay ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Split-brain seen with [0 0] pending matrix and io-cache page errors
On 10/19/2014 06:05 PM, Anirban Ghoshal wrote: I see. Thanks a tonne for the thorough explanation! :) I can see that our setup would be vulnerable here because the logger on one server is not generally aware of the state of the replica on the other server. So, it is possible that the log files may have been renamed before heal had a chance to kick in. Could I also request you for the bug ID (should there be one) against which you are coding up the fix, so that we could get a notification once it is passed? This bug was reported by Redhat QE and the bug is cloned upstream. I copied the relevant content so you would understand the context: https://bugzilla.redhat.com/show_bug.cgi?id=1154491 Pranith Also, as an aside, is O_DIRECT supposed to prevent this from occurring if one were to make allowance for the performance hit? Unfortunately no :-(. As far as I understand that was the only work-around. Pranith Thanks again, Anirban *From: * Pranith Kumar Karampuri ; *To: * Anirban Ghoshal ; ; *Subject: * Re: [Gluster-users] Split-brain seen with [0 0] pending matrix and io-cache page errors *Sent: * Sun, Oct 19, 2014 9:01:58 AM On 10/19/2014 01:36 PM, Anirban Ghoshal wrote: It is possible, yes, because these are actually a kind of log files. I suppose, like other logging frameworks these files an remain open for a considerable period, and then get renamed to support log rotate semantics. That said, I might need to check with the team that actually manages the logging framework to be sure. I only take care of the file-system stuff. I can tell you for sure Monday. If it is the same race that you mention, is there a fix for it? Thanks, Anirban I am working on the fix. RCA: 0) Lets say the file 'abc.log' is opened for writing on replica pair (brick-0, brick-1) 1) brick-0 went down 2) abc.log is renamed to abc.log.1 3) brick-0 comes back up 4) re-open on old abc.log happens from mount to brick-0 5) self-heal kicks in and deletes old abc.log and creates and syncs abc.log.1 6) But the mount is still writing to the deleted 'old abc.log' on brick-0 so abc.log.1 file remains at the same size while abc.log.1 file keeps increasing on brick-1. This leads to size mismatch split-brain on abc.log.1. Race happens between steps 4), 5). If 5) happens before 4) no split-brain will be observed. Work-around: 0) Take backup of good abc.log.1 file from brick-1. (Just being paranoid) Do any of the following two steps to make sure the stale file that is open is closed 1-a) Take the brick process with bad file down using kill -9 (In my example brick-0). 1-b) Introduce a temporary disconnect between mount and brick-0. (I would choose 1-a) 2) Remove the bad file(abc.log.1) and its gfid-backend-file from brick-0 3) Bring the brick back up (gluster volume start force)/restore the connection and let it heal by doing 'stat' on the file abc.log.1 on the mount. This bug existed from 2012, from the first time I implemented rename/hard-link self-heal. It is difficult to re-create. I have to put break-points at several places in the process to hit the race. Pranith Thanks, Anirban *From: * Pranith Kumar Karampuri ; *To: * Anirban Ghoshal ; ; *Subject: * Re: [Gluster-users] Split-brain seen with [0 0] pending matrix and io-cache page errors *Sent: * Sun, Oct 19, 2014 5:42:24 AM On 10/18/2014 04:36 PM, Anirban Ghoshal wrote: Hi, Yes, they do, and considerably. I'd forgotten to mention that on my last email. Their mtimes, however, as far as i could tell on separate servers, seemed to coincide. Thanks, Anirban Are these files always open? And is it possible that the file could have been renamed when one of the bricks was offline? I know of a race which can introduce this one. Just trying to find if it is the same case. Pranith *From: * Pranith Kumar Karampuri ; *To: * Anirban Ghoshal ; gluster-users@gluster.org ; *Subject: * Re: [Gluster-users] Split-brain seen with [0 0] pending matrix and io-cache page errors *Sent: * Sat, Oct 18, 2014 12:26:08 AM hi, Could you see if the size of the file mismatches? Pranith On 10/18/2014 04:20 AM, Anirban Ghoshal wrote: Hi everyone, I have this really confusing split-brain here that's bothering me. I am running glusterfs 3.4.2 over linux 2.6.34. I have a replica 2 volume 'testvol' that is It seems I cannot read/stat/edit the file in question, and `gluster volume heal testvol info split-brain` shows nothing. Here are the logs from the fuse-mount for the volume: [2014-09-29 07:53:02.867111] W [fuse-bridge.c:1172:fuse_err_cbk] 0-glusterfs-fuse: 4560969: FLUSH() ERR => -1 (Input/output error) [2014-09-29 07:54:16.007799] W [page.c:991:__ioc_page_error] 0-testvol-io-cache: page error for pag
Re: [Gluster-users] NFS not start on localhost
On Sat, Oct 18, 2014 at 01:24:12PM +0200, Demeter Tibor wrote: > Hi, > > [root@node0 ~]# tail -n 20 /var/log/glusterfs/nfs.log > [2014-10-18 07:41:06.136035] E [graph.c:307:glusterfs_graph_init] > 0-nfs-server: initializing translator failed > [2014-10-18 07:41:06.136040] E [graph.c:502:glusterfs_graph_activate] > 0-graph: init failed > pending frames: > frame : type(0) op(0) > > patchset: git://git.gluster.com/glusterfs.git > signal received: 11 > time of crash: 2014-10-18 07:41:06configuration details: > argp 1 > backtrace 1 > dlfcn 1 > fdatasync 1 > libpthread 1 > llistxattr 1 > setfsid 1 > spinlock 1 > epoll.h 1 > xattr.h 1 > st_atim.tv_nsec 1 > package-string: glusterfs 3.5.2 This definitely is a gluster/nfs issue. For whatever reasone, the gluster/nfs server crashes :-/ The log does not show enough details, some more lines before this are needed. There might be an issue where the NFS RPC-services can not register. I think I have seen similar crashes before, but never found the cause. You should check with the 'rpcinfo' command to see if there are any NFS RPC-services registered (nfs, lockd, mount, lockmgr). If there are any, verify that there are no other nfs processes running, this includes NFS-mounts in /etc/fstab and similar. Could you file a bug, attach the full (gzipped) nfs.log? Try to explain as much details of the setup as you can, and add a link to the archives of this thread. Please post the url to the bug in a response to this thread. A crashing process is never good, even when is could be caused by external processes. Link to file a bug: - https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS&component=nfs&version=3.5.2 Thanks, Niels > > Udv: > > Demeter Tibor > > Email: tdemeter @itsmart.hu > Skype: candyman_78 > Phone: +36 30 462 0500 > Web : www.it smart.hu > > IT SMART KFT. > 2120 Dunakeszi Wass Albert utca 2. I. em 9. > Telefon: +36 30 462-0500 Fax: +36 27 637-486 > > [EN] This message and any attachments are confidential and privileged and > intended for the use of the addressee only. If you have received this > communication in error, please notify the sender by replay e-mail and delete > this message from your system. Please note that Internet e-mail guarantees > neither the confidentiality nor the proper receipt of the message sent. The > data deriving from our correspondence with you are included in a file of > ITSMART Ltd which exclusive purpose is to manage the communications of the > company; under the understanding that, in maintaining said correspondence, > you authorize the treatment of such data for the mentioned purpose. You are > entitled to exercise your rights of access, rectification, cancellation and > opposition by addressing such written application to address above. > [HUN] Ez az üzenet és annak bármely csatolt anyaga bizalmas, a nyilvános > közléstol védett, kizárólag a címzett használhatja fel. Ha Ön nem az üzenet > címzettje, úgy kérjük válaszüzenetben értesítse errol a feladót és törölje az > üzenetet a rendszerbol. Kérjük vegye figyelembe, hogy az email-en történo > információtovábbítás kockázattal járhat, nem garantálja sem a csatorna > bizalmasságát, sem a kézbesítést. A levél az ITSMART Informatikai Kft. > kommunikációjának eszköze, az adatokat kizárólag erre a célra használjuk. > Jogosult tájékoztatást kérni személyes adatai kezelésérol, kérheti azok > helyesbítését, illetve törlését írásos kérelemben a fenti e-mail címen. > > - Eredeti üzenet - > > > Maybe share the last 15-20 lines of you /var/log/glusterfs/nfs.log for the > > consideration of everyone on the list? Thanks. > > > From: Demeter Tibor ; > > To: Anirban Ghoshal ; > > Cc: gluster-users ; > > Subject: Re: [Gluster-users] NFS not start on localhost > > Sent: Sat, Oct 18, 2014 10:36:36 AM > > > > > Hi, > > > I've try out these things: > > > - nfs.disable on-of > > - iptables disable > > - volume stop-start > > > but same. > > So, when I make a new volume everything is fine. > > After reboot the NFS won't listen on local host (only on server has brick0) > > > Centos7 with last ovirt > > > Regards, > > > Tibor > > > - Eredeti üzenet - > > > > It happens with me sometimes. Try `tail -n 20 /var/log/glusterfs/nfs.log`. > > > You will probably find something out that will help your cause. In > > > general, > > > if you just wish to start the thing up without going into the why of it, > > > try > > > `gluster volume set engine nfs.disable on` followed by ` gluster volume > > > set > > > engine nfs.disable off`. It does the trick quite often for me because it > > > is > > > a polite way to askmgmt/glusterd to try and respawn the nfs server process > > > if need be. But, keep in mind that this will call a (albeit small) service > > > interruption to all clients accessing volume engine over nfs. > > > > > > Thanks, > > > > > Anirban > > > > > > On Saturday, 18 October 2014 1:03 AM, Demeter Tibor
Re: [Gluster-users] Split-brain seen with [0 0] pending matrix and io-cache page errors
I see. Thanks a tonne for the thorough explanation! :) I can see that our setup would be vulnerable here because the logger on one server is not generally aware of the state of the replica on the other server. So, it is possible that the log files may have been renamed before heal had a chance to kick in. Could I also request you for the bug ID (should there be one) against which you are coding up the fix, so that we could get a notification once it is passed? Also, as an aside, is O_DIRECT supposed to prevent this from occurring if one were to make allowance for the performance hit? Thanks again, Anirban___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Split-brain seen with [0 0] pending matrix and io-cache page errors
On 10/19/2014 01:36 PM, Anirban Ghoshal wrote: It is possible, yes, because these are actually a kind of log files. I suppose, like other logging frameworks these files an remain open for a considerable period, and then get renamed to support log rotate semantics. That said, I might need to check with the team that actually manages the logging framework to be sure. I only take care of the file-system stuff. I can tell you for sure Monday. If it is the same race that you mention, is there a fix for it? Thanks, Anirban I am working on the fix. RCA: 0) Lets say the file 'abc.log' is opened for writing on replica pair (brick-0, brick-1) 1) brick-0 went down 2) abc.log is renamed to abc.log.1 3) brick-0 comes back up 4) re-open on old abc.log happens from mount to brick-0 5) self-heal kicks in and deletes old abc.log and creates and syncs abc.log.1 6) But the mount is still writing to the deleted 'old abc.log' on brick-0 so abc.log.1 file remains at the same size while abc.log.1 file keeps increasing on brick-1. This leads to size mismatch split-brain on abc.log.1. Race happens between steps 4), 5). If 5) happens before 4) no split-brain will be observed. Work-around: 0) Take backup of good abc.log.1 file from brick-1. (Just being paranoid) Do any of the following two steps to make sure the stale file that is open is closed 1-a) Take the brick process with bad file down using kill -9 (In my example brick-0). 1-b) Introduce a temporary disconnect between mount and brick-0. (I would choose 1-a) 2) Remove the bad file(abc.log.1) and its gfid-backend-file from brick-0 3) Bring the brick back up (gluster volume start force)/restore the connection and let it heal by doing 'stat' on the file abc.log.1 on the mount. This bug existed from 2012, from the first time I implemented rename/hard-link self-heal. It is difficult to re-create. I have to put break-points at several places in the process to hit the race. Pranith Thanks, Anirban *From: * Pranith Kumar Karampuri ; *To: * Anirban Ghoshal ; ; *Subject: * Re: [Gluster-users] Split-brain seen with [0 0] pending matrix and io-cache page errors *Sent: * Sun, Oct 19, 2014 5:42:24 AM On 10/18/2014 04:36 PM, Anirban Ghoshal wrote: Hi, Yes, they do, and considerably. I'd forgotten to mention that on my last email. Their mtimes, however, as far as i could tell on separate servers, seemed to coincide. Thanks, Anirban Are these files always open? And is it possible that the file could have been renamed when one of the bricks was offline? I know of a race which can introduce this one. Just trying to find if it is the same case. Pranith *From: * Pranith Kumar Karampuri ; *To: * Anirban Ghoshal ; gluster-users@gluster.org ; *Subject: * Re: [Gluster-users] Split-brain seen with [0 0] pending matrix and io-cache page errors *Sent: * Sat, Oct 18, 2014 12:26:08 AM hi, Could you see if the size of the file mismatches? Pranith On 10/18/2014 04:20 AM, Anirban Ghoshal wrote: Hi everyone, I have this really confusing split-brain here that's bothering me. I am running glusterfs 3.4.2 over linux 2.6.34. I have a replica 2 volume 'testvol' that is It seems I cannot read/stat/edit the file in question, and `gluster volume heal testvol info split-brain` shows nothing. Here are the logs from the fuse-mount for the volume: [2014-09-29 07:53:02.867111] W [fuse-bridge.c:1172:fuse_err_cbk] 0-glusterfs-fuse: 4560969: FLUSH() ERR => -1 (Input/output error) [2014-09-29 07:54:16.007799] W [page.c:991:__ioc_page_error] 0-testvol-io-cache: page error for page = 0x7fd5c8529d20 & waitq = 0x7fd5c8067d40 [2014-09-29 07:54:16.007854] W [fuse-bridge.c:2089:fuse_readv_cbk] 0-glusterfs-fuse: 4561103: READ => -1 (Input/output error) [2014-09-29 07:54:16.008018] W [page.c:991:__ioc_page_error] 0-testvol-io-cache: page error for page = 0x7fd5c8607ee0 & waitq = 0x7fd5c8067d40 [2014-09-29 07:54:16.008056] W [fuse-bridge.c:2089:fuse_readv_cbk] 0-glusterfs-fuse: 4561104: READ => -1 (Input/output error) [2014-09-29 07:54:16.008233] W [page.c:991:__ioc_page_error] 0-testvol-io-cache: page error for page = 0x7fd5c8066f30 & waitq = 0x7fd5c8067d40 [2014-09-29 07:54:16.008269] W [fuse-bridge.c:2089:fuse_readv_cbk] 0-glusterfs-fuse: 4561105: READ => -1 (Input/output error) [2014-09-29 07:54:16.008800] W [page.c:991:__ioc_page_error] 0-testvol-io-cache: page error for page = 0x7fd5c860bcf0 & waitq = 0x7fd5c863b1f0 [2014-09-29 07:54:16.008839] W [fuse-bridge.c:2089:fuse_readv_cbk] 0-glusterfs-fuse: 4561107: READ => -1 (Input/output error) [2014-09-29 07:54:16.009365] W [page.c:991:__ioc_page_error] 0-testvol-io-cache: page error for page = 0x7fd5c85fd120 & waitq = 0x7fd5c8067d40 [2014-09-29 07:54:16.009413] W [fuse-bridge.c:2089:fuse_readv_cbk] 0-glusterfs-fuse: 4561109: READ => -1 (In
Re: [Gluster-users] Split-brain seen with [0 0] pending matrix and io-cache page errors
It is possible, yes, because these are actually a kind of log files. I suppose, like other logging frameworks these files an remain open for a considerable period, and then get renamed to support log rotate semantics. That said, I might need to check with the team that actually manages the logging framework to be sure. I only take care of the file-system stuff. I can tell you for sure Monday. If it is the same race that you mention, is there a fix for it? Thanks, Anirban___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users