Re: [Gluster-users] NFS not start on localhost

2014-10-19 Thread Vijay Bellur

On 10/19/2014 06:56 PM, Niels de Vos wrote:

On Sat, Oct 18, 2014 at 01:24:12PM +0200, Demeter Tibor wrote:

Hi,

[root@node0 ~]# tail -n 20 /var/log/glusterfs/nfs.log
[2014-10-18 07:41:06.136035] E [graph.c:307:glusterfs_graph_init] 0-nfs-server: 
initializing translator failed
[2014-10-18 07:41:06.136040] E [graph.c:502:glusterfs_graph_activate] 0-graph: 
init failed
pending frames:
frame : type(0) op(0)

patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 2014-10-18 07:41:06configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.5.2


This definitely is a gluster/nfs issue. For whatever reasone, the
gluster/nfs server crashes :-/ The log does not show enough details,
some more lines before this are needed.



I wonder if the crash is due to a cleanup after the translator 
initialization failure. The complete logs might help in understanding 
why the initialization failed.


-Vijay

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Split-brain seen with [0 0] pending matrix and io-cache page errors

2014-10-19 Thread Pranith Kumar Karampuri


On 10/19/2014 06:05 PM, Anirban Ghoshal wrote:
I see. Thanks a tonne for the thorough explanation! :) I can see that 
our setup would be vulnerable here because the logger on one server is 
not generally aware of the state of the replica on the other server. 
So, it is possible that the log files may have been renamed before 
heal had a chance to kick in.


Could I also request you for the bug ID (should there be one) against 
which you are coding up the fix, so that we could get a notification 
once it is passed?


This bug was reported by Redhat QE and the bug is cloned upstream. I 
copied the relevant content so you would understand the context:

https://bugzilla.redhat.com/show_bug.cgi?id=1154491

Pranith


Also, as an aside, is O_DIRECT supposed to prevent this from occurring 
if one were to make allowance for the performance hit?



Unfortunately no :-(. As far as I understand that was the only work-around.

Pranith


Thanks again,
Anirban



*From: * Pranith Kumar Karampuri ;
*To: * Anirban Ghoshal ; 
;
*Subject: * Re: [Gluster-users] Split-brain seen with [0 0] pending 
matrix and io-cache page errors

*Sent: * Sun, Oct 19, 2014 9:01:58 AM


On 10/19/2014 01:36 PM, Anirban Ghoshal wrote:
It is possible, yes, because these are actually a kind of log files. 
I suppose, like other logging frameworks these files an remain open 
for a considerable period, and then get renamed to support log rotate 
semantics.


That said, I might need to check with the team that actually manages 
the logging framework to be sure. I only take care of the file-system 
stuff. I can tell you for sure Monday.


If it is the same race that you mention, is there a fix for it?

Thanks,
Anirban



I am working on the fix.

RCA:
0) Lets say the file 'abc.log' is opened for writing on replica pair 
(brick-0, brick-1)

1) brick-0 went down
2) abc.log is renamed to abc.log.1
3) brick-0 comes back up
4) re-open on old abc.log happens from mount to brick-0
5) self-heal kicks in and deletes old abc.log and creates and syncs 
abc.log.1
6) But the mount is still writing to the deleted 'old abc.log' on 
brick-0 so abc.log.1 file remains at the same size while abc.log.1 
file keeps increasing on brick-1. This leads to size mismatch 
split-brain on abc.log.1.


Race happens between steps 4), 5). If 5) happens before 4) no 
split-brain will be observed.


Work-around:

0) Take backup of good abc.log.1 file from brick-1. (Just being paranoid)

Do any of the following two steps to make sure the stale file that is 
open is closed
1-a) Take the brick process with bad file down using kill -9 
 (In my example brick-0).

1-b) Introduce a temporary disconnect between mount and brick-0.
(I would choose 1-a)
2) Remove the bad file(abc.log.1) and its gfid-backend-file from brick-0
3) Bring the brick back up (gluster volume start  
force)/restore the connection and let it heal by doing 'stat' on the 
file abc.log.1 on the mount.


This bug existed from 2012, from the first time I implemented 
rename/hard-link self-heal. It is difficult to re-create. I have to 
put break-points at several places in the process to hit the race.


Pranith



Thanks,
Anirban


*From: * Pranith Kumar Karampuri ;
*To: * Anirban Ghoshal ; 
;
*Subject: * Re: [Gluster-users] Split-brain seen with [0 0] pending 
matrix and io-cache page errors

*Sent: * Sun, Oct 19, 2014 5:42:24 AM


On 10/18/2014 04:36 PM, Anirban Ghoshal wrote:

Hi,

Yes, they do, and considerably. I'd forgotten to mention that on my 
last email. Their mtimes, however, as far as i could tell on 
separate servers, seemed to coincide.


Thanks,
Anirban




Are these files always open? And is it possible that the file could 
have been renamed when one of the bricks was offline? I know of a 
race which can introduce this one. Just trying to find if it is the 
same case.


Pranith




*From: * Pranith Kumar Karampuri ;
*To: * Anirban Ghoshal ; 
gluster-users@gluster.org ;
*Subject: * Re: [Gluster-users] Split-brain seen with [0 0] pending 
matrix and io-cache page errors

*Sent: * Sat, Oct 18, 2014 12:26:08 AM

hi,
  Could you see if the size of the file mismatches?

Pranith

On 10/18/2014 04:20 AM, Anirban Ghoshal wrote:

Hi everyone,

I have this really confusing split-brain here that's bothering me. 
I am running glusterfs 3.4.2 over linux 2.6.34. I have a replica 2 
volume 'testvol' that is It seems I cannot read/stat/edit the file 
in question, and `gluster volume heal testvol info split-brain` 
shows nothing. Here are the logs from the fuse-mount for the volume:


[2014-09-29 07:53:02.867111] W [fuse-bridge.c:1172:fuse_err_cbk] 
0-glusterfs-fuse: 4560969: FLUSH() ERR => -1 (Input/output error)
[2014-09-29 07:54:16.007799] W [page.c:991:__ioc_page_error] 
0-testvol-io-cache: page error for pag

Re: [Gluster-users] NFS not start on localhost

2014-10-19 Thread Niels de Vos
On Sat, Oct 18, 2014 at 01:24:12PM +0200, Demeter Tibor wrote:
> Hi, 
> 
> [root@node0 ~]# tail -n 20 /var/log/glusterfs/nfs.log 
> [2014-10-18 07:41:06.136035] E [graph.c:307:glusterfs_graph_init] 
> 0-nfs-server: initializing translator failed 
> [2014-10-18 07:41:06.136040] E [graph.c:502:glusterfs_graph_activate] 
> 0-graph: init failed 
> pending frames: 
> frame : type(0) op(0) 
> 
> patchset: git://git.gluster.com/glusterfs.git 
> signal received: 11 
> time of crash: 2014-10-18 07:41:06configuration details: 
> argp 1 
> backtrace 1 
> dlfcn 1 
> fdatasync 1 
> libpthread 1 
> llistxattr 1 
> setfsid 1 
> spinlock 1 
> epoll.h 1 
> xattr.h 1 
> st_atim.tv_nsec 1 
> package-string: glusterfs 3.5.2 

This definitely is a gluster/nfs issue. For whatever reasone, the
gluster/nfs server crashes :-/ The log does not show enough details,
some more lines before this are needed.

There might be an issue where the NFS RPC-services can not register. I
think I have seen similar crashes before, but never found the cause. You
should check with the 'rpcinfo' command to see if there are any NFS
RPC-services registered (nfs, lockd, mount, lockmgr). If there are any,
verify that there are no other nfs processes running, this includes
NFS-mounts in /etc/fstab and similar.

Could you file a bug, attach the full (gzipped) nfs.log? Try to explain
as much details of the setup as you can, and add a link to the archives
of this thread. Please post the url to the bug in a response to this
thread. A crashing process is never good, even when is could be caused
by external processes.

Link to file a bug:
- 
https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS&component=nfs&version=3.5.2

Thanks,
Niels


> 
> Udv: 
> 
> Demeter Tibor 
> 
> Email: tdemeter @itsmart.hu 
> Skype: candyman_78 
> Phone: +36 30 462 0500 
> Web : www.it smart.hu 
> 
> IT SMART KFT. 
> 2120 Dunakeszi Wass Albert utca 2. I. em 9. 
> Telefon: +36 30 462-0500 Fax: +36 27 637-486 
> 
> [EN] This message and any attachments are confidential and privileged and 
> intended for the use of the addressee only. If you have received this 
> communication in error, please notify the sender by replay e-mail and delete 
> this message from your system. Please note that Internet e-mail guarantees 
> neither the confidentiality nor the proper receipt of the message sent. The 
> data deriving from our correspondence with you are included in a file of 
> ITSMART Ltd which exclusive purpose is to manage the communications of the 
> company; under the understanding that, in maintaining said correspondence, 
> you authorize the treatment of such data for the mentioned purpose. You are 
> entitled to exercise your rights of access, rectification, cancellation and 
> opposition by addressing such written application to address above. 
> [HUN] Ez az üzenet és annak bármely csatolt anyaga bizalmas, a nyilvános 
> közléstol védett, kizárólag a címzett használhatja fel. Ha Ön nem az üzenet 
> címzettje, úgy kérjük válaszüzenetben értesítse errol a feladót és törölje az 
> üzenetet a rendszerbol. Kérjük vegye figyelembe, hogy az email-en történo 
> információtovábbítás kockázattal járhat, nem garantálja sem a csatorna 
> bizalmasságát, sem a kézbesítést. A levél az ITSMART Informatikai Kft. 
> kommunikációjának eszköze, az adatokat kizárólag erre a célra használjuk. 
> Jogosult tájékoztatást kérni személyes adatai kezelésérol, kérheti azok 
> helyesbítését, illetve törlését írásos kérelemben a fenti e-mail címen. 
> 
> - Eredeti üzenet -
> 
> > Maybe share the last 15-20 lines of you /var/log/glusterfs/nfs.log for the
> > consideration of everyone on the list? Thanks.
> 
> > From: Demeter Tibor ;
> > To: Anirban Ghoshal ;
> > Cc: gluster-users ;
> > Subject: Re: [Gluster-users] NFS not start on localhost
> > Sent: Sat, Oct 18, 2014 10:36:36 AM
> 
> > 
> > Hi,
> 
> > I've try out these things:
> 
> > - nfs.disable on-of
> > - iptables disable
> > - volume stop-start
> 
> > but same.
> > So, when I make a new volume everything is fine.
> > After reboot the NFS won't listen on local host (only on server has brick0)
> 
> > Centos7 with last ovirt
> 
> > Regards,
> 
> > Tibor
> 
> > - Eredeti üzenet -
> 
> > > It happens with me sometimes. Try `tail -n 20 /var/log/glusterfs/nfs.log`.
> > > You will probably find something out that will help your cause. In 
> > > general,
> > > if you just wish to start the thing up without going into the why of it,
> > > try
> > > `gluster volume set engine nfs.disable on` followed by ` gluster volume 
> > > set
> > > engine nfs.disable off`. It does the trick quite often for me because it 
> > > is
> > > a polite way to askmgmt/glusterd to try and respawn the nfs server process
> > > if need be. But, keep in mind that this will call a (albeit small) service
> > > interruption to all clients accessing volume engine over nfs.
> > 
> 
> > > Thanks,
> > 
> > > Anirban
> > 
> 
> > > On Saturday, 18 October 2014 1:03 AM, Demeter Tibor

Re: [Gluster-users] Split-brain seen with [0 0] pending matrix and io-cache page errors

2014-10-19 Thread Anirban Ghoshal
I see. Thanks a tonne for the thorough explanation! :) I can see that our setup 
would be vulnerable here because the logger on one server is not generally 
aware of the state of the replica on the other server. So, it is possible that 
the log files may have been renamed before heal had a chance to kick in. 

Could I also request you for the bug ID (should there be one) against which you 
are coding up the fix, so that we could get a notification once it is passed?

Also, as an aside, is O_DIRECT supposed to prevent this from occurring if one 
were to make allowance for the performance hit? 

Thanks again,
Anirban___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Split-brain seen with [0 0] pending matrix and io-cache page errors

2014-10-19 Thread Pranith Kumar Karampuri


On 10/19/2014 01:36 PM, Anirban Ghoshal wrote:
It is possible, yes, because these are actually a kind of log files. I 
suppose, like other logging frameworks these files an remain open for 
a considerable period, and then get renamed to support log rotate 
semantics.


That said, I might need to check with the team that actually manages 
the logging framework to be sure. I only take care of the file-system 
stuff. I can tell you for sure Monday.


If it is the same race that you mention, is there a fix for it?

Thanks,
Anirban



I am working on the fix.

RCA:
0) Lets say the file 'abc.log' is opened for writing on replica pair 
(brick-0, brick-1)

1) brick-0 went down
2) abc.log is renamed to abc.log.1
3) brick-0 comes back up
4) re-open on old abc.log happens from mount to brick-0
5) self-heal kicks in and deletes old abc.log and creates and syncs 
abc.log.1
6) But the mount is still writing to the deleted 'old abc.log' on 
brick-0 so abc.log.1 file remains at the same size while abc.log.1 file 
keeps increasing on brick-1. This leads to size mismatch split-brain on 
abc.log.1.


Race happens between steps 4), 5). If 5) happens before 4) no 
split-brain will be observed.


Work-around:

0) Take backup of good abc.log.1 file from brick-1. (Just being paranoid)

Do any of the following two steps to make sure the stale file that is 
open is closed
1-a) Take the brick process with bad file down using kill -9  
(In my example brick-0).

1-b) Introduce a temporary disconnect between mount and brick-0.
(I would choose 1-a)
2) Remove the bad file(abc.log.1) and its gfid-backend-file from brick-0
3) Bring the brick back up (gluster volume start  
force)/restore the connection and let it heal by doing 'stat' on the 
file abc.log.1 on the mount.


This bug existed from 2012, from the first time I implemented 
rename/hard-link self-heal. It is difficult to re-create. I have to put 
break-points at several places in the process to hit the race.


Pranith


Thanks,
Anirban


*From: * Pranith Kumar Karampuri ;
*To: * Anirban Ghoshal ; 
;
*Subject: * Re: [Gluster-users] Split-brain seen with [0 0] pending 
matrix and io-cache page errors

*Sent: * Sun, Oct 19, 2014 5:42:24 AM


On 10/18/2014 04:36 PM, Anirban Ghoshal wrote:

Hi,

Yes, they do, and considerably. I'd forgotten to mention that on my 
last email. Their mtimes, however, as far as i could tell on separate 
servers, seemed to coincide.


Thanks,
Anirban




Are these files always open? And is it possible that the file could 
have been renamed when one of the bricks was offline? I know of a race 
which can introduce this one. Just trying to find if it is the same case.


Pranith




*From: * Pranith Kumar Karampuri ;
*To: * Anirban Ghoshal ; 
gluster-users@gluster.org ;
*Subject: * Re: [Gluster-users] Split-brain seen with [0 0] pending 
matrix and io-cache page errors

*Sent: * Sat, Oct 18, 2014 12:26:08 AM

hi,
  Could you see if the size of the file mismatches?

Pranith

On 10/18/2014 04:20 AM, Anirban Ghoshal wrote:

Hi everyone,

I have this really confusing split-brain here that's bothering me. I 
am running glusterfs 3.4.2 over linux 2.6.34. I have a replica 2 
volume 'testvol' that is It seems I cannot read/stat/edit the file 
in question, and `gluster volume heal testvol info split-brain` 
shows nothing. Here are the logs from the fuse-mount for the volume:


[2014-09-29 07:53:02.867111] W [fuse-bridge.c:1172:fuse_err_cbk] 
0-glusterfs-fuse: 4560969: FLUSH() ERR => -1 (Input/output error)
[2014-09-29 07:54:16.007799] W [page.c:991:__ioc_page_error] 
0-testvol-io-cache: page error for page = 0x7fd5c8529d20 & waitq = 
0x7fd5c8067d40
[2014-09-29 07:54:16.007854] W [fuse-bridge.c:2089:fuse_readv_cbk] 
0-glusterfs-fuse: 4561103: READ => -1 (Input/output error)
[2014-09-29 07:54:16.008018] W [page.c:991:__ioc_page_error] 
0-testvol-io-cache: page error for page = 0x7fd5c8607ee0 & waitq = 
0x7fd5c8067d40
[2014-09-29 07:54:16.008056] W [fuse-bridge.c:2089:fuse_readv_cbk] 
0-glusterfs-fuse: 4561104: READ => -1 (Input/output error)
[2014-09-29 07:54:16.008233] W [page.c:991:__ioc_page_error] 
0-testvol-io-cache: page error for page = 0x7fd5c8066f30 & waitq = 
0x7fd5c8067d40
[2014-09-29 07:54:16.008269] W [fuse-bridge.c:2089:fuse_readv_cbk] 
0-glusterfs-fuse: 4561105: READ => -1 (Input/output error)
[2014-09-29 07:54:16.008800] W [page.c:991:__ioc_page_error] 
0-testvol-io-cache: page error for page = 0x7fd5c860bcf0 & waitq = 
0x7fd5c863b1f0
[2014-09-29 07:54:16.008839] W [fuse-bridge.c:2089:fuse_readv_cbk] 
0-glusterfs-fuse: 4561107: READ => -1 (Input/output error)
[2014-09-29 07:54:16.009365] W [page.c:991:__ioc_page_error] 
0-testvol-io-cache: page error for page = 0x7fd5c85fd120 & waitq = 
0x7fd5c8067d40
[2014-09-29 07:54:16.009413] W [fuse-bridge.c:2089:fuse_readv_cbk] 
0-glusterfs-fuse: 4561109: READ => -1 (In

Re: [Gluster-users] Split-brain seen with [0 0] pending matrix and io-cache page errors

2014-10-19 Thread Anirban Ghoshal
It is possible, yes, because these are actually a kind of log files. I suppose, 
like other logging frameworks these files an remain open for a considerable 
period, and then get renamed to support log rotate semantics. 

That said, I might need to check with the team that actually manages the 
logging framework to be sure. I only take care of the file-system stuff. I can 
tell you for sure Monday. 

If it is the same race that you mention, is there a fix for it?

Thanks,
Anirban___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users