Re: [Gluster-users] Can't delete or add files when a node fails.

Craig Carl Fri, 15 Oct 2010 16:28:02 -0700

Brian - 
I was able to reproduce this behavior here, I talked to engineering and I have 
an explanation for you. Because you are replicating, all of your file 
operations via the Gluster mount point will still be successfully committed. 
There is no POSIX compliant way for Gluster to send your application an error 
message and a commit for the write at the same time. As long as either node of 
a replicate pair is running you won't see errors accessing your data from the 
Gluster mount point. On the back end you will see the behavior your are seeing 
because Gluster needs to know what changes to replicate to the failed storage 
server when it comes back up. 
Please let me know if this answers your questions.




Thanks, 

Craig 

-- 
Craig Carl 













Senior Systems Engineer; Gluster, Inc. 
Cell - ( 408) 829-9953 (California, USA) 
Office - ( 408) 770-1884 
Gtalk - [email protected] 
Twitter - @gluster 
Installing Gluster Storage Platform, the movie! 
http://rackerhacker.com/2010/08/11/one-month-with-glusterfs-in-production/ 



From: "Craig Carl" <[email protected]> 
To: "Brian Hirt" <[email protected]> 
Cc: [email protected] 
Sent: Tuesday, October 12, 2010 5:12:24 PM 
Subject: Re: [Gluster-users] Can't delete or add files when a node fails. 

Brian, 
Give me a few days to reproduce the bug on 3.0.5 and 3.1 and I'll file bug 
reports and get a time estimate from engineering. 



Thanks, 

Craig 

-- 
Craig Carl 













Senior Systems Engineer; Gluster, Inc. 
Cell - ( 408) 829-9953 (California, USA) 
Office - ( 408) 770-1884 
Gtalk - [email protected] 
Twitter - @gluster 
Installing Gluster Storage Platform, the movie! 
http://rackerhacker.com/2010/08/11/one-month-with-glusterfs-in-production/ 



From: "Brian Hirt" <[email protected]> 
To: "Craig Carl" <[email protected]> 
Cc: [email protected] 
Sent: Monday, October 11, 2010 2:34:58 PM 
Subject: Re: [Gluster-users] Can't delete or add files when a node fails. 

How is this expected? The filesystem hasn't disappeared, it's just become 
read-only and the glusterfds is still running and silently failing when read 
operations are attempted. Gluster opens the files, it gets a read only error 
message back from the kernel and simply ignores it. This is not expected at all 
and I have a hard time believing it has anything to do with FUSE. 


The default behavior on most linux distros when they detect a problem with the 
filesystem is to remount the filesystem read only. 


--brian 




On Oct 11, 2010, at 3:27 PM, Craig Carl wrote: 




Brian - 
This is to be expected. If the filesystem `disappears` from under Gluster, 
Gluster will need to be restarted in order to reconnect to it. This appears to 
be a FUSE limitation. 



Thanks, 

Craig 

-- 
Craig Carl 













Senior Systems Engineer; Gluster, Inc. 
Cell - ( 408) 829-9953 (California, USA) 
Office - ( 408) 770-1884 
Gtalk - [email protected] 
Twitter - @gluster 
Installing Gluster Storage Platform, the movie! 
http://rackerhacker.com/2010/08/11/one-month-with-glusterfs-in-production/ 



From: "Brian Hirt" < [email protected] > 
To: [email protected] 
Sent: Friday, October 8, 2010 7:01:58 AM 
Subject: [Gluster-users] Can't delete or add files when a node fails. 

I am trying to track down a problem I reported on the list last week and 
discovered a new problem during my testing. 

If you have a four node setup with replicate/distribute and one of the nodes 
has a filesystem failure, the operating system will typically remount the 
filesystem read only. When this happens, the glusterfsd is still running on the 
failed machine, but i doesn't seem to recognize that there is a problem. If you 
try to create new files from a client and do an ls you will see that some of 
the files don't appear. Conversely if you remove files from the client they 
will still be there along with their content. 

This is trivial to reproduce by remounting the filesystem readonly on one of 
the bricks. If you are on a typical linux install and the gluster export 
directory is part of the root filesystem, you would only need to 'mount -o 
remount,abort /' 

Considering that this is a very typical path for failure, I would expect 
gluster to handle this properly. 

Regards, 

Brian Hirt 


_______________________________________________ 
Gluster-users mailing list 
[email protected] 
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users 


_______________________________________________ 
Gluster-users mailing list 
[email protected] 
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

Re: [Gluster-users] Can't delete or add files when a node fails.

Reply via email to