Hi All, 

On EC volume, we have been seeing an interesting bug caused by fine race 
between rmdir and inodelk which leads to EIO error. 
Pranith, Xavi and I had a discussion on this and have some possible solution. 
Your inputs are required on this bug and its possible solution. 

1 - Consider rmdir on /a/b and chown on a/b from 2 different clients/process. 
rmdir /a/b takes lock on "a" and deletes "b". 
However, chown /a/b will take lock on "b" to do setattr fop. Now, in case of 
(4+2) EC volume, inodelk might get ENOENT from 3 bricks (if rmdir /a/b succeeds 
on these 3 bricks) and 
might get locks from rest of the 3 bricks. 

As an operation should be successful on at least 4 bricks, it will throw EIO 
for chown. 

This can be solved on EC side while processing callbacks and based on error we 
can decide which error we should be passed on. In the above case sending 
ENOENT could be safer. 

2 - rmdir /a/b and rmdir /a/b/c comes from 2 different clients/process. 
Now, suppose "c" has been deleted by some other process, rmdir /a/b would be 
succeeded. 
At this point, it is possible that /a/b has been deleted and the inode for "b" 
has been purged on 3 bricks. At time the inodelk on "b" comes for rmdir /a/b/c. 
It will fail on 3 bricks and gets lock on rest of the 3. In this case again, we 
gets EIO. 

To solve this, It was suggested to take lock on parent as well as on entry 
which is to be deleted. So in the above case when we do rmdir /a/b/c we will 
take locks 
on "b" and "c" both. For rmdir /a/b we will take lock on "a" and "b". This will 
certainly impact performance but at this moment this looks feasible solution. 

---- 
Ashish 




_______________________________________________
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Reply via email to