Re: [Gluster-devel] NetBSD's read-subvol-entry.t spurious failures explained

2015-03-09 Thread Emmanuel Dreyfus
On Fri, Mar 06, 2015 at 05:55:34PM +0530, Ravishankar N wrote:
 After bringing brick0 up, and performing ls abc/def, does afr_do_readdir()
 get called for def?
 If it does,  then AFR will send lookup to both bricks via
 afr_inode_refresh() 

I think I tracked down the real problem. I now understand there is some
kind of race condition where we can enter afr_replies_interpret() with 
inode's ia_gfid and ia_type unset (that is, set to IA_INVAL). As a result, 
this code in afr_replies_interpret() will fail to go the
AFR_ENTRY_TRANSACTION route: 

afr_accused_fill (this, replies[i].xdata, data_accused,
  (inode-ia_type == IA_IFDIR) ?
   AFR_ENTRY_TRANSACTION : AFR_DATA_TRANSACTION);

The accused array does not take into account the entry bits in 
trusted.afr.patchy-client-0 from brick1, and brick0 is considered clean.

Using replies[i].poststat.ia_type  instead of inode-ia_type fixes
the problem. here is the patch for that. Please review:
http://review.gluster.org/9831/

-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD's read-subvol-entry.t spurious failures explained

2015-03-06 Thread Emmanuel Dreyfus
On Fri, Mar 06, 2015 at 05:55:34PM +0530, Ravishankar N wrote:
 On NetBSD I can see that AFR never gets trusted.afr.patchy-client-0
 and walways things brick0 is fine. AFR randomly picks brick0 or brick1
 to list directory content, and when it picks brick0 the test fails.
 After bringing brick0 up, and performing ls abc/def, does afr_do_readdir()
 get called for def?

Yes, it is. 

 If it does,  then AFR will send lookup to both bricks via
 afr_inode_refresh() ,

How is it supposed to happen? I can see I do not get into 
afr_inode_refresh_do() after visiting afr_do_readdir().

-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] NetBSD's read-subvol-entry.t spurious failures explained

2015-03-06 Thread Emmanuel Dreyfus
Hi

I tracked down the spurious failures of read-subvol-entry.t on NetBSD.

Here is what should happen: we have a volume with brick0 and brick1.
We disable self-heal, kill brick0, create a file in a directory, 
restart brick0, and we list directory content to check we find the file.

The tested mechanism is that in brick1, trusted.afr.patchy-client-0
accuse brick0 of being outdated, hence AFR should rule out brick0 
for listing directory content, and it should use brick1 which contains
the file we look for.

On NetBSD I can see that AFR never gets trusted.afr.patchy-client-0
and walways things brick0 is fine. AFR randomly picks brick0 or brick1
to list directory content, and when it picks brick0 the test fails.

The reason why trusted.afr.patchy-client-0 is not there is that the
node is cached in kernel FUSE from an earlier lookup. The TTL obtained
at that times tells the kernel this node is still valid, hence the
kernel does not send the new lookup to GlusterFS. Since GlusterFS uses
lookups to referesh client view of xattr, it sticks with older value
where brick0 was not yet oudated, and trusted.afr.patchy-client-0 is 
unset.

Questions:

1) Is NetBSD behavior wrong here? It got a TTL for a node, I understand
it should not send lookups to the filesystem until the TTL is expired.

2) How to fix it? If NetBSD behavior is correct, then I guess the test
only succeeds on Linux by chance and we only need to fix the test.
The change below flush kernel cache before looking for the file:

--- a/tests/basic/afr/read-subvol-entry.t
+++ b/tests/basic/afr/read-subvol-entry.t
@@ -26,6 +26,7 @@ TEST kill_brick $V0 $H0 $B0/brick0
 
 TEST touch $M0/abc/def/ghi
 TEST $CLI volume start $V0 force
+( cd $M0  umount $M0 ) 
 EXPECT_WITHIN $PROCESS_UP_TIMEOUT ghi echo `ls $M0/abc/def/`
 
 #Cleanup



-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD's read-subvol-entry.t spurious failures explained

2015-03-06 Thread Emmanuel Dreyfus
Ravishankar N ravishan...@redhat.com wrote:

 But since in the test case, we are doing a 'volume start force' , this
 code path doesn't seem to be hit and looks like we are calling 
 local-readfn() from afr_read_txn(). But read_subvol still is 1 (i.e the
 2nd brick).  Is that the case for you too? i.e does afr_readdir_wind()
 get called for subvol=1?

When the test fails, afr_readdir_wind()  is always called with
subvol=0.When it succeeds, with subvol=1.


-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel