Re: [Gluster-devel] Spurious failure report for master branch - 2015-03-03

2015-03-06 Thread Justin Clift
On 4 Mar 2015, at 15:25, Shyam srang...@redhat.com wrote: On 03/03/2015 11:27 PM, Justin Clift wrote: 2 x Coredumps * * http://mirror.salasaga.org/gluster/master/2015-03-03/bulk5/ IP - 104.130.74.142 This coredump run also failed on: *

Re: [Gluster-devel] NetBSD's read-subvol-entry.t spurious failures explained

2015-03-06 Thread Emmanuel Dreyfus
On Fri, Mar 06, 2015 at 05:55:34PM +0530, Ravishankar N wrote: On NetBSD I can see that AFR never gets trusted.afr.patchy-client-0 and walways things brick0 is fine. AFR randomly picks brick0 or brick1 to list directory content, and when it picks brick0 the test fails. After bringing brick0

[Gluster-devel] NetBSD's read-subvol-entry.t spurious failures explained

2015-03-06 Thread Emmanuel Dreyfus
Hi I tracked down the spurious failures of read-subvol-entry.t on NetBSD. Here is what should happen: we have a volume with brick0 and brick1. We disable self-heal, kill brick0, create a file in a directory, restart brick0, and we list directory content to check we find the file. The tested

Re: [Gluster-devel] NetBSD hanging regression tests

2015-03-06 Thread Emmanuel Dreyfus
Emmanuel Dreyfus m...@netbsd.org wrote: Obviously something went wrong. Perhaps there should be a timeout there, and/or a check that write() does not fail? Submitted here: http://review.gluster.org/9825 -- Emmanuel Dreyfus http://hcpnet.free.fr/pubz m...@netbsd.org

Re: [Gluster-devel] NetBSD's read-subvol-entry.t spurious failures explained

2015-03-06 Thread Emmanuel Dreyfus
Ravishankar N ravishan...@redhat.com wrote: But since in the test case, we are doing a 'volume start force' , this code path doesn't seem to be hit and looks like we are calling local-readfn() from afr_read_txn(). But read_subvol still is 1 (i.e the 2nd brick). Is that the case for you too?

Re: [Gluster-devel] NetBSD hanging regression tests

2015-03-06 Thread Emmanuel Dreyfus
Hi Recently NetBSD regression tests started hanging quite frequently. Here is an example: http://build.gluster.org/job/rackspace-netbsd7-regression-triggered/1679/ The offending test is root-squash-self-heal.t which starts a never-ending glfsheal process: PID LID WCHAN STAT LTIME

Re: [Gluster-devel] Spurious failure report for master branch - 2015-03-03

2015-03-06 Thread Pranith Kumar Karampuri
On 03/04/2015 09:57 AM, Justin Clift wrote: Ran 20 x regression tests on our GlusterFS master branch code as of a few hours ago, commit 95d5e60afb29aedc29909340e7564d54a6a247c2. 5 of them were successful (25%), 15 of them failed in various ways (75%). We need to get this down to about 5% or