Re: [Gluster-devel] Spurious failure report for master branch - 2015-03-03

2015-03-04 Thread Nithya Balachandran
Fix for the spurious bug-1117851.t failure at 
http://review.gluster.org/#/c/9798/

Regards,
Nithya


- Original Message -
From: Justin Clift jus...@gluster.org
To: Nithya Balachandran nbala...@redhat.com
Cc: Gluster Devel gluster-devel@gluster.org
Sent: Wednesday, 4 March, 2015 10:12:17 AM
Subject: Re: [Gluster-devel] Spurious failure report for master branch - 
2015-03-03

Thanks. :)

If you need a VM setup in Rackspace for you to investigate on, it's easy
to do.  Let me know if so. :)

+ Justin


On 4 Mar 2015, at 04:37, Nithya Balachandran nbala...@redhat.com wrote:
 I'll take a look at tests/bugs/distribute/bug-1117851.t
 
 Regards,
 Nithya
 
 - Original Message -
 From: Justin Clift jus...@gluster.org
 To: Gluster Devel gluster-devel@gluster.org
 Sent: Wednesday, 4 March, 2015 9:57:00 AM
 Subject: [Gluster-devel] Spurious failure report for master branch -  
 2015-03-03
 
 Ran 20 x regression tests on our GlusterFS master branch code
 as of a few hours ago, commit 95d5e60afb29aedc29909340e7564d54a6a247c2.
 
 5 of them were successful (25%), 15 of them failed in various ways
 (75%).
 
 We need to get this down to about 5% or less (preferably 0%), as it's
 killing our development iteration speed.  We're wasting huge amounts
 of time working around this. :(
 
 
 Spurious failures
 *
 
  * 5 x tests/bugs/distribute/bug-1117851.t
(Wstat: 0 Tests: 24 Failed: 1)
Failed test:  15
 
This one is causing a 25% failure rate all by itself. :(
 
This needs fixing soon. :)
 
 
  * 3 x tests/bugs/geo-replication/bug-877293.t
(Wstat: 0 Tests: 15 Failed: 1)
Failed test:  11
 
  * 2 x tests/basic/afr/entry-self-heal.t  
(Wstat: 0 Tests: 180 Failed: 2)
Failed tests:  127-128
 
  * 1 x tests/basic/ec/ec-12-4.t   
(Wstat: 0 Tests: 541 Failed: 2)
Failed tests:  409, 441
 
  * 1 x tests/basic/fops-sanity.t  
(Wstat: 0 Tests: 11 Failed: 1)
Failed test:  10
 
  * 1 x tests/basic/uss.t  
(Wstat: 0 Tests: 160 Failed: 1)
Failed test:  26
 
  * 1 x tests/performance/open-behind.t
(Wstat: 0 Tests: 17 Failed: 1)
Failed test:  17
 
  * 1 x tests/bugs/distribute/bug-884455.t 
(Wstat: 0 Tests: 22 Failed: 1)
Failed test:  11
 
  * 1 x tests/bugs/fuse/bug-1126048.t  
(Wstat: 0 Tests: 12 Failed: 1)
Failed test:  10
 
  * 1 x tests/bugs/quota/bug-1038598.t 
(Wstat: 0 Tests: 28 Failed: 1)
Failed test:  28
 
 
 2 x Coredumps
 *
 
  * http://mirror.salasaga.org/gluster/master/2015-03-03/bulk5/
 
IP - 104.130.74.142
 
This coredump run also failed on:
 
  * tests/basic/fops-sanity.t  
(Wstat: 0 Tests: 11 Failed: 1)
Failed test:  10
 
  * tests/bugs/glusterfs-server/bug-861542.t   
(Wstat: 0 Tests: 13 Failed: 1)
Failed test:  10
 
  * tests/performance/open-behind.t
(Wstat: 0 Tests: 17 Failed: 1)
Failed test:  17
 
  * http://mirror.salasaga.org/gluster/master/2015-03-03/bulk8/
 
IP - 104.130.74.143
 
This coredump run also failed on:
 
  * tests/basic/afr/entry-self-heal.t  
(Wstat: 0 Tests: 180 Failed: 2)
Failed tests:  127-128
 
  * tests/bugs/glusterfs-server/bug-861542.t   
(Wstat: 0 Tests: 13 Failed: 1)
Failed test:  10
 
 Both VMs are also online, in case they're useful to log into
 for investigation (root / the jenkins slave pw).
 
 If they're not, please let me know so I can blow them away. :)
 
 
 1 x hung host
 *
 
 Hung on tests/bugs/posix/bug-1113960.t
 
 root  12497  1290  0 Mar03 ?  S  0:00  \_ /bin/bash /opt/qa/regression.sh
 root  12504 12497  0 Mar03 ?  S  0:00  \_ /bin/bash ./run-tests.sh
 root  12519 12504  0 Mar03 ?  S  0:03  \_ /usr/bin/perl 
 /usr/bin/prove -rf --timer ./tests
 root  22018 12519  0 00:17 ?  S  0:00  \_ /bin/bash 
 ./tests/bugs/posix/bug-1113960.t
 root  30002 22018  0 01:57 ?  S  0:00  \_ mv 
 /mnt/glusterfs/0/longernamedir1/longernamedir2/longernamedir3/
 
 This VM (23.253.53.111) is still online + untouched (still hung),
 if someone wants to log in to investigate.  (root / the jenkins
 slave pw)
 
 Hope that's helpful. :)
 
 Regards and best wishes,
 
 Justin Clift
 
 --
 GlusterFS - http://www.gluster.org
 
 An open source, distributed file 

Re: [Gluster-devel] Spurious failure report for master branch - 2015-03-03

2015-03-04 Thread Shyam

On 03/03/2015 11:27 PM, Justin Clift wrote:

2 x Coredumps
*

   * http://mirror.salasaga.org/gluster/master/2015-03-03/bulk5/

 IP - 104.130.74.142

 This coredump run also failed on:

   * tests/basic/fops-sanity.t  
   (Wstat: 0 Tests: 11 Failed: 1)
 Failed test:  10

   * tests/bugs/glusterfs-server/bug-861542.t   
   (Wstat: 0 Tests: 13 Failed: 1)
 Failed test:  10

   * tests/performance/open-behind.t
   (Wstat: 0 Tests: 17 Failed: 1)
 Failed test:  17


FWIW, this is the same as 
https://bugzilla.redhat.com/show_bug.cgi?id=1195415




   * http://mirror.salasaga.org/gluster/master/2015-03-03/bulk8/

 IP - 104.130.74.143

 This coredump run also failed on:

   * tests/basic/afr/entry-self-heal.t  
   (Wstat: 0 Tests: 180 Failed: 2)
 Failed tests:  127-128

   * tests/bugs/glusterfs-server/bug-861542.t   
   (Wstat: 0 Tests: 13 Failed: 1)
 Failed test:  10


So is this one. i.e same as 
https://bugzilla.redhat.com/show_bug.cgi?id=1195415


Shyam
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] How does read-subvol-entry.t works?

2015-03-04 Thread Emmanuel Dreyfus
On Wed, Mar 04, 2015 at 10:31:06AM +0530, Ravishankar N wrote:
 Not sure, CC'ing Atin who might be able to shed some light on the glusterd
 logs. If the brick gets restarted as you say, the brick log will also
 contain something like I [glusterfsd.c:1959:main]
 0-/usr/local/sbin/glusterfsd: Started running /usr/local/sbin/glusterfsd
 and the graph information etc. Does it? And does volume status show the
 brick as online again?

See my other message: this is not our problem: the brick restarts 
because we restart it...

-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] IMP: GlusterD uses liburcu lists from now on.

2015-03-04 Thread Kaushal M
I forgot to mention this earlier. Anyone who has any patch on review that
involves GlusterD, will need to rebase their patch. Sorry for the
inconvenience.

~kaushal

On Wed, Mar 4, 2015 at 1:20 PM, Kaushal M kshlms...@gmail.com wrote:

 Review http://review.gluster.org/9624 just got merged. This is the first
 actual code change to be using liburcu within GlusterD.

 This change replaces the usage of libglusterfs list data structures and
 APIs, with the data structures and APIs provided by liburcu. The
 replacement is mostly a case of prefixing the libglusterfs list data
 structure and API names.

 We chose to do a complete replacement within GlusterD to prevent confusion
 to developers. We could have just used the liburcu lists just for the lists
 we wanted to protect with RCU, but it would require more effort from
 developers to decide on which list API to use for a given list.

 The liburcu APIs and data structures have a `cds_` prefix, and are other
 otherwise the same as libglusterfs counterparts. For eg.
 list_head - cds_list_head
 INIT_LIST_HEAD - CDS_INIT_LIST_HEAD
 list_for_each - cds_list_for_each
 list_entry - cds_list_entry
 etc.

 The above change just lays a base for the introduction of the actual RCU
 protection changes, which will be following soon (beginning with protection
 for peerinfos).

 Thanks.

 ~kaushal

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] How does read-subvol-entry.t works?

2015-03-04 Thread Emmanuel Dreyfus
On Tue, Mar 03, 2015 at 07:47:15AM +0530, Ravishankar N wrote:
 If the afr xattrs on the dir is clean on all bricks, then the dir is chosen
 by afr_read_subvol_select_by_policy().
 But in this case since the second brick is the only source, readdirs will
 have to use that as the read subvolume.

Here us my understanding so far whel listing $M0/abc/def, brick0 is ised
(while it should not), because afr_replies_interpret() gets in reply from 
brick1: 
data_accused[0] = 0 (it should be 1)
data_accused[1] = 0

data_accused[0] comes from trusted.afr.patchy-client-0 xattr of /abc/def
That attribute is correctly set.

I added dict_dump_to_log() in server_lookup_cbk() and client3_3_lookup_cbk()
to dump the xattr for /abc/def

In server_lookup_cbk() I get:
((glusterfs.inodelk-count:0)(glusterfs.entrylk-count:0)
(glusterfs.parent-entrylk:0)(trusted.afr.patchy-client-1:)
(trusted.afr.patchy-client-0:)(glusterfs.open-fd-count:0)
(trusted.glusterfs.dht:))

In client3_3_lookup_cbk() I only have left:
((trusted.glusterfs.dht:))

I will now try to see what I have on the wire.


-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Spurious failure report for master branch - 2015-03-03

2015-03-04 Thread Nithya Balachandran
Hi,

I had a look at 
tests/bugs/distribute/bug-1117851.t

The test fails at :

EXPECT_WITHIN 75 done cat $M0/status_0


The test uses a status file to check if the file rename operation (where a 1000 
files are renamed) which runs in the background is over. The status file 
$M0/status_0 is created before the rename begins and the string running is 
written to it. Once the rename is done, the string done is written to the 
file.

So it turns out the renames are actually finishing well in time - roughly 40 
seconds. But the status_0 file is not present so cat fails on the file. The 
logs for two regression runs that failed confirm this 
(http://build.gluster.org/job/rackspace-regression-2GB/951/console and 
http://build.gluster.org/job/rackspace-regression-2GB/983/console). 

cat: /mnt/glusterfs/0/status_0: No such file or directory
[14:53:50] ./tests/bugs/distribute/bug-1117851.t 
. 
not ok 15 Got  instead of done
Failed 1/24 subtests

The test runs successfully on my local setup and has failed only twice on the 
VM Justin provided(out of about 50 runs), so I am still looking into why it 
cannot find the file.


Regards,
Nithya

- Original Message -
From: Justin Clift jus...@gluster.org
To: Nithya Balachandran nbala...@redhat.com
Cc: Gluster Devel gluster-devel@gluster.org
Sent: Wednesday, 4 March, 2015 10:12:17 AM
Subject: Re: [Gluster-devel] Spurious failure report for master branch - 
2015-03-03

Thanks. :)

If you need a VM setup in Rackspace for you to investigate on, it's easy
to do.  Let me know if so. :)

+ Justin


On 4 Mar 2015, at 04:37, Nithya Balachandran nbala...@redhat.com wrote:
 I'll take a look at tests/bugs/distribute/bug-1117851.t
 
 Regards,
 Nithya
 
 - Original Message -
 From: Justin Clift jus...@gluster.org
 To: Gluster Devel gluster-devel@gluster.org
 Sent: Wednesday, 4 March, 2015 9:57:00 AM
 Subject: [Gluster-devel] Spurious failure report for master branch -  
 2015-03-03
 
 Ran 20 x regression tests on our GlusterFS master branch code
 as of a few hours ago, commit 95d5e60afb29aedc29909340e7564d54a6a247c2.
 
 5 of them were successful (25%), 15 of them failed in various ways
 (75%).
 
 We need to get this down to about 5% or less (preferably 0%), as it's
 killing our development iteration speed.  We're wasting huge amounts
 of time working around this. :(
 
 
 Spurious failures
 *
 
  * 5 x tests/bugs/distribute/bug-1117851.t
(Wstat: 0 Tests: 24 Failed: 1)
Failed test:  15
 
This one is causing a 25% failure rate all by itself. :(
 
This needs fixing soon. :)
 
 
  * 3 x tests/bugs/geo-replication/bug-877293.t
(Wstat: 0 Tests: 15 Failed: 1)
Failed test:  11
 
  * 2 x tests/basic/afr/entry-self-heal.t  
(Wstat: 0 Tests: 180 Failed: 2)
Failed tests:  127-128
 
  * 1 x tests/basic/ec/ec-12-4.t   
(Wstat: 0 Tests: 541 Failed: 2)
Failed tests:  409, 441
 
  * 1 x tests/basic/fops-sanity.t  
(Wstat: 0 Tests: 11 Failed: 1)
Failed test:  10
 
  * 1 x tests/basic/uss.t  
(Wstat: 0 Tests: 160 Failed: 1)
Failed test:  26
 
  * 1 x tests/performance/open-behind.t
(Wstat: 0 Tests: 17 Failed: 1)
Failed test:  17
 
  * 1 x tests/bugs/distribute/bug-884455.t 
(Wstat: 0 Tests: 22 Failed: 1)
Failed test:  11
 
  * 1 x tests/bugs/fuse/bug-1126048.t  
(Wstat: 0 Tests: 12 Failed: 1)
Failed test:  10
 
  * 1 x tests/bugs/quota/bug-1038598.t 
(Wstat: 0 Tests: 28 Failed: 1)
Failed test:  28
 
 
 2 x Coredumps
 *
 
  * http://mirror.salasaga.org/gluster/master/2015-03-03/bulk5/
 
IP - 104.130.74.142
 
This coredump run also failed on:
 
  * tests/basic/fops-sanity.t  
(Wstat: 0 Tests: 11 Failed: 1)
Failed test:  10
 
  * tests/bugs/glusterfs-server/bug-861542.t   
(Wstat: 0 Tests: 13 Failed: 1)
Failed test:  10
 
  * tests/performance/open-behind.t
(Wstat: 0 Tests: 17 Failed: 1)
Failed test:  17
 
  * http://mirror.salasaga.org/gluster/master/2015-03-03/bulk8/
 
IP - 104.130.74.143
 
This coredump run also failed on:
 
  * tests/basic/afr/entry-self-heal.t  
(Wstat: 0 Tests: 180 Failed: 2)
Failed tests:  127-128
 
  * tests/bugs/glusterfs-server/bug-861542.t   
  

[Gluster-devel] REMINDER: Weekly Gluster Community meeting in 50 mins

2015-03-04 Thread Justin Clift
Hi all,

In about 50 minutes the regular weekly Gluster Community IRC meeting
begins.  Everyone is welcome to join in. :)

Meeting details:

  * Location: #gluster-meeting on Freenode IRC
  * Date: every Wednesday
  * Time: 12:00 UTC, 13:00 CET (in your terminal, run: date -d 12:00 UTC)
  * Agenda: https://public.pad.fsfe.org/p/gluster-community-meetings

Currently the following items are listed:

  * Roll Call
  * Status of last weeks action items
  * GlusterFS 3.6
  * GlusterFS 3.5
  * GlusterFS 3.4
  * GlusterFS Next
  * Open Floor

The last topic has space for additions by any Community Member. If you
have a suitable topic to discuss, please add it to the agenda. :)

Regards and best wishes,

Justin Clift

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] REMINDER: Weekly Gluster Community meeting in 50 mins

2015-03-04 Thread Justin Clift
On 4 Mar 2015, at 11:11, Justin Clift jus...@gluster.org wrote:
 Hi all,
 
 In about 50 minutes the regular weekly Gluster Community IRC meeting
 begins.  Everyone is welcome to join in. :)
 
 Meeting details:
 
  * Location: #gluster-meeting on Freenode IRC
  * Date: every Wednesday
  * Time: 12:00 UTC, 13:00 CET (in your terminal, run: date -d 12:00 UTC)
  * Agenda: https://public.pad.fsfe.org/p/gluster-community-meetings

Thanks everyone for attending.  Pretty active meeting with a bunch of
people. :)

Lets see if we can get the spurious failure count down significantly
by next meeting. :)

Meeting Summary:

  
http://meetbot.fedoraproject.org/gluster-meeting/2015-03-04/gluster-meeting.2015-03-04-12.00.html

Full Log:

  
http://meetbot.fedoraproject.org/gluster-meeting/2015-03-04/gluster-meeting.2015-03-04-12.00.log.html

Regards and best wishes,

Justin Clift

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel