Re: [Gluster-devel] Spurious failure report for master branch - 2015-03-03
Fix for the spurious bug-1117851.t failure at http://review.gluster.org/#/c/9798/ Regards, Nithya - Original Message - From: Justin Clift jus...@gluster.org To: Nithya Balachandran nbala...@redhat.com Cc: Gluster Devel gluster-devel@gluster.org Sent: Wednesday, 4 March, 2015 10:12:17 AM Subject: Re: [Gluster-devel] Spurious failure report for master branch - 2015-03-03 Thanks. :) If you need a VM setup in Rackspace for you to investigate on, it's easy to do. Let me know if so. :) + Justin On 4 Mar 2015, at 04:37, Nithya Balachandran nbala...@redhat.com wrote: I'll take a look at tests/bugs/distribute/bug-1117851.t Regards, Nithya - Original Message - From: Justin Clift jus...@gluster.org To: Gluster Devel gluster-devel@gluster.org Sent: Wednesday, 4 March, 2015 9:57:00 AM Subject: [Gluster-devel] Spurious failure report for master branch - 2015-03-03 Ran 20 x regression tests on our GlusterFS master branch code as of a few hours ago, commit 95d5e60afb29aedc29909340e7564d54a6a247c2. 5 of them were successful (25%), 15 of them failed in various ways (75%). We need to get this down to about 5% or less (preferably 0%), as it's killing our development iteration speed. We're wasting huge amounts of time working around this. :( Spurious failures * * 5 x tests/bugs/distribute/bug-1117851.t (Wstat: 0 Tests: 24 Failed: 1) Failed test: 15 This one is causing a 25% failure rate all by itself. :( This needs fixing soon. :) * 3 x tests/bugs/geo-replication/bug-877293.t (Wstat: 0 Tests: 15 Failed: 1) Failed test: 11 * 2 x tests/basic/afr/entry-self-heal.t (Wstat: 0 Tests: 180 Failed: 2) Failed tests: 127-128 * 1 x tests/basic/ec/ec-12-4.t (Wstat: 0 Tests: 541 Failed: 2) Failed tests: 409, 441 * 1 x tests/basic/fops-sanity.t (Wstat: 0 Tests: 11 Failed: 1) Failed test: 10 * 1 x tests/basic/uss.t (Wstat: 0 Tests: 160 Failed: 1) Failed test: 26 * 1 x tests/performance/open-behind.t (Wstat: 0 Tests: 17 Failed: 1) Failed test: 17 * 1 x tests/bugs/distribute/bug-884455.t (Wstat: 0 Tests: 22 Failed: 1) Failed test: 11 * 1 x tests/bugs/fuse/bug-1126048.t (Wstat: 0 Tests: 12 Failed: 1) Failed test: 10 * 1 x tests/bugs/quota/bug-1038598.t (Wstat: 0 Tests: 28 Failed: 1) Failed test: 28 2 x Coredumps * * http://mirror.salasaga.org/gluster/master/2015-03-03/bulk5/ IP - 104.130.74.142 This coredump run also failed on: * tests/basic/fops-sanity.t (Wstat: 0 Tests: 11 Failed: 1) Failed test: 10 * tests/bugs/glusterfs-server/bug-861542.t (Wstat: 0 Tests: 13 Failed: 1) Failed test: 10 * tests/performance/open-behind.t (Wstat: 0 Tests: 17 Failed: 1) Failed test: 17 * http://mirror.salasaga.org/gluster/master/2015-03-03/bulk8/ IP - 104.130.74.143 This coredump run also failed on: * tests/basic/afr/entry-self-heal.t (Wstat: 0 Tests: 180 Failed: 2) Failed tests: 127-128 * tests/bugs/glusterfs-server/bug-861542.t (Wstat: 0 Tests: 13 Failed: 1) Failed test: 10 Both VMs are also online, in case they're useful to log into for investigation (root / the jenkins slave pw). If they're not, please let me know so I can blow them away. :) 1 x hung host * Hung on tests/bugs/posix/bug-1113960.t root 12497 1290 0 Mar03 ? S 0:00 \_ /bin/bash /opt/qa/regression.sh root 12504 12497 0 Mar03 ? S 0:00 \_ /bin/bash ./run-tests.sh root 12519 12504 0 Mar03 ? S 0:03 \_ /usr/bin/perl /usr/bin/prove -rf --timer ./tests root 22018 12519 0 00:17 ? S 0:00 \_ /bin/bash ./tests/bugs/posix/bug-1113960.t root 30002 22018 0 01:57 ? S 0:00 \_ mv /mnt/glusterfs/0/longernamedir1/longernamedir2/longernamedir3/ This VM (23.253.53.111) is still online + untouched (still hung), if someone wants to log in to investigate. (root / the jenkins slave pw) Hope that's helpful. :) Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file
Re: [Gluster-devel] Spurious failure report for master branch - 2015-03-03
On 03/03/2015 11:27 PM, Justin Clift wrote: 2 x Coredumps * * http://mirror.salasaga.org/gluster/master/2015-03-03/bulk5/ IP - 104.130.74.142 This coredump run also failed on: * tests/basic/fops-sanity.t (Wstat: 0 Tests: 11 Failed: 1) Failed test: 10 * tests/bugs/glusterfs-server/bug-861542.t (Wstat: 0 Tests: 13 Failed: 1) Failed test: 10 * tests/performance/open-behind.t (Wstat: 0 Tests: 17 Failed: 1) Failed test: 17 FWIW, this is the same as https://bugzilla.redhat.com/show_bug.cgi?id=1195415 * http://mirror.salasaga.org/gluster/master/2015-03-03/bulk8/ IP - 104.130.74.143 This coredump run also failed on: * tests/basic/afr/entry-self-heal.t (Wstat: 0 Tests: 180 Failed: 2) Failed tests: 127-128 * tests/bugs/glusterfs-server/bug-861542.t (Wstat: 0 Tests: 13 Failed: 1) Failed test: 10 So is this one. i.e same as https://bugzilla.redhat.com/show_bug.cgi?id=1195415 Shyam ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] How does read-subvol-entry.t works?
On Wed, Mar 04, 2015 at 10:31:06AM +0530, Ravishankar N wrote: Not sure, CC'ing Atin who might be able to shed some light on the glusterd logs. If the brick gets restarted as you say, the brick log will also contain something like I [glusterfsd.c:1959:main] 0-/usr/local/sbin/glusterfsd: Started running /usr/local/sbin/glusterfsd and the graph information etc. Does it? And does volume status show the brick as online again? See my other message: this is not our problem: the brick restarts because we restart it... -- Emmanuel Dreyfus m...@netbsd.org ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] IMP: GlusterD uses liburcu lists from now on.
I forgot to mention this earlier. Anyone who has any patch on review that involves GlusterD, will need to rebase their patch. Sorry for the inconvenience. ~kaushal On Wed, Mar 4, 2015 at 1:20 PM, Kaushal M kshlms...@gmail.com wrote: Review http://review.gluster.org/9624 just got merged. This is the first actual code change to be using liburcu within GlusterD. This change replaces the usage of libglusterfs list data structures and APIs, with the data structures and APIs provided by liburcu. The replacement is mostly a case of prefixing the libglusterfs list data structure and API names. We chose to do a complete replacement within GlusterD to prevent confusion to developers. We could have just used the liburcu lists just for the lists we wanted to protect with RCU, but it would require more effort from developers to decide on which list API to use for a given list. The liburcu APIs and data structures have a `cds_` prefix, and are other otherwise the same as libglusterfs counterparts. For eg. list_head - cds_list_head INIT_LIST_HEAD - CDS_INIT_LIST_HEAD list_for_each - cds_list_for_each list_entry - cds_list_entry etc. The above change just lays a base for the introduction of the actual RCU protection changes, which will be following soon (beginning with protection for peerinfos). Thanks. ~kaushal ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] How does read-subvol-entry.t works?
On Tue, Mar 03, 2015 at 07:47:15AM +0530, Ravishankar N wrote: If the afr xattrs on the dir is clean on all bricks, then the dir is chosen by afr_read_subvol_select_by_policy(). But in this case since the second brick is the only source, readdirs will have to use that as the read subvolume. Here us my understanding so far whel listing $M0/abc/def, brick0 is ised (while it should not), because afr_replies_interpret() gets in reply from brick1: data_accused[0] = 0 (it should be 1) data_accused[1] = 0 data_accused[0] comes from trusted.afr.patchy-client-0 xattr of /abc/def That attribute is correctly set. I added dict_dump_to_log() in server_lookup_cbk() and client3_3_lookup_cbk() to dump the xattr for /abc/def In server_lookup_cbk() I get: ((glusterfs.inodelk-count:0)(glusterfs.entrylk-count:0) (glusterfs.parent-entrylk:0)(trusted.afr.patchy-client-1:) (trusted.afr.patchy-client-0:)(glusterfs.open-fd-count:0) (trusted.glusterfs.dht:)) In client3_3_lookup_cbk() I only have left: ((trusted.glusterfs.dht:)) I will now try to see what I have on the wire. -- Emmanuel Dreyfus m...@netbsd.org ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Spurious failure report for master branch - 2015-03-03
Hi, I had a look at tests/bugs/distribute/bug-1117851.t The test fails at : EXPECT_WITHIN 75 done cat $M0/status_0 The test uses a status file to check if the file rename operation (where a 1000 files are renamed) which runs in the background is over. The status file $M0/status_0 is created before the rename begins and the string running is written to it. Once the rename is done, the string done is written to the file. So it turns out the renames are actually finishing well in time - roughly 40 seconds. But the status_0 file is not present so cat fails on the file. The logs for two regression runs that failed confirm this (http://build.gluster.org/job/rackspace-regression-2GB/951/console and http://build.gluster.org/job/rackspace-regression-2GB/983/console). cat: /mnt/glusterfs/0/status_0: No such file or directory [14:53:50] ./tests/bugs/distribute/bug-1117851.t . not ok 15 Got instead of done Failed 1/24 subtests The test runs successfully on my local setup and has failed only twice on the VM Justin provided(out of about 50 runs), so I am still looking into why it cannot find the file. Regards, Nithya - Original Message - From: Justin Clift jus...@gluster.org To: Nithya Balachandran nbala...@redhat.com Cc: Gluster Devel gluster-devel@gluster.org Sent: Wednesday, 4 March, 2015 10:12:17 AM Subject: Re: [Gluster-devel] Spurious failure report for master branch - 2015-03-03 Thanks. :) If you need a VM setup in Rackspace for you to investigate on, it's easy to do. Let me know if so. :) + Justin On 4 Mar 2015, at 04:37, Nithya Balachandran nbala...@redhat.com wrote: I'll take a look at tests/bugs/distribute/bug-1117851.t Regards, Nithya - Original Message - From: Justin Clift jus...@gluster.org To: Gluster Devel gluster-devel@gluster.org Sent: Wednesday, 4 March, 2015 9:57:00 AM Subject: [Gluster-devel] Spurious failure report for master branch - 2015-03-03 Ran 20 x regression tests on our GlusterFS master branch code as of a few hours ago, commit 95d5e60afb29aedc29909340e7564d54a6a247c2. 5 of them were successful (25%), 15 of them failed in various ways (75%). We need to get this down to about 5% or less (preferably 0%), as it's killing our development iteration speed. We're wasting huge amounts of time working around this. :( Spurious failures * * 5 x tests/bugs/distribute/bug-1117851.t (Wstat: 0 Tests: 24 Failed: 1) Failed test: 15 This one is causing a 25% failure rate all by itself. :( This needs fixing soon. :) * 3 x tests/bugs/geo-replication/bug-877293.t (Wstat: 0 Tests: 15 Failed: 1) Failed test: 11 * 2 x tests/basic/afr/entry-self-heal.t (Wstat: 0 Tests: 180 Failed: 2) Failed tests: 127-128 * 1 x tests/basic/ec/ec-12-4.t (Wstat: 0 Tests: 541 Failed: 2) Failed tests: 409, 441 * 1 x tests/basic/fops-sanity.t (Wstat: 0 Tests: 11 Failed: 1) Failed test: 10 * 1 x tests/basic/uss.t (Wstat: 0 Tests: 160 Failed: 1) Failed test: 26 * 1 x tests/performance/open-behind.t (Wstat: 0 Tests: 17 Failed: 1) Failed test: 17 * 1 x tests/bugs/distribute/bug-884455.t (Wstat: 0 Tests: 22 Failed: 1) Failed test: 11 * 1 x tests/bugs/fuse/bug-1126048.t (Wstat: 0 Tests: 12 Failed: 1) Failed test: 10 * 1 x tests/bugs/quota/bug-1038598.t (Wstat: 0 Tests: 28 Failed: 1) Failed test: 28 2 x Coredumps * * http://mirror.salasaga.org/gluster/master/2015-03-03/bulk5/ IP - 104.130.74.142 This coredump run also failed on: * tests/basic/fops-sanity.t (Wstat: 0 Tests: 11 Failed: 1) Failed test: 10 * tests/bugs/glusterfs-server/bug-861542.t (Wstat: 0 Tests: 13 Failed: 1) Failed test: 10 * tests/performance/open-behind.t (Wstat: 0 Tests: 17 Failed: 1) Failed test: 17 * http://mirror.salasaga.org/gluster/master/2015-03-03/bulk8/ IP - 104.130.74.143 This coredump run also failed on: * tests/basic/afr/entry-self-heal.t (Wstat: 0 Tests: 180 Failed: 2) Failed tests: 127-128 * tests/bugs/glusterfs-server/bug-861542.t
[Gluster-devel] REMINDER: Weekly Gluster Community meeting in 50 mins
Hi all, In about 50 minutes the regular weekly Gluster Community IRC meeting begins. Everyone is welcome to join in. :) Meeting details: * Location: #gluster-meeting on Freenode IRC * Date: every Wednesday * Time: 12:00 UTC, 13:00 CET (in your terminal, run: date -d 12:00 UTC) * Agenda: https://public.pad.fsfe.org/p/gluster-community-meetings Currently the following items are listed: * Roll Call * Status of last weeks action items * GlusterFS 3.6 * GlusterFS 3.5 * GlusterFS 3.4 * GlusterFS Next * Open Floor The last topic has space for additions by any Community Member. If you have a suitable topic to discuss, please add it to the agenda. :) Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] REMINDER: Weekly Gluster Community meeting in 50 mins
On 4 Mar 2015, at 11:11, Justin Clift jus...@gluster.org wrote: Hi all, In about 50 minutes the regular weekly Gluster Community IRC meeting begins. Everyone is welcome to join in. :) Meeting details: * Location: #gluster-meeting on Freenode IRC * Date: every Wednesday * Time: 12:00 UTC, 13:00 CET (in your terminal, run: date -d 12:00 UTC) * Agenda: https://public.pad.fsfe.org/p/gluster-community-meetings Thanks everyone for attending. Pretty active meeting with a bunch of people. :) Lets see if we can get the spurious failure count down significantly by next meeting. :) Meeting Summary: http://meetbot.fedoraproject.org/gluster-meeting/2015-03-04/gluster-meeting.2015-03-04-12.00.html Full Log: http://meetbot.fedoraproject.org/gluster-meeting/2015-03-04/gluster-meeting.2015-03-04-12.00.log.html Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel