[Gluster-devel] Test email pls ignore
Ignore this, it's just a test for measuring a delay issue with mailman. + Justin -- "My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there." - Indira Gandhi ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [gluster-devel] Documentation Tooling Review
On 23 Aug 2016, at 20:27, Justin Clift wrote: > On 11 Aug 2016, at 21:23, Amye Scavarda wrote: >> The Red Hat Gluster Storage documentation team and I had a conversation >> about how we can our upstream documentation more consistent and improved >> for our users, and they're willing to work with us to find where the major >> gaps are in our documentation. This is awesome! But it's going to take some >> work on our side to make this a reality. >> >> One piece that's come up is that we should probably look towards changing >> current tooling for this. It turns out that our ReadTheDocs instance search >> is failing because we're using markdown, and this is a known issue. It >> doesn't look like it's going to be fixed anytime soon. >> >> Rather than continue to try to make RTD serve our needs, I'd like to >> propose the following changes to where our documentation lives and in what >> language: >> I'd much rather pattern after docs.openshift.org, move to ASCIIdoc and use >> ASCIIbinder as our engine to power this. What that does is give us control >> over our overall infrastructure underneath our documentation, maintain our >> existing git workflow for adding to documentation, and matches with other >> communities that we work closely with. I'm mindful that there's a burden of >> migration again, but we'll be able to resolve a lot of the challenges we >> have with documentation currently: more control over layout, ability to >> change the structure to make it more user friendly, use our own search >> however we see fit. >> >> I'm happy to take comments on this proposal. Over the next week, I'll be >> reviewing the level of effort it would take to migrate to ASCIIdocs and >> ASCIIbinder, with the goal being to have this in place by end of September. >> >> Thoughts? > > It's probably worth considering GitBook instead: > > https://www.gitbook.com > > Example here: > > http://tutorial.djangogirls.org/en/index.html > > Pros: > > * Works with Markdown & ASCIIdoc > >No need to convert the existing docs to a new format, >and the already learned Markdown skills don't need relearning > > * Also fully Open Source > >https://github.com/GitbookIO/gitbook/ > > * Searching works very well > >Try searching on the Django Girls tutorial above for "Python". > >Correct results are returned in small fractions of a second. > > * Has well developed plugins to enable things like inline >videos, interactive exercises (and more) > >https://plugins.gitbook.com > > * Can be self hosted, or hosted on the GitBooks infrastructure > > * Doesn't require Ruby, unlike ASCIIbinder which is written >in it. An extra "Pro" pointed out to me offline: * You can log in with GitHub and post comments on each line Example here: https://docs.lacona.io/docs/basics/getting-started.html Note the green line there, with a helpful comment added to the side of that Seems like a good way for people to review/revise docs, for polishing & tweaking. > Cons: > > * It's written in Node.js instead > >Not sure that's any better than Ruby > > It seems a better polished solution than docs.openshift.org is using, > and would probably require less effort for the Gluster docs to be adapted > to. > > Thoughts? :) + Justin -- "My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there." - Indira Gandhi ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [gluster-devel] Documentation Tooling Review
On 11 Aug 2016, at 21:23, Amye Scavarda wrote: > The Red Hat Gluster Storage documentation team and I had a conversation > about how we can our upstream documentation more consistent and improved > for our users, and they're willing to work with us to find where the major > gaps are in our documentation. This is awesome! But it's going to take some > work on our side to make this a reality. > > One piece that's come up is that we should probably look towards changing > current tooling for this. It turns out that our ReadTheDocs instance search > is failing because we're using markdown, and this is a known issue. It > doesn't look like it's going to be fixed anytime soon. > > Rather than continue to try to make RTD serve our needs, I'd like to > propose the following changes to where our documentation lives and in what > language: > I'd much rather pattern after docs.openshift.org, move to ASCIIdoc and use > ASCIIbinder as our engine to power this. What that does is give us control > over our overall infrastructure underneath our documentation, maintain our > existing git workflow for adding to documentation, and matches with other > communities that we work closely with. I'm mindful that there's a burden of > migration again, but we'll be able to resolve a lot of the challenges we > have with documentation currently: more control over layout, ability to > change the structure to make it more user friendly, use our own search > however we see fit. > > I'm happy to take comments on this proposal. Over the next week, I'll be > reviewing the level of effort it would take to migrate to ASCIIdocs and > ASCIIbinder, with the goal being to have this in place by end of September. > > Thoughts? It's probably worth considering GitBook instead: https://www.gitbook.com Example here: http://tutorial.djangogirls.org/en/index.html Pros: * Works with Markdown & ASCIIdoc No need to convert the existing docs to a new format, and the already learned Markdown skills don't need relearning * Also fully Open Source https://github.com/GitbookIO/gitbook/ * Searching works very well Try searching on the Django Girls tutorial above for "Python". Correct results are returned in small fractions of a second. * Has well developed plugins to enable things like inline videos, interactive exercises (and more) https://plugins.gitbook.com * Can be self hosted, or hosted on the GitBooks infrastructure * Doesn't require Ruby, unlike ASCIIbinder which is written in it. Cons: * It's written in Node.js instead Not sure that's any better than Ruby It seems a better polished solution than docs.openshift.org is using, and would probably require less effort for the Gluster docs to be adapted to. Thoughts? :) + Justin -- "My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there." - Indira Gandhi ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Justin's last day at Red Hat today ;)
Hi us, It's my last day at Red Hat today, so I've just adjusted the jus...@gluster.org email address to redirect things to jus...@postgresql.org instead. So, people can still email me. I do have some Gluster things I'd like to finish off, it's just I need a bit of a break first. ;) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-infra] NetBSD regressions not being triggered for patches
On 18 Jun 2015, at 16:57, Emmanuel Dreyfus wrote: > Niels de Vos wrote: > >> I'm not sure what limitation you mean. Did we reach the limit of slaves >> that Jenkins can reasonably address? > > No I mean its inability to catch a new DNS record. Priority wise, my suggestion would be to first get Gerrit and Jenkins migrated to one of the two new servers. (probably put them in separate VM's) If the DNS problem does turn out to be the dodgy iWeb hardware firewall, then this fixes the DNS issue. (if not... well damn!) Assuming that does work :), then getting the other server set up with new VM's and such would be the next thing to do. That's my thinking anyway. For reference, these are the main hardware specs for the two boxes: formicary.gluster.org <-- for Gerrit/Jenkins/whatever * * 2 x Intel Xeon CPU E5-2640 v3 @ 2.60GHz (8 physical cores per cpu) * 32GB ECC RAM * 2 x ~560GB SAS HDD's * 1 x Intel 2P X520/2P I350 rNDC network card * Seems to be a 4 port 10GbE card. The mgmt console says 2 ports are up, and two down. Guessing this means only two ports are cabled up. ci.gluster.org <-- for VMs ** * 2 x Intel Xeon E5-2650 v3 @ 2.30GHz (10 physical cores per cpu) * 96GB ECC RAM * 4 x ~560GB SAS HDD's * 1 x Intel 2P X520/2P I350 rNDC network card * Seems to be a 4 port 10GbE card. The mgmt console says 2 ports are up, and two down. Guessing this means only two ports are cabled up. Hope this is useful info. ;) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-infra] NetBSD regressions not being triggered for patches
On 18 Jun 2015, at 09:19, Niels de Vos wrote: > On Thu, Jun 18, 2015 at 12:57:05AM +0100, Justin Clift wrote: >> On 17 Jun 2015, at 20:14, Niels de Vos wrote: >>> On Wed, Jun 17, 2015 at 03:14:31PM +0200, Michael Scherer wrote: >>>> Le mercredi 17 juin 2015 à 11:58 +0100, Justin Clift a écrit : >>>>> On 17 Jun 2015, at 10:53, Michael Scherer wrote: >>>>>> Le mercredi 17 juin 2015 à 11:48 +0200, Michael Scherer a écrit : >>>>>>> Le mercredi 17 juin 2015 à 08:20 +0200, Emmanuel Dreyfus a écrit : >>>>>>>> Venky Shankar wrote: >>>>>>>> >>>>>>>>> If that's the case, then I'll vote for this even if it takes some time >>>>>>>>> to get things in workable state. >>>>>>>> >>>>>>>> See my other mail about this: you enter a new slave VM in the DNS and >>>>>>>> it >>>>>>>> does not resolve, or somethimes you get 20s delays. I am convinced this >>>>>>>> is the reason why Jenkins bugs. >>>>>>> >>>>>>> But cloud.gluster.org is handled by rackspace, not sure how much control >>>>>>> we have for it ( not sure even where to start there ). >>>>>> >>>>>> So I cannot change the DNS destination. >>>>>> >>>>>> What I can do is to create a new dns zone, and then, we can delegate as >>>>>> we want. And migrate some slaves and not others, and see how it goes ? >>>>>> >>>>>> slaves.gluster.org would be ok for everybody ? >>>>> >>>>> Try it out, and see if it works. :) >>>>> >>>>> On the "scaling the infrastructure" side of things, are the two OSAS >>>>> servers >>>>> for Gluster still available? >>>> >>>> They are online. >>>> $ ssh r...@ci.gluster.org uptime >>>> 09:13:37 up 33 days, 16:34, 0 users, load average: 0,00, 0,01, 0,05 >>> >>> Can it run some Jenkins Slave VMs too? >> >> There are two boxes. A pretty beefy one for running Jenkins slave VM's >> (probably >> about 40 VM's simultaneously), and a slightly less beefy one for running >> Jenkins/Gerrit/whatever. > > Good to know, but it would be much more helpful if someone could install > VMs there and add them to the Jenkins instance... Who can do that, or > who can guide someone else to get it done? Misc has the keys. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-infra] NetBSD regressions not being triggered for patches
On 17 Jun 2015, at 20:14, Niels de Vos wrote: > On Wed, Jun 17, 2015 at 03:14:31PM +0200, Michael Scherer wrote: >> Le mercredi 17 juin 2015 à 11:58 +0100, Justin Clift a écrit : >>> On 17 Jun 2015, at 10:53, Michael Scherer wrote: >>>> Le mercredi 17 juin 2015 à 11:48 +0200, Michael Scherer a écrit : >>>>> Le mercredi 17 juin 2015 à 08:20 +0200, Emmanuel Dreyfus a écrit : >>>>>> Venky Shankar wrote: >>>>>> >>>>>>> If that's the case, then I'll vote for this even if it takes some time >>>>>>> to get things in workable state. >>>>>> >>>>>> See my other mail about this: you enter a new slave VM in the DNS and it >>>>>> does not resolve, or somethimes you get 20s delays. I am convinced this >>>>>> is the reason why Jenkins bugs. >>>>> >>>>> But cloud.gluster.org is handled by rackspace, not sure how much control >>>>> we have for it ( not sure even where to start there ). >>>> >>>> So I cannot change the DNS destination. >>>> >>>> What I can do is to create a new dns zone, and then, we can delegate as >>>> we want. And migrate some slaves and not others, and see how it goes ? >>>> >>>> slaves.gluster.org would be ok for everybody ? >>> >>> Try it out, and see if it works. :) >>> >>> On the "scaling the infrastructure" side of things, are the two OSAS servers >>> for Gluster still available? >> >> They are online. >> $ ssh r...@ci.gluster.org uptime >> 09:13:37 up 33 days, 16:34, 0 users, load average: 0,00, 0,01, 0,05 > > Can it run some Jenkins Slave VMs too? There are two boxes. A pretty beefy one for running Jenkins slave VM's (probably about 40 VM's simultaneously), and a slightly less beefy one for running Jenkins/Gerrit/whatever. + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Open source-based object storage startup Minio raises $3.3M from Nexus, others
Kind of interesting. This is a new startup by AB, one of the initial Gluster guys: http://www.vccircle.com/news/technology/2015/06/17/open-source-based-object-storage-startup-minio-raises-33m-nexus-others Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-infra] NetBSD regressions not being triggered for patches
On 17 Jun 2015, at 10:53, Michael Scherer wrote: > Le mercredi 17 juin 2015 à 11:48 +0200, Michael Scherer a écrit : >> Le mercredi 17 juin 2015 à 08:20 +0200, Emmanuel Dreyfus a écrit : >>> Venky Shankar wrote: >>> >>>> If that's the case, then I'll vote for this even if it takes some time >>>> to get things in workable state. >>> >>> See my other mail about this: you enter a new slave VM in the DNS and it >>> does not resolve, or somethimes you get 20s delays. I am convinced this >>> is the reason why Jenkins bugs. >> >> But cloud.gluster.org is handled by rackspace, not sure how much control >> we have for it ( not sure even where to start there ). > > So I cannot change the DNS destination. > > What I can do is to create a new dns zone, and then, we can delegate as > we want. And migrate some slaves and not others, and see how it goes ? > > slaves.gluster.org would be ok for everybody ? Try it out, and see if it works. :) On the "scaling the infrastructure" side of things, are the two OSAS servers for Gluster still available? If so, we should get them online ASAP, as that will give us ~40 new VMs + get us out of iWeb (which I suspect is the problem). Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-infra] NetBSD regressions not being triggered for patches
On 17 Jun 2015, at 07:29, Kaushal M wrote: > cloud.gluster.org is served by Rackspace Cloud DNS. AFAICT, there is > no readily available option to do zone transfers from it. We might > have to contact the Rackspace support to find out if they can do it as > a special request. Contacting Rackspace support is very easy, and they're normally very responsive. They have an online support ticket submission thing in the Rackspace UI. Often they get back to us with meaningful responses in less than 15-20 minutes. Please go ahead and submit a ticket. :) (Btw - I suspect the DNS issue is likely related to the hardware firewall in the iWeb infrastructure. It's probably acting up. :<). Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Notes on "brick multiplexing" in 4.0
On 17 Jun 2015, at 07:21, Kaushal M wrote: > One more question. I keep hearing about QoS for volumes as a feature. > How will we guarantee service quality for all the bricks from a single > server? Even if we weren't doing QoS, we make sure that operations on > brick doesn't DOS the others. We already keep hearing from users about > self-healing causing problems for the clients. Any idea if there's a clear patten of network vs disk traffic vs something else causing that? (Excess network or disk traffic could easily cause it in theory I guess, but practical data would be useful. :>) > Self-healing, rebalance > running simultaneously on multiple volumes in a multiplexed bricks > environment would most likely be disastrous. Not sure how that's different from now with those operations being able to run in the current approach. This is us having a chance to think this stuff through and work out a solution now. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Notes on "brick multiplexing" in 4.0
On 15 Jun 2015, at 20:35, Jeff Darcy wrote: > I've written up some thoughts about how to have multiple bricks sharing a > single process/port, since this is necessary to support other 4.0 features > and is likely to be a bit tricky to implement. Comments welcome here: > > https://goo.gl/27L9I5 Reading through that, it sounds like a well thought out approach. Did you consider a super-lightweight version first, which only has a process listening on one port for multiplexing traffic, and then passes the traffic to individual processes running on the server? eg similar to how common IPv4 NAT does, but for gluster traffic :) Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Gluster IPv6 bugfixes (Bug 1117886)
That'd be Awesome. :) + Justin On 15 Jun 2015, at 20:53, Richard Wareing wrote: > Hey Nithin, > > We have IPv6 going as well (v3.4.x & v3.6.x), so I might be able to help out > here and perhaps combine our efforts. We did something similar here, however > we also tackled the NFS side of the house, which required a bunch of changes > due to how port registration w/ portmapper changed in IPv6 vs IPv4. You > effectively have to use "libtirpc" to do all the port registrations with IPv6. > > We can offer up our patches for this work and hopefully things can be > combined such that end-users can simply do "vol set > transport-address-family " and voila they have whatever support > they desire. > > I'll see if we can get this posted to bug 1117886 this week. > > Richard > > > > From: gluster-devel-boun...@gluster.org [gluster-devel-boun...@gluster.org] > on behalf of Nithin Kumar Dabilpuram [nithind1...@yahoo.in] > Sent: Saturday, June 13, 2015 9:12 PM > To: gluster-devel@gluster.org > Subject: [Gluster-devel] Gluster IPv6 bugfixes (Bug 1117886) > > > > > Hi, > > Can I contribute to this bug fix ? I've worked on Gluster IPv6 functionality > bugs in 3.3.2 in my past organization and was able to successfully bring up > gluster on IPv6 link local addresses as well. > > Please find my work in progress patch. I'll raise gerrit review once testing > is done. I was successfully able to create volumes with 3 peers and add > bricks. I'll continue testing other basic functionality and see what needs to > be modified. Any other suggestions ? > > Brief info about the patch: > Here I'm trying to use "transport.address-family" option in > /etc/glusterfs/glusterd.vol file and then propagate the same to server and > client vol files and their translators. > > In this way when user mentions "transport.address-family inet6" in its > glusterd.vol file, all glusterd servers open AF_INET6 sockets and then the > same information is stored in glusterd_volinfo and used when generating vol > config files. > > -thanks > Nithin > > > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] A Year with Go
Potentially relevant to a GlusterD rewrite, since we've mentioned Go as a possibility a few times: https://vagabond.github.io/rants/2015/06/05/a-year-with-go/ https://news.ycombinator.com/item?id=9668302 + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Spurious regression: Checking on 3 test cases
On 26 May 2015, at 16:14, Niels de Vos wrote: > On Tue, May 26, 2015 at 10:17:05AM -0400, Jiffin Thottan wrote: > Testing can probably be done by adding a delay in the mount path. Either > a gdb script or systemtap that delays the execution of the thread that > handles the mount. Sounds like this is a real problem, that could happen in production systems (even if rarely). If that's the case, maybe add it to the test skipping code in run-tests.sh for now, and re-enable it when the Real Fix is found? (for 3.7.1/2?) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-infra] Regression failure in volume-snapshot-clone.t
There are two extra CentOS 6 VM's online for debugging stuff with, but they're both in use at the moment: * slave0.cloud.gluster.org * slave1.cloud.gluster.org Sachin Pandit, Raghavendra Bhat, and Krutika Dhananjay, are using them. Ping them, and organise with them when you can use them. I was intending to turn them off today, but it sounds like they should be left on for a while longer for people to investigate with. Regards and best wishes, Justin Clift On 21 May 2015, at 14:22, Avra Sengupta wrote: > Hi, > > Can I get access to a rackspace VM so that I can debug this particular > testcase on it. > > Regards, > Avra > > Forwarded Message > Subject: Re: [Gluster-devel] Regression failure in > volume-snapshot-clone.t > Date: Thu, 21 May 2015 17:08:05 +0530 > From: Vijay Bellur > To: Avra Sengupta , gluster Devel > , atin Mukherjee , Krishnan > Parthasarathi , rjosep >> Rajesh Joseph > > > On 05/21/2015 02:44 PM, Avra Sengupta wrote: > > Hi, > > > > I am not able to reproduce this failure in my set-up. I am aware that > > Atin was able to do so successfully a few days back, and I tried > > something similar with the following loop. > > > > for i in {1..100}; do export DEBUG=1; prove -r > > ./tests/basic/volume-snapshot-clone.t > 1; lines=`less 1 | grep "All > > tests successful" | wc -l`; if [ "$lines" != "1" ];then echo "TESTCASE > > FAILED. BREAKING"; break; fi; done > > > > I have been running this for about an hour and half, and will continue > > doing so. But till now i have not encountered a failure. Could anyone > > please point out if I am missing something obvious here. > > > > > > Some tests fail more frequently in the rackspace VMs where we run > regressions. Please drop a note on gluster-infra ML if you want to > offline one such VM from jenkins and run tests there. > > -Vijay > > > > > ___ > Gluster-infra mailing list > gluster-in...@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-infra -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Gluster Summit recordings
(Didn't see this mentioned elsewhere) The video recordings (using a tablet resting on the desk) for the Gluster Summit sessions in Barcelona are here: https://www.youtube.com/channel/UCngUyL3KPYz8M2n7rDJWU0w Thanks to Spot for providing the tablet for most of them, and uploading them too. :) Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-infra] Downtime for Jenkins
On 17 May 2015, at 13:36, Vijay Bellur wrote: > On 05/17/2015 02:32 PM, Vijay Bellur wrote: >> [Adding gluster-devel] >> On 05/16/2015 11:31 PM, Niels de Vos wrote: >>> On Sat, May 16, 2015 at 06:32:00PM +0200, Niels de Vos wrote: It seems that many failures of the regression tests (at least for NetBSD) are caused by failing to reconnect to the slave. Jenkins tries to keep a control connection open to the slaves, and reconnects when the connection terminates. I do not know why the connection is disrupted, but I can see that Jenkins is not able to resolve the hostname of the slave. For example, from (well, you have to find the older logs, Jenkins seems to have automatically reconnected) http://build.gluster.org/computer/nbslave72.cloud.gluster.org-v2/log : java.io.IOException: There was a problem while connecting to nbslave71.cloud.gluster.org:22 ... Caused by: java.net.UnknownHostException: nbslave71.cloud.gluster.org: Name or service not known The error in the console log of the regression test is less helpful, it only states the disconnection failure: http://build.gluster.org/job/rackspace-netbsd7-regression-triggered/5408/console >>> >>> In fact, this looks very much related to these reports: >>> >>> - https://issues.jenkins-ci.org/browse/JENKINS-19619 duplicate of 18879 >>> - https://issues.jenkins-ci.org/browse/JENKINS-18879 >>> >>> This problem should be fixed in Jenkins 1.524 and newer. Time to upgrade >>> Jenkins too? >> >> Yes, I have started an upgrade. Please expect a downtime for Jenkins >> during the upgrade. >> >> I will update once the activity is complete. >> > > Upgrade to Jenkins v1.613 is now complete and Jenkins seems to be largely > doing fine. Several plugins of Jenkins have also been updated to their latest > versions. During the course of the upgrade, I noticed that we were using the > deprecated 'gerrit approve' interface to intimate status of a smoke run. Have > changed that to use 'gerrit review' and this seems to have addressed the > problem of smoke tests not reporting status back to gerrit. > > There were a few instances of Jenkins not being able to launch slaves through > ssh but was later successful upon automatic retries. We will need to watch > this behavior to see if this problem persists and comes in the way of normal > functioning. > > Manu - can you please verify and report back if the NetBSD slaves work better > with the upgraded Jenkins master? > > All - please drop a note on gluster-infra if you happen to notice problems > with Jenkins. Good stuff. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Gluster 3.7.0 released
On 16 May 2015, at 10:07, Niels de Vos wrote: > The 3.7.0 release should be ready for usual production deployments. There is no way this is the case. We've done a huge amount of feature additions with little testing outside our own regression tests. We simply don't know how it functions on people's environments. "Feature addition" doesn't mean existing features weren't touched and adapted to suit... they definitely were. So even existing features have had code changes. There is no way this should be run in production environments (or anywhere people really depend on their data), without extensive testing in a non-production environment first. If someone doesn't have a non-production environment to test stuff in first, stick with 3.6.3 (and later 3.6.x series) for now, until other people have shaken out the first "show stopper" 3.7.x bugs. We *could* be super lucky and have managed to create a defect free release purely by sheer awesomeness. But we're more likely to see unicorns running around In Real Life tomorrow. Just my opinion... ;) Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Gluster 3.7.0 released
On 14 May 2015, at 10:19, Vijay Bellur wrote: > Hi All, > > I am happy to announce that Gluster 3.7.0 is now generally available. 3.7.0 > contains several new features and is one of our more feature packed releases > in recent times. The release notes [1] contains a description of new > functionality added to 3.7.0. In addition to features, 3.7.0 also contains > several bug fixes and minor improvements. It is highly recommended to test > 3.7.0 thoroughly for your use cases before deploying in production. > > Gluster 3.7.0 can be downloaded from [2]. Upgrade instructions can be found > at [3]. Packages for various distributions will be available shortly at the > download site. 3.7.0 won't be packaged into Ubuntu LTS nor CentOS EPEL will it? (I'm meaning their official external repos, not download.gluster.org) If there's any chance they might be, can we can get that blocked until 3.7.x so people on 3.6.3 aren't automatically upgraded via package update. Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Proposal for improving throughput for regression test
On 8 May 2015, at 18:41, Pranith Kumar Karampuri wrote: >> Break the regression tests into parts that can be run in parallel. >> >> So, instead of the regression testing for a particular CR going from the >> first test to the last in a serial sequence, we break it up into a number >> of chunks (dir based?) and make each of these a task. >> >> That won't reduce the overall number of tests, but it should get the time >> down for the result to be finished. >> >> Caveat : We're going to need more VM's, as once we get into things >> queueing up it's not going to help. :/ > Raghavendra Talur(CCed) did some work on this earlier by using more docker > isntances on a single VM to get the running time under an hour. Interesting idea. Any idea if this Docker approach could be made to work in CentOS 6 for our existing VM's? + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Regression failure release-3.7 for tests/basic/afr/entry-self-heal.t
On 8 May 2015, at 18:37, Pranith Kumar Karampuri wrote: > On 05/08/2015 10:53 PM, Justin Clift wrote: >> Seems like a new one, so it's been added to the Etherpad. >> >> http://build.gluster.org/job/regression-test-burn-in/23/console > This looks a lot similar to the data-self-heal.t test where healing fails to > happen because both the threads end up not getting enough locks to perform > heal in self-heal domain. taking blocking locks seem like an easy solution > but that will decrease self-heal through put, so Ravi and I are still > thinking about best way to solve this problem. Will take some time. I can add > this and data-self-heal.t to badtests for now, if that helps. Sure. Do you need this VM still, or can I give it to someone else for doing stuff with? :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Regression failure release-3.7 for tests/basic/afr/entry-self-heal.t
Seems like a new one, so it's been added to the Etherpad. http://build.gluster.org/job/regression-test-burn-in/23/console It's on a new slave VM (slave1), which has been disconnected in Jenkins so it can be investigated. It's using our standard Jenkins auth. + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Proposal for improving throughput for regression test
On 8 May 2015, at 17:37, Atin Mukherjee wrote: On 05/08/2015 08:54 PM, Justin Clift wrote: >> On 8 May 2015, at 10:02, Mohammed Rafi K C wrote: >>> Hi All, >>> >>> As we all know, our regression tests are killing us. An average, one >>> regression will take approximately two and half hours to complete the >>> run. So i guess this is the right time to think about enhancing our >>> regression. >>> >>> Proposal 1: >>> >>> Create a new option for the daemons to specify that it is running as >>> test mode, then we can skip fsync calls used for data durability. >>> >>> Proposal 2: >>> >>> Use ip address instead of host name, because it takes some good amount >>> of time to resolve from host name, and even some times causes spurious >>> failure. >>> >>> >>> Proposal 3: >>> Each component has a lot of .t files and there is redundancy in tests, >>> We can do a rework to reduce the .t files and make less number of tests >>> that covers unit testing for a component , and run regression runs once >>> in a day (nightly) . >>> >>> Please provide your inputs for the proposed ideas , and feel free to add >>> a new idea. >> >> Proposal 4: >> >> Break the regression tests into parts that can be run in parallel. >> >> So, instead of the regression testing for a particular CR going from the >> first test to the last in a serial sequence, we break it up into a number >> of chunks (dir based?) and make each of these a task. >> >> That won't reduce the overall number of tests, but it should get the time >> down for the result to be finished. >> >> Caveat : We're going to need more VM's, as once we get into things >> queueing up it's not going to help. :/ > This could be really effective and I've been thinking about it for quite > a long time :) Yeah, the idea wasn't first thought of by me. This seems like a good place to suggest it though. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] gluster crashes in dht_getxattr_cbk() due to null pointer dereference.
Thanks Paul. That's for an ancient series of GlusterFS (3.4.x) we're not really looking to release further updates for. If that's the version you guys are running in your production environment, having you looked into moving to a newer release series? + Justin On 8 May 2015, at 10:55, Paul Guo wrote: > Hi, > > gdb debugging shows the rootcause seems to be quite straightforward. The > gluster version is 3.4.5 and the stack: > > #0 0x7eff735fe354 in dht_getxattr_cbk (frame=0x7eff775b6360, > cookie=, this=, op_ret= optimized out>, op_errno=0, >xattr=, xdata=0x0) at dht-common.c:2043 > 2043 DHT_STACK_UNWIND (getxattr, frame, local->op_ret, > op_errno, > Missing separate debuginfos, use: debuginfo-install > glibc-2.12-1.80.el6.x86_64 keyutils-libs-1.4-4.el6.x86_64 > krb5-libs-1.9-33.el6.x86_64 libcom_err-1.41.12-12.el6.x86_64 > libgcc-4.4.6-4.el6.x86_64 libselinux-2.0.94-5.3.el6.x86_64 > openssl-1.0.1e-16.el6_5.14.x86_64 zlib-1.2.3-27.el6.x86_64 > (gdb) bt > #0 0x7eff735fe354 in dht_getxattr_cbk (frame=0x7eff775b6360, > cookie=, this=, op_ret= optimized out>, op_errno=0, >xattr=, xdata=0x0) at dht-common.c:2043 > #1 0x7eff7383c168 in afr_getxattr_cbk (frame=0x7eff7756ab58, > cookie=, this=, op_ret=0, > op_errno=0, dict=0x7eff76f21dc8, xdata=0x0) >at afr-inode-read.c:618 > #2 0x7eff73d8 in client3_3_getxattr_cbk (req=, > iov=, count=, > myframe=0x7eff77554d4c) at client-rpc-fops.c:1115 > #3 0x003de700d6f5 in rpc_clnt_handle_reply (clnt=0xc36ad0, > pollin=0x14b21560) at rpc-clnt.c:771 > #4 0x003de700ec6f in rpc_clnt_notify (trans=, > mydata=0xc36b00, event=, data=) at > rpc-clnt.c:891 > #5 0x003de700a4e8 in rpc_transport_notify (this=, > event=, data=) at > rpc-transport.c:497 > #6 0x7eff74af6216 in socket_event_poll_in (this=0xc46530) at > socket.c:2118 > #7 0x7eff74af7c3d in socket_event_handler (fd=, > idx=, data=0xc46530, poll_in=1, poll_out=0, poll_err=0) > at socket.c:2230 > #8 0x003de785e907 in event_dispatch_epoll_handler (event_pool=0xb70e90) > at event-epoll.c:384 > #9 event_dispatch_epoll (event_pool=0xb70e90) at event-epoll.c:445 > #10 0x00406818 in main (argc=4, argv=0x7fff24878238) at > glusterfsd.c:1934 > > See dht_getxattr_cbk() (below). When frame->local is equal to 0, gluster > jumps to the label "out" where when it accesses local->xattr (i.e. 0->xattr), > it crashes. Note in DHT_STACK_UNWIND()->STACK_UNWIND_STRICT(), fn looks fine. > > (gdb) p __local > $11 = (dht_local_t *) 0x0 > (gdb) p frame->local > $12 = (void *) 0x0 > (gdb) p fn > $1 = (fop_getxattr_cbk_t) 0x7eff7298c940 > > I did not read the dht code much so I have not idea whether zero frame->local > is normal or not, but from the code's perspective this is an obvious bug and > it still exists in latest glusterfs workspace. > > The following code change is a simple fix, but maybe there's a better one. > -if (is_last_call (this_call_cnt)) { > +if (is_last_call (this_call_cnt) && local != NULL) { > > Similar issues exist in other functions also, e.g. stripe_getxattr_cbk() (I > did not check all code). > > int > dht_getxattr_cbk (call_frame_t *frame, void *cookie, xlator_t *this, > int op_ret, int op_errno, dict_t *xattr, dict_t *xdata) > { >int this_call_cnt = 0; >dht_local_t *local = NULL; > >VALIDATE_OR_GOTO (frame, out); >VALIDATE_OR_GOTO (frame->local, out); > >.. > > out: >if (is_last_call (this_call_cnt)) { >DHT_STACK_UNWIND (getxattr, frame, local->op_ret, op_errno, > local->xattr, NULL); >} >return 0; > } > > > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Proposal for improving throughput for regression test
On 8 May 2015, at 10:02, Mohammed Rafi K C wrote: > Hi All, > > As we all know, our regression tests are killing us. An average, one > regression will take approximately two and half hours to complete the > run. So i guess this is the right time to think about enhancing our > regression. > > Proposal 1: > > Create a new option for the daemons to specify that it is running as > test mode, then we can skip fsync calls used for data durability. > > Proposal 2: > > Use ip address instead of host name, because it takes some good amount > of time to resolve from host name, and even some times causes spurious > failure. > > > Proposal 3: > Each component has a lot of .t files and there is redundancy in tests, > We can do a rework to reduce the .t files and make less number of tests > that covers unit testing for a component , and run regression runs once > in a day (nightly) . > > Please provide your inputs for the proposed ideas , and feel free to add > a new idea. Proposal 4: Break the regression tests into parts that can be run in parallel. So, instead of the regression testing for a particular CR going from the first test to the last in a serial sequence, we break it up into a number of chunks (dir based?) and make each of these a task. That won't reduce the overall number of tests, but it should get the time down for the result to be finished. Caveat : We're going to need more VM's, as once we get into things queueing up it's not going to help. :/ + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Proposal for improving throughput for regression test
On 8 May 2015, at 16:19, Jeff Darcy wrote: >> Proposal 2: >> >> Use ip address instead of host name, because it takes some good amount >> of time to resolve from host name, and even some times causes spurious >> failure. > > If resolution is taking a long time, that's probably fixable in the > test machine configuration. Reading a few lines from /etc/hosts should > take only a trivial amount of time. Ahhh, I didn't know that DNS resolution was being a problem. Yeah, we can hard code entries into /etc/hosts for the slave machines if that would help. (this is easy to do) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] good job on fixing heavy hitters in spurious regressions
On 8 May 2015, at 04:15, Pranith Kumar Karampuri wrote: > 2) If the same test fails on different patches more than 'x' number of times > we should do something drastic. Let us decide on 'x' and what the drastic > measure is. Sure. That number is 0. If it fails more than 0 times on different patches, we have a problem than needs resolving as an immediate priority. > Some good things I found this time around compared to 3.6.0 release: > 1) Failing the regression on first failure is helping locating the failure > logs really fast > 2) More people chipped in fixing the tests that are not at all their > responsibility, which is always great to see. Cool. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] good job on fixing heavy hitters in spurious regressions
On 8 May 2015, at 13:16, Jeff Darcy wrote: > Perhaps the change that's needed > is to make the fixing of likely-spurious test failures a higher > priority than adding new features. YES! A million times Yes. We need to move this project to operating with _0 regression failures_ as the normal state of things for master and release branches. Regression failures for CR's in development... sure, that's a normal part of development. But any time a regression failure happens in _master_ or a release branch should be a case of _get this fixed pronto_. + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Upstream master and 3.7 branch build broken
On 8 May 2015, at 15:52, Shyam wrote: > Shyam > P.S: Sending this to the devel list for those not looking at the IRC ATM Excellent, thanks. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] A HowTo for setting up network encryption with GlusterFS
On 7 May 2015, at 16:53, Kaushal M wrote: > Forgot the link. :D > > [1]: https://kshlm.in/network-encryption-in-glusterfs/ Any interest in copying that onto the main wiki? :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Jupyter notebook support on GitHub
This could be really useful for us: https://github.com/blog/1995-github-jupyter-notebooks-3 GitHub now supports Jupyter notebooks directly. Similar to how Markdown (.md) files are displayed in their rendered format, Jupyter notebook (.ipynb) files are now too. Should make for better docs for us, as we can do intro stuff and other technical concept bits with graphics now instead of just ascii art. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Fwd: [sqlite] SQLite 3.8.10 enters testing
Fuzzy testing has been added to SQLite's standard testing strategy. Wonder if it'd be useful for us too... ? + Justin Begin forwarded message: > From: Simon Slavin > Subject: Re: [sqlite] SQLite 3.8.10 enters testing > Date: 4 May 2015 22:03:59 BST > To: General Discussion of SQLite Database > > Reply-To: General Discussion of SQLite Database > > > > On 4 May 2015, at 8:23pm, Richard Hipp wrote: > >> A list of changes (still being revised and updated) is at >> (https://www.sqlite.org/draft/releaselog/3_8_10.html). > > "Because of its past success, AFL became a standard part of the testing > strategy for SQLite beginning with version 3.8.10. There is at least one > instance of AFL running against SQLite continuously, 24/7/365, trying new > randomly mutated inputs against SQLite at a rate of a few hundred to a few > thousand per second. Billions of inputs have been tried, but AFL's > instrumentation has narrowed them down to less than 20,000 test cases that > cover all distinct behaviors. Newly discovered test cases are periodically > captured and added to the TCL test suite." > > Heh. Mister Zalewski can be proud. > > Simon. > ___ > sqlite-users mailing list > sqlite-us...@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Regression test failures - Call for Action
On 5 May 2015, at 03:40, Jeff Darcy wrote: Jeff's patch failed again with same problem: >> http://build.gluster.org/job/rackspace-netbsd7-regression-triggered/4531/console > > Wouldn't have expected anything different. This one looks like a > problem in the Jenkins/Gerrit infrastructure. This kind of error message at the end of a failure log indicates the VM has self-disconnected from Jenkins and needs rebooting. Haven't found any other way to fix it. :/ Happens with both CentOS and NetBSD regression runs. [...] ^ FATAL: Unable to delete script file /var/tmp/hudson8377790745169807524.sh hudson.util.IOException2 : remote file operation failed: /var/tmp/hudson8377790745169807524.sh at hudson.remoting.Channel@2bae0315:nbslave72.cloud.gluster.org at hudson.FilePath.act(FilePath.java:900) at hudson.FilePath.act(FilePath.java:877) at hudson.FilePath.delete(FilePath.java:1262) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:101) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60) [...] + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Regression test failures - Call for Action
On 4 May 2015, at 08:06, Vijay Bellur wrote: > Hi All, > > There has been a spate of regression test failures (due to broken tests or > race conditions showing up) in the recent past [1] and I am inclined to block > 3.7.0 GA along with acceptance of patches until we fix *all* regression test > failures. We seem to have reached a point where this seems to be the only way > to restore sanity to our regression runs. > > I plan to put this into effect 24 hours from now i.e. around 0700 UTC on > 05/05. Thoughts? Please do this. :) + Justin > Thanks, > Vijay > > [1] https://public.pad.fsfe.org/p/gluster-spurious-failures > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] netbsd regression logs
On 1 May 2015, at 16:08, Emmanuel Dreyfus wrote: > Pranith Kumar Karampuri wrote: > >> I was not able to re-create glupy failure. I see that netbsd >> is not archiving logs like the linux regression. Do you mind adding that >> one? I think kaushal and Vijay did this for Linux regressions, so CC them. > > They are archived, in /archives/logs/ on the regressions VM. It's just > that you have to get them through sftp. Is it easy to add web access for them? (eg nginx or whatever) We have the nginx rule for the CentOS ones around somewhere if it'd help? + Justin > -- > Emmanuel Dreyfus > http://hcpnet.free.fr/pubz > m...@netbsd.org > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Configuration Error during gerrit login
I'm hoping this us mostly due to bugs in the older version of Gerrit + GitHub plugin we're using. We'll upgrade in a few weeks, and see how it goes then... ;) + Justin On 1 May 2015, at 03:38, Gaurav Garg wrote: > Hi, > > I was also having the same problems many times, i fixed it by following way > > > 1. Go to https://github.com/settings/applications and revoke the > authorization for 'Gerrit Instance for Gluster Community' > 2. Clean up all cookies for github and review.gluster.org > 3. Goto https://review.gluster.org/ and sign-in again. You'll be asked to > sign-in to Github again and provide authorization > > > - Original Message - > From: "Vijay Bellur" > To: "Gluster Devel" > Sent: Friday, May 1, 2015 12:31:38 AM > Subject: [Gluster-devel] Configuration Error during gerrit login > > Ran into "Configuration Error" several times today. The error message > states: > > "The HTTP server did not provide the username in the GITHUB_USERheader > when it forwarded the request to Gerrit Code Review..." > > Switching browsers was useful for me to overcome the problem. Annoying > for sure, but we seem to have a workaround :). > > HTH, > Vijay > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] REMINDER: Weekly Gluster Community meeting is in 30 mins!
On 29 Apr 2015, at 12:30, Justin Clift wrote: > Reminder!!! > > The weekly Gluster Community meeting is in 30 mins, in > #gluster-meeting on IRC. > > This is a completely public meeting, everyone is encouraged > to attend and be a part of it. :) Thanks for everyone for attending! * 3.6.3 has been released and announced (thanks raghu!) 3.6.4beta1 will be available in a week or so. * 3.7.0beta1 (tarball) has been released. We're still working on packages for it. ;) This will go into Fedora Rawhide too. * 3.5.4beta1 will likely be ready by the start of next week. * Tigert is working on some new GlusterFS website layout ideas. Preview here: https://glusternew-tigert.rhcloud.com He'll start a mailing list thread about it shortly. Meeting log: https://meetbot.fedoraproject.org/gluster-meeting/2015-04-29/gluster-meeting.2015-04-29-12.01.html Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] REMINDER: Weekly Gluster Community meeting is in 30 mins!
Reminder!!! The weekly Gluster Community meeting is in 30 mins, in #gluster-meeting on IRC. This is a completely public meeting, everyone is encouraged to attend and be a part of it. :) To add Agenda items *** Just add them to the main text of the Etherpad, and be at the meeting. :) https://public.pad.fsfe.org/p/gluster-community-meetings Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] This looks like an interesting Gerrit option we could turn on
On 29 Apr 2015, at 08:05, Niels de Vos wrote: > On Wed, Apr 29, 2015 at 02:40:54AM -0400, Jeff Darcy wrote: label.Label-Name.copyAllScoresOnTrivialRebase If true, all scores for the label are copied forward when a new patch set is uploaded that is a trivial rebase. A new patch set is considered as trivial rebase if the commit message is the same as in the previous patch set and if it has the same code delta as the previous patch set. This is the case if the change was rebased onto a different parent. This can be used to enable sticky approvals, reducing turn-around for trivial rebases prior to submitting a change. Defaults to false. >> >> "Same code delta" is a bit slippery. It can't be determined from the >> patch itself, because at least line numbers and diff context will have >> changed and would need to be ignored to say something's the same. I >> think forwarding scores is valuable enough that I'm in favor of turning >> this option on, but we should maintain awareness that scores might get >> forwarded in some cases where perhaps they shouldn't. > > Indeed, and I would be in favour of copying the scores after a rebase > for Code-Review only. We should still have Jenkins run the regression > tests so that in the (rare) event a patch gets incorrectly applied, > either building fails, or the change in behaviour gets detected. Yeah. Reading that section of the Gerrit manual more, it seems to be an option that gets turned on "per label". So, we can turn it on for the Code Review label, but leave it off for the Verified label. That should make sure that even "trivial rebases" get retested by the smoke and regression tests. + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] This looks like an interesting Gerrit option we could turn on
This sounds like it might be useful for us: https://gerrit-documentation.storage.googleapis.com/Documentation/2.9.4/config-labels.html#label_copyAllScoresOnTrivialRebase Yes/no/? + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Update on mgmt_v3-locks.t failure in netbsd
Does this mean we're officially no longer supporting 32 bit architectures? (or is that just on x86?) + Justin On 28 Apr 2015, at 12:45, Kaushal M wrote: > Found the problem. The NetBSD slaves are running a 32-bit kernel and > userspace. > ``` > nbslave7a# uname -p > i386 > ``` > > Because of this CAA_BITS_PER_LONG is set to 32 and the case for size 8 > isn't compiled in uatomic_add_return. Even though the underlying > (virtual) hardware has 64-bit support, and supports the required > 8-byte wide instrcution, it cannot be used because we are running on a > 32-bit kernel with a 32-bit userspace. > > Manu, was there any particular reason why you 32-bit NetBSD? If there > are none, can you please replace the VMs with 64-bit NetBSD. Until > then you can keep mgmt_v3-locks.t disabled. > > ~kaushal > > On Tue, Apr 28, 2015 at 4:56 PM, Kaushal M wrote: >> I seem to have found the issue. >> >> The uatomic_add_return function is defined in urcu/uatomic.h as >> ``` >> /* uatomic_add_return */ >> >> static inline __attribute__((always_inline)) >> unsigned long __uatomic_add_return(void *addr, unsigned long val, >>int len) >> { >> switch (len) { >> case 1: >> { >> unsigned char result = val; >> >> __asm__ __volatile__( >> "lock; xaddb %1, %0" >> : "+m"(*__hp(addr)), "+q" (result) >> : >> : "memory"); >> return result + (unsigned char)val; >> } >> case 2: >> { >> unsigned short result = val; >> >> __asm__ __volatile__( >> "lock; xaddw %1, %0" >> : "+m"(*__hp(addr)), "+r" (result) >> : >> : "memory"); >> return result + (unsigned short)val; >> } >> case 4: >> { >> unsigned int result = val; >> >> __asm__ __volatile__( >> "lock; xaddl %1, %0" >> : "+m"(*__hp(addr)), "+r" (result) >> : >> : "memory"); >> return result + (unsigned int)val; >> } >> #if (CAA_BITS_PER_LONG == 64) >> case 8: >> { >> unsigned long result = val; >> >> __asm__ __volatile__( >> "lock; xaddq %1, %0" >> : "+m"(*__hp(addr)), "+r" (result) >> : >> : "memory"); >> return result + (unsigned long)val; >> } >> #endif >> } >> /* >>* generate an illegal instruction. Cannot catch this with >>* linker tricks when optimizations are disabled. >>*/ >> __asm__ __volatile__("ud2"); >> return 0; >> } >> ``` >> >> As we can see, uatomic_add_return uses different assembly instructions >> to perform the add based on the size of the datatype of the value. If >> the size of the value doesn't exactly match one of the sizes in the >> switch case, it deliberately generates a SIGILL. >> >> The case for size 8, is conditionally compiled as we can see above. >> From the backtrace Atin provided earlier, we see that the size of the >> value is indeed 8 (we use uint64_t). Because we had a SIGILL, we can >> conclude that the case for size 8 wasn't compiled. >> >> I don't know why this compilation didn't (or as this is in a header >> file, doesn't) happen on the NetBSD slaves and this is something I'd >> like to find out. >> >> ~kaushal >> >> On Tue, Apr 28, 2015 at 1:50 PM, Anand Nekkunti wrote: >>> >>> On 04/28/2015 01:40 PM, Emmanuel Dreyfus wrote: On Tue, Apr 28, 2015 at 01:37:42PM +0530, Anand Nekkunti wrote: > >__asm__ is for to write assembly code in c (gcc), > __volatile__(:::) > compiler level barrier to force the compiler not to do reorder the > instructions(to avoid optimization ) . Sure, but the gory details should be of no interest to the developer engaged in debug: if it crashes this is probably because it is called with wrong arguments, hence the question: ccing gluster-devel >> >> new_peer->generation = uatomic_add_return (&conf->generation, >> 1); >> Are new_peer->generation and conf->generation sane? >>> >>> >>> ___ >>> Gluster-devel mailing list >>> Gluster-devel@gluster.org >>> http://www.gluster.org/mailman/listinfo/gluster-devel > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-deve
[Gluster-devel] Just did a bunch of Gerrit account merging... pls let me know if anything goes wrong for you
Some people are still having trouble logging into Gerrit, so I've just gone through and cleaned up some duplicate entries, old data, and similar. If Gerrit suddenly starts misbehaving for you now, please let me know. (In theory it shouldn't... but I don't trust "theory" in this at all ;>) Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Hung regression jobs
On 23 Apr 2015, at 01:18, Jeff Darcy wrote: > I just had to clean up a couple of these - 7327 and 7331. Fortunately, > they both seem to have gone on their merry way instead of dying. Both > were in the pre-mount stage of their setup, but did have mounts active > and gsyncd processes running (in one case multiple of them). I suspect > that this is related to the fact that the new geo-rep tests call "exit" > directly instead of returning errors (see geo-rep-helpers.c:192) and > don't use bash's "trap ... EXIT" functionality to ensure proper cleanup. > Thus, whatever was mounted or running when they failed will remain > mounted or running to trip up the next test. > > If one of your regression jobs seems to be hung, either log in to the > slave machine yourself or contact someone who can, so the offending > mounts/processes can be unmounted/killed. Ahhh yeah, this makes sense. The scripting in Jenkins for launching regression tests should probably be tweaked to also kill any left over geo-rep stuff. I'm focused elsewhere atm, so won't be looking at this myself. But anyone with a Jenkins login is able to. Just muck around with the script here to add geo-rep bits: http://build.gluster.org/job/rackspace-regression-2GB-triggered/ (remember to comment any chances, for traceability) :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Should we try alternating our Weekly Community meeting times?
On 23 Apr 2015, at 05:47, Joe Julian wrote: > I suggested it. Some other people in North America besides just myself > expressed an interest in being involved, but could not make early (or very > early) morning meetings. Since the globe has this cool spherical feature I > thought it might be a good idea to try to get involvement from the dark side. > :-) Would you be ok to chair the first meeting of the Dark Side of Gluster Community Meetings? :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] auto retry of failed test fails again !!
On 22 Apr 2015, at 15:39, Jeff Darcy wrote: >> As we know, we have a patch from Manu which re-triggers a given failed >> test. The idea was to reduce the burden of re-triggering the regression, >> but I've been noticing it is failing in 2nd attempt as well and I've >> seen this happening multiple times for patch [1]. I am not sure whether >> I am damn unlucky or we have a real problem here. >> >> Any thoughts? > > Many of the most common spurious failures seem timing-related. Since > the timing on a particular node is unlikely to change between the first > failure and the retry, neither is the result. Running the retries on > another node might work, but would be very complex to implement. For > the time being, I think our best bet is for the tests identified in > this patch to be excluded/ignored on NetBSD as well: > > http://review.gluster.org/#/c/10322/ > > That might significantly cut down on the false negatives. When tests > still fail, we're stuck with re-triggering manually. Jenkins has some kind of API (haven't looked at it), so we might be able to do something with the API to automatically add the failed CR to the regression queue again. That would have a reasonable change of running it on a different node. + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] geo-rep regression tests take *ages*?
Just noticed something a bit weird on the regression tests for CentOS 6.x: [13:28:44] ./tests/features/weighted-rebalance.t ... ok 23 s [13:46:50] ./tests/geo-rep/georep-rsync-changelog.t ok 1086 s [14:06:53] ./tests/geo-rep/georep-rsync-hybrid.t ... ok 1203 s [14:08:36] ./tests/geo-rep/georep-setup.t .. ok 103 s [14:26:35] ./tests/geo-rep/georep-tarssh-changelog.t ... ok 1079 s That's on: http://build.gluster.org/job/rackspace-regression-2GB-triggered/7285/console Those 3x 1000+ second regression tests are adding 56+minutes to the total regression test time. That's not how it should be is it? Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Should we try alternating our Weekly Community meeting times?
Today's Weekly Community Meeting had an interesting suggestion: There are members of the other hemisphere that would like to be active in the community but cannot attend meetings at this hour. I propose alternating meetings by 12 hours, 0:00 and 12:00 UTC. It's a decent suggestion, and we're definitely willing to try it out if we know people will attend the "other" meeting. :) Do we have volunteers to be at a 0:00 UTC meeting for Gluster? We also need someone to volunteer to be the meeting chair to run it (at least the first time). Who's up for it? :) Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] REMINDER: Weekly Gluster Community meeting is in one hour!
On 22 Apr 2015, at 11:59, Justin Clift wrote: > Reminder!!! > > The weekly Gluster Community meeting is in 30 mins, in > #gluster-meeting on IRC. > > This is a completely public meeting, everyone is encouraged > to attend and be a part of it. :) Thanks everyone who attended. Quite a few attendees and we covered lots of useful stuff. :) * GlusterFS 3.6.3 should be released in the next few days. Yay! :) * GlusterFS 3.7.0beta1 *and* 3.6.4beta1 should be released by the end of this week. * GlusterFS 3.5.4beta1 should be released fairly soon. Hoping for the end of this week too, but we'll see. Meeting logs: https://meetbot.fedoraproject.org/gluster-meeting/2015-04-22/gluster-meeting.2015-04-22-12.01.html Thanks to everyone who attended + participated. :) Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] REMINDER: Weekly Gluster Community meeting is in one hour!
Reminder!!! The weekly Gluster Community meeting is in 30 mins, in #gluster-meeting on IRC. This is a completely public meeting, everyone is encouraged to attend and be a part of it. :) To add Agenda items *** Just add them to the main text of the Etherpad, and be at the meeting. :) https://public.pad.fsfe.org/p/gluster-community-meetings Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-infra] Gerrit and Jenkins likely unavailable most of Sunday
On 22 Apr 2015, at 09:24, Anoop C S wrote: > On 04/22/2015 12:46 PM, Justin Clift wrote: >> On 22 Apr 2015, at 07:42, Justin Clift wrote: >>> On 20 Apr 2015, at 04:43, Aravinda wrote: >>>> Is it not possible to view the patches if not logged in? I think public >>>> access(read only) need to be enabled. >>> >>> It *does* seem to be possible after all. :) >>> >>> Our test instance for Gerrit (http://newgerritv2.cloud.gluster.org) is now >>> running the very latest release of Gerrit + the GitHub auth plugin, and >>> that allows anonymous read access. >> >> It turns out the settings to make this work were already present in >> the version of Gerrit we're using... so they've just been turned on. >> >> Anonymous read-only access should now be working. :) >> >> Signing out should now be working properly too. (yay) > > Anonymous read-only access and sign out works fine now.. :) Awesome. :) -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-infra] Gerrit and Jenkins likely unavailable most of Sunday
On 22 Apr 2015, at 07:42, Justin Clift wrote: > On 20 Apr 2015, at 04:43, Aravinda wrote: >> Is it not possible to view the patches if not logged in? I think public >> access(read only) need to be enabled. > > It *does* seem to be possible after all. :) > > Our test instance for Gerrit (http://newgerritv2.cloud.gluster.org) is now > running the very latest release of Gerrit + the GitHub auth plugin, and > that allows anonymous read access. It turns out the settings to make this work were already present in the version of Gerrit we're using... so they've just been turned on. Anonymous read-only access should now be working. :) Signing out should now be working properly too. (yay) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-infra] Gerrit and Jenkins likely unavailable most of Sunday
On 20 Apr 2015, at 04:43, Aravinda wrote: > Is it not possible to view the patches if not logged in? I think public > access(read only) need to be enabled. It *does* seem to be possible after all. :) Our test instance for Gerrit (http://newgerritv2.cloud.gluster.org) is now running the very latest release of Gerrit + the GitHub auth plugin, and that allows anonymous read access. So, we might be upgrading shortly. ;) Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] How to fix GITHUB_USER error when signing in
We're still tweaking settings with the new GitHub Authenticaion for Gerrit (and looking for a way to enable anonymous read access). If you get this error when trying to sign in, it means you need to clear your browser cookies: The HTTP server did not provide the username in the GITHUB_USER header when it forwarded the request to Gerrit Code Review. I suspect it's something to do with OAuth settings being cached in the cookie, which become invalid any time they're changed server side. So... once we get the bits all figured out it shouldn't happen again. *fingers crossed* + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-infra] Gerrit and Jenkins likely unavailable most of Sunday
On 21 Apr 2015, at 09:44, Vijay Bellur wrote: > gerrit-trigger plugin needs "Event Streaming" capability to run with Gerrit > 2.7+. This has been added now. > > All tests (smoke, NetBSD tests) should be functional now. Please let us know > if you notice anything amiss. Awesome. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] regressions on release-3.7 ?
On 20 Apr 2015, at 20:02, Vijay Bellur wrote: > On 04/21/2015 12:19 AM, Justin Clift wrote: >> On 20 Apr 2015, at 18:53, Jeff Darcy wrote: >>>> I propose that we don't drop test units but provide an ack to patches >>>> that have known regression failures. >>> >>> IIRC maintainers have had permission to issue such overrides since a >>> community meeting some months ago, but such overrides have remained >>> rare. What should we do to ensure that currently failing Jenkins >>> results are checked and (if necessary) overridden in a consistent >>> and timely fashion, without putting all of that burden directly on >>> your shoulders? Some sort of "officer of the day" rotation? An >>> Etherpad work queue? Something else? >> >> An Etherpad is probably a good basis for doing the listing. No >> preferences personally for how it gets attended to though. :) >> > > Another option would be to maintain a file with this list in the tests > directory. run-tests.sh can lookup this file to determine whether it should > continue or bail out. Good thinking. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] regressions on release-3.7 ?
On 20 Apr 2015, at 18:53, Jeff Darcy wrote: >> I propose that we don't drop test units but provide an ack to patches >> that have known regression failures. > > IIRC maintainers have had permission to issue such overrides since a > community meeting some months ago, but such overrides have remained > rare. What should we do to ensure that currently failing Jenkins > results are checked and (if necessary) overridden in a consistent > and timely fashion, without putting all of that burden directly on > your shoulders? Some sort of "officer of the day" rotation? An > Etherpad work queue? Something else? An Etherpad is probably a good basis for doing the listing. No preferences personally for how it gets attended to though. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-infra] Gerrit and Jenkins likely unavailable most of Sunday
On 20 Apr 2015, at 08:11, Atin Mukherjee wrote: > On 04/20/2015 08:35 AM, Vijay Bellur wrote: >> The procedure for migration from an admin perspective is quite involved >> and account migrations are better done in batches. Instead of mailing >> any of us directly, can you please update the gerrit migration etherpad >> [1] once you have signed in using github? This might be a slightly more >> optimal way of doing this migration :). We will pick up details from the >> etherpad at a regular frequency. >> > There are three set of problems what we noticed in the migration process: > > 1. Forbidden access when you try to sign in with github > 2. Multiple accounts upon successful github signing > 3. Unable to view files in patchsets - 404 error > > We have the fix for 1 & 2, please do mention in the etherpad [1] if you > fall into any of these categories. > > Vijay is working on point 3 and will keep posted once he finds a solution. Gerrit is up and running now (thanks hagarth and ndevos). Seems to be working decently too. :) I have the process to merge new GitHub userid's into existing accounts fairly well optimised now too. So, if you need your account created either add yourself to the etherpad or email me to get it done. :) https://public.pad.fsfe.org/p/gluster-gerrit-migration We're still working through Jenkins stuff at the moment... so not a lot in the way of smoke nor regression tests happening just yet. + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] regressions on release-3.7 ?
On 20 Apr 2015, at 14:14, Jeff Darcy wrote: >> The same problems that affect mainline are affecting release-3.7 too. We >> need to get over this soon. > > I think it's time to start skipping (or even deleting) some tests. For > example, volume-snapshot-clone.t alone is responsible for a huge number > of spurious failures. It's for a feature that people don't even seem to > know we have, and isn't sufficient for us to declare that feature > supportable, so the only real effect of the test's existence is these > spurious regression-test failures. Are you meaning deleting the tests temporarily (to let other stuff pass without being held up by it), or permanently? > In other cases (e.g. uss.t) bugs in > the test and/or the feature itself must still be fixed before we can > release 3.7 but that doesn't necessarily mean we need to run that test > for every unrelated change. > > The purpose of a regression test is to catch unanticipated problems with > a patch. A test that fails for its own unrelated reasons provides > little or no information of that nature, and is therefore best treated > as if no test for that feature/fix had ever existed. That's still bad > and still worthy of correction, but at least it doesn't interfere with > everyone else's work. That makes sense to me... as long as any temporarily-deleted-tests have their root cause(s) found and fixed before we release 3.7. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-infra] Gerrit and Jenkins likely unavailable most of Sunday
On 20 Apr 2015, at 04:43, Aravinda wrote: > Is it not possible to view the patches if not logged in? I think public > access(read only) need to be enabled. In theory it's supposed to be. :) However looking at the etherpad there are lots of people getting "Forbidden". I'm not sure why yet, but will start looking into it shortly (after coffee). :) + Justin > ~aravinda > > On 04/20/2015 08:35 AM, Vijay Bellur wrote: >> On 04/20/2015 04:25 AM, Justin Clift wrote: >>> The good news: >>> >>> 1) Gerrit is kind of :/ updated. The very very latest versions >>>(released friday) don't work properly for us. So, we're running >>>on the slightly older v2.9.4 release of Gerrit. >>> >>>It's a lot newer than what we were running though. ;) >>> >>> 2) The GitHub integration seems to be working. When you next to to >>>http://review.gluster.org, it'll get you to authenticate via >>>GitHub. >>> >>> The bad news: >>> >>> 1) The first time you authenticate to GitHub it will create a brand >>>new account for you, that doesn't have many useful permissions. >>> >>>You will need to email Vijay, Humble, or myself with the account >>>number it creates for you + with your GitHub username. >>> >>> Your account number will probably be something like 10006xx. >>> Mine was 1000668. >>> >>>This new account id needs to be merged into your existing one >>>manually by a Gerrit admin. It's not hard, and only needs to be >>>done once. :) >>> >> >> The procedure for migration from an admin perspective is quite involved and >> account migrations are better done in batches. Instead of mailing any of us >> directly, can you please update the gerrit migration etherpad [1] once you >> have signed in using github? This might be a slightly more optimal way of >> doing this migration :). We will pick up details from the etherpad at a >> regular frequency. >> >> Thanks for taking the trouble & apologies for any inconvenience caused in >> advance! >> >> Regards, >> Vijay >> >> [1] https://public.pad.fsfe.org/p/gluster-gerrit-migration -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-infra] Gerrit and Jenkins likely unavailable most of Sunday
The good news: 1) Gerrit is kind of :/ updated. The very very latest versions (released friday) don't work properly for us. So, we're running on the slightly older v2.9.4 release of Gerrit. It's a lot newer than what we were running though. ;) 2) The GitHub integration seems to be working. When you next to to http://review.gluster.org, it'll get you to authenticate via GitHub. The bad news: 1) The first time you authenticate to GitHub it will create a brand new account for you, that doesn't have many useful permissions. You will need to email Vijay, Humble, or myself with the account number it creates for you + with your GitHub username. Your account number will probably be something like 10006xx. Mine was 1000668. This new account id needs to be merged into your existing one manually by a Gerrit admin. It's not hard, and only needs to be done once. :) 2) Jenkins... didn't even get close to looking at it. So the Jenkins server is out of action for now. :/ The version of Jenkins we're running *may* not be compatible with our new Gerrit version (unsure). Will find out in the morning (after sleep, which I'm really needing atm). + Justin On 19 Apr 2015, at 11:38, Justin Clift wrote: > Gerrit and Jenkins are going to be shutting off pretty soon. > > So, any job running in Jenkins will be aborted. ;) > > *Please don't* submit new CR's, or run any new Jenkins jobs > from now until the upgrade is finished. > > Even if you see out Gerrit or Jenkins online, don't do stuff > with it. ;) > > + Justin > > > On 18 Apr 2015, at 19:30, Justin Clift wrote: >> Our Gerrit and Jenkins instances will be getting updated >> tomorrow. (yay!) >> >> It's not very straight forward to do though, so I'll >> probably shut them down tomorrow morning and they _may_ >> be offline for a large part of the day. >> >> Note - They have to be kept offline from when I do the >> initial backup for updating, until it's ready. >> >> I wish there was a better way... but there doesn't seem >> to be. :/ >> >> Sorry in advance, etc. >> >> Regards and best wishes, >> >> Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-infra] Gerrit and Jenkins likely unavailable most of Sunday
Gerrit and Jenkins are going to be shutting off pretty soon. So, any job running in Jenkins will be aborted. ;) *Please don't* submit new CR's, or run any new Jenkins jobs from now until the upgrade is finished. Even if you see out Gerrit or Jenkins online, don't do stuff with it. ;) + Justin On 18 Apr 2015, at 19:30, Justin Clift wrote: > Our Gerrit and Jenkins instances will be getting updated > tomorrow. (yay!) > > It's not very straight forward to do though, so I'll > probably shut them down tomorrow morning and they _may_ > be offline for a large part of the day. > > Note - They have to be kept offline from when I do the > initial backup for updating, until it's ready. > > I wish there was a better way... but there doesn't seem > to be. :/ > > Sorry in advance, etc. > > Regards and best wishes, > > Justin Clift > > -- > GlusterFS - http://www.gluster.org > > An open source, distributed file system scaling to several > petabytes, and handling thousands of clients. > > My personal twitter: twitter.com/realjustinclift > > ___ > Gluster-infra mailing list > gluster-in...@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-infra -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Gerrit and Jenkins likely unavailable most of Sunday
Our Gerrit and Jenkins instances will be getting updated tomorrow. (yay!) It's not very straight forward to do though, so I'll probably shut them down tomorrow morning and they _may_ be offline for a large part of the day. Note - They have to be kept offline from when I do the initial backup for updating, until it's ready. I wish there was a better way... but there doesn't seem to be. :/ Sorry in advance, etc. Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] How to search frequently asked questions in gluster mailing lists.
On 18 Apr 2015, at 16:01, Niels de Vos wrote: > On Sat, Apr 18, 2015 at 03:14:41PM +0100, Justin Clift wrote: >> On 18 Apr 2015, at 07:49, Raghavendra Talur >> wrote: >> >>> Use the second search box, the one below "Google search for Gluster". >>> Works for me on both Chrome and Firefox on Android and Fedora 21. >>> Please try again and let me know :) >> >> Errr, which second search box? :) >> >> http://ded.ninja/gluster/custom_google_search_screenshot.png > > This one, use Fedora and Firefox? > >http://i.imgur.com/cCGmOdK.png Are you guys not using ad blockers? + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] How to search frequently asked questions in gluster mailing lists.
On 18 Apr 2015, at 07:49, Raghavendra Talur wrote: > > Use the second search box, the one below "Google search for Gluster". > Works for me on both Chrome and Firefox on Android and Fedora 21. > Please try again and let me know :) Errr, which second search box? :) http://ded.ninja/gluster/custom_google_search_screenshot.png + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Request to mailing list Gluster-devel rejected
On 17 Apr 2015, at 22:25, Lars Ingebrigtsen wrote: > Justin Clift writes: > >> Would you be ok to update the mailing list address you guys are >> using for gluster-devel? It's using the previous mailing list >> address, which is no longer in use. ;) > > What's the new address? It's: Gluster Devel :) Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] How to search frequently asked questions in gluster mailing lists.
On 17 Apr 2015, at 19:08, Raghavendra Talur wrote: > I had created a custom Google search for it once but never went back to > finish tweaking it. Here is the link > https://cse.google.co.in:443/cse/publicurl?cx=011584940093792490158:gn39i94_4la Tried that here... and it doesn't seem to work for me. It gives a page I can type in a search term (eg "glusterd"), but the next page says "Loading" and never comes back. :( Tried in Opera (latest release) on OSX 10.9. Any ideas? + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] WARNING: patches marked as Verified might not be
On 17 Apr 2015, at 16:02, Jeff Darcy wrote: > As mentioned in a previous email, recent changes to speed up regression > testing have uncovered a problem with how we track the verified status > of patches. Specifically, the fault sequence is as follows: > > 1) Regression fails, "Gluster Build System" adds V-1 > > 2) NetBSD regression passes, "NetBSD Build System" adds V+1 > > 3) Smoke passes, "Gluster Build System" erases the V-1 > > The order of the first two might be reversed. What's new is that > regression never used to finish before smoke, and now it usually does > (for the failure case). So, reviewers/committers, please be sure to > check the last *regression* result before assuming that green check mark > is valid. I'm periodically going through the list of recently reviewed > patches and manually fixing up the status, but I can't catch all of them > and TBH don't know how much longer I'll keep trying. > > Gerrit/Jenkins maintainers: we have three options for a long term > solution. > > a) Run smoke and regression as separate Gerrit users, require >concurrence for a patch to be considered V+1. This is what we >already do for NetBSD. It's simple, and seems to work well. > > b) Stop triggering regression directly from the Gerrit event, trigger it >from a successful smoke completion instead. This is a bit more >complicated, but has the additional benefit that regression *won't >even run* if smoke fails (saving resources). > > c) Write a script to seek out and destroy improperly marked patches >(basically automate what I've been doing the last coupld of days). >It'll work, but it still leaves a window when patches are improperly >marked. We should only consider it if we run into significant >problems with the other two approaches. > > Anyone else have any preference for (a) vs. (b)? I can't implement (a) > myself. I could implement (b), but I don't want to go messing with the > Jenkins configuration unless/until we have a consensus that it's the > right thing to do. (a) sounds good to me. Vijay can setup the Gerrit account if he's agreeable. :) I'm focused on other Gerrit issues atm. :( + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Google OpenID stops working this Sunday
Hi all, Using Google "OpenID 2.0" for authentication will stop working from 20th April (this Monday). This is a bit of a problem for us, as many of our developers use it to authenticate with our Gerrit instance. We have some potential ways forward: * Switch to using GitHub OAuth instead This is currently working in a test VM. Seems ok so far. * Switch to using Google "OpenID Connect" instead Haven't yet gotten this working in our test VM, but am looking into it. Gerrit doesn't seem to let us use multiple authentication providers... it seems like we need to pick one OR the other here. :( (I could be wrong) Personally, I think we should choose the GitHub OAuth method, since we're all going to need GitHub accounts anyway for the Gluster Forge v2. So, keeps things simple from that perspective. ;) It will probably be a bit messy the *first time* we all log in, as we'll need to merge our new GitHub account info with our existing accounts. After that though, we should be good. Is anyone really against this idea? Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Regression-test pipeline report
On 17 Apr 2015, at 01:50, Jeff Darcy wrote: > Lastly, I have a lead on some of the core dumps that have occurred > during regression tests. See the following bug for details. > > https://bugzilla.redhat.com/show_bug.cgi?id=1212660 Awesome. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Auto-retry failed tests
On 16 Apr 2015, at 05:28, Emmanuel Dreyfus wrote: > Hi > > We all know regression spurious failures are a problem. In order to > minimize their impact, NetBSD regression restart the whole test suite in > case of error so that spurious failures do not cause an undeserved > verified=-1 vote to be cast. > > This takes time, and as a consequence the netbsd7_regression backlog > gets huge in the afternoon. > > I proposed this change to improve the situation: modify run-tests.sh to > retry only the failed tests. That behavior is off by default and can be > enabled using run-tests.sh -r > http://review.gluster.org/10128/ > > Having that one merged would not change anything to the way Linux > regression is run right now, and it would let me make the NetBSD > regression much faster (until the day spurious regressions are fixed). The concept sounds really useful. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Regression tests marked as SUCCESS when they shouldn't be
On 16 Apr 2015, at 03:56, Jeff Darcy wrote: >> Noticing several of the recent regression tests are being marked as SUCCESS >> in Jenkins (then Gerrit), when they're clearly failing. >> >> eg: >> >> http://build.gluster.org/job/rackspace-regression-2GB-triggered/6968/console >> http://build.gluster.org/job/rackspace-regression-2GB-triggered/6969/console >> http://build.gluster.org/job/rackspace-regression-2GB-triggered/6970/console >> http://build.gluster.org/job/rackspace-regression-2GB-triggered/6966/console >> >> Is this something to do with your patch to try and get failures finishing >> faster? > > Yes, I think it probably is. At line 221 we echo the status for debugging, > but that means the result of main() is the result of the echo (practically > always zero) rather than the real result. All we need to do is take out > that echo line. Well, if the echo is important for debugging, we can save the real result status earlier and then "return $REAL_STATUS" type of thing. + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Regression tests marked as SUCCESS when they shouldn't be
Noticing several of the recent regression tests are being marked as SUCCESS in Jenkins (then Gerrit), when they're clearly failing. eg: http://build.gluster.org/job/rackspace-regression-2GB-triggered/6968/console http://build.gluster.org/job/rackspace-regression-2GB-triggered/6969/console http://build.gluster.org/job/rackspace-regression-2GB-triggered/6970/console http://build.gluster.org/job/rackspace-regression-2GB-triggered/6966/console Is this something to do with your patch to try and get failures finishing faster? ;) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] The Gluster Forge -- new and improved version 2 with extra sprinkles
Hi everyone, The Gluster Forge at present is currently hosted using Gitorious. We're planning on migrating these projects to GitHub in the near future (next few days) - as that's where the majority of the Open Source Community is. The "forge.gluster.org" URL will then be very simple website (2 pages!) that: * lists the projects (in categories, for easy finding) * shows the activity for the projects, across all of them (easy to obtain and update stats for hourly, using the GitHub API) So far I've knocked up :) some dodgy Python code to do the hourly stats collection + stick them in a SQLite database: https://github.com/gluster/forge (Pull Requests to make it less dodgy are welcome btw!) As we get projects into GitHub from the current Forge, the "config" file there needs to be updated to include them. If one of your Gluster Forge v1 projects is already on GitHub, please let me know. (or send a pull request adding it to the config file) Hopefully that's workable for people, and helps us get a bunch more contributors to the projects over time... :D Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] release-3.7 branch created [Was 3.7.0 update]
On 15 Apr 2015, at 12:54, Ravishankar N wrote: > On 04/15/2015 03:31 PM, Justin Clift wrote: >> On 15 Apr 2015, at 08:09, Ravishankar N wrote: >>> On 04/14/2015 11:57 PM, Vijay Bellur wrote: >>>> From here on, we would need patches to be explicitly sent on release-3.7 >>>> for the content to be included in a 3.7.x release. Please ensure that you >>>> send a backport for release-3.7 after the corresponding patch has been >>>> accepted on master. >>>> >>>> Thanks again to everyone who have helped us in getting here. Look forward >>>> to more fun and collaboration as we move towards 3.7.0 GA! >>>> >>> For the replication arbiter feature, I'm working on the changes that need >>> to be made on the AFR code. Once it gets merged in master, I will >>> back-port it to 3.7 (I'm targeting to get this done before the GA.). Apart >>> from that I don't think there are new regressions since 3.6 for AFR, so we >>> should be good to go. >> How about this one? >> >> * tests/basic/afr/sparse-file-self-heal.t >> (Wstat: 0 Tests: 64 Failed: 35) >> Failed tests: 1-6, 11, 20-30, 33-34, 36, 41, 50-61, 64 >> Happens in master (Mon 30th March - git commit id >> 3feaf1648528ff39e23748ac9004a77595460c9d) >> (hasn't yet been added to BZ) >> Being investigated by: ? >> >> As per: >> >> https://public.pad.fsfe.org/p/gluster-spurious-failures >> >> ;) > > Just tried the test case a couple of times on my laptop on today's master > and with the head at the above commit ID, passes ever time. :-\ Sure. That's the nature of spurious failures. It's not likely to be trivial to track down... but it *is* important. ;) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] release-3.7 branch created [Was 3.7.0 update]
On 15 Apr 2015, at 08:09, Ravishankar N wrote: > On 04/14/2015 11:57 PM, Vijay Bellur wrote: >> From here on, we would need patches to be explicitly sent on release-3.7 for >> the content to be included in a 3.7.x release. Please ensure that you send a >> backport for release-3.7 after the corresponding patch has been accepted on >> master. >> >> Thanks again to everyone who have helped us in getting here. Look forward to >> more fun and collaboration as we move towards 3.7.0 GA! >> > > For the replication arbiter feature, I'm working on the changes that need to > be made on the AFR code. Once it gets merged in master, I will back-port it > to 3.7 (I'm targeting to get this done before the GA.). Apart from that I > don't think there are new regressions since 3.6 for AFR, so we should be good > to go. How about this one? * tests/basic/afr/sparse-file-self-heal.t (Wstat: 0 Tests: 64 Failed: 35) Failed tests: 1-6, 11, 20-30, 33-34, 36, 41, 50-61, 64 Happens in master (Mon 30th March - git commit id 3feaf1648528ff39e23748ac9004a77595460c9d) (hasn't yet been added to BZ) Being investigated by: ? As per: https://public.pad.fsfe.org/p/gluster-spurious-failures ;) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Possibly root cause for the Gluster regression test cores?
On 8 Apr 2015, at 14:13, Pranith Kumar Karampuri wrote: > On 04/08/2015 06:20 PM, Justin Clift wrote: >> Hagarth mentioned in the weekly IRC meeting that you have an >> idea what might be causing the regression tests to generate >> cores? >> >> Can you outline that quickly, as Jeff has some time and might >> be able to help narrow it down further. :) >> >> (and these core files are really annoying :/) > I feel it is a lot like https://bugzilla.redhat.com/show_bug.cgi?id=1184417. > clear-locks command is not handled properly after we did the client_t > refactor. I believe that is the reason for the crashes but I could be wrong. > But After looking at the code I feel there is high probability that this is > the issue. I didn't find it easy to fix. We will need to change the lock > structure list maintenance heavily. Easier thing would be to disable > clear-locks functionality tests in the regression as it is not something that > is used by the users IMO and see if it indeed is the same issue. There are 2 > tests using this command: > 18:34:00 :) ⚡ git grep clear-locks tests > tests/bugs/disperse/bug-1179050.t:TEST $CLI volume clear-locks $V0 / kind all > inode > tests/bugs/glusterd/bug-824753-file-locker.c: "gluster volume clear-locks %s > /%s kind all posix 0,7-1 |" > > If even after disabling these two tests it fails then we will need to look > again. I think jeff's patch which will find the test which triggered the core > should help here. Thanks Pranith. :) Is this other "problem when disconnecting" BZ possibly related, or is that a different thing? https://bugzilla.redhat.com/show_bug.cgi?id=1195415 + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Need volunteers for spurious failure fixing on master
Hi us, Although we've fixed a few of these spurious failures in our git master branch, we still need a few more people to help out with the rest. :) This is important because these spurious failures affect *everyone* doing any kind of development work on GlusterFS. With these fixed, our regression runs will be a _lot_ quicker, more predictable, and we'll be able to iterate on our code much faster. So... who's ok to help with some of the ungrabbed ones below? :) Need someone to investigate + create the fix * tests/bugs/disperse/bug-1161886.t Fails tests 13-16 because of missing inode.h when building (possibly unnecessary) helper C program 14:04 < JustinClift> Is that a missing dependency that should be installed on the regression test slaves? 14:05 < jdarcy> That's a really weird one. It doesn't happen every time. 14:06 < jdarcy> It's *our* inode.h, which should totally be present long before the test needs it, but somehow it fails to find it once in a while. * tests/bugs/snapshot/bug-1162498.t Need a volunteer to investigate this * tests/basic/quota-nfs.t Need a volunteer to investigate this * tests/performance/open-behind.t Need a volunteer to investigate this * tests/bugs/distribute/bug-1122443.t Need a volunteer to investigate this * tests/basic/afr/sparse-file-self-heal.t Need a volunteer to investigate this * tests/bugs/disperse/bug-1187474.t Need a volunteer to investigate this * tests/basic/fops-sanity.t Need a volunteer to investigate this Needing reviews *** * split-brain-resolution.t Anuradha has a proposed fix here: http://review.gluster.org/#/c/10134/ * /tests/features/ssl-authz.t Jeff has a proposed fix here: http://review.gluster.org/#/c/10075/ Ones under investigation * Core dumps by socket disconnect race Initial analysis: https://bugzilla.redhat.com/show_bug.cgi?id=1195415 Pranith and/or Jeff are looking into this? * Random regression test hang : bug-1113960.t Nithya is investigating: https://bugzilla.redhat.com/show_bug.cgi?id=1209340 The Etherpad for co-ordinating this *** https://public.pad.fsfe.org/p/gluster-spurious-failures Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Possibly root cause for the Gluster regression test cores?
Hi Pranith, Hagarth mentioned in the weekly IRC meeting that you have an idea what might be causing the regression tests to generate cores? Can you outline that quickly, as Jeff has some time and might be able to help narrow it down further. :) (and these core files are really annoying :/) Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] 3.7.0 update
On 7 Apr 2015, at 11:21, Vijay Bellur wrote: > 3. Spurious regression tests listed in [3] to be fixed. > To not impede the review & merge workflow on release-3.7/master, I plan to > drop those test units which still cause > spurious failures by the time we branch release-3.7. Thinking about this more... this feels like the wrong approach. The spurious failures seem to be caused reasonably often by race conditions in our code and similar. Dropping the unfixed spurious tests feels like sweeping the harder/trickier problems under the rug, which means they'll need to be found and fixed later anyway. That's kind of the opposite to "lets resolve all these spurious failures before proceeding" (which should mean "lets fix them finally"). ;) And yeah, this could delay the release a bit. Personally, I'm ok with that. It's not going to make our release of lower quality, and people not involved in spurious failure fixing are still able to do dev work on master. ? Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Shutting down Gerrit for a few minutes
On 7 Apr 2015, at 16:31, Justin Clift wrote: > On 7 Apr 2015, at 15:45, Justin Clift wrote: >> Just an FYI. Shutting down Gerrit for a few minutes, to move around >> some files on the Gerrit server (need to free up space urgently). >> >> Shouldn't be too long. (fingers crossed) :) > > ... and it hasn't returned from rebooting after yum update. :( > > We're investigating. > > Sorry for the longer-than-expected outage. :/ It's back up and running again. A bunch of space has been freed up on the filesystems for it, "git gc" has been run on each of the git repos, and the packages have all been updated via yum. (except Gerrit, which isn't yum installed) It _seems_ to be working ok now, for the initial git checkout I just tried. If something acts up though, please let us know. :) Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-infra] Shutting down Gerrit for a few minutes
On 7 Apr 2015, at 15:45, Justin Clift wrote: > Just an FYI. Shutting down Gerrit for a few minutes, to move around > some files on the Gerrit server (need to free up space urgently). > > Shouldn't be too long. (fingers crossed) :) ... and it hasn't returned from rebooting after yum update. :( We're investigating. Sorry for the longer-than-expected outage. :/ + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Shutting down Gerrit for a few minutes
Just an FYI. Shutting down Gerrit for a few minutes, to move around some files on the Gerrit server (need to free up space urgently). Shouldn't be too long. (fingers crossed) :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Gluster 3.6.2 On Xeon Phi
On 12 Feb 2015, at 08:54, Mohammed Rafi K C wrote: > On 02/12/2015 08:32 AM, Rudra Siva wrote: >> Rafi, >> >> I'm preparing the Phi RDMA patch for submission > > If you can send a patch to support iWARP, that will be a great addition > to gluster rdma. Clearing out older email... did this patch get submitted and merged? :) Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Coredump in master :/
On 2 Apr 2015, at 14:42, Jeff Darcy wrote: >> Is it ok to put slave23.cloud.gluster.org into general rotation, so it >> runs regression jobs along with the rest? > > Sounds OK to me. Do we have a place to store the core tarball, just in > case we decide we need to go back to it some day? Yep. They're now here: http://ded.ninja/gluster/slave23.cloud.gluster.org/ Should be safe for a couple of months at least. In theory. ;) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Security hardening RELRO & PIE flags
On 2 Apr 2015, at 14:08, Niels de Vos wrote: > On Thu, Apr 02, 2015 at 01:21:57PM +0100, Justin Clift wrote: >> On 31 Mar 2015, at 08:15, Niels de Vos wrote: >>> On Tue, Mar 31, 2015 at 12:20:19PM +0530, Kaushal M wrote: >>>> IMHO, doing hardening and security should be left the individual >>>> distributions and the package maintainers. Generally, each distribution has >>>> it's own policies with regards to hardening and security. We as an upstream >>>> project cannot decide on what a distribution should do. But we should be >>>> ready to fix bugs that could arise when distributions do hardened builds. >>>> >>>> So, I vote against having these hardening flags added to the base GlusterFS >>>> build. But we could add the flags the Fedora spec files which we carry with >>>> our source. >>> >>> Indeed, I agree that the compiler flags should be specified by the >>> distributions. At least Fedora and Debian do this already include >>> (probably different) options within their packaging scripts. We should >>> set the flags we need, but not more. It would be annoying to set default >>> flags that can conflict with others, or which are not (yet) available on >>> architectures that we normally do not test. >> >> First thoughts: :) >> >> * We provide our own packaging scripts + distribute rpms/deb's from our >>own site too. >> >>Should we investigate/try these flags out for the packages we build + >>supply? > > At least for the RPMs, we try to follow the Fedora guidelines and their > standard flags. With recent Fedora releases this includes additional > hardening flags. > >> * Are there changes in our code + debugging practises that would be needed >>for these security hardening flags to work? >> >>If there are, and we don't make these changes ourselves, doesn't that >>mean we're telling distributions they need to carry their own patch set >>in order to have a "more secure" GlusterFS? > > We have received several patches from the Debian maintainer that improve > the handling of these options. When maintainers for distrubutions build > GlusterFS and require changes, they either file bugs and/or send > patches. I think this works quite well. Thanks Niels. Sounds like we're already in good shape then. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Coredump in master :/
On 1 Apr 2015, at 19:47, Jeff Darcy wrote: >> When doing an initial burn in test (regression run on master head >> of GlusterFS git), it coredumped on the new "slave23.cloud.gluster.org" VM. >> (yeah, I'm reusing VM names) >> >> http://build.gluster.org/job/regression-test-burn-in/16/console >> >> Does anyone have time to check the coredump, and see if this is >> the bug we already know about? > > This is *not* the same as others I've seen. There are no threads in the > usual connection-cleanup/list_del code. Rather, it looks like some are > in generic malloc code, possibly indicating some sort of arena corruption. Is it ok to put slave23.cloud.gluster.org into general rotation, so it runs regression jobs along with the rest? + Justing -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Adventures in building GlusterFS
On 2 Apr 2015, at 12:10, Vijay Bellur wrote: > On 04/02/2015 06:27 AM, Jeff Darcy wrote: >> My recommendations: >> >> (1) Apply the -Wno-error=cpp and -Wno-error=maybe-uninitialized >> changes wherever they need to be applied so that they're >> effective during normal regression builds > > Thanks, Jeff. > > Justin - would it be possible to do this change as well in build.sh? The regression builds seem to be running again at the moment without removing -Werror. So I'm not sure if this needs adjusting any more? + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Security hardening RELRO & PIE flags
On 31 Mar 2015, at 08:15, Niels de Vos wrote: > On Tue, Mar 31, 2015 at 12:20:19PM +0530, Kaushal M wrote: >> IMHO, doing hardening and security should be left the individual >> distributions and the package maintainers. Generally, each distribution has >> it's own policies with regards to hardening and security. We as an upstream >> project cannot decide on what a distribution should do. But we should be >> ready to fix bugs that could arise when distributions do hardened builds. >> >> So, I vote against having these hardening flags added to the base GlusterFS >> build. But we could add the flags the Fedora spec files which we carry with >> our source. > > Indeed, I agree that the compiler flags should be specified by the > distributions. At least Fedora and Debian do this already include > (probably different) options within their packaging scripts. We should > set the flags we need, but not more. It would be annoying to set default > flags that can conflict with others, or which are not (yet) available on > architectures that we normally do not test. First thoughts: :) * We provide our own packaging scripts + distribute rpms/deb's from our own site too. Should we investigate/try these flags out for the packages we build + supply? * Are there changes in our code + debugging practises that would be needed for these security hardening flags to work? If there are, and we don't make these changes ourselves, doesn't that mean we're telling distributions they need to carry their own patch set in order to have a "more secure" GlusterFS? + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Multiple verify in gerrit
On 2 Apr 2015, at 05:18, Emmanuel Dreyfus wrote: > Hi > > I am now convinced the solution to our multiple regression problem is to > introduce more "Gluster Build System" users: one for CentOS regression, > another one for NetBSD regression (and one for each smoke test, as > exaplained below). > > I just tested it on http://review.gluster.org/10052, and here is what > gerrit display in the verified column > - if there are neither verified=+1 or verified=-1 cast: nothing > - if there is at least one verified=+1 and no verified=-1: verified > - if there is at least one verified=-1: failed > > Therefore if CentOS regression uses bu...@review.gluster.org to report > results and NetBSD regression uses nb7bu...@review.gluster.org (later > user should be created), we acheive this outcome: > - gerrit will display a change as verified if one regression reported it > as verified and the other either also succeeded or failed to report > - gerrit will display a change as failed if one regression reported it > at failed, regardless of what the other reported. > > There is still one minor problem: if one regression does not report, or > report late, we can have the feeling that a change is verified while it > should not, and its status can change later. But this is a minor issue > compaed to curent status. > > Other ideas: > - smoke builds should also report as different gerrit users, so that a > verified=+1 regression result does not override verified=-1 smoke build > result > - when we get a regression failure, we could cast the verified vote to > gerrit and immediatly schedule another regression run. That way we could > automatically workaround spurious failures without the need for > retrigger in Jenkins. You're probably right. :) I'll set up test / sandbox VM's today using last night's backup of our Gerrit setup, then we can try stuff out on it to make sure. Give me a few hours though. ;) It needs to be able to communicate with stuff on the internet for OpenID to work, but unable to affect our Jenkins box, Forge/GitHub/etc. Best way I've thought of for doing that (so far) is adding static routes to bogus IP addresses in /etc/hosts for the things we don't want it communicating with. The other option might be to just use the built in IP tables firewall to disallow all communications except for whitelisted addresses. Will figure it out in a few hours. ;) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Adventures in building GlusterFS
On 2 Apr 2015, at 12:10, Vijay Bellur wrote: > On 04/02/2015 06:27 AM, Jeff Darcy wrote: >> My recommendations: >> >> (1) Apply the -Wno-error=cpp and -Wno-error=maybe-uninitialized >> changes wherever they need to be applied so that they're >> effective during normal regression builds > > Thanks, Jeff. > > Justin - would it be possible to do this change as well in build.sh? Sure. What needs changing from here? https://github.com/justinclift/glusterfs_patch_acceptance_tests/blob/master/build.sh + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Adventures in building GlusterFS
On 2 Apr 2015, at 01:57, Jeff Darcy wrote: >(1) Apply the -Wno-error=cpp and -Wno-error=maybe-uninitialized >changes wherever they need to be applied so that they're >effective during normal regression builds The git repo which holds our CentOS build and regression testing scripts, is here: https://review.gerrithub.io/#/admin/projects/justinclift/glusterfs_patch_acceptance_tests https://github.com/justinclift/glusterfs_patch_acceptance_tests It's being used as a test bunny to try out GerritHub. (May end in rabbit soup. I do not like rabbit soup. :/) The build bit in it is (bash script): P=/build; ./configure --prefix=$P/install --with-mountutildir=$P/install/sbin --with-initdir=$P/install/etc --localstatedir=/var --enable-bd-xlator=yes --enable-debug --silent make install CFLAGS="-g -O0 -Wall -Werror" -j 4 With the -Werror added last night. Should we adjust it? Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Adventures in building GlusterFS
On 2 Apr 2015, at 01:57, Jeff Darcy wrote: > As many of you have undoubtedly noticed, we're now in a situation where > *all* regression builds are now failing, with something like this: > > - > cc1: warnings being treated as errors > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-snapshot-utils.c: > In function ‘glusterd_snap_quorum_check_for_create’: > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-snapshot-utils.c:2615: > error: passing argument 2 of ‘does_gd_meet_server_quorum’ from > incompatible pointer type > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-server-quorum.h:56: > note: expected ‘struct list_head *’ but argument is of type ‘struct > cds_list_head *’ > - > > The reason is that -Werror was turned on earlier today. I'm not quite > sure how or where, because the version of build.sh that I thought builds > would use doesn't seem to have changed since September 8, but then > there's a lot about this system I don't understand. Vijay (who I > believe made the change) knows it better than I ever will. A. This was me. Noticed the lack of -Werror lack night, and immediately "fixed" it. Then hit the sack shortly after. Umm Sorry? :/ + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Coredump in master :/
On 1 Apr 2015, at 20:22, Vijay Bellur wrote: > On 04/02/2015 12:46 AM, Justin Clift wrote: >> On 1 Apr 2015, at 20:09, Vijay Bellur wrote: >> >>> My sanity run got blown due to this as I use -Wall -Werror during >>> compilation. >>> >>> Submitted http://review.gluster.org/10105 to correct this. >> >> Should we add -Wall -Werror to the compile options for our CentOS 6.x >> regression runs? > > I would prefer doing that for CentOS 6.x at least. k, that's been done. All of the regression tests current queued up for master and release-3.6 are probably going to self destruct now though. (just thought of that. oops) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Coredump in master :/
On 1 Apr 2015, at 20:09, Vijay Bellur wrote: > My sanity run got blown due to this as I use -Wall -Werror during compilation. > > Submitted http://review.gluster.org/10105 to correct this. Should we add -Wall -Werror to the compile options for our CentOS 6.x regression runs? + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] rackspace-netbsd7-regression-triggered has been disabled
On 1 Apr 2015, at 17:38, Emmanuel Dreyfus wrote: > Justin Clift wrote: > >> We need some kind of solution. > > What about ading another nb7build user in gerrit? That way results will > not conflict. I'm not sure. However, Vijay's now added me as an admin in our production Gerrit instalce, and I have the process for restoring our backups in a local VM (on my desktop) worked out now. So... I can test this tomorrow morning and try it out. Then we'll know for sure. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Coredump in master :/
On 1 Apr 2015, at 19:51, Shyam wrote: > On 04/01/2015 02:47 PM, Jeff Darcy wrote: >>> When doing an initial burn in test (regression run on master head >>> of GlusterFS git), it coredumped on the new "slave23.cloud.gluster.org" VM. >>> (yeah, I'm reusing VM names) >>> >>> http://build.gluster.org/job/regression-test-burn-in/16/console >>> >>> Does anyone have time to check the coredump, and see if this is >>> the bug we already know about? >> >> This is *not* the same as others I've seen. There are no threads in the >> usual connection-cleanup/list_del code. Rather, it looks like some are >> in generic malloc code, possibly indicating some sort of arena corruption. > > This looks like the other core I saw yesterday, which was not the usual > connection cleanup stuff. Adding this info here, as this brings this core > count upto 2. > > One here, and the other in core.16937 : http://ded.ninja/gluster/blk0/ Oh, I just noticed there's a bunch of compile warnings at the top of the regression run: libtool: install: warning: relinking `server.la' /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-snapshot-utils.c: In function ‘glusterd_snap_quorum_check_for_create’: /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-snapshot-utils.c:2615: warning: passing argument 2 of ‘does_gd_meet_server_quorum’ from incompatible pointer type /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-server-quorum.h:56: note: expected ‘struct list_head *’ but argument is of type ‘struct cds_list_head *’ /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-snapshot-utils.c: In function ‘glusterd_snap_quorum_check’: /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-snapshot-utils.c:2788: warning: passing argument 2 of ‘does_gd_meet_server_quorum’ from incompatible pointer type /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-server-quorum.h:56: note: expected ‘struct list_head *’ but argument is of type ‘struct cds_list_head *’ /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-snapshot-utils.c:2803: warning: passing argument 2 of ‘does_gd_meet_server_quorum’ from incompatible pointer type /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-server-quorum.h:56: note: expected ‘struct list_head *’ but argument is of type ‘struct cds_list_head *’ /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-server-quorum.c: In function ‘glusterd_get_quorum_cluster_counts’: /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-server-quorum.c:230: warning: comparison of distinct pointer types lacks a cast /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-server-quorum.c:236: warning: comparison of distinct pointer types lacks a cast libtool: install: warning: relinking `glusterd.la' libtool: install: warning: relinking `posix-acl.la' Related / smoking-gun? + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Coredump in master :/
Hi us, Adding some more CentOS 6.x regression testing VM's at the moment, to cope with the current load. When doing an initial burn in test (regression run on master head of GlusterFS git), it coredumped on the new "slave23.cloud.gluster.org" VM. (yeah, I'm reusing VM names) http://build.gluster.org/job/regression-test-burn-in/16/console Does anyone have time to check the coredump, and see if this is the bug we already know about? Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Extra overnight regression test run results
On 1 Apr 2015, at 03:48, Justin Clift wrote: > On 31 Mar 2015, at 14:18, Shyam wrote: > >>> Also, most of the regression runs produced cores. Here are >>> the first two: >>> >>> http://ded.ninja/gluster/blk0/ >> >> There are 4 cores here, 3 pointing to the (by now hopefully) famous bug >> #1195415. One of the cores exhibit a different stack etc. Need more analysis >> to see what the issue could be here, core file: core.16937 >> >>> http://ded.ninja/gluster/blk1/ >> >> There is a single core here, pointing to the above bug again. > > Both the blk0 and blk1 VM's are still online and available, > if that's helpful? > > If not, please let me know and I'll nuke them. :) I'm ok to nuke both those VM's, yeah? + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Got a slogan idea?
On 1 Apr 2015, at 18:19, Kaleb S. KEITHLEY wrote: > On 04/01/2015 12:24 PM, Ravishankar N wrote: >> >> I found it easier to draw a picture of what I had in mind, especially >> the arrow mark thingy. You can view it here: >> https://github.com/itisravi/image/blob/master/gluster.jpg >> So what the image is trying to convey (hopefully) is "Gluster: Software >> Defined Storage. Redefined" > > That's clever. I like it. Yeah, works for me too. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel