[Gluster-devel] Test email pls ignore
Ignore this, it's just a test for measuring a delay issue with mailman. + Justin -- "My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there." - Indira Gandhi ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [gluster-devel] Documentation Tooling Review
On 23 Aug 2016, at 20:27, Justin Clift <jus...@postgresql.org> wrote: > On 11 Aug 2016, at 21:23, Amye Scavarda wrote: >> The Red Hat Gluster Storage documentation team and I had a conversation >> about how we can our upstream documentation more consistent and improved >> for our users, and they're willing to work with us to find where the major >> gaps are in our documentation. This is awesome! But it's going to take some >> work on our side to make this a reality. >> >> One piece that's come up is that we should probably look towards changing >> current tooling for this. It turns out that our ReadTheDocs instance search >> is failing because we're using markdown, and this is a known issue. It >> doesn't look like it's going to be fixed anytime soon. >> >> Rather than continue to try to make RTD serve our needs, I'd like to >> propose the following changes to where our documentation lives and in what >> language: >> I'd much rather pattern after docs.openshift.org, move to ASCIIdoc and use >> ASCIIbinder as our engine to power this. What that does is give us control >> over our overall infrastructure underneath our documentation, maintain our >> existing git workflow for adding to documentation, and matches with other >> communities that we work closely with. I'm mindful that there's a burden of >> migration again, but we'll be able to resolve a lot of the challenges we >> have with documentation currently: more control over layout, ability to >> change the structure to make it more user friendly, use our own search >> however we see fit. >> >> I'm happy to take comments on this proposal. Over the next week, I'll be >> reviewing the level of effort it would take to migrate to ASCIIdocs and >> ASCIIbinder, with the goal being to have this in place by end of September. >> >> Thoughts? > > It's probably worth considering GitBook instead: > > https://www.gitbook.com > > Example here: > > http://tutorial.djangogirls.org/en/index.html > > Pros: > > * Works with Markdown & ASCIIdoc > >No need to convert the existing docs to a new format, >and the already learned Markdown skills don't need relearning > > * Also fully Open Source > >https://github.com/GitbookIO/gitbook/ > > * Searching works very well > >Try searching on the Django Girls tutorial above for "Python". > >Correct results are returned in small fractions of a second. > > * Has well developed plugins to enable things like inline >videos, interactive exercises (and more) > >https://plugins.gitbook.com > > * Can be self hosted, or hosted on the GitBooks infrastructure > > * Doesn't require Ruby, unlike ASCIIbinder which is written >in it. An extra "Pro" pointed out to me offline: * You can log in with GitHub and post comments on each line Example here: https://docs.lacona.io/docs/basics/getting-started.html Note the green line there, with a helpful comment added to the side of that Seems like a good way for people to review/revise docs, for polishing & tweaking. > Cons: > > * It's written in Node.js instead > >Not sure that's any better than Ruby > > It seems a better polished solution than docs.openshift.org is using, > and would probably require less effort for the Gluster docs to be adapted > to. > > Thoughts? :) + Justin -- "My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there." - Indira Gandhi ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [gluster-devel] Documentation Tooling Review
On 11 Aug 2016, at 21:23, Amye Scavarda wrote: > The Red Hat Gluster Storage documentation team and I had a conversation > about how we can our upstream documentation more consistent and improved > for our users, and they're willing to work with us to find where the major > gaps are in our documentation. This is awesome! But it's going to take some > work on our side to make this a reality. > > One piece that's come up is that we should probably look towards changing > current tooling for this. It turns out that our ReadTheDocs instance search > is failing because we're using markdown, and this is a known issue. It > doesn't look like it's going to be fixed anytime soon. > > Rather than continue to try to make RTD serve our needs, I'd like to > propose the following changes to where our documentation lives and in what > language: > I'd much rather pattern after docs.openshift.org, move to ASCIIdoc and use > ASCIIbinder as our engine to power this. What that does is give us control > over our overall infrastructure underneath our documentation, maintain our > existing git workflow for adding to documentation, and matches with other > communities that we work closely with. I'm mindful that there's a burden of > migration again, but we'll be able to resolve a lot of the challenges we > have with documentation currently: more control over layout, ability to > change the structure to make it more user friendly, use our own search > however we see fit. > > I'm happy to take comments on this proposal. Over the next week, I'll be > reviewing the level of effort it would take to migrate to ASCIIdocs and > ASCIIbinder, with the goal being to have this in place by end of September. > > Thoughts? It's probably worth considering GitBook instead: https://www.gitbook.com Example here: http://tutorial.djangogirls.org/en/index.html Pros: * Works with Markdown & ASCIIdoc No need to convert the existing docs to a new format, and the already learned Markdown skills don't need relearning * Also fully Open Source https://github.com/GitbookIO/gitbook/ * Searching works very well Try searching on the Django Girls tutorial above for "Python". Correct results are returned in small fractions of a second. * Has well developed plugins to enable things like inline videos, interactive exercises (and more) https://plugins.gitbook.com * Can be self hosted, or hosted on the GitBooks infrastructure * Doesn't require Ruby, unlike ASCIIbinder which is written in it. Cons: * It's written in Node.js instead Not sure that's any better than Ruby It seems a better polished solution than docs.openshift.org is using, and would probably require less effort for the Gluster docs to be adapted to. Thoughts? :) + Justin -- "My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there." - Indira Gandhi ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-infra] NetBSD regressions not being triggered for patches
On 18 Jun 2015, at 16:57, Emmanuel Dreyfus m...@netbsd.org wrote: Niels de Vos nde...@redhat.com wrote: I'm not sure what limitation you mean. Did we reach the limit of slaves that Jenkins can reasonably address? No I mean its inability to catch a new DNS record. Priority wise, my suggestion would be to first get Gerrit and Jenkins migrated to one of the two new servers. (probably put them in separate VM's) If the DNS problem does turn out to be the dodgy iWeb hardware firewall, then this fixes the DNS issue. (if not... well damn!) Assuming that does work :), then getting the other server set up with new VM's and such would be the next thing to do. That's my thinking anyway. For reference, these are the main hardware specs for the two boxes: formicary.gluster.org -- for Gerrit/Jenkins/whatever * * 2 x Intel Xeon CPU E5-2640 v3 @ 2.60GHz (8 physical cores per cpu) * 32GB ECC RAM * 2 x ~560GB SAS HDD's * 1 x Intel 2P X520/2P I350 rNDC network card * Seems to be a 4 port 10GbE card. The mgmt console says 2 ports are up, and two down. Guessing this means only two ports are cabled up. ci.gluster.org -- for VMs ** * 2 x Intel Xeon E5-2650 v3 @ 2.30GHz (10 physical cores per cpu) * 96GB ECC RAM * 4 x ~560GB SAS HDD's * 1 x Intel 2P X520/2P I350 rNDC network card * Seems to be a 4 port 10GbE card. The mgmt console says 2 ports are up, and two down. Guessing this means only two ports are cabled up. Hope this is useful info. ;) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-infra] NetBSD regressions not being triggered for patches
On 18 Jun 2015, at 09:19, Niels de Vos nde...@redhat.com wrote: On Thu, Jun 18, 2015 at 12:57:05AM +0100, Justin Clift wrote: On 17 Jun 2015, at 20:14, Niels de Vos nde...@redhat.com wrote: On Wed, Jun 17, 2015 at 03:14:31PM +0200, Michael Scherer wrote: Le mercredi 17 juin 2015 à 11:58 +0100, Justin Clift a écrit : On 17 Jun 2015, at 10:53, Michael Scherer msche...@redhat.com wrote: Le mercredi 17 juin 2015 à 11:48 +0200, Michael Scherer a écrit : Le mercredi 17 juin 2015 à 08:20 +0200, Emmanuel Dreyfus a écrit : Venky Shankar yknev.shan...@gmail.com wrote: If that's the case, then I'll vote for this even if it takes some time to get things in workable state. See my other mail about this: you enter a new slave VM in the DNS and it does not resolve, or somethimes you get 20s delays. I am convinced this is the reason why Jenkins bugs. But cloud.gluster.org is handled by rackspace, not sure how much control we have for it ( not sure even where to start there ). So I cannot change the DNS destination. What I can do is to create a new dns zone, and then, we can delegate as we want. And migrate some slaves and not others, and see how it goes ? slaves.gluster.org would be ok for everybody ? Try it out, and see if it works. :) On the scaling the infrastructure side of things, are the two OSAS servers for Gluster still available? They are online. $ ssh r...@ci.gluster.org uptime 09:13:37 up 33 days, 16:34, 0 users, load average: 0,00, 0,01, 0,05 Can it run some Jenkins Slave VMs too? There are two boxes. A pretty beefy one for running Jenkins slave VM's (probably about 40 VM's simultaneously), and a slightly less beefy one for running Jenkins/Gerrit/whatever. Good to know, but it would be much more helpful if someone could install VMs there and add them to the Jenkins instance... Who can do that, or who can guide someone else to get it done? Misc has the keys. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Notes on brick multiplexing in 4.0
On 17 Jun 2015, at 07:21, Kaushal M kshlms...@gmail.com wrote: One more question. I keep hearing about QoS for volumes as a feature. How will we guarantee service quality for all the bricks from a single server? Even if we weren't doing QoS, we make sure that operations on brick doesn't DOS the others. We already keep hearing from users about self-healing causing problems for the clients. Any idea if there's a clear patten of network vs disk traffic vs something else causing that? (Excess network or disk traffic could easily cause it in theory I guess, but practical data would be useful. :) Self-healing, rebalance running simultaneously on multiple volumes in a multiplexed bricks environment would most likely be disastrous. Not sure how that's different from now with those operations being able to run in the current approach. This is us having a chance to think this stuff through and work out a solution now. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-infra] NetBSD regressions not being triggered for patches
On 17 Jun 2015, at 10:53, Michael Scherer msche...@redhat.com wrote: Le mercredi 17 juin 2015 à 11:48 +0200, Michael Scherer a écrit : Le mercredi 17 juin 2015 à 08:20 +0200, Emmanuel Dreyfus a écrit : Venky Shankar yknev.shan...@gmail.com wrote: If that's the case, then I'll vote for this even if it takes some time to get things in workable state. See my other mail about this: you enter a new slave VM in the DNS and it does not resolve, or somethimes you get 20s delays. I am convinced this is the reason why Jenkins bugs. But cloud.gluster.org is handled by rackspace, not sure how much control we have for it ( not sure even where to start there ). So I cannot change the DNS destination. What I can do is to create a new dns zone, and then, we can delegate as we want. And migrate some slaves and not others, and see how it goes ? slaves.gluster.org would be ok for everybody ? Try it out, and see if it works. :) On the scaling the infrastructure side of things, are the two OSAS servers for Gluster still available? If so, we should get them online ASAP, as that will give us ~40 new VMs + get us out of iWeb (which I suspect is the problem). Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-infra] NetBSD regressions not being triggered for patches
On 17 Jun 2015, at 07:29, Kaushal M kshlms...@gmail.com wrote: cloud.gluster.org is served by Rackspace Cloud DNS. AFAICT, there is no readily available option to do zone transfers from it. We might have to contact the Rackspace support to find out if they can do it as a special request. Contacting Rackspace support is very easy, and they're normally very responsive. They have an online support ticket submission thing in the Rackspace UI. Often they get back to us with meaningful responses in less than 15-20 minutes. Please go ahead and submit a ticket. :) (Btw - I suspect the DNS issue is likely related to the hardware firewall in the iWeb infrastructure. It's probably acting up. :). Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Gluster IPv6 bugfixes (Bug 1117886)
That'd be Awesome. :) + Justin On 15 Jun 2015, at 20:53, Richard Wareing rware...@fb.com wrote: Hey Nithin, We have IPv6 going as well (v3.4.x v3.6.x), so I might be able to help out here and perhaps combine our efforts. We did something similar here, however we also tackled the NFS side of the house, which required a bunch of changes due to how port registration w/ portmapper changed in IPv6 vs IPv4. You effectively have to use libtirpc to do all the port registrations with IPv6. We can offer up our patches for this work and hopefully things can be combined such that end-users can simply do vol set volume transport-address-family inet|inet6 and voila they have whatever support they desire. I'll see if we can get this posted to bug 1117886 this week. Richard From: gluster-devel-boun...@gluster.org [gluster-devel-boun...@gluster.org] on behalf of Nithin Kumar Dabilpuram [nithind1...@yahoo.in] Sent: Saturday, June 13, 2015 9:12 PM To: gluster-devel@gluster.org Subject: [Gluster-devel] Gluster IPv6 bugfixes (Bug 1117886) Hi, Can I contribute to this bug fix ? I've worked on Gluster IPv6 functionality bugs in 3.3.2 in my past organization and was able to successfully bring up gluster on IPv6 link local addresses as well. Please find my work in progress patch. I'll raise gerrit review once testing is done. I was successfully able to create volumes with 3 peers and add bricks. I'll continue testing other basic functionality and see what needs to be modified. Any other suggestions ? Brief info about the patch: Here I'm trying to use transport.address-family option in /etc/glusterfs/glusterd.vol file and then propagate the same to server and client vol files and their translators. In this way when user mentions transport.address-family inet6 in its glusterd.vol file, all glusterd servers open AF_INET6 sockets and then the same information is stored in glusterd_volinfo and used when generating vol config files. -thanks Nithin ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Notes on brick multiplexing in 4.0
On 15 Jun 2015, at 20:35, Jeff Darcy jda...@redhat.com wrote: I've written up some thoughts about how to have multiple bricks sharing a single process/port, since this is necessary to support other 4.0 features and is likely to be a bit tricky to implement. Comments welcome here: https://goo.gl/27L9I5 Reading through that, it sounds like a well thought out approach. Did you consider a super-lightweight version first, which only has a process listening on one port for multiplexing traffic, and then passes the traffic to individual processes running on the server? eg similar to how common IPv4 NAT does, but for gluster traffic :) Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] A Year with Go
Potentially relevant to a GlusterD rewrite, since we've mentioned Go as a possibility a few times: https://vagabond.github.io/rants/2015/06/05/a-year-with-go/ https://news.ycombinator.com/item?id=9668302 + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-infra] Regression failure in volume-snapshot-clone.t
There are two extra CentOS 6 VM's online for debugging stuff with, but they're both in use at the moment: * slave0.cloud.gluster.org * slave1.cloud.gluster.org Sachin Pandit, Raghavendra Bhat, and Krutika Dhananjay, are using them. Ping them, and organise with them when you can use them. I was intending to turn them off today, but it sounds like they should be left on for a while longer for people to investigate with. Regards and best wishes, Justin Clift On 21 May 2015, at 14:22, Avra Sengupta aseng...@redhat.com wrote: Hi, Can I get access to a rackspace VM so that I can debug this particular testcase on it. Regards, Avra Forwarded Message Subject: Re: [Gluster-devel] Regression failure in volume-snapshot-clone.t Date: Thu, 21 May 2015 17:08:05 +0530 From: Vijay Bellur vbel...@redhat.com To: Avra Sengupta aseng...@redhat.com, gluster Devel gluster-devel@gluster.org, atin Mukherjee amukh...@redhat.com, Krishnan Parthasarathi kpart...@redhat.com, rjosep Rajesh Joseph rjos...@redhat.com On 05/21/2015 02:44 PM, Avra Sengupta wrote: Hi, I am not able to reproduce this failure in my set-up. I am aware that Atin was able to do so successfully a few days back, and I tried something similar with the following loop. for i in {1..100}; do export DEBUG=1; prove -r ./tests/basic/volume-snapshot-clone.t 1; lines=`less 1 | grep All tests successful | wc -l`; if [ $lines != 1 ];then echo TESTCASE FAILED. BREAKING; break; fi; done I have been running this for about an hour and half, and will continue doing so. But till now i have not encountered a failure. Could anyone please point out if I am missing something obvious here. Some tests fail more frequently in the rackspace VMs where we run regressions. Please drop a note on gluster-infra ML if you want to offline one such VM from jenkins and run tests there. -Vijay ___ Gluster-infra mailing list gluster-in...@gluster.org http://www.gluster.org/mailman/listinfo/gluster-infra -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Gluster Summit recordings
(Didn't see this mentioned elsewhere) The video recordings (using a tablet resting on the desk) for the Gluster Summit sessions in Barcelona are here: https://www.youtube.com/channel/UCngUyL3KPYz8M2n7rDJWU0w Thanks to Spot for providing the tablet for most of them, and uploading them too. :) Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-infra] Downtime for Jenkins
On 17 May 2015, at 13:36, Vijay Bellur vbel...@redhat.com wrote: On 05/17/2015 02:32 PM, Vijay Bellur wrote: [Adding gluster-devel] On 05/16/2015 11:31 PM, Niels de Vos wrote: On Sat, May 16, 2015 at 06:32:00PM +0200, Niels de Vos wrote: It seems that many failures of the regression tests (at least for NetBSD) are caused by failing to reconnect to the slave. Jenkins tries to keep a control connection open to the slaves, and reconnects when the connection terminates. I do not know why the connection is disrupted, but I can see that Jenkins is not able to resolve the hostname of the slave. For example, from (well, you have to find the older logs, Jenkins seems to have automatically reconnected) http://build.gluster.org/computer/nbslave72.cloud.gluster.org-v2/log : java.io.IOException: There was a problem while connecting to nbslave71.cloud.gluster.org:22 ... Caused by: java.net.UnknownHostException: nbslave71.cloud.gluster.org: Name or service not known The error in the console log of the regression test is less helpful, it only states the disconnection failure: http://build.gluster.org/job/rackspace-netbsd7-regression-triggered/5408/console In fact, this looks very much related to these reports: - https://issues.jenkins-ci.org/browse/JENKINS-19619 duplicate of 18879 - https://issues.jenkins-ci.org/browse/JENKINS-18879 This problem should be fixed in Jenkins 1.524 and newer. Time to upgrade Jenkins too? Yes, I have started an upgrade. Please expect a downtime for Jenkins during the upgrade. I will update once the activity is complete. Upgrade to Jenkins v1.613 is now complete and Jenkins seems to be largely doing fine. Several plugins of Jenkins have also been updated to their latest versions. During the course of the upgrade, I noticed that we were using the deprecated 'gerrit approve' interface to intimate status of a smoke run. Have changed that to use 'gerrit review' and this seems to have addressed the problem of smoke tests not reporting status back to gerrit. There were a few instances of Jenkins not being able to launch slaves through ssh but was later successful upon automatic retries. We will need to watch this behavior to see if this problem persists and comes in the way of normal functioning. Manu - can you please verify and report back if the NetBSD slaves work better with the upgraded Jenkins master? All - please drop a note on gluster-infra if you happen to notice problems with Jenkins. Good stuff. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Gluster 3.7.0 released
On 14 May 2015, at 10:19, Vijay Bellur vbel...@redhat.com wrote: Hi All, I am happy to announce that Gluster 3.7.0 is now generally available. 3.7.0 contains several new features and is one of our more feature packed releases in recent times. The release notes [1] contains a description of new functionality added to 3.7.0. In addition to features, 3.7.0 also contains several bug fixes and minor improvements. It is highly recommended to test 3.7.0 thoroughly for your use cases before deploying in production. Gluster 3.7.0 can be downloaded from [2]. Upgrade instructions can be found at [3]. Packages for various distributions will be available shortly at the download site. snip 3.7.0 won't be packaged into Ubuntu LTS nor CentOS EPEL will it? (I'm meaning their official external repos, not download.gluster.org) If there's any chance they might be, can we can get that blocked until 3.7.x so people on 3.6.3 aren't automatically upgraded via package update. Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] good job on fixing heavy hitters in spurious regressions
On 8 May 2015, at 13:16, Jeff Darcy jda...@redhat.com wrote: snip Perhaps the change that's needed is to make the fixing of likely-spurious test failures a higher priority than adding new features. YES! A million times Yes. We need to move this project to operating with _0 regression failures_ as the normal state of things for master and release branches. Regression failures for CR's in development... sure, that's a normal part of development. But any time a regression failure happens in _master_ or a release branch should be a case of _get this fixed pronto_. + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Proposal for improving throughput for regression test
On 8 May 2015, at 16:19, Jeff Darcy jda...@redhat.com wrote: snip Proposal 2: Use ip address instead of host name, because it takes some good amount of time to resolve from host name, and even some times causes spurious failure. If resolution is taking a long time, that's probably fixable in the test machine configuration. Reading a few lines from /etc/hosts should take only a trivial amount of time. Ahhh, I didn't know that DNS resolution was being a problem. Yeah, we can hard code entries into /etc/hosts for the slave machines if that would help. (this is easy to do) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] good job on fixing heavy hitters in spurious regressions
On 8 May 2015, at 04:15, Pranith Kumar Karampuri pkara...@redhat.com wrote: snip 2) If the same test fails on different patches more than 'x' number of times we should do something drastic. Let us decide on 'x' and what the drastic measure is. Sure. That number is 0. If it fails more than 0 times on different patches, we have a problem than needs resolving as an immediate priority. snip Some good things I found this time around compared to 3.6.0 release: 1) Failing the regression on first failure is helping locating the failure logs really fast 2) More people chipped in fixing the tests that are not at all their responsibility, which is always great to see. Cool. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Proposal for improving throughput for regression test
On 8 May 2015, at 10:02, Mohammed Rafi K C rkavu...@redhat.com wrote: Hi All, As we all know, our regression tests are killing us. An average, one regression will take approximately two and half hours to complete the run. So i guess this is the right time to think about enhancing our regression. Proposal 1: Create a new option for the daemons to specify that it is running as test mode, then we can skip fsync calls used for data durability. Proposal 2: Use ip address instead of host name, because it takes some good amount of time to resolve from host name, and even some times causes spurious failure. Proposal 3: Each component has a lot of .t files and there is redundancy in tests, We can do a rework to reduce the .t files and make less number of tests that covers unit testing for a component , and run regression runs once in a day (nightly) . Please provide your inputs for the proposed ideas , and feel free to add a new idea. Proposal 4: Break the regression tests into parts that can be run in parallel. So, instead of the regression testing for a particular CR going from the first test to the last in a serial sequence, we break it up into a number of chunks (dir based?) and make each of these a task. That won't reduce the overall number of tests, but it should get the time down for the result to be finished. Caveat : We're going to need more VM's, as once we get into things queueing up it's not going to help. :/ + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] gluster crashes in dht_getxattr_cbk() due to null pointer dereference.
Thanks Paul. That's for an ancient series of GlusterFS (3.4.x) we're not really looking to release further updates for. If that's the version you guys are running in your production environment, having you looked into moving to a newer release series? + Justin On 8 May 2015, at 10:55, Paul Guo bigpaul...@foxmail.com wrote: Hi, gdb debugging shows the rootcause seems to be quite straightforward. The gluster version is 3.4.5 and the stack: #0 0x7eff735fe354 in dht_getxattr_cbk (frame=0x7eff775b6360, cookie=value optimized out, this=value optimized out, op_ret=value optimized out, op_errno=0, xattr=value optimized out, xdata=0x0) at dht-common.c:2043 2043 DHT_STACK_UNWIND (getxattr, frame, local-op_ret, op_errno, Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.80.el6.x86_64 keyutils-libs-1.4-4.el6.x86_64 krb5-libs-1.9-33.el6.x86_64 libcom_err-1.41.12-12.el6.x86_64 libgcc-4.4.6-4.el6.x86_64 libselinux-2.0.94-5.3.el6.x86_64 openssl-1.0.1e-16.el6_5.14.x86_64 zlib-1.2.3-27.el6.x86_64 (gdb) bt #0 0x7eff735fe354 in dht_getxattr_cbk (frame=0x7eff775b6360, cookie=value optimized out, this=value optimized out, op_ret=value optimized out, op_errno=0, xattr=value optimized out, xdata=0x0) at dht-common.c:2043 #1 0x7eff7383c168 in afr_getxattr_cbk (frame=0x7eff7756ab58, cookie=value optimized out, this=value optimized out, op_ret=0, op_errno=0, dict=0x7eff76f21dc8, xdata=0x0) at afr-inode-read.c:618 #2 0x7eff73d8 in client3_3_getxattr_cbk (req=value optimized out, iov=value optimized out, count=value optimized out, myframe=0x7eff77554d4c) at client-rpc-fops.c:1115 #3 0x003de700d6f5 in rpc_clnt_handle_reply (clnt=0xc36ad0, pollin=0x14b21560) at rpc-clnt.c:771 #4 0x003de700ec6f in rpc_clnt_notify (trans=value optimized out, mydata=0xc36b00, event=value optimized out, data=value optimized out) at rpc-clnt.c:891 #5 0x003de700a4e8 in rpc_transport_notify (this=value optimized out, event=value optimized out, data=value optimized out) at rpc-transport.c:497 #6 0x7eff74af6216 in socket_event_poll_in (this=0xc46530) at socket.c:2118 #7 0x7eff74af7c3d in socket_event_handler (fd=value optimized out, idx=value optimized out, data=0xc46530, poll_in=1, poll_out=0, poll_err=0) at socket.c:2230 #8 0x003de785e907 in event_dispatch_epoll_handler (event_pool=0xb70e90) at event-epoll.c:384 #9 event_dispatch_epoll (event_pool=0xb70e90) at event-epoll.c:445 #10 0x00406818 in main (argc=4, argv=0x7fff24878238) at glusterfsd.c:1934 See dht_getxattr_cbk() (below). When frame-local is equal to 0, gluster jumps to the label out where when it accesses local-xattr (i.e. 0-xattr), it crashes. Note in DHT_STACK_UNWIND()-STACK_UNWIND_STRICT(), fn looks fine. (gdb) p __local $11 = (dht_local_t *) 0x0 (gdb) p frame-local $12 = (void *) 0x0 (gdb) p fn $1 = (fop_getxattr_cbk_t) 0x7eff7298c940 mdc_readv_cbk I did not read the dht code much so I have not idea whether zero frame-local is normal or not, but from the code's perspective this is an obvious bug and it still exists in latest glusterfs workspace. The following code change is a simple fix, but maybe there's a better one. -if (is_last_call (this_call_cnt)) { +if (is_last_call (this_call_cnt) local != NULL) { Similar issues exist in other functions also, e.g. stripe_getxattr_cbk() (I did not check all code). int dht_getxattr_cbk (call_frame_t *frame, void *cookie, xlator_t *this, int op_ret, int op_errno, dict_t *xattr, dict_t *xdata) { int this_call_cnt = 0; dht_local_t *local = NULL; VALIDATE_OR_GOTO (frame, out); VALIDATE_OR_GOTO (frame-local, out); .. out: if (is_last_call (this_call_cnt)) { DHT_STACK_UNWIND (getxattr, frame, local-op_ret, op_errno, local-xattr, NULL); } return 0; } ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Proposal for improving throughput for regression test
On 8 May 2015, at 18:41, Pranith Kumar Karampuri pkara...@redhat.com wrote: snip Break the regression tests into parts that can be run in parallel. So, instead of the regression testing for a particular CR going from the first test to the last in a serial sequence, we break it up into a number of chunks (dir based?) and make each of these a task. That won't reduce the overall number of tests, but it should get the time down for the result to be finished. Caveat : We're going to need more VM's, as once we get into things queueing up it's not going to help. :/ Raghavendra Talur(CCed) did some work on this earlier by using more docker isntances on a single VM to get the running time under an hour. Interesting idea. Any idea if this Docker approach could be made to work in CentOS 6 for our existing VM's? + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Jupyter notebook support on GitHub
This could be really useful for us: https://github.com/blog/1995-github-jupyter-notebooks-3 GitHub now supports Jupyter notebooks directly. Similar to how Markdown (.md) files are displayed in their rendered format, Jupyter notebook (.ipynb) files are now too. Should make for better docs for us, as we can do intro stuff and other technical concept bits with graphics now instead of just ascii art. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Fwd: [sqlite] SQLite 3.8.10 enters testing
Fuzzy testing has been added to SQLite's standard testing strategy. Wonder if it'd be useful for us too... ? + Justin Begin forwarded message: From: Simon Slavin slav...@bigfraud.org Subject: Re: [sqlite] SQLite 3.8.10 enters testing Date: 4 May 2015 22:03:59 BST To: General Discussion of SQLite Database sqlite-us...@mailinglists.sqlite.org Reply-To: General Discussion of SQLite Database sqlite-us...@mailinglists.sqlite.org On 4 May 2015, at 8:23pm, Richard Hipp d...@sqlite.org wrote: A list of changes (still being revised and updated) is at (https://www.sqlite.org/draft/releaselog/3_8_10.html). Because of its past success, AFL became a standard part of the testing strategy for SQLite beginning with version 3.8.10. There is at least one instance of AFL running against SQLite continuously, 24/7/365, trying new randomly mutated inputs against SQLite at a rate of a few hundred to a few thousand per second. Billions of inputs have been tried, but AFL's instrumentation has narrowed them down to less than 20,000 test cases that cover all distinct behaviors. Newly discovered test cases are periodically captured and added to the TCL test suite. Heh. Mister Zalewski can be proud. Simon. ___ sqlite-users mailing list sqlite-us...@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Regression test failures - Call for Action
On 5 May 2015, at 03:40, Jeff Darcy jda...@redhat.com wrote: Jeff's patch failed again with same problem: http://build.gluster.org/job/rackspace-netbsd7-regression-triggered/4531/console Wouldn't have expected anything different. This one looks like a problem in the Jenkins/Gerrit infrastructure. This kind of error message at the end of a failure log indicates the VM has self-disconnected from Jenkins and needs rebooting. Haven't found any other way to fix it. :/ Happens with both CentOS and NetBSD regression runs. [...] ^ FATAL: Unable to delete script file /var/tmp/hudson8377790745169807524.sh hudson.util.IOException2 : remote file operation failed: /var/tmp/hudson8377790745169807524.sh at hudson.remoting.Channel@2bae0315:nbslave72.cloud.gluster.org at hudson.FilePath.act(FilePath.java:900) at hudson.FilePath.act(FilePath.java:877) at hudson.FilePath.delete(FilePath.java:1262) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:101) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60) [...] + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Regression test failures - Call for Action
On 4 May 2015, at 08:06, Vijay Bellur vbel...@redhat.com wrote: Hi All, There has been a spate of regression test failures (due to broken tests or race conditions showing up) in the recent past [1] and I am inclined to block 3.7.0 GA along with acceptance of patches until we fix *all* regression test failures. We seem to have reached a point where this seems to be the only way to restore sanity to our regression runs. I plan to put this into effect 24 hours from now i.e. around 0700 UTC on 05/05. Thoughts? Please do this. :) + Justin Thanks, Vijay [1] https://public.pad.fsfe.org/p/gluster-spurious-failures ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Configuration Error during gerrit login
I'm hoping this us mostly due to bugs in the older version of Gerrit + GitHub plugin we're using. We'll upgrade in a few weeks, and see how it goes then... ;) + Justin On 1 May 2015, at 03:38, Gaurav Garg gg...@redhat.com wrote: Hi, I was also having the same problems many times, i fixed it by following way 1. Go to https://github.com/settings/applications and revoke the authorization for 'Gerrit Instance for Gluster Community' 2. Clean up all cookies for github and review.gluster.org 3. Goto https://review.gluster.org/ and sign-in again. You'll be asked to sign-in to Github again and provide authorization - Original Message - From: Vijay Bellur vbel...@redhat.com To: Gluster Devel gluster-devel@gluster.org Sent: Friday, May 1, 2015 12:31:38 AM Subject: [Gluster-devel] Configuration Error during gerrit login Ran into Configuration Error several times today. The error message states: The HTTP server did not provide the username in the GITHUB_USERheader when it forwarded the request to Gerrit Code Review... Switching browsers was useful for me to overcome the problem. Annoying for sure, but we seem to have a workaround :). HTH, Vijay ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] netbsd regression logs
On 1 May 2015, at 16:08, Emmanuel Dreyfus m...@netbsd.org wrote: Pranith Kumar Karampuri pkara...@redhat.com wrote: I was not able to re-create glupy failure. I see that netbsd is not archiving logs like the linux regression. Do you mind adding that one? I think kaushal and Vijay did this for Linux regressions, so CC them. They are archived, in /archives/logs/ on the regressions VM. It's just that you have to get them through sftp. Is it easy to add web access for them? (eg nginx or whatever) We have the nginx rule for the CentOS ones around somewhere if it'd help? + Justin -- Emmanuel Dreyfus http://hcpnet.free.fr/pubz m...@netbsd.org ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] This looks like an interesting Gerrit option we could turn on
On 29 Apr 2015, at 08:05, Niels de Vos nde...@redhat.com wrote: On Wed, Apr 29, 2015 at 02:40:54AM -0400, Jeff Darcy wrote: label.Label-Name.copyAllScoresOnTrivialRebase If true, all scores for the label are copied forward when a new patch set is uploaded that is a trivial rebase. A new patch set is considered as trivial rebase if the commit message is the same as in the previous patch set and if it has the same code delta as the previous patch set. This is the case if the change was rebased onto a different parent. This can be used to enable sticky approvals, reducing turn-around for trivial rebases prior to submitting a change. Defaults to false. Same code delta is a bit slippery. It can't be determined from the patch itself, because at least line numbers and diff context will have changed and would need to be ignored to say something's the same. I think forwarding scores is valuable enough that I'm in favor of turning this option on, but we should maintain awareness that scores might get forwarded in some cases where perhaps they shouldn't. Indeed, and I would be in favour of copying the scores after a rebase for Code-Review only. We should still have Jenkins run the regression tests so that in the (rare) event a patch gets incorrectly applied, either building fails, or the change in behaviour gets detected. Yeah. Reading that section of the Gerrit manual more, it seems to be an option that gets turned on per label. So, we can turn it on for the Code Review label, but leave it off for the Verified label. That should make sure that even trivial rebases get retested by the smoke and regression tests. + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] REMINDER: Weekly Gluster Community meeting is in 30 mins!
Reminder!!! The weekly Gluster Community meeting is in 30 mins, in #gluster-meeting on IRC. This is a completely public meeting, everyone is encouraged to attend and be a part of it. :) To add Agenda items *** Just add them to the main text of the Etherpad, and be at the meeting. :) https://public.pad.fsfe.org/p/gluster-community-meetings Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] REMINDER: Weekly Gluster Community meeting is in 30 mins!
On 29 Apr 2015, at 12:30, Justin Clift jus...@gluster.org wrote: Reminder!!! The weekly Gluster Community meeting is in 30 mins, in #gluster-meeting on IRC. This is a completely public meeting, everyone is encouraged to attend and be a part of it. :) Thanks for everyone for attending! * 3.6.3 has been released and announced (thanks raghu!) 3.6.4beta1 will be available in a week or so. * 3.7.0beta1 (tarball) has been released. We're still working on packages for it. ;) This will go into Fedora Rawhide too. * 3.5.4beta1 will likely be ready by the start of next week. * Tigert is working on some new GlusterFS website layout ideas. Preview here: https://glusternew-tigert.rhcloud.com He'll start a mailing list thread about it shortly. Meeting log: https://meetbot.fedoraproject.org/gluster-meeting/2015-04-29/gluster-meeting.2015-04-29-12.01.html Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Update on mgmt_v3-locks.t failure in netbsd
Does this mean we're officially no longer supporting 32 bit architectures? (or is that just on x86?) + Justin On 28 Apr 2015, at 12:45, Kaushal M kshlms...@gmail.com wrote: Found the problem. The NetBSD slaves are running a 32-bit kernel and userspace. ``` nbslave7a# uname -p i386 ``` Because of this CAA_BITS_PER_LONG is set to 32 and the case for size 8 isn't compiled in uatomic_add_return. Even though the underlying (virtual) hardware has 64-bit support, and supports the required 8-byte wide instrcution, it cannot be used because we are running on a 32-bit kernel with a 32-bit userspace. Manu, was there any particular reason why you 32-bit NetBSD? If there are none, can you please replace the VMs with 64-bit NetBSD. Until then you can keep mgmt_v3-locks.t disabled. ~kaushal On Tue, Apr 28, 2015 at 4:56 PM, Kaushal M kshlms...@gmail.com wrote: I seem to have found the issue. The uatomic_add_return function is defined in urcu/uatomic.h as ``` /* uatomic_add_return */ static inline __attribute__((always_inline)) unsigned long __uatomic_add_return(void *addr, unsigned long val, int len) { switch (len) { case 1: { unsigned char result = val; __asm__ __volatile__( lock; xaddb %1, %0 : +m(*__hp(addr)), +q (result) : : memory); return result + (unsigned char)val; } case 2: { unsigned short result = val; __asm__ __volatile__( lock; xaddw %1, %0 : +m(*__hp(addr)), +r (result) : : memory); return result + (unsigned short)val; } case 4: { unsigned int result = val; __asm__ __volatile__( lock; xaddl %1, %0 : +m(*__hp(addr)), +r (result) : : memory); return result + (unsigned int)val; } #if (CAA_BITS_PER_LONG == 64) case 8: { unsigned long result = val; __asm__ __volatile__( lock; xaddq %1, %0 : +m(*__hp(addr)), +r (result) : : memory); return result + (unsigned long)val; } #endif } /* * generate an illegal instruction. Cannot catch this with * linker tricks when optimizations are disabled. */ __asm__ __volatile__(ud2); return 0; } ``` As we can see, uatomic_add_return uses different assembly instructions to perform the add based on the size of the datatype of the value. If the size of the value doesn't exactly match one of the sizes in the switch case, it deliberately generates a SIGILL. The case for size 8, is conditionally compiled as we can see above. From the backtrace Atin provided earlier, we see that the size of the value is indeed 8 (we use uint64_t). Because we had a SIGILL, we can conclude that the case for size 8 wasn't compiled. I don't know why this compilation didn't (or as this is in a header file, doesn't) happen on the NetBSD slaves and this is something I'd like to find out. ~kaushal On Tue, Apr 28, 2015 at 1:50 PM, Anand Nekkunti anekk...@redhat.com wrote: On 04/28/2015 01:40 PM, Emmanuel Dreyfus wrote: On Tue, Apr 28, 2015 at 01:37:42PM +0530, Anand Nekkunti wrote: __asm__ is for to write assembly code in c (gcc), __volatile__(:::) compiler level barrier to force the compiler not to do reorder the instructions(to avoid optimization ) . Sure, but the gory details should be of no interest to the developer engaged in debug: if it crashes this is probably because it is called with wrong arguments, hence the question: ccing gluster-devel new_peer-generation = uatomic_add_return (conf-generation, 1); Are new_peer-generation and conf-generation sane? ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] This looks like an interesting Gerrit option we could turn on
This sounds like it might be useful for us: https://gerrit-documentation.storage.googleapis.com/Documentation/2.9.4/config-labels.html#label_copyAllScoresOnTrivialRebase Yes/no/? + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Just did a bunch of Gerrit account merging... pls let me know if anything goes wrong for you
Some people are still having trouble logging into Gerrit, so I've just gone through and cleaned up some duplicate entries, old data, and similar. If Gerrit suddenly starts misbehaving for you now, please let me know. (In theory it shouldn't... but I don't trust theory in this at all ;) Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Hung regression jobs
On 23 Apr 2015, at 01:18, Jeff Darcy jda...@redhat.com wrote: I just had to clean up a couple of these - 7327 and 7331. Fortunately, they both seem to have gone on their merry way instead of dying. Both were in the pre-mount stage of their setup, but did have mounts active and gsyncd processes running (in one case multiple of them). I suspect that this is related to the fact that the new geo-rep tests call exit directly instead of returning errors (see geo-rep-helpers.c:192) and don't use bash's trap ... EXIT functionality to ensure proper cleanup. Thus, whatever was mounted or running when they failed will remain mounted or running to trip up the next test. If one of your regression jobs seems to be hung, either log in to the slave machine yourself or contact someone who can, so the offending mounts/processes can be unmounted/killed. Ahhh yeah, this makes sense. The scripting in Jenkins for launching regression tests should probably be tweaked to also kill any left over geo-rep stuff. I'm focused elsewhere atm, so won't be looking at this myself. But anyone with a Jenkins login is able to. Just muck around with the script here to add geo-rep bits: http://build.gluster.org/job/rackspace-regression-2GB-triggered/ (remember to comment any chances, for traceability) :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Should we try alternating our Weekly Community meeting times?
On 23 Apr 2015, at 05:47, Joe Julian j...@julianfamily.org wrote: I suggested it. Some other people in North America besides just myself expressed an interest in being involved, but could not make early (or very early) morning meetings. Since the globe has this cool spherical feature I thought it might be a good idea to try to get involvement from the dark side. :-) Would you be ok to chair the first meeting of the Dark Side of Gluster Community Meetings? :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] REMINDER: Weekly Gluster Community meeting is in one hour!
On 22 Apr 2015, at 11:59, Justin Clift jus...@gluster.org wrote: Reminder!!! The weekly Gluster Community meeting is in 30 mins, in #gluster-meeting on IRC. This is a completely public meeting, everyone is encouraged to attend and be a part of it. :) Thanks everyone who attended. Quite a few attendees and we covered lots of useful stuff. :) * GlusterFS 3.6.3 should be released in the next few days. Yay! :) * GlusterFS 3.7.0beta1 *and* 3.6.4beta1 should be released by the end of this week. * GlusterFS 3.5.4beta1 should be released fairly soon. Hoping for the end of this week too, but we'll see. Meeting logs: https://meetbot.fedoraproject.org/gluster-meeting/2015-04-22/gluster-meeting.2015-04-22-12.01.html Thanks to everyone who attended + participated. :) Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Should we try alternating our Weekly Community meeting times?
Today's Weekly Community Meeting had an interesting suggestion: There are members of the other hemisphere that would like to be active in the community but cannot attend meetings at this hour. I propose alternating meetings by 12 hours, 0:00 and 12:00 UTC. It's a decent suggestion, and we're definitely willing to try it out if we know people will attend the other meeting. :) Do we have volunteers to be at a 0:00 UTC meeting for Gluster? We also need someone to volunteer to be the meeting chair to run it (at least the first time). Who's up for it? :) Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] geo-rep regression tests take *ages*?
Just noticed something a bit weird on the regression tests for CentOS 6.x: [13:28:44] ./tests/features/weighted-rebalance.t ... ok 23 s [13:46:50] ./tests/geo-rep/georep-rsync-changelog.t ok 1086 s [14:06:53] ./tests/geo-rep/georep-rsync-hybrid.t ... ok 1203 s [14:08:36] ./tests/geo-rep/georep-setup.t .. ok 103 s [14:26:35] ./tests/geo-rep/georep-tarssh-changelog.t ... ok 1079 s That's on: http://build.gluster.org/job/rackspace-regression-2GB-triggered/7285/console Those 3x 1000+ second regression tests are adding 56+minutes to the total regression test time. That's not how it should be is it? Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] auto retry of failed test fails again !!
On 22 Apr 2015, at 15:39, Jeff Darcy jda...@redhat.com wrote: As we know, we have a patch from Manu which re-triggers a given failed test. The idea was to reduce the burden of re-triggering the regression, but I've been noticing it is failing in 2nd attempt as well and I've seen this happening multiple times for patch [1]. I am not sure whether I am damn unlucky or we have a real problem here. Any thoughts? Many of the most common spurious failures seem timing-related. Since the timing on a particular node is unlikely to change between the first failure and the retry, neither is the result. Running the retries on another node might work, but would be very complex to implement. For the time being, I think our best bet is for the tests identified in this patch to be excluded/ignored on NetBSD as well: http://review.gluster.org/#/c/10322/ That might significantly cut down on the false negatives. When tests still fail, we're stuck with re-triggering manually. Jenkins has some kind of API (haven't looked at it), so we might be able to do something with the API to automatically add the failed CR to the regression queue again. That would have a reasonable change of running it on a different node. + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-infra] Gerrit and Jenkins likely unavailable most of Sunday
On 20 Apr 2015, at 04:43, Aravinda avish...@redhat.com wrote: Is it not possible to view the patches if not logged in? I think public access(read only) need to be enabled. It *does* seem to be possible after all. :) Our test instance for Gerrit (http://newgerritv2.cloud.gluster.org) is now running the very latest release of Gerrit + the GitHub auth plugin, and that allows anonymous read access. So, we might be upgrading shortly. ;) Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] REMINDER: Weekly Gluster Community meeting is in one hour!
Reminder!!! The weekly Gluster Community meeting is in 30 mins, in #gluster-meeting on IRC. This is a completely public meeting, everyone is encouraged to attend and be a part of it. :) To add Agenda items *** Just add them to the main text of the Etherpad, and be at the meeting. :) https://public.pad.fsfe.org/p/gluster-community-meetings Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-infra] Gerrit and Jenkins likely unavailable most of Sunday
On 22 Apr 2015, at 09:24, Anoop C S achir...@redhat.com wrote: On 04/22/2015 12:46 PM, Justin Clift wrote: On 22 Apr 2015, at 07:42, Justin Clift jus...@gluster.org wrote: On 20 Apr 2015, at 04:43, Aravinda avish...@redhat.com wrote: Is it not possible to view the patches if not logged in? I think public access(read only) need to be enabled. It *does* seem to be possible after all. :) Our test instance for Gerrit (http://newgerritv2.cloud.gluster.org) is now running the very latest release of Gerrit + the GitHub auth plugin, and that allows anonymous read access. It turns out the settings to make this work were already present in the version of Gerrit we're using... so they've just been turned on. Anonymous read-only access should now be working. :) Signing out should now be working properly too. (yay) Anonymous read-only access and sign out works fine now.. :) Awesome. :) -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] regressions on release-3.7 ?
On 20 Apr 2015, at 14:14, Jeff Darcy jda...@redhat.com wrote: The same problems that affect mainline are affecting release-3.7 too. We need to get over this soon. I think it's time to start skipping (or even deleting) some tests. For example, volume-snapshot-clone.t alone is responsible for a huge number of spurious failures. It's for a feature that people don't even seem to know we have, and isn't sufficient for us to declare that feature supportable, so the only real effect of the test's existence is these spurious regression-test failures. Are you meaning deleting the tests temporarily (to let other stuff pass without being held up by it), or permanently? In other cases (e.g. uss.t) bugs in the test and/or the feature itself must still be fixed before we can release 3.7 but that doesn't necessarily mean we need to run that test for every unrelated change. The purpose of a regression test is to catch unanticipated problems with a patch. A test that fails for its own unrelated reasons provides little or no information of that nature, and is therefore best treated as if no test for that feature/fix had ever existed. That's still bad and still worthy of correction, but at least it doesn't interfere with everyone else's work. That makes sense to me... as long as any temporarily-deleted-tests have their root cause(s) found and fixed before we release 3.7. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-infra] Gerrit and Jenkins likely unavailable most of Sunday
On 20 Apr 2015, at 08:11, Atin Mukherjee amukh...@redhat.com wrote: On 04/20/2015 08:35 AM, Vijay Bellur wrote: snip The procedure for migration from an admin perspective is quite involved and account migrations are better done in batches. Instead of mailing any of us directly, can you please update the gerrit migration etherpad [1] once you have signed in using github? This might be a slightly more optimal way of doing this migration :). We will pick up details from the etherpad at a regular frequency. There are three set of problems what we noticed in the migration process: 1. Forbidden access when you try to sign in with github 2. Multiple accounts upon successful github signing 3. Unable to view files in patchsets - 404 error We have the fix for 1 2, please do mention in the etherpad [1] if you fall into any of these categories. Vijay is working on point 3 and will keep posted once he finds a solution. Gerrit is up and running now (thanks hagarth and ndevos). Seems to be working decently too. :) I have the process to merge new GitHub userid's into existing accounts fairly well optimised now too. So, if you need your account created either add yourself to the etherpad or email me to get it done. :) https://public.pad.fsfe.org/p/gluster-gerrit-migration We're still working through Jenkins stuff at the moment... so not a lot in the way of smoke nor regression tests happening just yet. + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] regressions on release-3.7 ?
On 20 Apr 2015, at 20:02, Vijay Bellur vbel...@redhat.com wrote: On 04/21/2015 12:19 AM, Justin Clift wrote: On 20 Apr 2015, at 18:53, Jeff Darcy jda...@redhat.com wrote: I propose that we don't drop test units but provide an ack to patches that have known regression failures. IIRC maintainers have had permission to issue such overrides since a community meeting some months ago, but such overrides have remained rare. What should we do to ensure that currently failing Jenkins results are checked and (if necessary) overridden in a consistent and timely fashion, without putting all of that burden directly on your shoulders? Some sort of officer of the day rotation? An Etherpad work queue? Something else? An Etherpad is probably a good basis for doing the listing. No preferences personally for how it gets attended to though. :) Another option would be to maintain a file with this list in the tests directory. run-tests.sh can lookup this file to determine whether it should continue or bail out. Good thinking. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-infra] Gerrit and Jenkins likely unavailable most of Sunday
On 20 Apr 2015, at 04:43, Aravinda avish...@redhat.com wrote: Is it not possible to view the patches if not logged in? I think public access(read only) need to be enabled. In theory it's supposed to be. :) However looking at the etherpad there are lots of people getting Forbidden. I'm not sure why yet, but will start looking into it shortly (after coffee). :) + Justin ~aravinda On 04/20/2015 08:35 AM, Vijay Bellur wrote: On 04/20/2015 04:25 AM, Justin Clift wrote: The good news: 1) Gerrit is kind of :/ updated. The very very latest versions (released friday) don't work properly for us. So, we're running on the slightly older v2.9.4 release of Gerrit. It's a lot newer than what we were running though. ;) 2) The GitHub integration seems to be working. When you next to to http://review.gluster.org, it'll get you to authenticate via GitHub. The bad news: 1) The first time you authenticate to GitHub it will create a brand new account for you, that doesn't have many useful permissions. You will need to email Vijay, Humble, or myself with the account number it creates for you + with your GitHub username. Your account number will probably be something like 10006xx. Mine was 1000668. This new account id needs to be merged into your existing one manually by a Gerrit admin. It's not hard, and only needs to be done once. :) The procedure for migration from an admin perspective is quite involved and account migrations are better done in batches. Instead of mailing any of us directly, can you please update the gerrit migration etherpad [1] once you have signed in using github? This might be a slightly more optimal way of doing this migration :). We will pick up details from the etherpad at a regular frequency. Thanks for taking the trouble apologies for any inconvenience caused in advance! Regards, Vijay [1] https://public.pad.fsfe.org/p/gluster-gerrit-migration -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-infra] Gerrit and Jenkins likely unavailable most of Sunday
The good news: 1) Gerrit is kind of :/ updated. The very very latest versions (released friday) don't work properly for us. So, we're running on the slightly older v2.9.4 release of Gerrit. It's a lot newer than what we were running though. ;) 2) The GitHub integration seems to be working. When you next to to http://review.gluster.org, it'll get you to authenticate via GitHub. The bad news: 1) The first time you authenticate to GitHub it will create a brand new account for you, that doesn't have many useful permissions. You will need to email Vijay, Humble, or myself with the account number it creates for you + with your GitHub username. Your account number will probably be something like 10006xx. Mine was 1000668. This new account id needs to be merged into your existing one manually by a Gerrit admin. It's not hard, and only needs to be done once. :) 2) Jenkins... didn't even get close to looking at it. So the Jenkins server is out of action for now. :/ The version of Jenkins we're running *may* not be compatible with our new Gerrit version (unsure). Will find out in the morning (after sleep, which I'm really needing atm). + Justin On 19 Apr 2015, at 11:38, Justin Clift jus...@gluster.org wrote: Gerrit and Jenkins are going to be shutting off pretty soon. So, any job running in Jenkins will be aborted. ;) *Please don't* submit new CR's, or run any new Jenkins jobs from now until the upgrade is finished. Even if you see out Gerrit or Jenkins online, don't do stuff with it. ;) + Justin On 18 Apr 2015, at 19:30, Justin Clift jus...@gluster.org wrote: Our Gerrit and Jenkins instances will be getting updated tomorrow. (yay!) It's not very straight forward to do though, so I'll probably shut them down tomorrow morning and they _may_ be offline for a large part of the day. Note - They have to be kept offline from when I do the initial backup for updating, until it's ready. I wish there was a better way... but there doesn't seem to be. :/ Sorry in advance, etc. Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] How to search frequently asked questions in gluster mailing lists.
On 18 Apr 2015, at 07:49, Raghavendra Talur raghavendra.ta...@gmail.com wrote: snip Use the second search box, the one below Google search for Gluster. Works for me on both Chrome and Firefox on Android and Fedora 21. Please try again and let me know :) Errr, which second search box? :) http://ded.ninja/gluster/custom_google_search_screenshot.png + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] How to search frequently asked questions in gluster mailing lists.
On 18 Apr 2015, at 16:01, Niels de Vos nde...@redhat.com wrote: On Sat, Apr 18, 2015 at 03:14:41PM +0100, Justin Clift wrote: On 18 Apr 2015, at 07:49, Raghavendra Talur raghavendra.ta...@gmail.com wrote: snip Use the second search box, the one below Google search for Gluster. Works for me on both Chrome and Firefox on Android and Fedora 21. Please try again and let me know :) Errr, which second search box? :) http://ded.ninja/gluster/custom_google_search_screenshot.png This one, use Fedora and Firefox? http://i.imgur.com/cCGmOdK.png Are you guys not using ad blockers? + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Gerrit and Jenkins likely unavailable most of Sunday
Our Gerrit and Jenkins instances will be getting updated tomorrow. (yay!) It's not very straight forward to do though, so I'll probably shut them down tomorrow morning and they _may_ be offline for a large part of the day. Note - They have to be kept offline from when I do the initial backup for updating, until it's ready. I wish there was a better way... but there doesn't seem to be. :/ Sorry in advance, etc. Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Google OpenID stops working this Sunday
Hi all, Using Google OpenID 2.0 for authentication will stop working from 20th April (this Monday). This is a bit of a problem for us, as many of our developers use it to authenticate with our Gerrit instance. We have some potential ways forward: * Switch to using GitHub OAuth instead This is currently working in a test VM. Seems ok so far. * Switch to using Google OpenID Connect instead Haven't yet gotten this working in our test VM, but am looking into it. Gerrit doesn't seem to let us use multiple authentication providers... it seems like we need to pick one OR the other here. :( (I could be wrong) Personally, I think we should choose the GitHub OAuth method, since we're all going to need GitHub accounts anyway for the Gluster Forge v2. So, keeps things simple from that perspective. ;) It will probably be a bit messy the *first time* we all log in, as we'll need to merge our new GitHub account info with our existing accounts. After that though, we should be good. Is anyone really against this idea? Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Auto-retry failed tests
On 16 Apr 2015, at 05:28, Emmanuel Dreyfus m...@netbsd.org wrote: Hi We all know regression spurious failures are a problem. In order to minimize their impact, NetBSD regression restart the whole test suite in case of error so that spurious failures do not cause an undeserved verified=-1 vote to be cast. This takes time, and as a consequence the netbsd7_regression backlog gets huge in the afternoon. I proposed this change to improve the situation: modify run-tests.sh to retry only the failed tests. That behavior is off by default and can be enabled using run-tests.sh -r http://review.gluster.org/10128/ Having that one merged would not change anything to the way Linux regression is run right now, and it would let me make the NetBSD regression much faster (until the day spurious regressions are fixed). The concept sounds really useful. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] release-3.7 branch created [Was 3.7.0 update]
On 15 Apr 2015, at 12:54, Ravishankar N ravishan...@redhat.com wrote: On 04/15/2015 03:31 PM, Justin Clift wrote: On 15 Apr 2015, at 08:09, Ravishankar N ravishan...@redhat.com wrote: On 04/14/2015 11:57 PM, Vijay Bellur wrote: From here on, we would need patches to be explicitly sent on release-3.7 for the content to be included in a 3.7.x release. Please ensure that you send a backport for release-3.7 after the corresponding patch has been accepted on master. Thanks again to everyone who have helped us in getting here. Look forward to more fun and collaboration as we move towards 3.7.0 GA! For the replication arbiter feature, I'm working on the changes that need to be made on the AFR code. Once it gets merged in master, I will back-port it to 3.7 (I'm targeting to get this done before the GA.). Apart from that I don't think there are new regressions since 3.6 for AFR, so we should be good to go. How about this one? * tests/basic/afr/sparse-file-self-heal.t (Wstat: 0 Tests: 64 Failed: 35) Failed tests: 1-6, 11, 20-30, 33-34, 36, 41, 50-61, 64 Happens in master (Mon 30th March - git commit id 3feaf1648528ff39e23748ac9004a77595460c9d) (hasn't yet been added to BZ) Being investigated by: ? As per: https://public.pad.fsfe.org/p/gluster-spurious-failures ;) Just tried the test case a couple of times on my laptop on today's master and with the head at the above commit ID, passes ever time. :-\ Sure. That's the nature of spurious failures. It's not likely to be trivial to track down... but it *is* important. ;) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] The Gluster Forge -- new and improved version 2 with extra sprinkles
Hi everyone, The Gluster Forge at present is currently hosted using Gitorious. We're planning on migrating these projects to GitHub in the near future (next few days) - as that's where the majority of the Open Source Community is. The forge.gluster.org URL will then be very simple website (2 pages!) that: * lists the projects (in categories, for easy finding) * shows the activity for the projects, across all of them (easy to obtain and update stats for hourly, using the GitHub API) So far I've knocked up :) some dodgy Python code to do the hourly stats collection + stick them in a SQLite database: https://github.com/gluster/forge (Pull Requests to make it less dodgy are welcome btw!) As we get projects into GitHub from the current Forge, the config file there needs to be updated to include them. If one of your Gluster Forge v1 projects is already on GitHub, please let me know. (or send a pull request adding it to the config file) Hopefully that's workable for people, and helps us get a bunch more contributors to the projects over time... :D Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Regression tests marked as SUCCESS when they shouldn't be
On 16 Apr 2015, at 03:56, Jeff Darcy jda...@redhat.com wrote: Noticing several of the recent regression tests are being marked as SUCCESS in Jenkins (then Gerrit), when they're clearly failing. eg: http://build.gluster.org/job/rackspace-regression-2GB-triggered/6968/console http://build.gluster.org/job/rackspace-regression-2GB-triggered/6969/console http://build.gluster.org/job/rackspace-regression-2GB-triggered/6970/console http://build.gluster.org/job/rackspace-regression-2GB-triggered/6966/console Is this something to do with your patch to try and get failures finishing faster? Yes, I think it probably is. At line 221 we echo the status for debugging, but that means the result of main() is the result of the echo (practically always zero) rather than the real result. All we need to do is take out that echo line. Well, if the echo is important for debugging, we can save the real result status earlier and then return $REAL_STATUS type of thing. + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Regression tests marked as SUCCESS when they shouldn't be
Noticing several of the recent regression tests are being marked as SUCCESS in Jenkins (then Gerrit), when they're clearly failing. eg: http://build.gluster.org/job/rackspace-regression-2GB-triggered/6968/console http://build.gluster.org/job/rackspace-regression-2GB-triggered/6969/console http://build.gluster.org/job/rackspace-regression-2GB-triggered/6970/console http://build.gluster.org/job/rackspace-regression-2GB-triggered/6966/console Is this something to do with your patch to try and get failures finishing faster? ;) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] release-3.7 branch created [Was 3.7.0 update]
On 15 Apr 2015, at 08:09, Ravishankar N ravishan...@redhat.com wrote: On 04/14/2015 11:57 PM, Vijay Bellur wrote: From here on, we would need patches to be explicitly sent on release-3.7 for the content to be included in a 3.7.x release. Please ensure that you send a backport for release-3.7 after the corresponding patch has been accepted on master. Thanks again to everyone who have helped us in getting here. Look forward to more fun and collaboration as we move towards 3.7.0 GA! For the replication arbiter feature, I'm working on the changes that need to be made on the AFR code. Once it gets merged in master, I will back-port it to 3.7 (I'm targeting to get this done before the GA.). Apart from that I don't think there are new regressions since 3.6 for AFR, so we should be good to go. How about this one? * tests/basic/afr/sparse-file-self-heal.t (Wstat: 0 Tests: 64 Failed: 35) Failed tests: 1-6, 11, 20-30, 33-34, 36, 41, 50-61, 64 Happens in master (Mon 30th March - git commit id 3feaf1648528ff39e23748ac9004a77595460c9d) (hasn't yet been added to BZ) Being investigated by: ? As per: https://public.pad.fsfe.org/p/gluster-spurious-failures ;) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] 3.7.0 update
On 7 Apr 2015, at 11:21, Vijay Bellur vbel...@redhat.com wrote: snip 3. Spurious regression tests listed in [3] to be fixed. To not impede the review merge workflow on release-3.7/master, I plan to drop those test units which still cause spurious failures by the time we branch release-3.7. Thinking about this more... this feels like the wrong approach. The spurious failures seem to be caused reasonably often by race conditions in our code and similar. Dropping the unfixed spurious tests feels like sweeping the harder/trickier problems under the rug, which means they'll need to be found and fixed later anyway. That's kind of the opposite to lets resolve all these spurious failures before proceeding (which should mean lets fix them finally). ;) And yeah, this could delay the release a bit. Personally, I'm ok with that. It's not going to make our release of lower quality, and people not involved in spurious failure fixing are still able to do dev work on master. ? Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Possibly root cause for the Gluster regression test cores?
Hi Pranith, Hagarth mentioned in the weekly IRC meeting that you have an idea what might be causing the regression tests to generate cores? Can you outline that quickly, as Jeff has some time and might be able to help narrow it down further. :) (and these core files are really annoying :/) Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Need volunteers for spurious failure fixing on master
Hi us, Although we've fixed a few of these spurious failures in our git master branch, we still need a few more people to help out with the rest. :) This is important because these spurious failures affect *everyone* doing any kind of development work on GlusterFS. With these fixed, our regression runs will be a _lot_ quicker, more predictable, and we'll be able to iterate on our code much faster. So... who's ok to help with some of the ungrabbed ones below? :) Need someone to investigate + create the fix * tests/bugs/disperse/bug-1161886.t Fails tests 13-16 because of missing inode.h when building (possibly unnecessary) helper C program 14:04 JustinClift Is that a missing dependency that should be installed on the regression test slaves? 14:05 jdarcy That's a really weird one. It doesn't happen every time. 14:06 jdarcy It's *our* inode.h, which should totally be present long before the test needs it, but somehow it fails to find it once in a while. * tests/bugs/snapshot/bug-1162498.t Need a volunteer to investigate this * tests/basic/quota-nfs.t Need a volunteer to investigate this * tests/performance/open-behind.t Need a volunteer to investigate this * tests/bugs/distribute/bug-1122443.t Need a volunteer to investigate this * tests/basic/afr/sparse-file-self-heal.t Need a volunteer to investigate this * tests/bugs/disperse/bug-1187474.t Need a volunteer to investigate this * tests/basic/fops-sanity.t Need a volunteer to investigate this Needing reviews *** * split-brain-resolution.t Anuradha has a proposed fix here: http://review.gluster.org/#/c/10134/ * /tests/features/ssl-authz.t Jeff has a proposed fix here: http://review.gluster.org/#/c/10075/ Ones under investigation * Core dumps by socket disconnect race Initial analysis: https://bugzilla.redhat.com/show_bug.cgi?id=1195415 Pranith and/or Jeff are looking into this? * Random regression test hang : bug-1113960.t Nithya is investigating: https://bugzilla.redhat.com/show_bug.cgi?id=1209340 The Etherpad for co-ordinating this *** https://public.pad.fsfe.org/p/gluster-spurious-failures Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Possibly root cause for the Gluster regression test cores?
On 8 Apr 2015, at 14:13, Pranith Kumar Karampuri pkara...@redhat.com wrote: On 04/08/2015 06:20 PM, Justin Clift wrote: snip Hagarth mentioned in the weekly IRC meeting that you have an idea what might be causing the regression tests to generate cores? Can you outline that quickly, as Jeff has some time and might be able to help narrow it down further. :) (and these core files are really annoying :/) I feel it is a lot like https://bugzilla.redhat.com/show_bug.cgi?id=1184417. clear-locks command is not handled properly after we did the client_t refactor. I believe that is the reason for the crashes but I could be wrong. But After looking at the code I feel there is high probability that this is the issue. I didn't find it easy to fix. We will need to change the lock structure list maintenance heavily. Easier thing would be to disable clear-locks functionality tests in the regression as it is not something that is used by the users IMO and see if it indeed is the same issue. There are 2 tests using this command: 18:34:00 :) ⚡ git grep clear-locks tests tests/bugs/disperse/bug-1179050.t:TEST $CLI volume clear-locks $V0 / kind all inode tests/bugs/glusterd/bug-824753-file-locker.c: gluster volume clear-locks %s /%s kind all posix 0,7-1 | If even after disabling these two tests it fails then we will need to look again. I think jeff's patch which will find the test which triggered the core should help here. Thanks Pranith. :) Is this other problem when disconnecting BZ possibly related, or is that a different thing? https://bugzilla.redhat.com/show_bug.cgi?id=1195415 + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Shutting down Gerrit for a few minutes
Just an FYI. Shutting down Gerrit for a few minutes, to move around some files on the Gerrit server (need to free up space urgently). Shouldn't be too long. (fingers crossed) :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-infra] Shutting down Gerrit for a few minutes
On 7 Apr 2015, at 15:45, Justin Clift jus...@gluster.org wrote: Just an FYI. Shutting down Gerrit for a few minutes, to move around some files on the Gerrit server (need to free up space urgently). Shouldn't be too long. (fingers crossed) :) ... and it hasn't returned from rebooting after yum update. :( We're investigating. Sorry for the longer-than-expected outage. :/ + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Shutting down Gerrit for a few minutes
On 7 Apr 2015, at 16:31, Justin Clift jus...@gluster.org wrote: On 7 Apr 2015, at 15:45, Justin Clift jus...@gluster.org wrote: Just an FYI. Shutting down Gerrit for a few minutes, to move around some files on the Gerrit server (need to free up space urgently). Shouldn't be too long. (fingers crossed) :) ... and it hasn't returned from rebooting after yum update. :( We're investigating. Sorry for the longer-than-expected outage. :/ It's back up and running again. A bunch of space has been freed up on the filesystems for it, git gc has been run on each of the git repos, and the packages have all been updated via yum. (except Gerrit, which isn't yum installed) It _seems_ to be working ok now, for the initial git checkout I just tried. If something acts up though, please let us know. :) Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Gluster 3.6.2 On Xeon Phi
On 12 Feb 2015, at 08:54, Mohammed Rafi K C rkavu...@redhat.com wrote: On 02/12/2015 08:32 AM, Rudra Siva wrote: Rafi, I'm preparing the Phi RDMA patch for submission If you can send a patch to support iWARP, that will be a great addition to gluster rdma. Clearing out older email... did this patch get submitted and merged? :) Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Security hardening RELRO PIE flags
On 31 Mar 2015, at 08:15, Niels de Vos nde...@redhat.com wrote: On Tue, Mar 31, 2015 at 12:20:19PM +0530, Kaushal M wrote: IMHO, doing hardening and security should be left the individual distributions and the package maintainers. Generally, each distribution has it's own policies with regards to hardening and security. We as an upstream project cannot decide on what a distribution should do. But we should be ready to fix bugs that could arise when distributions do hardened builds. So, I vote against having these hardening flags added to the base GlusterFS build. But we could add the flags the Fedora spec files which we carry with our source. Indeed, I agree that the compiler flags should be specified by the distributions. At least Fedora and Debian do this already include (probably different) options within their packaging scripts. We should set the flags we need, but not more. It would be annoying to set default flags that can conflict with others, or which are not (yet) available on architectures that we normally do not test. First thoughts: :) * We provide our own packaging scripts + distribute rpms/deb's from our own site too. Should we investigate/try these flags out for the packages we build + supply? * Are there changes in our code + debugging practises that would be needed for these security hardening flags to work? If there are, and we don't make these changes ourselves, doesn't that mean we're telling distributions they need to carry their own patch set in order to have a more secure GlusterFS? + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Adventures in building GlusterFS
On 2 Apr 2015, at 12:10, Vijay Bellur vbel...@redhat.com wrote: On 04/02/2015 06:27 AM, Jeff Darcy wrote: My recommendations: (1) Apply the -Wno-error=cpp and -Wno-error=maybe-uninitialized changes wherever they need to be applied so that they're effective during normal regression builds Thanks, Jeff. Justin - would it be possible to do this change as well in build.sh? The regression builds seem to be running again at the moment without removing -Werror. So I'm not sure if this needs adjusting any more? + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Adventures in building GlusterFS
On 2 Apr 2015, at 01:57, Jeff Darcy jda...@redhat.com wrote: As many of you have undoubtedly noticed, we're now in a situation where *all* regression builds are now failing, with something like this: - cc1: warnings being treated as errors /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-snapshot-utils.c: In function ‘glusterd_snap_quorum_check_for_create’: /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-snapshot-utils.c:2615: error: passing argument 2 of ‘does_gd_meet_server_quorum’ from incompatible pointer type /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-server-quorum.h:56: note: expected ‘struct list_head *’ but argument is of type ‘struct cds_list_head *’ - The reason is that -Werror was turned on earlier today. I'm not quite sure how or where, because the version of build.sh that I thought builds would use doesn't seem to have changed since September 8, but then there's a lot about this system I don't understand. Vijay (who I believe made the change) knows it better than I ever will. A. This was me. Noticed the lack of -Werror lack night, and immediately fixed it. Then hit the sack shortly after. Umm Sorry? :/ + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Adventures in building GlusterFS
On 2 Apr 2015, at 12:10, Vijay Bellur vbel...@redhat.com wrote: On 04/02/2015 06:27 AM, Jeff Darcy wrote: My recommendations: (1) Apply the -Wno-error=cpp and -Wno-error=maybe-uninitialized changes wherever they need to be applied so that they're effective during normal regression builds Thanks, Jeff. Justin - would it be possible to do this change as well in build.sh? Sure. What needs changing from here? https://github.com/justinclift/glusterfs_patch_acceptance_tests/blob/master/build.sh + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Adventures in building GlusterFS
On 2 Apr 2015, at 01:57, Jeff Darcy jda...@redhat.com wrote: snip (1) Apply the -Wno-error=cpp and -Wno-error=maybe-uninitialized changes wherever they need to be applied so that they're effective during normal regression builds The git repo which holds our CentOS build and regression testing scripts, is here: https://review.gerrithub.io/#/admin/projects/justinclift/glusterfs_patch_acceptance_tests https://github.com/justinclift/glusterfs_patch_acceptance_tests It's being used as a test bunny to try out GerritHub. (May end in rabbit soup. I do not like rabbit soup. :/) The build bit in it is (bash script): P=/build; ./configure --prefix=$P/install --with-mountutildir=$P/install/sbin --with-initdir=$P/install/etc --localstatedir=/var --enable-bd-xlator=yes --enable-debug --silent make install CFLAGS=-g -O0 -Wall -Werror -j 4 With the -Werror added last night. Should we adjust it? Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Multiple verify in gerrit
On 2 Apr 2015, at 05:18, Emmanuel Dreyfus m...@netbsd.org wrote: Hi I am now convinced the solution to our multiple regression problem is to introduce more Gluster Build System users: one for CentOS regression, another one for NetBSD regression (and one for each smoke test, as exaplained below). I just tested it on http://review.gluster.org/10052, and here is what gerrit display in the verified column - if there are neither verified=+1 or verified=-1 cast: nothing - if there is at least one verified=+1 and no verified=-1: verified - if there is at least one verified=-1: failed Therefore if CentOS regression uses bu...@review.gluster.org to report results and NetBSD regression uses nb7bu...@review.gluster.org (later user should be created), we acheive this outcome: - gerrit will display a change as verified if one regression reported it as verified and the other either also succeeded or failed to report - gerrit will display a change as failed if one regression reported it at failed, regardless of what the other reported. There is still one minor problem: if one regression does not report, or report late, we can have the feeling that a change is verified while it should not, and its status can change later. But this is a minor issue compaed to curent status. Other ideas: - smoke builds should also report as different gerrit users, so that a verified=+1 regression result does not override verified=-1 smoke build result - when we get a regression failure, we could cast the verified vote to gerrit and immediatly schedule another regression run. That way we could automatically workaround spurious failures without the need for retrigger in Jenkins. You're probably right. :) I'll set up test / sandbox VM's today using last night's backup of our Gerrit setup, then we can try stuff out on it to make sure. Give me a few hours though. ;) It needs to be able to communicate with stuff on the internet for OpenID to work, but unable to affect our Jenkins box, Forge/GitHub/etc. Best way I've thought of for doing that (so far) is adding static routes to bogus IP addresses in /etc/hosts for the things we don't want it communicating with. The other option might be to just use the built in IP tables firewall to disallow all communications except for whitelisted addresses. Will figure it out in a few hours. ;) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Coredump in master :/
On 1 Apr 2015, at 19:47, Jeff Darcy jda...@redhat.com wrote: When doing an initial burn in test (regression run on master head of GlusterFS git), it coredumped on the new slave23.cloud.gluster.org VM. (yeah, I'm reusing VM names) http://build.gluster.org/job/regression-test-burn-in/16/console Does anyone have time to check the coredump, and see if this is the bug we already know about? This is *not* the same as others I've seen. There are no threads in the usual connection-cleanup/list_del code. Rather, it looks like some are in generic malloc code, possibly indicating some sort of arena corruption. Is it ok to put slave23.cloud.gluster.org into general rotation, so it runs regression jobs along with the rest? + Justing -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Coredump in master :/
On 2 Apr 2015, at 14:42, Jeff Darcy jda...@redhat.com wrote: Is it ok to put slave23.cloud.gluster.org into general rotation, so it runs regression jobs along with the rest? Sounds OK to me. Do we have a place to store the core tarball, just in case we decide we need to go back to it some day? Yep. They're now here: http://ded.ninja/gluster/slave23.cloud.gluster.org/ Should be safe for a couple of months at least. In theory. ;) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Security hardening RELRO PIE flags
On 2 Apr 2015, at 14:08, Niels de Vos nde...@redhat.com wrote: On Thu, Apr 02, 2015 at 01:21:57PM +0100, Justin Clift wrote: On 31 Mar 2015, at 08:15, Niels de Vos nde...@redhat.com wrote: On Tue, Mar 31, 2015 at 12:20:19PM +0530, Kaushal M wrote: IMHO, doing hardening and security should be left the individual distributions and the package maintainers. Generally, each distribution has it's own policies with regards to hardening and security. We as an upstream project cannot decide on what a distribution should do. But we should be ready to fix bugs that could arise when distributions do hardened builds. So, I vote against having these hardening flags added to the base GlusterFS build. But we could add the flags the Fedora spec files which we carry with our source. Indeed, I agree that the compiler flags should be specified by the distributions. At least Fedora and Debian do this already include (probably different) options within their packaging scripts. We should set the flags we need, but not more. It would be annoying to set default flags that can conflict with others, or which are not (yet) available on architectures that we normally do not test. First thoughts: :) * We provide our own packaging scripts + distribute rpms/deb's from our own site too. Should we investigate/try these flags out for the packages we build + supply? At least for the RPMs, we try to follow the Fedora guidelines and their standard flags. With recent Fedora releases this includes additional hardening flags. * Are there changes in our code + debugging practises that would be needed for these security hardening flags to work? If there are, and we don't make these changes ourselves, doesn't that mean we're telling distributions they need to carry their own patch set in order to have a more secure GlusterFS? We have received several patches from the Debian maintainer that improve the handling of these options. When maintainers for distrubutions build GlusterFS and require changes, they either file bugs and/or send patches. I think this works quite well. Thanks Niels. Sounds like we're already in good shape then. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Got a slogan idea?
On 1 Apr 2015, at 13:14, Tom Callaway tcall...@redhat.com wrote: Hello Gluster Ant People! Right now, if you go to gluster.org, you see our current slogan in giant text: Write once, read everywhere However, no one seems to be super-excited about that slogan. It doesn't really help differentiate gluster from a portable hard drive or a paperback book. I am going to work with Red Hat's branding geniuses to come up with some possibilities, but sometimes, the best ideas come from the people directly involved with a project. What I am saying is that if you have a slogan idea for Gluster, I want to hear it. You can reply on list or send it to me directly. I will collect all the proposals (yours and the ones that Red Hat comes up with) and circle back around for community discussion in about a month or so. Gluster: Scale out your data. Safely. :) -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] rackspace-netbsd7-regression-triggered has been disabled
On 1 Apr 2015, at 17:20, Marcelo Barbosa fireman...@fedoraproject.org wrote: yep, using on Jenkins the plugin Gerrit Trigger, this plugin trigger all requests for all repositories and all branches, this function running with Auto QA tests and vote with ACL verified, for example: https://gerrit.ovirt.org/#/c/37886/ my intention is make one deploy with one complete example the use this, comming soon :D We already use the Gerrit Trigger plugin extensively. :) Now we need to get more advanced with it... In our existing setup, we have bunch of fast tests that run automatically (triggered), and which vote. We use this for initial smoke testing to locate (fail on) obviously problems quickly. We also have a much more in-depth CentOS 6.x regression test (~2hours run time) that's triggered. It doesn't vote using the Gerrit trigger method. Instead it calls back via ssh to Gerrit, indicating SUCCESS or FAILURE. That status goes into a column in the Gerrit CR, to show if it passed the regression test or not. Now... we have a NetBSD 7.x regression test (~2 hour run time) that's also triggered. We need to find a way for this to communicate back to Gerrit. We've tried using the same method used for our CentOS 6.x communication back, but the status results for the NetBSD tests conflict with the CentOS status results. We need some kind of solution. Hoping you have ideas? :) We're ok with pretty much anything that works and can be setup ASAP. Like today. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] crypt xlator bug
On 1 Apr 2015, at 10:57, Emmanuel Dreyfus m...@netbsd.org wrote: Hi crypt.t was recently broken in NetBSD regression. The glusterfs returns a node with file type invalid to FUSE, and that breaks the test. After running a git bisect, I found the offending commit after which this behavior appeared: 8a2e2b88fc21dc7879f838d18cd0413dd88023b7 mem-pool: invalidate memory on GF_FREE to aid debugging This means the bug has always been there, but this debugging aid caused it to be reliable. Sounds like that commit is a good win then. :) Harsha/Pranith/Lala, your names are on the git blame for crypt.c... any ideas? :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Extra overnight regression test run results
On 1 Apr 2015, at 03:48, Justin Clift jus...@gluster.org wrote: On 31 Mar 2015, at 14:18, Shyam srang...@redhat.com wrote: snip Also, most of the regression runs produced cores. Here are the first two: http://ded.ninja/gluster/blk0/ There are 4 cores here, 3 pointing to the (by now hopefully) famous bug #1195415. One of the cores exhibit a different stack etc. Need more analysis to see what the issue could be here, core file: core.16937 http://ded.ninja/gluster/blk1/ There is a single core here, pointing to the above bug again. Both the blk0 and blk1 VM's are still online and available, if that's helpful? If not, please let me know and I'll nuke them. :) I'm ok to nuke both those VM's, yeah? + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Coredump in master :/
Hi us, Adding some more CentOS 6.x regression testing VM's at the moment, to cope with the current load. When doing an initial burn in test (regression run on master head of GlusterFS git), it coredumped on the new slave23.cloud.gluster.org VM. (yeah, I'm reusing VM names) http://build.gluster.org/job/regression-test-burn-in/16/console Does anyone have time to check the coredump, and see if this is the bug we already know about? Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Coredump in master :/
On 1 Apr 2015, at 19:51, Shyam srang...@redhat.com wrote: On 04/01/2015 02:47 PM, Jeff Darcy wrote: When doing an initial burn in test (regression run on master head of GlusterFS git), it coredumped on the new slave23.cloud.gluster.org VM. (yeah, I'm reusing VM names) http://build.gluster.org/job/regression-test-burn-in/16/console Does anyone have time to check the coredump, and see if this is the bug we already know about? This is *not* the same as others I've seen. There are no threads in the usual connection-cleanup/list_del code. Rather, it looks like some are in generic malloc code, possibly indicating some sort of arena corruption. This looks like the other core I saw yesterday, which was not the usual connection cleanup stuff. Adding this info here, as this brings this core count upto 2. One here, and the other in core.16937 : http://ded.ninja/gluster/blk0/ Oh, I just noticed there's a bunch of compile warnings at the top of the regression run: libtool: install: warning: relinking `server.la' /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-snapshot-utils.c: In function ‘glusterd_snap_quorum_check_for_create’: /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-snapshot-utils.c:2615: warning: passing argument 2 of ‘does_gd_meet_server_quorum’ from incompatible pointer type /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-server-quorum.h:56: note: expected ‘struct list_head *’ but argument is of type ‘struct cds_list_head *’ /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-snapshot-utils.c: In function ‘glusterd_snap_quorum_check’: /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-snapshot-utils.c:2788: warning: passing argument 2 of ‘does_gd_meet_server_quorum’ from incompatible pointer type /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-server-quorum.h:56: note: expected ‘struct list_head *’ but argument is of type ‘struct cds_list_head *’ /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-snapshot-utils.c:2803: warning: passing argument 2 of ‘does_gd_meet_server_quorum’ from incompatible pointer type /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-server-quorum.h:56: note: expected ‘struct list_head *’ but argument is of type ‘struct cds_list_head *’ /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-server-quorum.c: In function ‘glusterd_get_quorum_cluster_counts’: /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-server-quorum.c:230: warning: comparison of distinct pointer types lacks a cast /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-server-quorum.c:236: warning: comparison of distinct pointer types lacks a cast libtool: install: warning: relinking `glusterd.la' libtool: install: warning: relinking `posix-acl.la' Related / smoking-gun? + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] rackspace-netbsd7-regression-triggered has been disabled
On 1 Apr 2015, at 17:38, Emmanuel Dreyfus m...@netbsd.org wrote: Justin Clift jus...@gluster.org wrote: We need some kind of solution. What about ading another nb7build user in gerrit? That way results will not conflict. I'm not sure. However, Vijay's now added me as an admin in our production Gerrit instalce, and I have the process for restoring our backups in a local VM (on my desktop) worked out now. So... I can test this tomorrow morning and try it out. Then we'll know for sure. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Coredump in master :/
On 1 Apr 2015, at 20:09, Vijay Bellur vbel...@redhat.com wrote: snip My sanity run got blown due to this as I use -Wall -Werror during compilation. Submitted http://review.gluster.org/10105 to correct this. Should we add -Wall -Werror to the compile options for our CentOS 6.x regression runs? + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Coredump in master :/
On 1 Apr 2015, at 20:22, Vijay Bellur vbel...@redhat.com wrote: On 04/02/2015 12:46 AM, Justin Clift wrote: On 1 Apr 2015, at 20:09, Vijay Bellur vbel...@redhat.com wrote: snip My sanity run got blown due to this as I use -Wall -Werror during compilation. Submitted http://review.gluster.org/10105 to correct this. Should we add -Wall -Werror to the compile options for our CentOS 6.x regression runs? I would prefer doing that for CentOS 6.x at least. k, that's been done. All of the regression tests current queued up for master and release-3.6 are probably going to self destruct now though. (just thought of that. oops) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Extra overnight regression test run results
Hi all, Ran 20 x regression test jobs on (severely resource constrained) 1GB Rackspace VM's last night (in addition to the 20x normal VM's ones also run). The 1GB VM's have much much slower disk, only one virtual CPU, and 1/2 the RAM of our standard 2GB testing VMs. These are the failure results: * 20 x tests/basic/mount-nfs-auth.t Failed test: 40 100% fail rate. ;) * 20 x tests/basic/uss.t Failed tests: 149, 151-153, 157-159 100% fail rate * 11 x tests/bugs/distribute/bug-1117851.t Failed test: 15 55% fail rate * 2 x tests/performance/open-behind.t Failed test: 17 10% fail rate * 1 x tests/basic/afr/self-heald.t Failed tests: 13-14, 16, 19-29, 32-50, 52-65, 67-75, 77, 79-81 5% fail rate * 1 x tests/basic/afr/entry-self-heal.t Failed tests: 127-128 5% fail rate * 1 x tests/features/trash.t Failed test: 57 5% fail rate Wouldn't surprise me if some/many of the failures are due to time out of various sorts in tests. Very slow VMs. ;) Also, most of the regression runs produced cores. Here are the first two: http://ded.ninja/gluster/blk0/ http://ded.ninja/gluster/blk1/ Hoping someone has some time to check those quickly and see if there's anything useful in them or not. (the hosts are all still online atm, shortly to be nuked) Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] rackspace-netbsd7-regression-triggered has been disabled
On 1 Apr 2015, at 03:03, Emmanuel Dreyfus m...@netbsd.org wrote: Jeff Darcy jda...@redhat.com wrote: That's fine. I left a note for you in the script, regarding what I think it needs to do at that point. Here is the comment: # We shouldn't be touching CR at all. For V, we should set V+1 iff this # test succeeded *and* the value was already 0 or 1, V-1 otherwise. I # don't know how to do that, but the various smoke tests must be doing # something similar/equivalent. It's also possible that this part should # be done as a post-build action instead. The problem is indeed that we do now know how to retreive previous V value. I guess gerrit is the place where V combinations should be correctly handled. What is the plan for NetBSD regression now? It will fail anything which has not been rebased after recent fixes were merged, but apart from that the thing is in rather good shape right now. It sounds like we need a solution to have both the NetBSD and CentOS regressions run, and only give the +1 when both of them have successfully finished. If either of them fail, then it gets a -1. Research time. ;) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Extra overnight regression test run results
On 31 Mar 2015, at 14:18, Shyam srang...@redhat.com wrote: snip Also, most of the regression runs produced cores. Here are the first two: http://ded.ninja/gluster/blk0/ There are 4 cores here, 3 pointing to the (by now hopefully) famous bug #1195415. One of the cores exhibit a different stack etc. Need more analysis to see what the issue could be here, core file: core.16937 http://ded.ninja/gluster/blk1/ There is a single core here, pointing to the above bug again. Both the blk0 and blk1 VM's are still online and available, if that's helpful? If not, please let me know and I'll nuke them. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] rackspace-netbsd7-regression-triggered has been disabled
On 1 Apr 2015, at 04:07, Emmanuel Dreyfus m...@netbsd.org wrote: Justin Clift jus...@gluster.org wrote: It sounds like we need a solution to have both the NetBSD and CentOS regressions run, and only give the +1 when both of them have successfully finished. If either of them fail, then it gets a -1. That, or perhaps we could have two verified fields? Sure. Whichever works. :) Personally, I'm not sure how to do either yet. + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] rackspace-netbsd7-regression-triggered has been disabled
On 1 Apr 2015, at 05:04, Emmanuel Dreyfus m...@netbsd.org wrote: Justin Clift jus...@gluster.org wrote: That, or perhaps we could have two verified fields? Sure. Whichever works. :) Personally, I'm not sure how to do either yet. In http://build.gluster.org/gerrit-trigger/ you have Verdict categories with CRVW (code review) and VRIF (verified), and there is a add verdict category, which suggest this is something that can be done. Of course the Gerrit side will need some configuration too, but if Jenkins can deal with more Gerrit fields, there must be a way to add fields in Gerrit. Interesting. Marcelo, this sounds like something you'd know about. Any ideas? :) We're trying to add an extra Verified column to our Gerrit + Jenkins setup. We have an existing one for Gluster Build System (which is our CentOS Regression testing). Now we want to add one for our NetBSD Regression testing. Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Extra overnight regression test run results
On 31 Mar 2015, at 17:43, Nithya Balachandran nbala...@redhat.com wrote: snip * 11 x tests/bugs/distribute/bug-1117851.t Failed test: 15 55% fail rate Is the test output for the bug-1117851.t failure available anywhere? Not at the moment. It would be really easy to setup a new VM with a failure of this, and give you access to it, if that would help? + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] rackspace-netbsd7-regression-triggered has been disabled
On 1 Apr 2015, at 00:48, Jeff Darcy jda...@redhat.com wrote: The following Gerrit patchsets were affected: http://review.gluster.org/#/c/9557/ (Nandaja Varma) changelog: Fixing buffer overrun coverity issues http://review.gluster.org/#/c/9981/ (Pranith Kumar Karampuri) cluster/ec: Refactor inode-writev http://review.gluster.org/#/c/9970/ (Kotresh HR) extras: Fix stop-all-gluster-processes.sh script http://review.gluster.org/#/c/10075/ (Jeff Darcy) socket: use OpenSSL multi-threading interfaces this one nuked a CR+1 (from Kaleb) as well as V+1 In the absence of any other obvious way to fix this up, I'll start new jobs for these momentarily. Found another one: http://review.gluster.org/#/c/9859/ (Raghavendra Talur) libglusterfs/syncop: Add xdata to all syncop calls Started a new job for that one too. If you have a build.gluster.org login, fixing this is pretty simple. Doesn't need the job to be re-run. ;) All you need to do is change to the jenkins user (on build.gluster.org) then run the command that's at the bottom of the regression test run. For example, looking at the regression run for the first issue you have in the list: http://build.gluster.org/job/rackspace-regression-2GB-triggered/6198/consoleFull At the very end of the regression run, it shows this: ssh bu...@review.gluster.org gerrit review --message ''\''http://build.gluster.org/job/rackspace-regression-2GB-triggered/6198/consoleFull : SUCCESS'\''' --project=glusterfs --verified=+1 --code-review=0 ab9bdb54f89a6f8080f8b338b32b23698e9de515 Running that command from the jenkins user on build.gluster.org resend the SUCCESS message to Gerrit: [homepc]$ ssh build.gluster.org [justin@build]$ sudo su - jenkins [jenkins@build]$ ssh bu...@review.gluster.org gerrit review --message ''\''http://build.gluster.org/job/rackspace-regression-2GB-triggered/6198/consoleFull : SUCCESS'\''' --project=glusterfs --verified=+1 --code-review=0 ab9bdb54f89a6f8080f8b338b32b23698e9de515 [jenkins@build]$ And it's done. ;) I've done the first one. I'll leave the others for you, so you embed the skill :) Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Spurious Failures in regression runs
On 30 Mar 2015, at 18:54, Vijay Bellur vbel...@redhat.com wrote: Hi All, We are attempting to capture all known spurious regression failures from the jenkins instance in build.gluster.org at [1]. The issues listed in the etherpad impede our patch merging workflow and need to be sorted out before we branch release-3.7. If you happen to be the owner of one or more issues in the etherpad, can you please look into the failures and have them addressed soon? To help show up more regression failures, we ran 20x new VM's in Rackspace with a full regression test each of master head branch: * Two hung regression tests on tests/bugs/posix/bug-1113960.t * Still hung in case anyone wants to check them out * 162.242.167.96 * 162.242.167.132 * Both allowing remote root login, and using our jenkins slave password as their root pw * 2 x failures on ./tests/basic/afr/sparse-file-self-heal.t Failed tests: 1-6, 11, 20-30, 33-34, 36, 41, 50-61, 64 Added to etherpad * 1 x failure on ./tests/bugs/disperse/bug-1187474.t Failed tests: 11-12 Added to etherpad * 1 x failure on ./tests/basic/uss.t Failed test: 153 Already on etherpad Looks like our general failure rate is improving. :) The hangs are a bit worrying though. :( Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] NetBSD regression recovered (with patches)
On 29 Mar 2015, at 21:31, Emmanuel Dreyfus m...@netbsd.org wrote: Hi We now have the patches to recover NetBSD regression. This needs to be reviewed and merged (it would be nice if it could be before release-3.7 branching): http://review.gluster.org/10030 http://review.gluster.org/10032 http://review.gluster.org/10033 http://review.gluster.org/10034 http://review.gluster.org/9831 http://review.gluster.org/9944 I also have the following pending change that may deserve a look before branching: http://review.gluster.org/10017 Once all that will be merged, we will still have a few failing cases for which I need some help (see details below): tests/basic/afr/split-brain-resolution.t tests/basic/ec/ tests/basic/tier/tier.t tests/encryption/crypt.t tests/features/trash.t I will disable them so that NetBSD regression vote can be useful. But it would be nice if people coule help fixing them. Here are the details: 1) tests/basic/afr/split-brain-resolution.t 100% reliable failure, I posted an analysis of it in 1m21hnh.yiu9vbyqrhg9m%m...@netbsd.org 2) tests/basic/ec tests have rare sprious failures, Xavier Hernandez said he will look at it when time will allow it. 3) tests/basic/tier/tier.t has 100% reliable failure on nbslave70 [19:19:07] ./tests/basic/tier/tier.t .. 20/32 md5: /d/backends/patchy1/d1/data2.txt: No such file or directory [19:19:07] ./tests/basic/tier/tier.t .. 23/32 not ok 23 Got 1 instead of 0 md5: /d/backends/patchy1/d1/data3.txt: No such file or directory [19:19:07] ./tests/basic/tier/tier.t .. 24/32 not ok 24 Got 1 instead of 0 umount: /mnt/glusterfs/0: Device busy [19:19:07] ./tests/basic/tier/tier.t .. Failed 2/32 subtests ./tests/basic/tier/tier.t (Wstat: 0 Tests: 32 Failed: 2) Failed tests: 23-24 Note that tier.t needs a portability patch not yet submitted: diff --git a/tests/basic/tier/tier.t b/tests/basic/tier/tier.t index 383d470..9cc754a 100755 --- a/tests/basic/tier/tier.t +++ b/tests/basic/tier/tier.t @@ -96,7 +96,7 @@ sleep 12 uuidgen d1/data2.txt # Check promotion on read to slow tier -echo 3 /proc/sys/vm/drop_caches +( cd $M0 umount $M0 ) cat d1/data3.txt sleep 5 EXPECT_WITHIN $PROMOTE_TIMEOUT 0 file_on_fast_tier d1/data2.txt 4) tests/encryption/crypt.t has 100% reliable error on nbslave70. This is annoying because that one passed before: [19:21:21] ./tests/encryption/crypt.t .. 19/39 ln: /mnt/glusterfs/0/testfile: Protocol error not ok 20 mv: rename /mnt/glusterfs/0/testfile to /mnt/glusterfs/0/testfile-renamed: Protocol error not ok 21 diff: /mnt/glusterfs/0/testfile-symlink: No such file or directory not ok 26 [19:21:21] ./tests/encryption/crypt.t .. Failed 3/39 subtests ./tests/encryption/crypt.t (Wstat: 0 Tests: 39 Failed: 3) Failed tests: 20-21, 26 5) tests/features/trash.t is completely broken. Note that http://review.gluster.org/10033 patches trash.t for NetBSD compatibility [19:24:33] ./tests/features/trash.t .. 1/65 No volumes present [19:24:33] ./tests/features/trash.t .. 15/65 not ok 15 [19:24:33] ./tests/features/trash.t .. 18/65 not ok 18 [19:24:33] ./tests/features/trash.t .. 30/65 not ok 30 (...) ./tests/features/trash.t (Wstat: 0 Tests: 65 Failed: 16) Failed tests: 15, 18, 30, 33-34, 38-39, 57-65 Awesome. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Responsibilities and expectations of our maintainers
On 28 Mar 2015, at 13:12, Emmanuel Dreyfus m...@netbsd.org wrote: Pranith Kumar Karampuri pkara...@redhat.com wrote: By which time some more problems may creep in, it will be chicken and egg problem. Force a -2. Everybody will work just on Netbsd for a while but after that things should be just similar to Linux. It would probably be a good idea to decide a date on which this forcing would happen. I will submit a batch of fixes within the next hours. Once they are merged we will have a sane situation again. Except that http://review.gluster.org/10019 causes a few spurious bugs to become reliable. Once that one is merged, NetBSD regression will be broken again. Atin has an initial fix for the breakage here: http://review.gluster.org/#/c/10032/ Does that help? :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Regarding Scheduler Translator
On 27 Mar 2015, at 08:16, Shyam Deshmukh shyamdeshmukh...@gmail.com wrote: I am Shyam Deshmukh. Just started understanding glusterfs architecture. I couldnt find shedular translator code in glusterfs3.5.1. Please help me out to find source code of scheduler. I am reading about translator from following link http://www.gluster.org/community/documentation/index.php/Translators/cluster/unify#GlusterFS_Schedulers Hi Shyam, Welcome to the Gluster Community. :) As a first thought, are you definitely looking for stuff in version 3.5.1 of GlusterFS? 3.5.1 is an older version. Our latest one in the 3.5.x series is 3.5.3. If it helps, our very latest release of GlusterFS is 3.6.2, and we're working on version 3.7.0 presently as well. What are you wanting to do with GlusterFS btw, if you're ok to describe it. :) Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Regarding Scheduler Translator
On 27 Mar 2015, at 13:33, Shyam Deshmukh shyamdeshmukh...@gmail.com wrote: Well, First of all Thank you sir. I am working on ME/Mtech research project Load Balancing for distributed file system I have done literature survey. My guide an me are proposing one distributed load balancing mechanism for DFS. I was searching best DFS platform to run my algorithm.Fortunately I found it as Glusterfs.I tried to deploy GFS in our LAB. I created some distributed replicated volume and stored some files on it. Our college is approved me to go ahead. As it has something called translator so that we can extend existing architecture. I was looking for the same translator whether it exists or not. I found that scheduler but didn't get source code. Thats why posted that question. That makes sense. :) Looking here, I can't see it either: https://github.com/gluster/glusterfs/tree/release-3.5/xlators (Note - I'm not a GlusterFS coder, I work on other bits :) I suspect the page you're looking at on the wiki is old and needs updating. :/ Pranith does the Scheduler translator exist still - as described on the wiki? Is there a better place for Shyam to read up on the current translators? :) Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] GlusterFS 3.4.7beta4 is now available for testing
Note for everyone - This is likely to be the last ever GlusterFS 3.4.x release. If you're using 3.4.x, you'll want to make sure this works. Any 3.4.x bugs after this one... - please upgrade your GlusterFS. ;) Regards and best wishes, Justin Clift On 27 Mar 2015, at 14:43, Kaleb S. KEITHLEY kkeit...@redhat.com wrote: Many thanks to all our users that have reported bugs against the 3.4 version of GlusterFS! glusterfs-3.4.7beta4 has been made available for testing. N.B. glusterfs-3.4.7beta3 was released but a late arriving patch necessitated quickly doing beta4. If you filed a bug against 3.4.x and it is listed as fixed in the Release Notes, please test it to confirm that it is fixed. Please update the bug report as soon as possible if you find that it has not been fixed. If any assistance is needed, do not hesitate to send a request to the Gluster Users mailinglist (gluster-us...@gluster.org) or start a discussion in the #gluster channel on Freenode IRC. The release notes can be found at http://blog.gluster.org/2015/03/glusterfs-3-4-7beta4-is-now-available-for-testing/ Packages for selected distributions can be found on the main download server at http://download.gluster.org/pub/gluster/glusterfs/qa-releases/3.4.7beta4/ Thank you in advance for testing, -- your friendly GlusterFS-3.4 release wrangler -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Reducing the number of Rackspace regression VM's
FYI. Now we're past the mad rush of patch submissions for the 3.7 feature freeze, the number of Rackspace regression VM's is being reduced. Trying to get us into the ballpark of our budget... ;) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Updated best way to retrigger jobs in Jenkins
Hi us :), Just added a page to the wiki, showing how to retrigger a failed job in Jenkins: http://www.gluster.org/community/documentation/index.php/Retrigger_jobs_in_Jenkins Personally I used to use a different, less convenient method. This new way on the wiki seems to be much easier and as effective. Better suggestions welcome of course. :) Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Updated best way to retrigger jobs in Jenkins
On 26 Mar 2015, at 17:28, Vijay Bellur vbel...@redhat.com wrote: On 03/26/2015 10:48 PM, Justin Clift wrote: Hi us :), Just added a page to the wiki, showing how to retrigger a failed job in Jenkins: http://www.gluster.org/community/documentation/index.php/Retrigger_jobs_in_Jenkins Personally I used to use a different, less convenient method. This new way on the wiki seems to be much easier and as effective. Better suggestions welcome of course. :) This is how I normally retrigger failed jobs. Thanks for creating this descriptive document on how to do that! :) Welcome. :D + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Revamping the GlusterFS Documentation...
On 23 Mar 2015, at 07:01, Shravan Chandrashekar schan...@redhat.com wrote: Hi All, The Gluster Filesystem documentation is not user friendly and fragmented and this has been the feedback we have been receiving. We got back to our drawing board and blueprints and realized that the content was scattered at various places. These include: [Static HTML] http://www.gluster.org/documentation/ [Mediawiki] http://www.gluster.org/community/documentation/ [In-source] https://github.com/gluster/glusterfs/tree/master/doc [Markdown] https://github.com/GlusterFS/Notes and so on… Hence, we started by curating content from various sources including gluster.org static HTML documentation, glusterfs github repository, various blog posts and the Community wiki. We also felt the need to improve the community member's experience with Gluster documentation. This led us to put some thought into the user interface. As a result we came up with a page which links all content into a single landing page: http://www.gluster.org/community/documentation/index.php/Staged_Docs This is just our first step to improve our community docs and enhance the community contribution towards documentation. I would like to thank Humble Chirammal and Anjana Sriram for the suggestions and directions during the entire process. I am sure there is lot of scope for improvement. Hence, request you all to review the content and provide your suggestions. Looks like a good effort. Is the general concept for this to become the front/landing page for the main wiki? Also some initial thoughts: * Gluster Ant Logo image - The first letter REALLY looks like a C (to me), not a G. Reads as Cluster for me... That aside, it looks really good. :) * Getting Started section ... move it up maybe, before the Terminology / Architecture / Additional Resources bit This is to make it more obvious for new people. * Terminologies should probably be Terminology, as Terminology is kind of both singular and plural. * All that Developers need to know → Everything Developers need to know They're my first thoughts anyway. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel