Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)
As mentioned on HADOOP-12111, there is now an incubator-style proposal: http://wiki.apache.org/incubator/YetusProposal On Wed, Jun 24, 2015 at 9:41 AM, Sean Busbey wrote: > Hi Folks! > > Work in a feature branch is now being tracked by HADOOP-12111. > > On Thu, Jun 18, 2015 at 10:07 PM, Sean Busbey wrote: > >> It looks like we have consensus. >> >> I'll start drafting up a proposal for the next board meeting (July 15th). >> Once we work out the name I'll submit a PODLINGNAMESEARCH jira to track >> that we did due diligence on whatever we pick. >> >> In the mean time, Hadoop PMC would y'all be willing to host us in a >> branch so that we can start prepping things now? We would want branch >> commit rights for the proposed new PMC. >> >> >> -Sean >> >> >> On Mon, Jun 15, 2015 at 6:47 PM, Sean Busbey wrote: >> >>> Oof. I had meant to push on this again but life got in the way and now >>> the June board meeting is upon us. Sorry everyone. In the event that this >>> ends up contentious, hopefully one of the copied communities can give us a >>> branch to work in. >>> >>> I know everyone is busy, so here's the short version of this email: I'd >>> like to move some of the code currently in Hadoop (test-patch) into a new >>> TLP focused on QA tooling. I'm not sure what the best format for priming >>> this conversation is. ORC filled in the incubator project proposal >>> template, but I'm not sure how much that confused the issue. So to start, >>> I'll just write what I'm hoping we can accomplish in general terms here. >>> >>> All software development projects that are community based (that is, >>> accepting outside contributions) face a common QA problem for vetting >>> in-coming contributions. Hadoop is fortunate enough to be sufficiently >>> popular that the weight of the problem drove tool development (i.e. >>> test-patch). That tool is generalizable enough that a bunch of other TLPs >>> have adopted their own forks. Unfortunately, in most projects this kind of >>> QA work is an enabler rather than a primary concern, so often the tooling >>> is worked on ad-hoc and little shared improvements happen across projects. >>> Since >>> the tooling itself is never a primary concern, any made is rarely reused >>> outside of ASF projects. >>> >>> Over the last couple months a few of us have been working on >>> generalizing the tooling present in the Hadoop code base (because it was >>> the most mature out of all those in the various projects) and it's reached >>> a point where we think we can start bringing on other downstream users. >>> This means we need to start establishing things like a release cadence and >>> to grow the new contributors we have to handle more project responsibility. >>> Personally, I think that means it's time to move out from under Hadoop to >>> drive things as our own community. Eventually, I hope the community can >>> help draw in a group of folks traditionally underrepresented in ASF >>> projects, namely QA and operations folks. >>> >>> I think test-patch by itself has enough scope to justify a project. >>> Having a solid set of build tools that are customizable to fit the norms of >>> different software communities is a bunch of work. Making it work well in >>> both the context of automated test systems like Jenkins and for individual >>> developers is even more work. We could easily also take over maintenance of >>> things like shelldocs, since test-patch is the primary consumer of that >>> currently but it's generally useful tooling. >>> >>> In addition to test-patch, I think the proposed project has some future >>> growth potential. Given some adoption of test-patch to prove utility, the >>> project could build on the ties it makes to start building tools to help >>> projects do their own longer-run testing. Note that I'm talking about the >>> tools to build QA processes and not a particular set of tested components. >>> Specifically, I think the ChaosMonkey work that's in HBase should be >>> generalizable as a fault injection framework (either based on that code or >>> something like it). Doing this for arbitrary software is obviously very >>> difficult, and a part of easing that will be to make (and then favor) >>> tooling to allow projects to have operational glue that looks the same. >>> Namely, the shell work that's been done in hadoop-functions.sh would be a >>> great foundational layer that could bring good daemon handling practices to >>> a whole slew of software projects. In the event that these frameworks and >>> tools get adopted by parts of the Hadoop ecosystem, that could make the job >>> of i.e. Bigtop substantially easier. >>> >>> I've reached out to a few folks who have been involved in the current >>> test-patch work or expressed interest in helping out on getting it used in >>> other projects. Right now, the proposed PMC would be (alphabetical by last >>> name): >>> >>> * Andrew Bayer (ASF member, incubator pmc, bigtop pmc, flume pmc, >>> jclouds pmc, sqoop pmc, all
Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)
Hi Folks! Work in a feature branch is now being tracked by HADOOP-12111. On Thu, Jun 18, 2015 at 10:07 PM, Sean Busbey wrote: > It looks like we have consensus. > > I'll start drafting up a proposal for the next board meeting (July 15th). > Once we work out the name I'll submit a PODLINGNAMESEARCH jira to track > that we did due diligence on whatever we pick. > > In the mean time, Hadoop PMC would y'all be willing to host us in a branch > so that we can start prepping things now? We would want branch commit > rights for the proposed new PMC. > > > -Sean > > > On Mon, Jun 15, 2015 at 6:47 PM, Sean Busbey wrote: > >> Oof. I had meant to push on this again but life got in the way and now >> the June board meeting is upon us. Sorry everyone. In the event that this >> ends up contentious, hopefully one of the copied communities can give us a >> branch to work in. >> >> I know everyone is busy, so here's the short version of this email: I'd >> like to move some of the code currently in Hadoop (test-patch) into a new >> TLP focused on QA tooling. I'm not sure what the best format for priming >> this conversation is. ORC filled in the incubator project proposal >> template, but I'm not sure how much that confused the issue. So to start, >> I'll just write what I'm hoping we can accomplish in general terms here. >> >> All software development projects that are community based (that is, >> accepting outside contributions) face a common QA problem for vetting >> in-coming contributions. Hadoop is fortunate enough to be sufficiently >> popular that the weight of the problem drove tool development (i.e. >> test-patch). That tool is generalizable enough that a bunch of other TLPs >> have adopted their own forks. Unfortunately, in most projects this kind of >> QA work is an enabler rather than a primary concern, so often the tooling >> is worked on ad-hoc and little shared improvements happen across projects. >> Since >> the tooling itself is never a primary concern, any made is rarely reused >> outside of ASF projects. >> >> Over the last couple months a few of us have been working on generalizing >> the tooling present in the Hadoop code base (because it was the most mature >> out of all those in the various projects) and it's reached a point where we >> think we can start bringing on other downstream users. This means we need >> to start establishing things like a release cadence and to grow the new >> contributors we have to handle more project responsibility. Personally, I >> think that means it's time to move out from under Hadoop to drive things as >> our own community. Eventually, I hope the community can help draw in a >> group of folks traditionally underrepresented in ASF projects, namely QA >> and operations folks. >> >> I think test-patch by itself has enough scope to justify a project. >> Having a solid set of build tools that are customizable to fit the norms of >> different software communities is a bunch of work. Making it work well in >> both the context of automated test systems like Jenkins and for individual >> developers is even more work. We could easily also take over maintenance of >> things like shelldocs, since test-patch is the primary consumer of that >> currently but it's generally useful tooling. >> >> In addition to test-patch, I think the proposed project has some future >> growth potential. Given some adoption of test-patch to prove utility, the >> project could build on the ties it makes to start building tools to help >> projects do their own longer-run testing. Note that I'm talking about the >> tools to build QA processes and not a particular set of tested components. >> Specifically, I think the ChaosMonkey work that's in HBase should be >> generalizable as a fault injection framework (either based on that code or >> something like it). Doing this for arbitrary software is obviously very >> difficult, and a part of easing that will be to make (and then favor) >> tooling to allow projects to have operational glue that looks the same. >> Namely, the shell work that's been done in hadoop-functions.sh would be a >> great foundational layer that could bring good daemon handling practices to >> a whole slew of software projects. In the event that these frameworks and >> tools get adopted by parts of the Hadoop ecosystem, that could make the job >> of i.e. Bigtop substantially easier. >> >> I've reached out to a few folks who have been involved in the current >> test-patch work or expressed interest in helping out on getting it used in >> other projects. Right now, the proposed PMC would be (alphabetical by last >> name): >> >> * Andrew Bayer (ASF member, incubator pmc, bigtop pmc, flume pmc, jclouds >> pmc, sqoop pmc, all around Jenkins expert) >> * Sean Busbey (ASF member, accumulo pmc, hbase pmc) >> * Nick Dimiduk (hbase pmc, phoenix pmc) >> * Chris Nauroth (ASF member, incubator pmc, hadoop pmc) >> * Andrew Purtell (ASF member, incubator pmc, bigtop pmc, hbase pmc, >> phoenix pm
Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)
+1 for a separate project and going directly to TLP if possible (as Hadoop itself did when split out of Nutch) +1 for having language discussions once it's a TLP :-) Cheers, Nigel > On Jun 22, 2015, at 1:55 PM, Andrew Purtell wrote: > >> On Mon, Jun 22, 2015 at 1:03 PM, Nick Dimiduk wrote: >> >> On Mon, Jun 22, 2015 at 12:43 PM, Colin P. McCabe >> wrote: >> >>> You mentioned that "most of our project will be focused on shell >>> scripts" I guess based on the existing test-patch code. Allen did a >>> lot of good work in this area recently. I am curious if you evaluated >>> languages such as Python or Node.js for this use-case. Shell scripts >>> can get a little... tricky beyond a certain size. On the other hand, >>> if we are standardizing on shell, which shell and which version? >>> Perhaps bash 3.5+? >> >> I'll also add that shell is not helpful for a cross-platform set of >> tooling. I recently added a daemon to Apache Phoenix; an explicit >> requirement was Windows support. I ended up implementing a solution in >> python because that environment is platform-agnostic and still systems-y >> enough. I think this is something this project should seriously consider. > > In my opinion, historically, test-patch hasn't needed to be cross platform > because the only first class development environment for Hadoop has been > Linux. Growing beyond this could absolutely be one focus of Yetus should > that be a consensus goal of the community. The seed of the project, though, > is today's test-patch, which is implemented in bash. That's where we are > today. Language "discussions" (smile) can and should be forward looking. > > >> On Mon, Jun 22, 2015 at 1:03 PM, Nick Dimiduk wrote: >> >> On Mon, Jun 22, 2015 at 12:43 PM, Colin P. McCabe >> wrote: >> >>> You mentioned that "most of our project will be focused on shell >>> scripts" I guess based on the existing test-patch code. Allen did a >>> lot of good work in this area recently. I am curious if you evaluated >>> languages such as Python or Node.js for this use-case. Shell scripts >>> can get a little... tricky beyond a certain size. On the other hand, >>> if we are standardizing on shell, which shell and which version? >>> Perhaps bash 3.5+? >> >> I'll also add that shell is not helpful for a cross-platform set of >> tooling. I recently added a daemon to Apache Phoenix; an explicit >> requirement was Windows support. I ended up implementing a solution in >> python because that environment is platform-agnostic and still systems-y >> enough. I think this is something this project should seriously consider. >> >> -n >> >> On Tue, Jun 16, 2015 at 7:55 PM, Sean Busbey wrote: I'm going to try responding to several things at once here, so >> apologies >>> if I miss anyone and sorry for the long email. :) On Tue, Jun 16, 2015 at 3:44 PM, Steve Loughran < >> ste...@hortonworks.com> wrote: > I think it's good to have a general build/test process projects can >>> share, > so +1 to pulling it out. You should get help from others. > > regarding incubation, it is a lot of work, especially for something >>> that's > more of an in-house tool than an artifact to release and redistribute. > > You can't just use apache labs or the build project's repo to work on >>> this? > > if you do want to incubate, we may want to nominate the hadoop project >>> as > the monitoring PMC, rather than incubator@. > > -steve Important note: we're proposing a board resolution that would directly >>> pull this code base out into a new TLP; there'd be no incubator, we'd just continue building community and start making releases. The proposed PMC believes the tooling we're talking about has direct applicability to projects well outside of the ASF. Lot's of other open source projects run on community contributions and have a general need >>> for better QA tools. Given that problem set and the presence of a community working to solve it, there's no reason this needs to be treated as an in-house build project. We certainly want to be useful to ASF projects >>> and getting them on-board given our current optimization for ASF infra will certainly be easier, but we're not limited to that (and our current prerequisites, a CI tool and jira or github, are pretty broadly >>> available). > On Tue, Jun 16, 2015 at 10:13 AM, Nick Dimiduk wrote: > > Since we're tossing out names, how about Apache Bootstrap? It's a > meta-project to help other projects get off the ground, after all. There's already a web development framework named Bootstrap[1]. It's >> also used by several ASF projects, so I think it best to avoid the >> confusion. The name is, of course, up to the proposed PMC. As a bit of background, >>> the current name Yetus fulfills Allen's desire to ha
Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)
On Mon, Jun 22, 2015 at 1:03 PM, Nick Dimiduk wrote: > On Mon, Jun 22, 2015 at 12:43 PM, Colin P. McCabe > wrote: > > > You mentioned that "most of our project will be focused on shell > > scripts" I guess based on the existing test-patch code. Allen did a > > lot of good work in this area recently. I am curious if you evaluated > > languages such as Python or Node.js for this use-case. Shell scripts > > can get a little... tricky beyond a certain size. On the other hand, > > if we are standardizing on shell, which shell and which version? > > Perhaps bash 3.5+? > > > > I'll also add that shell is not helpful for a cross-platform set of > tooling. I recently added a daemon to Apache Phoenix; an explicit > requirement was Windows support. I ended up implementing a solution in > python because that environment is platform-agnostic and still systems-y > enough. I think this is something this project should seriously consider. > In my opinion, historically, test-patch hasn't needed to be cross platform because the only first class development environment for Hadoop has been Linux. Growing beyond this could absolutely be one focus of Yetus should that be a consensus goal of the community. The seed of the project, though, is today's test-patch, which is implemented in bash. That's where we are today. Language "discussions" (smile) can and should be forward looking. On Mon, Jun 22, 2015 at 1:03 PM, Nick Dimiduk wrote: > On Mon, Jun 22, 2015 at 12:43 PM, Colin P. McCabe > wrote: > > > You mentioned that "most of our project will be focused on shell > > scripts" I guess based on the existing test-patch code. Allen did a > > lot of good work in this area recently. I am curious if you evaluated > > languages such as Python or Node.js for this use-case. Shell scripts > > can get a little... tricky beyond a certain size. On the other hand, > > if we are standardizing on shell, which shell and which version? > > Perhaps bash 3.5+? > > > > I'll also add that shell is not helpful for a cross-platform set of > tooling. I recently added a daemon to Apache Phoenix; an explicit > requirement was Windows support. I ended up implementing a solution in > python because that environment is platform-agnostic and still systems-y > enough. I think this is something this project should seriously consider. > > -n > > On Tue, Jun 16, 2015 at 7:55 PM, Sean Busbey wrote: > > > I'm going to try responding to several things at once here, so > apologies > > if > > > I miss anyone and sorry for the long email. :) > > > > > > > > > On Tue, Jun 16, 2015 at 3:44 PM, Steve Loughran < > ste...@hortonworks.com> > > > wrote: > > > > > >> I think it's good to have a general build/test process projects can > > share, > > >> so +1 to pulling it out. You should get help from others. > > >> > > >> regarding incubation, it is a lot of work, especially for something > > that's > > >> more of an in-house tool than an artifact to release and redistribute. > > >> > > >> You can't just use apache labs or the build project's repo to work on > > this? > > >> > > >> if you do want to incubate, we may want to nominate the hadoop project > > as > > >> the monitoring PMC, rather than incubator@. > > >> > > >> -steve > > >> > > >> > > > Important note: we're proposing a board resolution that would directly > > pull > > > this code base out into a new TLP; there'd be no incubator, we'd just > > > continue building community and start making releases. > > > > > > The proposed PMC believes the tooling we're talking about has direct > > > applicability to projects well outside of the ASF. Lot's of other open > > > source projects run on community contributions and have a general need > > for > > > better QA tools. Given that problem set and the presence of a community > > > working to solve it, there's no reason this needs to be treated as an > > > in-house build project. We certainly want to be useful to ASF projects > > and > > > getting them on-board given our current optimization for ASF infra will > > > certainly be easier, but we're not limited to that (and our current > > > prerequisites, a CI tool and jira or github, are pretty broadly > > available). > > > > > > > > > On Tue, Jun 16, 2015 at 10:13 AM, Nick Dimiduk > > wrote: > > > > > >> > > >> Since we're tossing out names, how about Apache Bootstrap? It's a > > >> meta-project to help other projects get off the ground, after all. > > >> > > > > > > > > > There's already a web development framework named Bootstrap[1]. It's > also > > > used by several ASF projects, so I think it best to avoid the > confusion. > > > > > > The name is, of course, up to the proposed PMC. As a bit of background, > > the > > > current name Yetus fulfills Allen's desire to have something shell > > related > > > and my desire to have a project that starts with Y (there are currently > > no > > > ASF projects that start with Y). The universe of names that fill in > these > > > two is very small, AFAICT. I did a brie
Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)
On Mon, Jun 22, 2015 at 12:43 PM, Colin P. McCabe wrote: > You mentioned that "most of our project will be focused on shell > scripts" I guess based on the existing test-patch code. Allen did a > lot of good work in this area recently. I am curious if you evaluated > languages such as Python or Node.js for this use-case. Shell scripts > can get a little... tricky beyond a certain size. On the other hand, > if we are standardizing on shell, which shell and which version? > Perhaps bash 3.5+? > I'll also add that shell is not helpful for a cross-platform set of tooling. I recently added a daemon to Apache Phoenix; an explicit requirement was Windows support. I ended up implementing a solution in python because that environment is platform-agnostic and still systems-y enough. I think this is something this project should seriously consider. -n On Tue, Jun 16, 2015 at 7:55 PM, Sean Busbey wrote: > > I'm going to try responding to several things at once here, so apologies > if > > I miss anyone and sorry for the long email. :) > > > > > > On Tue, Jun 16, 2015 at 3:44 PM, Steve Loughran > > wrote: > > > >> I think it's good to have a general build/test process projects can > share, > >> so +1 to pulling it out. You should get help from others. > >> > >> regarding incubation, it is a lot of work, especially for something > that's > >> more of an in-house tool than an artifact to release and redistribute. > >> > >> You can't just use apache labs or the build project's repo to work on > this? > >> > >> if you do want to incubate, we may want to nominate the hadoop project > as > >> the monitoring PMC, rather than incubator@. > >> > >> -steve > >> > >> > > Important note: we're proposing a board resolution that would directly > pull > > this code base out into a new TLP; there'd be no incubator, we'd just > > continue building community and start making releases. > > > > The proposed PMC believes the tooling we're talking about has direct > > applicability to projects well outside of the ASF. Lot's of other open > > source projects run on community contributions and have a general need > for > > better QA tools. Given that problem set and the presence of a community > > working to solve it, there's no reason this needs to be treated as an > > in-house build project. We certainly want to be useful to ASF projects > and > > getting them on-board given our current optimization for ASF infra will > > certainly be easier, but we're not limited to that (and our current > > prerequisites, a CI tool and jira or github, are pretty broadly > available). > > > > > > On Tue, Jun 16, 2015 at 10:13 AM, Nick Dimiduk > wrote: > > > >> > >> Since we're tossing out names, how about Apache Bootstrap? It's a > >> meta-project to help other projects get off the ground, after all. > >> > > > > > > There's already a web development framework named Bootstrap[1]. It's also > > used by several ASF projects, so I think it best to avoid the confusion. > > > > The name is, of course, up to the proposed PMC. As a bit of background, > the > > current name Yetus fulfills Allen's desire to have something shell > related > > and my desire to have a project that starts with Y (there are currently > no > > ASF projects that start with Y). The universe of names that fill in these > > two is very small, AFAICT. I did a brief suitability search and didn't > find > > any blockers. > > > > > > On Tue, Jun 16, 2015 at 11:59 AM, Allen Wittenauer > > wrote: > > > >> > >> Since a couple of people have brought it up: > >> > >> I think the release question is probably one of the big question > >> marks. Other than tar balls, how does something like this actually get > >> used downstream? > >> > >> For test-patch, in particular, I have a few thoughts on this: > >> > >> Short term: > >> > >> * Projects that want to move RIGHT NOW would modify their > Jenkins > >> jobs to checkout from the Yetus repo (preferably at a well known tag or > >> branch) in one directory and their project repo in another directory. > Then > >> it’s just a matter of passing the correct flags to test-patch. This is > >> pretty much how I’ve been personally running test-patch for about 6 > months > >> now. Under Jenkins, we’ve seen this work with NiFi (incubating) already. > >> > >> * Create a stub version of test-patch that projects could check > >> into their repo, replacing the existing test-patch. This stub version > >> would git clone from either ASF or github and then execute test-patch > >> accordingly on demand. With the correct smarts, it could make sure it > has > >> a cached version to prevent continual clones. > >> > >> Longer term: > >> > >> * I’ve been toying with the idea of (ab)using Java repos and > >> packaging as a transportation layer, either in addition or in > combination > >> with something like a maven plugin. Something like this would clearly > be > >> better for offline usage and/or to lower the network tra
Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)
+1 for making this a separate project. We've always struggled with a lot of forks of the test-patch code and perhaps this project can help create something that works well for multiple projects. Bypassing the incubator seems kind of weird (I didn't know that was an option) but I will let other people with more experience in the ASF comment on that. You mentioned that "most of our project will be focused on shell scripts" I guess based on the existing test-patch code. Allen did a lot of good work in this area recently. I am curious if you evaluated languages such as Python or Node.js for this use-case. Shell scripts can get a little... tricky beyond a certain size. On the other hand, if we are standardizing on shell, which shell and which version? Perhaps bash 3.5+? Also, what will be the mechanism for customizing this for each project? Ideally the customizations needed would be small so we could share the most code. cheers, Colin On Tue, Jun 16, 2015 at 7:55 PM, Sean Busbey wrote: > I'm going to try responding to several things at once here, so apologies if > I miss anyone and sorry for the long email. :) > > > On Tue, Jun 16, 2015 at 3:44 PM, Steve Loughran > wrote: > >> I think it's good to have a general build/test process projects can share, >> so +1 to pulling it out. You should get help from others. >> >> regarding incubation, it is a lot of work, especially for something that's >> more of an in-house tool than an artifact to release and redistribute. >> >> You can't just use apache labs or the build project's repo to work on this? >> >> if you do want to incubate, we may want to nominate the hadoop project as >> the monitoring PMC, rather than incubator@. >> >> -steve >> >> > Important note: we're proposing a board resolution that would directly pull > this code base out into a new TLP; there'd be no incubator, we'd just > continue building community and start making releases. > > The proposed PMC believes the tooling we're talking about has direct > applicability to projects well outside of the ASF. Lot's of other open > source projects run on community contributions and have a general need for > better QA tools. Given that problem set and the presence of a community > working to solve it, there's no reason this needs to be treated as an > in-house build project. We certainly want to be useful to ASF projects and > getting them on-board given our current optimization for ASF infra will > certainly be easier, but we're not limited to that (and our current > prerequisites, a CI tool and jira or github, are pretty broadly available). > > > On Tue, Jun 16, 2015 at 10:13 AM, Nick Dimiduk wrote: > >> >> Since we're tossing out names, how about Apache Bootstrap? It's a >> meta-project to help other projects get off the ground, after all. >> > > > There's already a web development framework named Bootstrap[1]. It's also > used by several ASF projects, so I think it best to avoid the confusion. > > The name is, of course, up to the proposed PMC. As a bit of background, the > current name Yetus fulfills Allen's desire to have something shell related > and my desire to have a project that starts with Y (there are currently no > ASF projects that start with Y). The universe of names that fill in these > two is very small, AFAICT. I did a brief suitability search and didn't find > any blockers. > > > On Tue, Jun 16, 2015 at 11:59 AM, Allen Wittenauer > wrote: > >> >> Since a couple of people have brought it up: >> >> I think the release question is probably one of the big question >> marks. Other than tar balls, how does something like this actually get >> used downstream? >> >> For test-patch, in particular, I have a few thoughts on this: >> >> Short term: >> >> * Projects that want to move RIGHT NOW would modify their Jenkins >> jobs to checkout from the Yetus repo (preferably at a well known tag or >> branch) in one directory and their project repo in another directory. Then >> it’s just a matter of passing the correct flags to test-patch. This is >> pretty much how I’ve been personally running test-patch for about 6 months >> now. Under Jenkins, we’ve seen this work with NiFi (incubating) already. >> >> * Create a stub version of test-patch that projects could check >> into their repo, replacing the existing test-patch. This stub version >> would git clone from either ASF or github and then execute test-patch >> accordingly on demand. With the correct smarts, it could make sure it has >> a cached version to prevent continual clones. >> >> Longer term: >> >> * I’ve been toying with the idea of (ab)using Java repos and >> packaging as a transportation layer, either in addition or in combination >> with something like a maven plugin. Something like this would clearly be >> better for offline usage and/or to lower the network traffic. >> > > It's important that the project follow ASF guidelines on publishing > releases[2]. So long as we publish rel
Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)
It looks like we have consensus. I'll start drafting up a proposal for the next board meeting (July 15th). Once we work out the name I'll submit a PODLINGNAMESEARCH jira to track that we did due diligence on whatever we pick. In the mean time, Hadoop PMC would y'all be willing to host us in a branch so that we can start prepping things now? We would want branch commit rights for the proposed new PMC. -Sean On Mon, Jun 15, 2015 at 6:47 PM, Sean Busbey wrote: > Oof. I had meant to push on this again but life got in the way and now the > June board meeting is upon us. Sorry everyone. In the event that this ends > up contentious, hopefully one of the copied communities can give us a > branch to work in. > > I know everyone is busy, so here's the short version of this email: I'd > like to move some of the code currently in Hadoop (test-patch) into a new > TLP focused on QA tooling. I'm not sure what the best format for priming > this conversation is. ORC filled in the incubator project proposal > template, but I'm not sure how much that confused the issue. So to start, > I'll just write what I'm hoping we can accomplish in general terms here. > > All software development projects that are community based (that is, > accepting outside contributions) face a common QA problem for vetting > in-coming contributions. Hadoop is fortunate enough to be sufficiently > popular that the weight of the problem drove tool development (i.e. > test-patch). That tool is generalizable enough that a bunch of other TLPs > have adopted their own forks. Unfortunately, in most projects this kind of > QA work is an enabler rather than a primary concern, so often the tooling > is worked on ad-hoc and little shared improvements happen across projects. > Since > the tooling itself is never a primary concern, any made is rarely reused > outside of ASF projects. > > Over the last couple months a few of us have been working on generalizing > the tooling present in the Hadoop code base (because it was the most mature > out of all those in the various projects) and it's reached a point where we > think we can start bringing on other downstream users. This means we need > to start establishing things like a release cadence and to grow the new > contributors we have to handle more project responsibility. Personally, I > think that means it's time to move out from under Hadoop to drive things as > our own community. Eventually, I hope the community can help draw in a > group of folks traditionally underrepresented in ASF projects, namely QA > and operations folks. > > I think test-patch by itself has enough scope to justify a project. Having > a solid set of build tools that are customizable to fit the norms of > different software communities is a bunch of work. Making it work well in > both the context of automated test systems like Jenkins and for individual > developers is even more work. We could easily also take over maintenance of > things like shelldocs, since test-patch is the primary consumer of that > currently but it's generally useful tooling. > > In addition to test-patch, I think the proposed project has some future > growth potential. Given some adoption of test-patch to prove utility, the > project could build on the ties it makes to start building tools to help > projects do their own longer-run testing. Note that I'm talking about the > tools to build QA processes and not a particular set of tested components. > Specifically, I think the ChaosMonkey work that's in HBase should be > generalizable as a fault injection framework (either based on that code or > something like it). Doing this for arbitrary software is obviously very > difficult, and a part of easing that will be to make (and then favor) > tooling to allow projects to have operational glue that looks the same. > Namely, the shell work that's been done in hadoop-functions.sh would be a > great foundational layer that could bring good daemon handling practices to > a whole slew of software projects. In the event that these frameworks and > tools get adopted by parts of the Hadoop ecosystem, that could make the job > of i.e. Bigtop substantially easier. > > I've reached out to a few folks who have been involved in the current > test-patch work or expressed interest in helping out on getting it used in > other projects. Right now, the proposed PMC would be (alphabetical by last > name): > > * Andrew Bayer (ASF member, incubator pmc, bigtop pmc, flume pmc, jclouds > pmc, sqoop pmc, all around Jenkins expert) > * Sean Busbey (ASF member, accumulo pmc, hbase pmc) > * Nick Dimiduk (hbase pmc, phoenix pmc) > * Chris Nauroth (ASF member, incubator pmc, hadoop pmc) > * Andrew Purtell (ASF member, incubator pmc, bigtop pmc, hbase pmc, > phoenix pmc) > * Allen Wittenauer (hadoop committer) > > That PMC gives us several members and a bunch of folks familiar with the > ASF. Combined with the code already existing in Apache spaces, I think that > gives us sufficient justif
Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)
For words beginning with "Y" Yale: Mythical animal elephant/boar. After some quick Googling, it's apparently originally documented by "Pliny the Elder", so it's got a beer-related connotation too. Downside, might get confused with the university of the same name. Yare: adj for quick/agile/lively Yucca: noun for a plant that an elephant *could* eat Yair: Scottish term for a fish trap. Also a Hebrew name for "He will enlighten" -Ray On Wed, Jun 17, 2015 at 5:03 AM, Steve Loughran wrote: > > > On 17 Jun 2015, at 03:55, Sean Busbey wrote: > > > > The name is, of course, up to the proposed PMC. As a bit of background, > the > > current name Yetus fulfills Allen's desire to have something shell > related > > and my desire to have a project that starts with Y (there are currently > no > > ASF projects that start with Y). The universe of names that fill in these > > two is very small, AFAICT. I did a brief suitability search and didn't > find > > any blockers. > > > Apache YouBrokeTheBuild? > > I'd thought of "yeti", but there's a couple of software projects/products > called that already. > > Here's a complete list of things that live alongside elephants in > Tanzania; nothing beginning with Y > > http://www.serengeti.org/animals.html > > if you pick one from that list I may have a photo for your slides >
Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)
> On 17 Jun 2015, at 03:55, Sean Busbey wrote: > > The name is, of course, up to the proposed PMC. As a bit of background, the > current name Yetus fulfills Allen's desire to have something shell related > and my desire to have a project that starts with Y (there are currently no > ASF projects that start with Y). The universe of names that fill in these > two is very small, AFAICT. I did a brief suitability search and didn't find > any blockers. Apache YouBrokeTheBuild? I'd thought of "yeti", but there's a couple of software projects/products called that already. Here's a complete list of things that live alongside elephants in Tanzania; nothing beginning with Y http://www.serengeti.org/animals.html if you pick one from that list I may have a photo for your slides
Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)
I'm going to try responding to several things at once here, so apologies if I miss anyone and sorry for the long email. :) On Tue, Jun 16, 2015 at 3:44 PM, Steve Loughran wrote: > I think it's good to have a general build/test process projects can share, > so +1 to pulling it out. You should get help from others. > > regarding incubation, it is a lot of work, especially for something that's > more of an in-house tool than an artifact to release and redistribute. > > You can't just use apache labs or the build project's repo to work on this? > > if you do want to incubate, we may want to nominate the hadoop project as > the monitoring PMC, rather than incubator@. > > -steve > > Important note: we're proposing a board resolution that would directly pull this code base out into a new TLP; there'd be no incubator, we'd just continue building community and start making releases. The proposed PMC believes the tooling we're talking about has direct applicability to projects well outside of the ASF. Lot's of other open source projects run on community contributions and have a general need for better QA tools. Given that problem set and the presence of a community working to solve it, there's no reason this needs to be treated as an in-house build project. We certainly want to be useful to ASF projects and getting them on-board given our current optimization for ASF infra will certainly be easier, but we're not limited to that (and our current prerequisites, a CI tool and jira or github, are pretty broadly available). On Tue, Jun 16, 2015 at 10:13 AM, Nick Dimiduk wrote: > > Since we're tossing out names, how about Apache Bootstrap? It's a > meta-project to help other projects get off the ground, after all. > There's already a web development framework named Bootstrap[1]. It's also used by several ASF projects, so I think it best to avoid the confusion. The name is, of course, up to the proposed PMC. As a bit of background, the current name Yetus fulfills Allen's desire to have something shell related and my desire to have a project that starts with Y (there are currently no ASF projects that start with Y). The universe of names that fill in these two is very small, AFAICT. I did a brief suitability search and didn't find any blockers. On Tue, Jun 16, 2015 at 11:59 AM, Allen Wittenauer wrote: > > Since a couple of people have brought it up: > > I think the release question is probably one of the big question > marks. Other than tar balls, how does something like this actually get > used downstream? > > For test-patch, in particular, I have a few thoughts on this: > > Short term: > > * Projects that want to move RIGHT NOW would modify their Jenkins > jobs to checkout from the Yetus repo (preferably at a well known tag or > branch) in one directory and their project repo in another directory. Then > it’s just a matter of passing the correct flags to test-patch. This is > pretty much how I’ve been personally running test-patch for about 6 months > now. Under Jenkins, we’ve seen this work with NiFi (incubating) already. > > * Create a stub version of test-patch that projects could check > into their repo, replacing the existing test-patch. This stub version > would git clone from either ASF or github and then execute test-patch > accordingly on demand. With the correct smarts, it could make sure it has > a cached version to prevent continual clones. > > Longer term: > > * I’ve been toying with the idea of (ab)using Java repos and > packaging as a transportation layer, either in addition or in combination > with something like a maven plugin. Something like this would clearly be > better for offline usage and/or to lower the network traffic. > It's important that the project follow ASF guidelines on publishing releases[2]. So long as we publish releases to the distribution directory I think we'd be fine having folks work off of the corresponding tag. I'm not sure there's much reason to do that, however. A Jenkins job can just as easily grab a release tarball as a git tag and we're not talking about a large amount of stuff. The kind of build setup that Chris N mentioned is also totally doable now that there's a build description DSL for Jenkins[3]. For individual developers, I don't see any reason we can't package things up as a tool, similar to how findbugs or shellcheck work. We can make OS packages (or homebrew for OS X) if we want to make stand alone installation on developer machines real easy. Those same packages could be installed on the ASF build machines, provided some ASF project wanted to make use of Yetus. Having releases will incur some turn around time for when folks want to see fixes, but that's a trade off around release cadence we can work out longer term. I would like to have one or two projects that can work off of the bleeding edge repo, but we'd have to get that to mesh with foundation policy. My gut tells me we should be able to com
Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)
I think it's good to have a general build/test process projects can share, so +1 to pulling it out. You should get help from others. regarding incubation, it is a lot of work, especially for something that's more of an in-house tool than an artifact to release and redistribute. You can't just use apache labs or the build project's repo to work on this? if you do want to incubate, we may want to nominate the hadoop project as the monitoring PMC, rather than incubator@. -steve > On 16 Jun 2015, at 17:59, Allen Wittenauer wrote: > > > Since a couple of people have brought it up: > > I think the release question is probably one of the big question marks. > Other than tar balls, how does something like this actually get used > downstream? > > For test-patch, in particular, I have a few thoughts on this: > > Short term: > > * Projects that want to move RIGHT NOW would modify their Jenkins jobs > to checkout from the Yetus repo (preferably at a well known tag or branch) in > one directory and their project repo in another directory. Then it’s just a > matter of passing the correct flags to test-patch. This is pretty much how > I’ve been personally running test-patch for about 6 months now. Under > Jenkins, we’ve seen this work with NiFi (incubating) already. > > * Create a stub version of test-patch that projects could check into > their repo, replacing the existing test-patch. This stub version would git > clone from either ASF or github and then execute test-patch accordingly on > demand. With the correct smarts, it could make sure it has a cached version > to prevent continual clones. > > Longer term: > > * I’ve been toying with the idea of (ab)using Java repos and packaging > as a transportation layer, either in addition or in combination with > something like a maven plugin. Something like this would clearly be better > for offline usage and/or to lower the network traffic. > > > It’s probably worth pointing out that plugins can get sucked in from > outside the Yetus dir structure, so project specific bits can remain in those > projects. This would mean that, e.g., if ambari decides they want to change > the dependency ordering such that ambari-metrics always gets built first, > that’s completely doable without the Yetus project getting involved. This is > particularly relevant for things like the Dockerfile where projects would > almost certainly want to dictate their build and test time dependencies.
Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)
Since a couple of people have brought it up: I think the release question is probably one of the big question marks. Other than tar balls, how does something like this actually get used downstream? For test-patch, in particular, I have a few thoughts on this: Short term: * Projects that want to move RIGHT NOW would modify their Jenkins jobs to checkout from the Yetus repo (preferably at a well known tag or branch) in one directory and their project repo in another directory. Then it’s just a matter of passing the correct flags to test-patch. This is pretty much how I’ve been personally running test-patch for about 6 months now. Under Jenkins, we’ve seen this work with NiFi (incubating) already. * Create a stub version of test-patch that projects could check into their repo, replacing the existing test-patch. This stub version would git clone from either ASF or github and then execute test-patch accordingly on demand. With the correct smarts, it could make sure it has a cached version to prevent continual clones. Longer term: * I’ve been toying with the idea of (ab)using Java repos and packaging as a transportation layer, either in addition or in combination with something like a maven plugin. Something like this would clearly be better for offline usage and/or to lower the network traffic. It’s probably worth pointing out that plugins can get sucked in from outside the Yetus dir structure, so project specific bits can remain in those projects. This would mean that, e.g., if ambari decides they want to change the dependency ordering such that ambari-metrics always gets built first, that’s completely doable without the Yetus project getting involved. This is particularly relevant for things like the Dockerfile where projects would almost certainly want to dictate their build and test time dependencies.
Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)
+1 on the idea. It would be great if tests about dependency management. multiple branches, and distributed environment can be done in the project. One discussion point is how Hadoop depends on Yetus, including the development cycles. It's a good time to rethink what's can be done for making Hadoop better. Thanks, - Tsuyoshi On Tue, Jun 16, 2015 at 8:47 AM, Sean Busbey wrote: > Oof. I had meant to push on this again but life got in the way and now the > June board meeting is upon us. Sorry everyone. In the event that this ends > up contentious, hopefully one of the copied communities can give us a > branch to work in. > > I know everyone is busy, so here's the short version of this email: I'd > like to move some of the code currently in Hadoop (test-patch) into a new > TLP focused on QA tooling. I'm not sure what the best format for priming > this conversation is. ORC filled in the incubator project proposal > template, but I'm not sure how much that confused the issue. So to start, > I'll just write what I'm hoping we can accomplish in general terms here. > > All software development projects that are community based (that is, > accepting outside contributions) face a common QA problem for vetting > in-coming contributions. Hadoop is fortunate enough to be sufficiently > popular that the weight of the problem drove tool development (i.e. > test-patch). That tool is generalizable enough that a bunch of other TLPs > have adopted their own forks. Unfortunately, in most projects this kind of > QA work is an enabler rather than a primary concern, so often the tooling > is worked on ad-hoc and little shared improvements happen across > projects. Since > the tooling itself is never a primary concern, any made is rarely reused > outside of ASF projects. > > Over the last couple months a few of us have been working on generalizing > the tooling present in the Hadoop code base (because it was the most mature > out of all those in the various projects) and it's reached a point where we > think we can start bringing on other downstream users. This means we need > to start establishing things like a release cadence and to grow the new > contributors we have to handle more project responsibility. Personally, I > think that means it's time to move out from under Hadoop to drive things as > our own community. Eventually, I hope the community can help draw in a > group of folks traditionally underrepresented in ASF projects, namely QA > and operations folks. > > I think test-patch by itself has enough scope to justify a project. Having > a solid set of build tools that are customizable to fit the norms of > different software communities is a bunch of work. Making it work well in > both the context of automated test systems like Jenkins and for individual > developers is even more work. We could easily also take over maintenance of > things like shelldocs, since test-patch is the primary consumer of that > currently but it's generally useful tooling. > > In addition to test-patch, I think the proposed project has some future > growth potential. Given some adoption of test-patch to prove utility, the > project could build on the ties it makes to start building tools to help > projects do their own longer-run testing. Note that I'm talking about the > tools to build QA processes and not a particular set of tested components. > Specifically, I think the ChaosMonkey work that's in HBase should be > generalizable as a fault injection framework (either based on that code or > something like it). Doing this for arbitrary software is obviously very > difficult, and a part of easing that will be to make (and then favor) > tooling to allow projects to have operational glue that looks the same. > Namely, the shell work that's been done in hadoop-functions.sh would be a > great foundational layer that could bring good daemon handling practices to > a whole slew of software projects. In the event that these frameworks and > tools get adopted by parts of the Hadoop ecosystem, that could make the job > of i.e. Bigtop substantially easier. > > I've reached out to a few folks who have been involved in the current > test-patch work or expressed interest in helping out on getting it used in > other projects. Right now, the proposed PMC would be (alphabetical by last > name): > > * Andrew Bayer (ASF member, incubator pmc, bigtop pmc, flume pmc, jclouds > pmc, sqoop pmc, all around Jenkins expert) > * Sean Busbey (ASF member, accumulo pmc, hbase pmc) > * Nick Dimiduk (hbase pmc, phoenix pmc) > * Chris Nauroth (ASF member, incubator pmc, hadoop pmc) > * Andrew Purtell (ASF member, incubator pmc, bigtop pmc, hbase pmc, > phoenix pmc) > * Allen Wittenauer (hadoop committer) > > That PMC gives us several members and a bunch of folks familiar with the > ASF. Combined with the code already existing in Apache spaces, I think that > gives us sufficient justification for a direct board proposal. > > The planned project name is "Apache Yetus". It's
Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)
I think this is a great idea! Having just gone through the process of getting Phoenix up to speed with precommits, it would be really nice to have a place to go other than "fork/hack someone else's work". For the same project, I recently integrated its first daemon service. This meant adding a bunch of servicy Python code (multi platform support is required) which I only sort of trust. Again, would be great to have an explicit resource for this kind of thing in the ecosystem. I expect Calcite and Kylin will be following along shortly. Since we're tossing out names, how about Apache Bootstrap? It's a meta-project to help other projects get off the ground, after all. -n On Monday, June 15, 2015, Sean Busbey wrote: > Oof. I had meant to push on this again but life got in the way and now the > June board meeting is upon us. Sorry everyone. In the event that this ends > up contentious, hopefully one of the copied communities can give us a > branch to work in. > > I know everyone is busy, so here's the short version of this email: I'd > like to move some of the code currently in Hadoop (test-patch) into a new > TLP focused on QA tooling. I'm not sure what the best format for priming > this conversation is. ORC filled in the incubator project proposal > template, but I'm not sure how much that confused the issue. So to start, > I'll just write what I'm hoping we can accomplish in general terms here. > > All software development projects that are community based (that is, > accepting outside contributions) face a common QA problem for vetting > in-coming contributions. Hadoop is fortunate enough to be sufficiently > popular that the weight of the problem drove tool development (i.e. > test-patch). That tool is generalizable enough that a bunch of other TLPs > have adopted their own forks. Unfortunately, in most projects this kind of > QA work is an enabler rather than a primary concern, so often the tooling > is worked on ad-hoc and little shared improvements happen across > projects. Since > the tooling itself is never a primary concern, any made is rarely reused > outside of ASF projects. > > Over the last couple months a few of us have been working on generalizing > the tooling present in the Hadoop code base (because it was the most mature > out of all those in the various projects) and it's reached a point where we > think we can start bringing on other downstream users. This means we need > to start establishing things like a release cadence and to grow the new > contributors we have to handle more project responsibility. Personally, I > think that means it's time to move out from under Hadoop to drive things as > our own community. Eventually, I hope the community can help draw in a > group of folks traditionally underrepresented in ASF projects, namely QA > and operations folks. > > I think test-patch by itself has enough scope to justify a project. Having > a solid set of build tools that are customizable to fit the norms of > different software communities is a bunch of work. Making it work well in > both the context of automated test systems like Jenkins and for individual > developers is even more work. We could easily also take over maintenance of > things like shelldocs, since test-patch is the primary consumer of that > currently but it's generally useful tooling. > > In addition to test-patch, I think the proposed project has some future > growth potential. Given some adoption of test-patch to prove utility, the > project could build on the ties it makes to start building tools to help > projects do their own longer-run testing. Note that I'm talking about the > tools to build QA processes and not a particular set of tested components. > Specifically, I think the ChaosMonkey work that's in HBase should be > generalizable as a fault injection framework (either based on that code or > something like it). Doing this for arbitrary software is obviously very > difficult, and a part of easing that will be to make (and then favor) > tooling to allow projects to have operational glue that looks the same. > Namely, the shell work that's been done in hadoop-functions.sh would be a > great foundational layer that could bring good daemon handling practices to > a whole slew of software projects. In the event that these frameworks and > tools get adopted by parts of the Hadoop ecosystem, that could make the job > of i.e. Bigtop substantially easier. > > I've reached out to a few folks who have been involved in the current > test-patch work or expressed interest in helping out on getting it used in > other projects. Right now, the proposed PMC would be (alphabetical by last > name): > > * Andrew Bayer (ASF member, incubator pmc, bigtop pmc, flume pmc, jclouds > pmc, sqoop pmc, all around Jenkins expert) > * Sean Busbey (ASF member, accumulo pmc, hbase pmc) > * Nick Dimiduk (hbase pmc, phoenix pmc) > * Chris Nauroth (ASF member, incubator pmc, hadoop pmc) > * Andrew Purtell (ASF member, incubator pmc, bigto
Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)
+1. From its user's viewpoint, recent improvements on test-patch made my work really efficient. For example, quick feedback due to avoiding unnecessary tests, automated build environment setup due to Docker support, automated patch download from JIRA, automated shellcheck and whitespace checker, etc. I believe it is worth spreading these ideas as a TLP over other projects having the same problems such as a long QA process. 2015-06-16 15:08 GMT+09:00 Chris Douglas : > +1 A separate project sounds great. It'd be great to have more > standard tooling across the ecosystem. > > As a practical matter, how should projects consume releases? -C > > On Mon, Jun 15, 2015 at 4:47 PM, Sean Busbey wrote: > > Oof. I had meant to push on this again but life got in the way and now > the > > June board meeting is upon us. Sorry everyone. In the event that this > ends > > up contentious, hopefully one of the copied communities can give us a > > branch to work in. > > > > I know everyone is busy, so here's the short version of this email: I'd > > like to move some of the code currently in Hadoop (test-patch) into a new > > TLP focused on QA tooling. I'm not sure what the best format for priming > > this conversation is. ORC filled in the incubator project proposal > > template, but I'm not sure how much that confused the issue. So to start, > > I'll just write what I'm hoping we can accomplish in general terms here. > > > > All software development projects that are community based (that is, > > accepting outside contributions) face a common QA problem for vetting > > in-coming contributions. Hadoop is fortunate enough to be sufficiently > > popular that the weight of the problem drove tool development (i.e. > > test-patch). That tool is generalizable enough that a bunch of other TLPs > > have adopted their own forks. Unfortunately, in most projects this kind > of > > QA work is an enabler rather than a primary concern, so often the tooling > > is worked on ad-hoc and little shared improvements happen across > > projects. Since > > the tooling itself is never a primary concern, any made is rarely reused > > outside of ASF projects. > > > > Over the last couple months a few of us have been working on generalizing > > the tooling present in the Hadoop code base (because it was the most > mature > > out of all those in the various projects) and it's reached a point where > we > > think we can start bringing on other downstream users. This means we need > > to start establishing things like a release cadence and to grow the new > > contributors we have to handle more project responsibility. Personally, I > > think that means it's time to move out from under Hadoop to drive things > as > > our own community. Eventually, I hope the community can help draw in a > > group of folks traditionally underrepresented in ASF projects, namely QA > > and operations folks. > > > > I think test-patch by itself has enough scope to justify a project. > Having > > a solid set of build tools that are customizable to fit the norms of > > different software communities is a bunch of work. Making it work well in > > both the context of automated test systems like Jenkins and for > individual > > developers is even more work. We could easily also take over maintenance > of > > things like shelldocs, since test-patch is the primary consumer of that > > currently but it's generally useful tooling. > > > > In addition to test-patch, I think the proposed project has some future > > growth potential. Given some adoption of test-patch to prove utility, the > > project could build on the ties it makes to start building tools to help > > projects do their own longer-run testing. Note that I'm talking about the > > tools to build QA processes and not a particular set of tested > components. > > Specifically, I think the ChaosMonkey work that's in HBase should be > > generalizable as a fault injection framework (either based on that code > or > > something like it). Doing this for arbitrary software is obviously very > > difficult, and a part of easing that will be to make (and then favor) > > tooling to allow projects to have operational glue that looks the same. > > Namely, the shell work that's been done in hadoop-functions.sh would be a > > great foundational layer that could bring good daemon handling practices > to > > a whole slew of software projects. In the event that these frameworks and > > tools get adopted by parts of the Hadoop ecosystem, that could make the > job > > of i.e. Bigtop substantially easier. > > > > I've reached out to a few folks who have been involved in the current > > test-patch work or expressed interest in helping out on getting it used > in > > other projects. Right now, the proposed PMC would be (alphabetical by > last > > name): > > > > * Andrew Bayer (ASF member, incubator pmc, bigtop pmc, flume pmc, jclouds > > pmc, sqoop pmc, all around Jenkins expert) > > * Sean Busbey (ASF member, accumulo pmc, hbase pmc) > > * Nick Di
Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)
+1 A separate project sounds great. It'd be great to have more standard tooling across the ecosystem. As a practical matter, how should projects consume releases? -C On Mon, Jun 15, 2015 at 4:47 PM, Sean Busbey wrote: > Oof. I had meant to push on this again but life got in the way and now the > June board meeting is upon us. Sorry everyone. In the event that this ends > up contentious, hopefully one of the copied communities can give us a > branch to work in. > > I know everyone is busy, so here's the short version of this email: I'd > like to move some of the code currently in Hadoop (test-patch) into a new > TLP focused on QA tooling. I'm not sure what the best format for priming > this conversation is. ORC filled in the incubator project proposal > template, but I'm not sure how much that confused the issue. So to start, > I'll just write what I'm hoping we can accomplish in general terms here. > > All software development projects that are community based (that is, > accepting outside contributions) face a common QA problem for vetting > in-coming contributions. Hadoop is fortunate enough to be sufficiently > popular that the weight of the problem drove tool development (i.e. > test-patch). That tool is generalizable enough that a bunch of other TLPs > have adopted their own forks. Unfortunately, in most projects this kind of > QA work is an enabler rather than a primary concern, so often the tooling > is worked on ad-hoc and little shared improvements happen across > projects. Since > the tooling itself is never a primary concern, any made is rarely reused > outside of ASF projects. > > Over the last couple months a few of us have been working on generalizing > the tooling present in the Hadoop code base (because it was the most mature > out of all those in the various projects) and it's reached a point where we > think we can start bringing on other downstream users. This means we need > to start establishing things like a release cadence and to grow the new > contributors we have to handle more project responsibility. Personally, I > think that means it's time to move out from under Hadoop to drive things as > our own community. Eventually, I hope the community can help draw in a > group of folks traditionally underrepresented in ASF projects, namely QA > and operations folks. > > I think test-patch by itself has enough scope to justify a project. Having > a solid set of build tools that are customizable to fit the norms of > different software communities is a bunch of work. Making it work well in > both the context of automated test systems like Jenkins and for individual > developers is even more work. We could easily also take over maintenance of > things like shelldocs, since test-patch is the primary consumer of that > currently but it's generally useful tooling. > > In addition to test-patch, I think the proposed project has some future > growth potential. Given some adoption of test-patch to prove utility, the > project could build on the ties it makes to start building tools to help > projects do their own longer-run testing. Note that I'm talking about the > tools to build QA processes and not a particular set of tested components. > Specifically, I think the ChaosMonkey work that's in HBase should be > generalizable as a fault injection framework (either based on that code or > something like it). Doing this for arbitrary software is obviously very > difficult, and a part of easing that will be to make (and then favor) > tooling to allow projects to have operational glue that looks the same. > Namely, the shell work that's been done in hadoop-functions.sh would be a > great foundational layer that could bring good daemon handling practices to > a whole slew of software projects. In the event that these frameworks and > tools get adopted by parts of the Hadoop ecosystem, that could make the job > of i.e. Bigtop substantially easier. > > I've reached out to a few folks who have been involved in the current > test-patch work or expressed interest in helping out on getting it used in > other projects. Right now, the proposed PMC would be (alphabetical by last > name): > > * Andrew Bayer (ASF member, incubator pmc, bigtop pmc, flume pmc, jclouds > pmc, sqoop pmc, all around Jenkins expert) > * Sean Busbey (ASF member, accumulo pmc, hbase pmc) > * Nick Dimiduk (hbase pmc, phoenix pmc) > * Chris Nauroth (ASF member, incubator pmc, hadoop pmc) > * Andrew Purtell (ASF member, incubator pmc, bigtop pmc, hbase pmc, > phoenix pmc) > * Allen Wittenauer (hadoop committer) > > That PMC gives us several members and a bunch of folks familiar with the > ASF. Combined with the code already existing in Apache spaces, I think that > gives us sufficient justification for a direct board proposal. > > The planned project name is "Apache Yetus". It's an archaic genus of sea > snail and most of our project will be focused on shell scripts. > > N.b.: this does not mean that the Hadoop community would _have_ to
Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)
+1 ZooKeeper is another project that has expressed interest in improving its pre-commit process lately. I understand Allen has had some success applying this to the ZooKeeper build too, with some small caveats around quirks in the build.xml that I think we can resolve. I'm interested in defining how the release model works for a project like this. The current model of forking and checking it in directly to multiple projects leads to the fragmentation and bugs described earlier in the thread. Another possible model is something more dynamic, like a bootstrap script capable of checking out a release from a git tag before launching pre-commit. I'm interested to hear from various projects on how they'd like to integrate. --Chris Nauroth On 6/15/15, 8:57 PM, "Josh Elser" wrote: >+1 > >(Have been talking to Sean in private on the subject -- seems >appropriate to voice some public support) > >I'd be interested in this for Accumulo and Slider. For Accumulo, we've >come a far way without a pre-commit build, primarily due to a CTR >process. We have seen the repeated questions of "how do I run the tests" >which a more automated workflow would help with, IMO. I think Slider >could benefit with the same reasons. > >I'd also be giddy to see the recent improvements in Hadoop trickle down >into the other projects that Allen already mentioned. > >Take this as record that I'd be happy to try to help out where possible. > >Sean Busbey wrote: >> thank you for making a more digestible version Allen. :) >> >> If you're interested in soliciting feedback from other projects, I >>created >> ASF short links to this thread in common-dev and hbase: >> >> >> * http://s.apache.org/yetus-discuss-hadoop >> * http://s.apache.org/yetus-discuss-hbase >> >> While I agree that it's important to get feedback from ASF projects that >> might find this useful, I can say that recently I've been involved in >>the >> non-ASF project YCSB and both the pretest and better shell stuff would >>be >> immensely useful over there. >> >> On Mon, Jun 15, 2015 at 10:36 PM, Allen Wittenauer >>wrote: >> >>> I'm clearly +1 on this idea. As part of the rewrite in >>>Hadoop of >>> test-patch, it was amazing to see how far and wide this bit of code as >>> spread. So I see consolidating everyone's efforts as a huge win for a >>> large number of projects. (esp considering how many I saw suffering >>>from a >>> variety of identified bugs! ) >>> >>> But…. >>> >>> I think it's important for people involved in those other >>>projects >>> to speak up and voice an opinion as to whether this is useful. >>> >>> To summarize: >>> >>> In the short term, a single location to get/use a precommit >>>patch >>> tester rather than everyone building/supporting their own in their >>>spare >>> time. >>> >>> FWIW, we've already got the code base modified to be >>>pluggable. >>> We've written some basic/simple plugins that support Hadoop, HBase, >>>Tajo, >>> Tez, Pig, and Flink. For HBase and Flink, this does include their >>>custom >>> checks. Adding support for other project shouldn't be hard. Simple >>> projects take almost no time after seeing the basic pattern. >>> >>> I think it's worthwhile highlighting that means support for >>>both >>> JIRA and GitHub as well as Ant and Maven from the same code base. >>> >>> Longer term: >>> >>> Well, we clearly have ideas of things that we want to do. >>>Adding >>> more features to test-patch (review board? gradle?) is obvious. But >>>what >>> about teasing apart and generalizing some of the other shell bits from >>> projects? A common library for building CLI tools to fault injection to >>> release documentation creation tools to … I'd even like to see us get >>>as >>> advanced as a "run this program to auto-generate daemon stop/start >>>bits". >>> >>> I had a few chats with people about this idea at Hadoop >>>Summit. >>> What's truly exciting are the ideas that people had once they realized >>>what >>> kinds of problems we're trying to solve. It's always amazing the >>>problems >>> that projects have that could be solved by these types of solutions. >>>Let's >>> stop hiding our cool toys in this area. >>> >>> So, what feedback and ideas do you have in this area? Are >>>you a >>> yay or a nay? >>> >>> >>> On Jun 15, 2015, at 4:47 PM, Sean Busbey wrote: >>> Oof. I had meant to push on this again but life got in the way and now >>> the June board meeting is upon us. Sorry everyone. In the event that this >>> ends up contentious, hopefully one of the copied communities can give us a branch to work in. I know everyone is busy, so here's the short version of this email: I'd like to move some of the code currently in Hadoop (test-patch) into a new TLP focused on QA tooling. I'm not sure what the best format for priming this conversation is. ORC filled in the incubator project proposal t
Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)
+1 (Have been talking to Sean in private on the subject -- seems appropriate to voice some public support) I'd be interested in this for Accumulo and Slider. For Accumulo, we've come a far way without a pre-commit build, primarily due to a CTR process. We have seen the repeated questions of "how do I run the tests" which a more automated workflow would help with, IMO. I think Slider could benefit with the same reasons. I'd also be giddy to see the recent improvements in Hadoop trickle down into the other projects that Allen already mentioned. Take this as record that I'd be happy to try to help out where possible. Sean Busbey wrote: thank you for making a more digestible version Allen. :) If you're interested in soliciting feedback from other projects, I created ASF short links to this thread in common-dev and hbase: * http://s.apache.org/yetus-discuss-hadoop * http://s.apache.org/yetus-discuss-hbase While I agree that it's important to get feedback from ASF projects that might find this useful, I can say that recently I've been involved in the non-ASF project YCSB and both the pretest and better shell stuff would be immensely useful over there. On Mon, Jun 15, 2015 at 10:36 PM, Allen Wittenauer wrote: I'm clearly +1 on this idea. As part of the rewrite in Hadoop of test-patch, it was amazing to see how far and wide this bit of code as spread. So I see consolidating everyone's efforts as a huge win for a large number of projects. (esp considering how many I saw suffering from a variety of identified bugs! ) But…. I think it's important for people involved in those other projects to speak up and voice an opinion as to whether this is useful. To summarize: In the short term, a single location to get/use a precommit patch tester rather than everyone building/supporting their own in their spare time. FWIW, we've already got the code base modified to be pluggable. We've written some basic/simple plugins that support Hadoop, HBase, Tajo, Tez, Pig, and Flink. For HBase and Flink, this does include their custom checks. Adding support for other project shouldn't be hard. Simple projects take almost no time after seeing the basic pattern. I think it's worthwhile highlighting that means support for both JIRA and GitHub as well as Ant and Maven from the same code base. Longer term: Well, we clearly have ideas of things that we want to do. Adding more features to test-patch (review board? gradle?) is obvious. But what about teasing apart and generalizing some of the other shell bits from projects? A common library for building CLI tools to fault injection to release documentation creation tools to … I'd even like to see us get as advanced as a "run this program to auto-generate daemon stop/start bits". I had a few chats with people about this idea at Hadoop Summit. What's truly exciting are the ideas that people had once they realized what kinds of problems we're trying to solve. It's always amazing the problems that projects have that could be solved by these types of solutions. Let's stop hiding our cool toys in this area. So, what feedback and ideas do you have in this area? Are you a yay or a nay? On Jun 15, 2015, at 4:47 PM, Sean Busbey wrote: Oof. I had meant to push on this again but life got in the way and now the June board meeting is upon us. Sorry everyone. In the event that this ends up contentious, hopefully one of the copied communities can give us a branch to work in. I know everyone is busy, so here's the short version of this email: I'd like to move some of the code currently in Hadoop (test-patch) into a new TLP focused on QA tooling. I'm not sure what the best format for priming this conversation is. ORC filled in the incubator project proposal template, but I'm not sure how much that confused the issue. So to start, I'll just write what I'm hoping we can accomplish in general terms here. All software development projects that are community based (that is, accepting outside contributions) face a common QA problem for vetting in-coming contributions. Hadoop is fortunate enough to be sufficiently popular that the weight of the problem drove tool development (i.e. test-patch). That tool is generalizable enough that a bunch of other TLPs have adopted their own forks. Unfortunately, in most projects this kind of QA work is an enabler rather than a primary concern, so often the tooling is worked on ad-hoc and little shared improvements happen across projects. Since the tooling itself is never a primary concern, any made is rarely reused outside of ASF projects. Over the last couple months a few of us have been working on generalizing the tooling present in the Hadoop code base (because it was the most mature out of all those in the various projects) and it's reached a point where we think we can start bringing on other downstream users. This means we need to start
Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)
thank you for making a more digestible version Allen. :) If you're interested in soliciting feedback from other projects, I created ASF short links to this thread in common-dev and hbase: * http://s.apache.org/yetus-discuss-hadoop * http://s.apache.org/yetus-discuss-hbase While I agree that it's important to get feedback from ASF projects that might find this useful, I can say that recently I've been involved in the non-ASF project YCSB and both the pretest and better shell stuff would be immensely useful over there. On Mon, Jun 15, 2015 at 10:36 PM, Allen Wittenauer wrote: > > I'm clearly +1 on this idea. As part of the rewrite in Hadoop of > test-patch, it was amazing to see how far and wide this bit of code as > spread. So I see consolidating everyone's efforts as a huge win for a > large number of projects. (esp considering how many I saw suffering from a > variety of identified bugs! ) > > But…. > > I think it's important for people involved in those other projects > to speak up and voice an opinion as to whether this is useful. > > To summarize: > > In the short term, a single location to get/use a precommit patch > tester rather than everyone building/supporting their own in their spare > time. > > FWIW, we've already got the code base modified to be pluggable. > We've written some basic/simple plugins that support Hadoop, HBase, Tajo, > Tez, Pig, and Flink. For HBase and Flink, this does include their custom > checks. Adding support for other project shouldn't be hard. Simple > projects take almost no time after seeing the basic pattern. > > I think it's worthwhile highlighting that means support for both > JIRA and GitHub as well as Ant and Maven from the same code base. > > Longer term: > > Well, we clearly have ideas of things that we want to do. Adding > more features to test-patch (review board? gradle?) is obvious. But what > about teasing apart and generalizing some of the other shell bits from > projects? A common library for building CLI tools to fault injection to > release documentation creation tools to … I'd even like to see us get as > advanced as a "run this program to auto-generate daemon stop/start bits". > > I had a few chats with people about this idea at Hadoop Summit. > What's truly exciting are the ideas that people had once they realized what > kinds of problems we're trying to solve. It's always amazing the problems > that projects have that could be solved by these types of solutions. Let's > stop hiding our cool toys in this area. > > So, what feedback and ideas do you have in this area? Are you a > yay or a nay? > > > On Jun 15, 2015, at 4:47 PM, Sean Busbey wrote: > > > Oof. I had meant to push on this again but life got in the way and now > the > > June board meeting is upon us. Sorry everyone. In the event that this > ends > > up contentious, hopefully one of the copied communities can give us a > > branch to work in. > > > > I know everyone is busy, so here's the short version of this email: I'd > > like to move some of the code currently in Hadoop (test-patch) into a new > > TLP focused on QA tooling. I'm not sure what the best format for priming > > this conversation is. ORC filled in the incubator project proposal > > template, but I'm not sure how much that confused the issue. So to start, > > I'll just write what I'm hoping we can accomplish in general terms here. > > > > All software development projects that are community based (that is, > > accepting outside contributions) face a common QA problem for vetting > > in-coming contributions. Hadoop is fortunate enough to be sufficiently > > popular that the weight of the problem drove tool development (i.e. > > test-patch). That tool is generalizable enough that a bunch of other TLPs > > have adopted their own forks. Unfortunately, in most projects this kind > of > > QA work is an enabler rather than a primary concern, so often the tooling > > is worked on ad-hoc and little shared improvements happen across > > projects. Since > > the tooling itself is never a primary concern, any made is rarely reused > > outside of ASF projects. > > > > Over the last couple months a few of us have been working on generalizing > > the tooling present in the Hadoop code base (because it was the most > mature > > out of all those in the various projects) and it's reached a point where > we > > think we can start bringing on other downstream users. This means we need > > to start establishing things like a release cadence and to grow the new > > contributors we have to handle more project responsibility. Personally, I > > think that means it's time to move out from under Hadoop to drive things > as > > our own community. Eventually, I hope the community can help draw in a > > group of folks traditionally underrepresented in ASF projects, namely QA > > and operations folks. > > > > I think test-patch by itself has enough scope to justify a proj
Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)
I'm clearly +1 on this idea. As part of the rewrite in Hadoop of test-patch, it was amazing to see how far and wide this bit of code as spread. So I see consolidating everyone's efforts as a huge win for a large number of projects. (esp considering how many I saw suffering from a variety of identified bugs! ) But…. I think it's important for people involved in those other projects to speak up and voice an opinion as to whether this is useful. To summarize: In the short term, a single location to get/use a precommit patch tester rather than everyone building/supporting their own in their spare time. FWIW, we've already got the code base modified to be pluggable. We've written some basic/simple plugins that support Hadoop, HBase, Tajo, Tez, Pig, and Flink. For HBase and Flink, this does include their custom checks. Adding support for other project shouldn't be hard. Simple projects take almost no time after seeing the basic pattern. I think it's worthwhile highlighting that means support for both JIRA and GitHub as well as Ant and Maven from the same code base. Longer term: Well, we clearly have ideas of things that we want to do. Adding more features to test-patch (review board? gradle?) is obvious. But what about teasing apart and generalizing some of the other shell bits from projects? A common library for building CLI tools to fault injection to release documentation creation tools to … I'd even like to see us get as advanced as a "run this program to auto-generate daemon stop/start bits". I had a few chats with people about this idea at Hadoop Summit. What's truly exciting are the ideas that people had once they realized what kinds of problems we're trying to solve. It's always amazing the problems that projects have that could be solved by these types of solutions. Let's stop hiding our cool toys in this area. So, what feedback and ideas do you have in this area? Are you a yay or a nay? On Jun 15, 2015, at 4:47 PM, Sean Busbey wrote: > Oof. I had meant to push on this again but life got in the way and now the > June board meeting is upon us. Sorry everyone. In the event that this ends > up contentious, hopefully one of the copied communities can give us a > branch to work in. > > I know everyone is busy, so here's the short version of this email: I'd > like to move some of the code currently in Hadoop (test-patch) into a new > TLP focused on QA tooling. I'm not sure what the best format for priming > this conversation is. ORC filled in the incubator project proposal > template, but I'm not sure how much that confused the issue. So to start, > I'll just write what I'm hoping we can accomplish in general terms here. > > All software development projects that are community based (that is, > accepting outside contributions) face a common QA problem for vetting > in-coming contributions. Hadoop is fortunate enough to be sufficiently > popular that the weight of the problem drove tool development (i.e. > test-patch). That tool is generalizable enough that a bunch of other TLPs > have adopted their own forks. Unfortunately, in most projects this kind of > QA work is an enabler rather than a primary concern, so often the tooling > is worked on ad-hoc and little shared improvements happen across > projects. Since > the tooling itself is never a primary concern, any made is rarely reused > outside of ASF projects. > > Over the last couple months a few of us have been working on generalizing > the tooling present in the Hadoop code base (because it was the most mature > out of all those in the various projects) and it's reached a point where we > think we can start bringing on other downstream users. This means we need > to start establishing things like a release cadence and to grow the new > contributors we have to handle more project responsibility. Personally, I > think that means it's time to move out from under Hadoop to drive things as > our own community. Eventually, I hope the community can help draw in a > group of folks traditionally underrepresented in ASF projects, namely QA > and operations folks. > > I think test-patch by itself has enough scope to justify a project. Having > a solid set of build tools that are customizable to fit the norms of > different software communities is a bunch of work. Making it work well in > both the context of automated test systems like Jenkins and for individual > developers is even more work. We could easily also take over maintenance of > things like shelldocs, since test-patch is the primary consumer of that > currently but it's generally useful tooling. > > In addition to test-patch, I think the proposed project has some future > growth potential. Given some adoption of test-patch to prove utility, the > project could build on the ties it makes to start building tools to help > projects do their own longer-run testing. Note that I'm talk
Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)
Oof. I had meant to push on this again but life got in the way and now the June board meeting is upon us. Sorry everyone. In the event that this ends up contentious, hopefully one of the copied communities can give us a branch to work in. I know everyone is busy, so here's the short version of this email: I'd like to move some of the code currently in Hadoop (test-patch) into a new TLP focused on QA tooling. I'm not sure what the best format for priming this conversation is. ORC filled in the incubator project proposal template, but I'm not sure how much that confused the issue. So to start, I'll just write what I'm hoping we can accomplish in general terms here. All software development projects that are community based (that is, accepting outside contributions) face a common QA problem for vetting in-coming contributions. Hadoop is fortunate enough to be sufficiently popular that the weight of the problem drove tool development (i.e. test-patch). That tool is generalizable enough that a bunch of other TLPs have adopted their own forks. Unfortunately, in most projects this kind of QA work is an enabler rather than a primary concern, so often the tooling is worked on ad-hoc and little shared improvements happen across projects. Since the tooling itself is never a primary concern, any made is rarely reused outside of ASF projects. Over the last couple months a few of us have been working on generalizing the tooling present in the Hadoop code base (because it was the most mature out of all those in the various projects) and it's reached a point where we think we can start bringing on other downstream users. This means we need to start establishing things like a release cadence and to grow the new contributors we have to handle more project responsibility. Personally, I think that means it's time to move out from under Hadoop to drive things as our own community. Eventually, I hope the community can help draw in a group of folks traditionally underrepresented in ASF projects, namely QA and operations folks. I think test-patch by itself has enough scope to justify a project. Having a solid set of build tools that are customizable to fit the norms of different software communities is a bunch of work. Making it work well in both the context of automated test systems like Jenkins and for individual developers is even more work. We could easily also take over maintenance of things like shelldocs, since test-patch is the primary consumer of that currently but it's generally useful tooling. In addition to test-patch, I think the proposed project has some future growth potential. Given some adoption of test-patch to prove utility, the project could build on the ties it makes to start building tools to help projects do their own longer-run testing. Note that I'm talking about the tools to build QA processes and not a particular set of tested components. Specifically, I think the ChaosMonkey work that's in HBase should be generalizable as a fault injection framework (either based on that code or something like it). Doing this for arbitrary software is obviously very difficult, and a part of easing that will be to make (and then favor) tooling to allow projects to have operational glue that looks the same. Namely, the shell work that's been done in hadoop-functions.sh would be a great foundational layer that could bring good daemon handling practices to a whole slew of software projects. In the event that these frameworks and tools get adopted by parts of the Hadoop ecosystem, that could make the job of i.e. Bigtop substantially easier. I've reached out to a few folks who have been involved in the current test-patch work or expressed interest in helping out on getting it used in other projects. Right now, the proposed PMC would be (alphabetical by last name): * Andrew Bayer (ASF member, incubator pmc, bigtop pmc, flume pmc, jclouds pmc, sqoop pmc, all around Jenkins expert) * Sean Busbey (ASF member, accumulo pmc, hbase pmc) * Nick Dimiduk (hbase pmc, phoenix pmc) * Chris Nauroth (ASF member, incubator pmc, hadoop pmc) * Andrew Purtell (ASF member, incubator pmc, bigtop pmc, hbase pmc, phoenix pmc) * Allen Wittenauer (hadoop committer) That PMC gives us several members and a bunch of folks familiar with the ASF. Combined with the code already existing in Apache spaces, I think that gives us sufficient justification for a direct board proposal. The planned project name is "Apache Yetus". It's an archaic genus of sea snail and most of our project will be focused on shell scripts. N.b.: this does not mean that the Hadoop community would _have_ to rely on the new TLP, but I hope that once we have a release that can be evaluated there'd be enough benefit to strongly encourage it. This has mostly been focused on scope and community issues, and I'd love to talk through any feedback on that. Additionally, are there any other points folks want to make sure are covered before we have a resolution? On Sat, Jun 6, 2015
[DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)
Sorry for the resend. I figured this deserves a [DISCUSS] flag. On Sat, Jun 6, 2015 at 10:39 PM, Sean Busbey wrote: > Hi Folks! > > After working on test-patch with other folks for the last few months, I > think we've reached the point where we can make the fastest progress > towards the goal of a general use pre-commit patch tester by spinning > things into a project focused on just that. I think we have a mature enough > code base and a sufficient fledgling community, so I'm going to put > together a tlp proposal. > > Thanks for the feedback thus far from use within Hadoop. I hope we can > continue to make things more useful. > > -Sean > > On Wed, Mar 11, 2015 at 5:16 PM, Sean Busbey wrote: > >> HBase's dev-support folder is where the scripts and support files live. >> We've only recently started adding anything to the maven builds that's >> specific to jenkins[1]; so far it's diagnostic stuff, but that's where I'd >> add in more if we ran into the same permissions problems y'all are having. >> >> There's also our precommit job itself, though it isn't large[2]. AFAIK, >> we don't properly back this up anywhere, we just notify each other of >> changes on a particular mail thread[3]. >> >> [1]: https://github.com/apache/hbase/blob/master/pom.xml#L1687 >> [2]: https://builds.apache.org/job/PreCommit-HBASE-Build/ (they're all >> read because I just finished fixing "mvn site" running out of permgen) >> [3]: http://s.apache.org/NT0 >> >> >> On Wed, Mar 11, 2015 at 4:51 PM, Chris Nauroth >> wrote: >> >>> Sure, thanks Sean! Do we just look in the dev-support folder in the >>> HBase >>> repo? Is there any additional context we need to be aware of? >>> >>> Chris Nauroth >>> Hortonworks >>> http://hortonworks.com/ >>> >>> >>> >>> >>> >>> >>> On 3/11/15, 2:44 PM, "Sean Busbey" wrote: >>> >>> >+dev@hbase >>> > >>> >HBase has recently been cleaning up our precommit jenkins jobs to make >>> >them >>> >more robust. From what I can tell our stuff started off as an earlier >>> >version of what Hadoop uses for testing. >>> > >>> >Folks on either side open to an experiment of combining our precommit >>> >check >>> >tooling? In principle we should be looking for the same kinds of things. >>> > >>> >Naturally we'll still need different jenkins jobs to handle different >>> >resource needs and we'd need to figure out where stuff eventually lives, >>> >but that could come later. >>> > >>> >On Wed, Mar 11, 2015 at 4:34 PM, Chris Nauroth < >>> cnaur...@hortonworks.com> >>> >wrote: >>> > >>> >> The only thing I'm aware of is the failOnError option: >>> >> >>> >> >>> >> >>> http://maven.apache.org/plugins/maven-clean-plugin/examples/ignoring-erro >>> >>rs >>> >> .html >>> >> >>> >> >>> >> I prefer that we don't disable this, because ignoring different kinds >>> of >>> >> failures could leave our build directories in an indeterminate state. >>> >>For >>> >> example, we could end up with an old class file on the classpath for >>> >>test >>> >> runs that was supposedly deleted. >>> >> >>> >> I think it's worth exploring Eddy's suggestion to try simulating >>> failure >>> >> by placing a file where the code expects to see a directory. That >>> might >>> >> even let us enable some of these tests that are skipped on Windows, >>> >> because Windows allows access for the owner even after permissions >>> have >>> >> been stripped. >>> >> >>> >> Chris Nauroth >>> >> Hortonworks >>> >> http://hortonworks.com/ >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> On 3/11/15, 2:10 PM, "Colin McCabe" wrote: >>> >> >>> >> >Is there a maven plugin or setting we can use to simply remove >>> >> >directories that have no executable permissions on them? Clearly we >>> >> >have the permission to do this from a technical point of view (since >>> >> >we created the directories as the jenkins user), it's simply that the >>> >> >code refuses to do it. >>> >> > >>> >> >Otherwise I guess we can just fix those tests... >>> >> > >>> >> >Colin >>> >> > >>> >> >On Tue, Mar 10, 2015 at 2:43 PM, Lei Xu wrote: >>> >> >> Thanks a lot for looking into HDFS-7722, Chris. >>> >> >> >>> >> >> In HDFS-7722: >>> >> >> TestDataNodeVolumeFailureXXX tests reset data dir permissions in >>> >> >>TearDown(). >>> >> >> TestDataNodeHotSwapVolumes reset permissions in a finally clause. >>> >> >> >>> >> >> Also I ran mvn test several times on my machine and all tests >>> passed. >>> >> >> >>> >> >> However, since in DiskChecker#checkDirAccess(): >>> >> >> >>> >> >> private static void checkDirAccess(File dir) throws >>> >>DiskErrorException { >>> >> >> if (!dir.isDirectory()) { >>> >> >> throw new DiskErrorException("Not a directory: " >>> >> >> + dir.toString()); >>> >> >> } >>> >> >> >>> >> >> checkAccessByFileMethods(dir); >>> >> >> } >>> >> >> >>> >> >> One potentially safer alternative is replacing data dir with a >>> >>regular >>> >> >> file to stimulate disk failures. >>> >> >> >>> >> >> On Tue, Mar 10, 2015 at 2:19 PM, Chris