On Tue, Dec 01, 2015 at 05:10:57PM -0800, Devananda van der Veen wrote: > On Tue, Dec 1, 2015 at 3:22 AM, Steven Hardy <[email protected]> wrote: > > On Mon, Nov 30, 2015 at 03:35:13PM -0800, Devananda van der Veen wrote: > >Â Â On Mon, Nov 30, 2015 at 3:07 PM, Zane Bitter <[email protected]> > wrote: > > > >Â Â Â On 30/11/15 12:51, Ruby Loo wrote: > > > >Â Â Â Â On 30 November 2015 at 10:19, Derek Higgins > <[email protected] > >Â Â Â Â <mailto:[email protected]>> wrote: > > > >Â Â Â Â Ã*Â Ã*Â Hi All, > > > >Â Â Â Â Ã*Â Ã*Â Ã*Â Ã*Â Ã*Â A few months tripleo switch from > its devtest based CI to > >Â Â Â Â one > >Â Â Â Â Ã*Â Ã*Â that was based on instack. Before doing this we > anticipated > >Â Â Â Â Ã*Â Ã*Â disruption in the ci jobs and removed them from > non tripleo > >Â Â Â Â projects. > > > >Â Â Â Â Ã*Â Ã*Â Ã*Â Ã*Â Ã*Â We'd like to investigate adding it > back to heat and > >Â Â Â Â ironic as > >Â Â Â Â Ã*Â Ã*Â these are the two projects where we find our ci > provides the > >Â Â Â Â most > >Â Â Â Â Ã*Â Ã*Â value. But we can only do this if the results > from the job are > >Â Â Â Â Ã*Â Ã*Â treated as voting. > > > >Â Â Â Â What does this mean? That the tripleo job could vote and do > a -1 and > >Â Â Â Â block ironic's gate? > > > >Â Â Â Â Ã*Â Ã*Â Ã*Â Ã*Â Ã*Â In the past most of the non tripleo > projects tended to > >Â Â Â Â ignore > >Â Â Â Â Ã*Â Ã*Â the results from the tripleo job as it wasn't > unusual for the > >Â Â Â Â job to > >Â Â Â Â Ã*Â Ã*Â broken for days at a time. The thing is, ignoring > the results of > >Â Â Â Â the > >Â Â Â Â Ã*Â Ã*Â job is the reason (the majority of the time) it > was broken in > >Â Â Â Â the > >Â Â Â Â Ã*Â Ã*Â first place. > >Â Â Â Â Ã*Â Ã*Â Ã*Â Ã*Â Ã*Â To decrease the number of breakages > we are now no longer > >Â Â Â Â Ã*Â Ã*Â running master code for everything (for the non > tripleo projects > >Â Â Â Â we > >Â Â Â Â Ã*Â Ã*Â bump the versions we use periodically if they are > working). I > >Â Â Â Â Ã*Â Ã*Â believe with this model the CI jobs we run have > become a lot > >Â Â Â Â more > >Â Â Â Â Ã*Â Ã*Â reliable, there are still breakages but far less > frequently. > > > >Â Â Â Â Ã*Â Ã*Â What I proposing is we add at least one of our > tripleo jobs back > >Â Â Â Â to > >Â Â Â Â Ã*Â Ã*Â both heat and ironic (and other projects > associated with them > >Â Â Â Â e.g. > >Â Â Â Â Ã*Â Ã*Â clients, ironicinspector etc..), tripleo will > switch to running > >Â Â Â Â Ã*Â Ã*Â latest master of those repositories and the cores > approving on > >Â Â Â Â those > >Â Â Â Â Ã*Â Ã*Â projects should wait for a passing CI jobs before > hitting > >Â Â Â Â approve. > >Â Â Â Â Ã*Â Ã*Â So how do people feel about doing this? can we > give it a go? A > >Â Â Â Â Ã*Â Ã*Â couple of people have already expressed an > interest in doing > >Â Â Â Â this > >Â Â Â Â Ã*Â Ã*Â but I'd like to make sure were all in agreement > before switching > >Â Â Â Â it on. > > > >Â Â Â Â This seems to indicate that the tripleo jobs are > non-voting, or at > >Â Â Â Â least > >Â Â Â Â won't block the gate -- so I'm fine with adding tripleo > jobs to > >Â Â Â Â ironic. > >Â Â Â Â But if you want cores to wait/make sure they pass, then > shouldn't they > >Â Â Â Â be voting? (Guess I'm a bit confused.) > > > >Â Â Â +1 > > > >Â Â Â I don't think it hurts to turn it on, but tbh I'm > uncomfortable with the > >Â Â Â mental overhead of a non-voting job that I have to manually > treat as a > >Â Â Â voting job. If it's stable enough to make it a voting job, I'd > prefer we > >Â Â Â just make it voting. And if it's not then I'd like to see it > be made > >Â Â Â stable enough to be a voting job and then make it voting. > > > >Â Â This is roughly where I sit as well -- if it's non-voting, > experience > >Â Â tells me that it will largely be ignored, and as such, isn't a > good use of > >Â Â resources. > > I'm sure you can appreciate it's something of a chicken/egg problem > though > - if everyone always ignores non-voting jobs, they never become voting. > > That effect is magnified with TripleO though, because it consumes so > many > OpenStack projects, any one of which has the capability to break our CI, > so > in an ideal world we'd have voting feedback on all-the-things, but > that's > not where we are right now due in large-part to the steady stream of > regressions (from Heat, Ironic and other projects). > >Â Â I haven't looked at tripleo or tripleoci in a while, so I wont > assume that > >Â Â my recollection of the CI jobs bears any resemblance to what > exists today. > >Â Â Could you explain what areas of ironic (or its subprojects) will > be > >Â Â covered by these tests?Ã*Â If they are already covered by > existing tests, > >Â Â then I don't see the benefit of adding another job; conversely, > if this is > >Â Â testing areas we don't cover today, then there's probably value > in running > >Â Â tripleoci in a voting fashion for now and then moving that > coverage into > >Â Â ironic's project testing. > > I like to think of TripleO as a trunk-chasing "power user", and as such > gives very valuable "user" feedback, including breaking things in > exciting > ways you hadn't anticipated in your project integration tests. > > This has, in the case of Heat at least, made TripleO an extremely > effective > "kitchen sink" stress test, and has uncovered numerous issues we failed > to > find with out internal tests (obviously we do add coverage when we find > them). > > In the case of Ironic, I think the usage is somewhat less demanding, but > no > less "real world" - here's a good example for you: > > https://bugs.launchpad.net/ironic/+bug/1507738 > > In this case, Ironic landed a change to master, which broke all existing > deployments using Centos/RHEL derived distributions, so master Ironic > has > been broken for folks using those distros for over 6 weeks. > > I know in that case, the problem was really old ipxe image in the > distro, > and yes there were several possible workarounds, but as a developer who > cares about users, I personally would rather get gate feedback than > angry > users on IRC/email when I unwittingly break the world for them ;) > > (note, I'm not assigning any blame above, it's one of *many* examples of > unexpected breakage due to insufficient gate feedback of real usage > accross > many projects). > > Great example, Steve, and I agree that more and faster feedback from users > into patches is a good thing. I'm also sad that it was broken for that > long and no one raised the issue in our meeting until this week. > This particular bug highlights a gap in Ironic's test coverage which I > would be delighted if someone wants to close -- that we aren't testing > support for RH-based distros. Closing that gap doesn't require TripleoCI > at all; we should simply add a dsvm job for Ironic on Fedora, using a > Fedora-based ramdisk. That will help prevent similar regressions in the > future. > Anyway, I have big reservations about putting TripleoCI on a path to ever > gating Ironic patches. I started to bikeshed on that and then deleted it > ... tldr; I believe it is important for this job to vote in a non-gating > way. As a reviewer, I'm unlikely to pay attention to it if it doesn't > vote, and there's a good reason for this: > Non-voting jobs are used for experimentation. A non-voting job is a job > that we want to vote, but which we don't trust enough yet. It has been > promoted from the experimental pipeline to the check pipeline so that it > gets a lot more runs and so that we can stabilize it enough to make it > voting.
Ah, I think all we have here is a terminology mismatch around "non voting" vs "non gating". AFAIK what is being proposed is to reinstate the TripleO jobs so they *do* vote on any change (+1/-1), but they do not block the gate, so we won't get in the way if occasional outages happen. > I was going to suggest that tripleoci vote as a third party CI system (I > know, it's not actually a third-party CI system, but I'd like to vote like > one). And then I noticed that it used to do just that. [0] If I'm > interpreting it correctly, the "gate-tripleo-ironic*" jobs voted from a > separate account, left an informative -1, but did not block the gate. > That's exactly what I would like in this case. +1, I think that's what's being proposed, so we're in agreement! :) Steve __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: [email protected]?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
