On Tue, Aug 19, 2014 at 7:15 PM, David Nalley <da...@gnsa.us> wrote: > > > > IMHO we should not even release 4.5 until we have a agreed upon: > > > > -what our issues are and why we released 4.4 and 4.3 late. > > -taken action to resolve those issues > > -guarantees that 4.5 will be on time > > > > If we don't do that, I don't even know why we are putting ourselves > through the pain of a release schedule. > > > > So I've been trying to give this some thought. Here's my current line > of thinking. > > The issues with late releases are not a function of our release > process per se; but are instead a function of our development process. > CloudStack is a relatively large codebase. It has a lots of points > that interact with each other, and it's moderately complex. > Development moves forward and at least happy-path testing is done for > new features, but the range of options is so large that testing > everything is a bit difficult. When someone makes a merge request; I > suspect few people do much looking. Understandable, it's a boring > task; and really looking doesn't tell us much except for style and > egregious errors. We've rarely done mandatory testing of feature > branches before they are merged in. If you want to ship on time, you > must ensure that we are vociferously guarding the quality of the > master and release branches; that we can verify programmatically that > a commit or merge doesn't break things. We must insist on automated > testing being added. > > So I've said all of that to say that I think that ship has sailed for > 4.5. We are well past feature freeze; and we didn't really have any > gating functionality. We frankly have very little idea of quality of > whats in master right now. It's certainly worse than 4.4. So now we'll > enter code freeze, we'll try and play catch up and fix all of the > things we discover that are broken. And invariably, we'll be late > again. > > If you want to solve this problem; my personal belief is that its > really is tied to CI. Efforts around Travis are interesting and > perhaps are a piece of that puzzle. Discussions around running CI are > important as well, but I truly believe that we need a gating function > that prohibits commits that increase our level of untested code or > code that fails to pass testing. I've seen some other projects using > pull requests in github, and then using the github pull request > builder[1] for jenkins to verify that every PR works. I know we've > talked about gerrit previously, and perhaps that will work as well. > > [1] https://wiki.cloudbees.com/bin/view/DEV/Github+Pull+Request+Validation >
A lot of valid points. One of my personal beliefs is that we need a better CI solution/system, rather than depending on the simulator. There seems to be some sort of consensus/habit of trusting the simulator output, but both 4.3 and 4.4 has/had some _serious_ issues that the simulator didn't/doesn't catch. Sometimes the problem lies in the systemvm, sometimes it could be a quirk on the hypervisor, it doesn't really matter when you don't actually deploy the essential parts of an acs installation to test them. Unfortunately I have no good solution, for that I am far too new to the whole acs ecosystem -- Erik