> -----Original Message----- > From: David Nalley [mailto:da...@gnsa.us] > Sent: Thursday, April 18, 2013 6:41 PM > To: dev@cloudstack.apache.org > Subject: Re: [ASFCS42] Proposed schedule for our next release > > On Thu, Apr 18, 2013 at 6:26 PM, Will Chan <will.c...@citrix.com> wrote: > > > > > -----Original Message----- > > > From: Chip Childers [mailto:chip.child...@sungard.com] > > > Sent: Monday, April 15, 2013 7:22 AM > > > To: dev@cloudstack.apache.org > > > Cc: cloudstack-...@incubator.apache.org > > > Subject: Re: [ASFCS42] Proposed schedule for our next release > > > > > > On Thu, Apr 11, 2013 at 02:50:02PM -0700, Animesh Chaturvedi wrote: > > > > > > > > I want to call out my concern on technical debt we have > > > > accumulated so > > > far. > > > > > > > > I did an analysis on JIRA bugs yesterday night PST on "Affects > > > > Version = 4.1" and created since Dec 2012 > > > > > > > > Total records : 429 > > > > Resolution Type (Invalid, Duplicate, Cannot reproduce etc.) : 87 > > > > (30 Blockers, 27 Critical, 27 Major, 4 Minor) Valid Defects : > > > > 429-87= 342 Fixed : 246 (60 Blockers, 70 Critical, 99 Majors) out > > > > of which 217 were fixed since Feb Unresolved : 96 (1 Blocker, 8 > > > > Critical, 64 Major) > > > > > > > > With this data it looks like we have fixed 2/3 of valid defects in > > > > little over > > > 2 months and pretty much deferring around 1/3 rd of issues for > > > future release. > > > > > > > > I also looked at overall backlog of bugs (Critical, Major and > > > > Blockers only) > > > as of 4/10/2013 - 10:0PM PST. > > > > > > > > 284 open (18 Blocker, 38 Critical, 228 Major) ; By Fix version > > > > - Release 4.0.x and prior: 13 > > > > - 4.1: 70 > > > > - 4.2 : 97 > > > > - Future: 8 > > > > - No version: 107 > > > > > > > > Looking at that we fixed 217 bugs in roughly 2 months during 4.1 > > > > cycle, > > > fixing the backlog of bug will probably take us 2 months. Should > > > we extend the 4.2 test cycle by 2 months [Original Schedule: 6/1 - > > > 7/22, Extended > > > Schedule: 6/1-9/22] to reduce the technical debt significantly? I > > > would like to hear how community wants to address technical debt. > > > Based on the input and consensus I will publish the agreed schedule > next week. > > > > > > > > > > > > > > I don't think that an extension of time changes bug counts really. > > > IMO, we need to pull together to have some bug-fix focused effort > > > applied to the code-base. It's also another reason that I'm so big > > > on making sure that automated tests come in with the new features. > > > That doesn't address test scenarios that human testers can come up > > > with, but if a developer spends the time to think about testing the > > > basic feature and codifies that, we should at least avoid the "this > actually doesn't work at all" types of bugs. > > > > > > There's a school of thought that says, don't build another feature > > > until you have sorted out the known bugs in the current features. I > > > don't think we could really pull that off, but perhaps a different > > > thread to rally people around the bug backlog is in order? > > > > > > -chip > > > > Sorry to chime in so late to this thread as I've been offsite for the better > part of this week. I was one of the original 4 month release crowd but after > the recent two releases of ACS, I'm starting to wonder if we shouldn't start > moving this to a 6 month cycle instead of two. Here are some high level > observations based on the previous two releases: > > > > 1. It doesn't seem like we are on a true 4 month time based release > > schedule. Both 4.0 and 4.1 were delayed more than several weeks past > the original proposed GA date. 4.0 was released 11/6 and let's assume that > 4.1 will ship within a week or two. That's almost a 6 month release cycle. > > So both 4.0 and 4.1 strike me as extraordinary. 4.0 was our first release - > and we had lots of issues to resolve. 4.1 introduced a ton of packaging and > name changes that I also consider to be hopefully one time. Really - we've > only been through our release cycle once, so I am not ready to declare it > perpetually behind schedule. > > > > Every release incurs a fixed cost of release notes, upgrade testing, > > etc. that I suspect at least eats a month worth of time depending on > > people's schedule. That's 3 months out of the year rather than two if > > we can get a 6 months cycle. We can use that extra month for other > purposes if need be. I suppose if we want to continue to release past the > proposed hard GA date, then I guess it doesn't matter if it's 4 or 6 months. > It's basically a release when the release mgmt. team feels it's right to > release based on current bugs, etc. > > > > Having seen the point releases twice now, which still need upgrade testing, > release notes, etc I don't get the feeling that the 'overhread' referred to > above is the problem. Joe may disagree with me. > > > 2. As more and more features/development go in, it just means more > > destabilization of the code. 4.0 was delayed and the majority of that > > work was licensing files. 4.1 got just a bit more complicated with new > feature development and the delay is now much longer. Not all features are > created equal in terms of testing. Some may require more time to develop > but may not impact the entire system like for example, adding a new > hypervisor. > > However, work like refactoring vm sync or other more internal code > > could affect the entire stack and require more QA time. We need extra > time for new code to settle in. > > > > I wonder why we would merge feature that we can't prove doesn't break > the entire stack and prove that it works. Some of this is the missing > automation you talk about below. Essentially we have no way, sometimes > until months after the merge, to tell if something works or not because we > relay on manual QA to test it. > > > 3. ACS is still dependent largely on manual QA. Let's face it, our > automated testing/unit testing isn't mature enough quite yet and we > cannot always expect manual QA to be there and on ACS schedule. > CloudStack releases have some type of quality expectations as well as > support for upgrades. Upgrades and migration scripts aren't that easily > automatable. Chip and others have been very diligent on ensuring that > code check in has the appropriate tests but it's not there yet. > > > > 4. ACS development is based on volunteer work and many of us have a > $dayjob and may not be able to assist with fixing bugs in ACS schedule. > Having only a couple of months to fix bugs and expect others to follow our > ACS schedule seems a bit rushed. Wearing my Citrix hat now, I can tell you > that 2 months of QA and bug fixing is not enough to release quality GA > release. And that is with me breathing down the necks of many of the > engineers to get them fixed on time. ACS does not have this type of culture > and nor should it. Given that, we should be a bit more flexible in terms of > allowing people eventually to act on issues. > > > > > So a couple of other comments. > We have folks clamoring for the awesome new features. To the point they > are creating derivative works (which tells me we are doing some things right > as folks are finding it easy enough to do) > > What I gathered from reading the above doesn't really have anything to do > with schedule: > * New development destabilizes our code base, and is a threat to quality > and the release schedule > * We can not depend on the current level of manual QA to be present going > forward. > > This brings me to conclusion that as a community we should seriously > temper our inclusion of new features and make our focus automated > testing until such time as pushing a release out is less months of manual QA > processes and more of a decision. This makes me want to raise the barrier > for merges even higher. Perhaps running the entire Marvin suite with the > proposed merge is what we need to begin mandating. > > --David "who wishes he had kept working on Automated QA tasks" Nalley :)
One of the reasons if I remember correctly for the 4 month cycle is to release features faster? I think by now, we have enough evidence that automated QA is not quite there yet and according to Joe, there is definitely a "fixed" cost of time to every major release. I'm not against 4 months release cycle, but I am stating that perhaps we are at a stage where we can benefit more from a longer release with longer QA cycles. I'm not proposing that with a 6 month cycle we push the code freeze date so we can incorporate new features. I'm proposing that we give the community more time to QA, fix bugs, add unit/marven test cases with the bug fixes so we can have a stable base. As Chip said, the more stable the base, the faster we can start adding more features. Perhaps by that time, we can go back to a 4 month release cycle if the community wants that. Will