RE: [ASFCS42] Proposed schedule for our next release

Will Chan Fri, 19 Apr 2013 14:59:26 -0700


> -----Original Message-----
> From: David Nalley [mailto:da...@gnsa.us]
> Sent: Thursday, April 18, 2013 6:41 PM
> To: dev@cloudstack.apache.org
> Subject: Re: [ASFCS42] Proposed schedule for our next release
> 
> On Thu, Apr 18, 2013 at 6:26 PM, Will Chan <will.c...@citrix.com> wrote:
> >
> > > -----Original Message-----
> > > From: Chip Childers [mailto:chip.child...@sungard.com]
> > > Sent: Monday, April 15, 2013 7:22 AM
> > > To: dev@cloudstack.apache.org
> > > Cc: cloudstack-...@incubator.apache.org
> > > Subject: Re: [ASFCS42] Proposed schedule for our next release
> > >
> > > On Thu, Apr 11, 2013 at 02:50:02PM -0700, Animesh Chaturvedi wrote:
> > > >
> > > > I want to call out my concern on technical debt we have
> > > > accumulated so
> > > far.
> > > >
> > > >  I did an analysis on JIRA bugs yesterday night PST on "Affects
> > > > Version = 4.1" and created since Dec 2012
> > > >
> > > > Total records : 429
> > > > Resolution Type (Invalid, Duplicate, Cannot reproduce etc.) : 87
> > > > (30 Blockers, 27 Critical, 27 Major, 4 Minor) Valid Defects  :
> > > > 429-87= 342 Fixed : 246 (60 Blockers, 70 Critical, 99 Majors) out
> > > > of which 217 were fixed since Feb Unresolved : 96 (1 Blocker, 8
> > > > Critical, 64 Major)
> > > >
> > > > With this data it looks like we have fixed 2/3 of valid defects in
> > > > little over
> > > 2 months and pretty much deferring around 1/3 rd of issues for
> > > future release.
> > > >
> > > > I also looked at overall backlog of bugs (Critical, Major and
> > > > Blockers only)
> > > as of 4/10/2013 - 10:0PM PST.
> > > >
> > > > 284 open (18 Blocker, 38 Critical, 228 Major) ; By Fix version
> > > >     -  Release 4.0.x and prior: 13
> > > >     -  4.1: 70
> > > >     -  4.2 : 97
> > > >     -  Future: 8
> > > >     -  No version: 107
> > > >
> > > > Looking at that we fixed 217 bugs in roughly 2 months during 4.1
> > > > cycle,
> > > fixing the backlog of bug  will probably take us 2 months.  Should
> > > we extend the 4.2 test cycle by 2 months [Original Schedule: 6/1 -
> > > 7/22, Extended
> > > Schedule: 6/1-9/22] to reduce the technical debt significantly? I
> > > would like to hear how community wants to address technical debt.
> > > Based on the input and consensus I will publish the agreed schedule
> next week.
> > > >
> > > >
> > >
> > > I don't think that an extension of time changes bug counts really.
> > > IMO, we need to pull together to have some bug-fix focused effort
> > > applied to the code-base.  It's also another reason that I'm so big
> > > on making sure that automated tests come in with the new features.
> > > That doesn't address test scenarios that human testers can come up
> > > with, but if a developer spends the time to think about testing the
> > > basic feature and codifies that, we should at least avoid the "this
> actually doesn't work at all" types of bugs.
> > >
> > > There's a school of thought that says, don't build another feature
> > > until you have sorted out the known bugs in the current features.  I
> > > don't think we could really pull that off, but perhaps a different
> > > thread to rally people around the bug backlog is in order?
> > >
> > > -chip
> >
> > Sorry to chime in so late to this thread as I've been offsite for the better
> part of this week.  I was one of the original 4 month release crowd but after
> the recent two releases of ACS, I'm starting to wonder if we shouldn't start
> moving this to a 6 month cycle instead of two.  Here are some high level
> observations based on the previous two releases:
> >
> > 1. It doesn't seem like we are on a true 4 month time based release
> > schedule.  Both 4.0 and 4.1 were delayed more than several weeks past
> the original proposed GA date.  4.0 was released 11/6 and let's assume that
> 4.1 will ship within a week or two.  That's almost a 6 month release cycle.
> 
> So both 4.0 and 4.1 strike me as extraordinary. 4.0 was our first release -
> and we had lots of issues to resolve. 4.1 introduced a ton of packaging and
> name changes that I also consider to be hopefully one time. Really - we've
> only been through our release cycle once, so I am not ready to declare it
> perpetually behind schedule.
> 
> 
> > Every release incurs a fixed cost of release notes, upgrade testing,
> > etc. that I suspect at least eats a month worth of time depending on
> > people's schedule.  That's 3 months out of the year rather than two if
> > we can get a 6 months cycle.  We can use that extra month for other
> purposes if need be.  I suppose if we want to continue to release past the
> proposed hard GA date, then I guess it doesn't matter if it's 4 or 6 months.
> It's basically a release when the release mgmt. team feels it's right to
> release based on current bugs, etc.
> >
> 
> Having seen the point releases twice now, which still need upgrade testing,
> release notes, etc I don't get the feeling that the 'overhread' referred to
> above is the problem. Joe may disagree with me.
> 
> > 2. As more and more features/development go in, it just means more
> > destabilization of the code.  4.0 was delayed and the majority of that
> > work was licensing files.  4.1 got just a bit more complicated with new
> feature development and the delay is now much longer.  Not all features are
> created equal in terms of testing.  Some may require more time to develop
> but may not impact the entire system like for example, adding a new
> hypervisor.
> > However, work like refactoring vm sync or other more internal code
> > could affect the entire stack and require more QA time.  We need extra
> time for new code to settle in.
> >
> 
> I wonder why we would merge feature that we can't prove doesn't break
> the entire stack and prove that it works. Some of this is the missing
> automation you talk about below. Essentially we have no way, sometimes
> until months after the merge, to tell if something works or not because we
> relay on manual QA to test it.
> 
> > 3. ACS is still dependent largely on manual QA.  Let's face it, our
> automated testing/unit testing isn't mature enough quite yet and we
> cannot always expect manual QA to be there and on ACS schedule.
> CloudStack releases have some type of quality expectations as well as
> support for upgrades.  Upgrades and migration scripts aren't that easily
> automatable.  Chip and others have been very diligent on ensuring that
> code check in has the appropriate tests but it's not there yet.
> >
> > 4. ACS development is based on volunteer work and many of us have a
> $dayjob and may not be able to assist with fixing bugs in ACS schedule.
> Having only a couple of months to fix bugs and expect others to follow our
> ACS schedule seems a bit rushed.  Wearing my Citrix hat now, I can tell you
> that 2 months of QA and bug fixing  is not enough to release quality GA
> release.  And that is with me breathing down the necks of many of the
> engineers to get them fixed on time.  ACS does not have this type of culture
> and nor should it.   Given that, we should be a bit more flexible in terms of
> allowing people eventually to act on issues.
> >
> 
> 
> So a couple of other comments.
> We have folks clamoring for the awesome new features. To the point they
> are creating derivative works (which tells me we are doing some things right
> as folks are finding it easy enough to do)
> 
> What I gathered from reading the above doesn't really have anything to do
> with schedule:
> * New development destabilizes our code base, and is a threat to quality
> and the release schedule
> * We can not depend on the current level of manual QA to be present going
> forward.
> 
> This brings me to conclusion that as a community we should seriously
> temper our inclusion of new features and make our focus automated
> testing until such time as pushing a release out is less months of manual QA
> processes and more of a decision. This makes me want to raise the barrier
> for merges even higher. Perhaps running the entire Marvin suite with the
> proposed merge is what we need to begin mandating.
> 
> --David "who wishes he had kept working on Automated QA tasks" Nalley :)


One of the reasons if I remember correctly for the 4 month cycle is to release 
features faster?  I think by now, we have enough evidence that automated QA is 
not quite there yet and according to Joe, there is definitely a "fixed" cost of 
time to every major release.  I'm not against 4 months release cycle, but I am 
stating that perhaps we are at a stage where we can benefit more from a longer 
release with longer QA cycles.  I'm not proposing that with a 6 month cycle we 
push the code freeze date so we can incorporate new features.  I'm proposing 
that we give the community more time to QA, fix bugs, add unit/marven test 
cases with the bug fixes so we can have a stable base.  As Chip said, the more 
stable the base, the faster we can start adding more features.  Perhaps by that 
time, we can go back to a 4 month release cycle if the community wants that.

Will

RE: [ASFCS42] Proposed schedule for our next release

Reply via email to