What: Proposal for a new release methodology for the Open MPI Project.

Why: We have [at least] 2 competing forces in Open MPI:
  - desire to release new features quickly.  Fast is good.
  - desire to release based on production quality.  Slow is good.

  The competition between these two forces has both created some
  tension in the Open MPI community as well as created a Very Long
  release cycle for OMPI v1.3 (yes, it was our specific and deliberate
  choice to be feature driven -- but it was still verrrrry loooong).

How: Take some ideas from other well-established release paradigms, such as:
  - Linux kernel "odd/even" version number release methodology
  - Red Hat/Fedora stable vs. feature releases
  - Agile development models

When: For all releases after the v1.3 series (i.e., this proposal does
not include any releases in the v1.3 series).

--> Ralph and I will talk through all the details and answer any
    questions on tomorrow's teleconference (Tue, 17 Feb 2009).

= ========================================================================

Details:

In v1.3, we let a lot of really good features sit in development for a
long, long time.  Yes, we chose to do this and there were good reasons
for doing so, but the fact remains that we had some really good stuff
done and stable for long periods of time, but they weren't generally
available to users who wanted to use them.  Even for users who are
willing to be out on the bleeding edge, trunk tarballs are just too
scary.

Given the two competing forces mentioned above (feature/fast releases
+ stable/slow releases), it seems like we really want two different --
but overlapping -- release mechanisms.

Taking inspiration from other well-established paradigms, Ralph and I
propose the following for all releases starting with v1.4.0:

- Have two concurrent release series:
  1. "Super stable": for production users who care about stability
     above all else.  They're willing to wait long periods of time
     before updating to a new version of Open MPI.
  2. "Feature driven": for users who are willing to take a few chances
     to get new OMPI features -- but cannot endure the chaos of
     nightly trunk tarballs.

- The general idea is that a feature driven release is developed for a
  while in an RM-regulated yet somewhat agile development style.  When
  specific criteria are met (i.e., feature complete, schedule driven,
  etc.), the feature release series is morphed into a super stable
  state and released.  At this point, all development stops on that
  release series; only bug fixes are allowed.
- RM's therefore become responsible for *two* release series: a
  feature driven series and the corresponding super stable series that
  emerges from it.

***KEY POINT*** This "two release" methodology allows for the release
(and real-world testing) of new features in a much more timely fashion
than our current release methodology.

Here's a crude ASCII art representation of how branches will work
using this proposal in SVN:

          v1.3 series/super stable
             v1.3.0      v1.3.2                               v1.6.0
/----|---|-------|-----------|---- > /-|---|---|->
trunk  /         v1.3.1              v1.3.1                /
------------------------------------------------------------------------>
          \   v1.4.0  v1.4.2  v1.4.4  ...               v1.5.0   v1.5.1

\--|---|---|---|---|---|---|---|---|---------|--------|------>
                  v1.4.1  v1.4.3      ...            now becomes
             v1.4/feature driven                     v1.5/super stable

Here's how a typical release cycle works:

- Assume that a "super stable" version exists; a release series that
  has an odd minor number: v1.3, v1.5, v1.7, ...etc.
- For this example, let's assume that the super stable is v1.3.
- Only bug fixes go into the "super stable" series.

- Plans for the next "super stable" are drawn up (v1.5 in this
  example), including a list of goals, new features, a timeline, etc.

- A new feature release series is created shortly after the first
  super stable release with a minor version number that is even (e.g.,
  v1.4, v1.6, v1.8, ...etc.).
- In this example, the feature release series will be v1.4.
- The v1.4 branch is taken to a point with at least some degree of
  stability and released as v1.4.0.

- Development on the SVN trunk continues.

- According to a public schedule (probably tied to our teleconference
  schedule), the RM's will approve the moving of features and bug
  fixes to the feature release.
  - Rather than submitting CMRs for the Gatekeeper to move, the
    "owner" of a particular feature/bug fix will be assigned a
    specific time to move the item to the feature branch.
  - For example, George will have from Tues-Fri to move his cool new
    feature X to the v1.4 branch.
  - Friday night, new 1.4 tarballs are cut and everyone's MTT tries
    them out.
  - Iterate for the next week or so to get the v1.4 branch stable.
  - Rinse, repeat.

- Once the feature series meets certain criteria (e.g., feature
  complete, timeline is met, etc.), it undergoes a period of intense
  testing and debugging to achieve "super stable" status.  Once "super
  stable" has been reached, the branch is renamed to be "v1.5" and we
  start the whole cycle again (with v1.6/v1.7).
  - CMRs and Gatekeepers are used on the super stable series.
  - The older super stable series (v1.3) then becomes either
    unsupported or "less supported."

***KEY POINT*** That the schedule of moving features and bug fixes to
the release branch is somewhat fluid.  If George doesn't have time to
move feature X in his appointed week, the RMs shuffle him back further
in the schedule and take the next item off the list.  This shuffling
allows for rapid response to dynamic resource availability at each
organization.

***KEY POINT*** One of the goals of this proposal is to remove the
stigma of not getting into a given release -- because the feature
branch will have somewhat frequent releases (probably somewhere
between 1-3 months; see below).  Hence, we want to try to avoid the
tendency of OMPI developers to pack in a million features right before
a release, fearing that if feature X is not included in this release,
it'll sit on the SVN trunk for a year before release.

To ground all of the above discussion in a concrete proposal:

  1. Ralph and I will be responsible for the v1.4 and v1.5 series.
  2. Immediately start treating the v1.3 series as "super stable,"
     (although the v1.3 series is grandfathered -- George and Brad
     are still the RMs of v1.3 and are not bound by this proposal).
  3. (more or less) Immediately create the v1.4 branch from the SVN
     trunk.  Start working toward v1.4.0.
  4. Ralph and I will draw up a public list of desired features for
     the next "super stable" series -- v1.5.  This will include what
     has already happened on the trunk (and will therefore be in
     v1.4.0).
  5. Ralph and I will also make up a public schedule of when each
     feature will move from the trunk to the v1.4 branch.  As
     mentioned above, this schedule is meant to be a living document;
     we fully expect that scheduled items will move around as
     time/resources/features shift.
  6. We'll periodically release 1.4.x versions with clear delineations
     of what new features are available in each.  A SWAG for release
     frequency will be a release every 1-3 months.  It might be easier
     to say that our initial intent is to release no less than once a
     quarter; specific frequency will likely be determined on an
     case-by-case basis.
  7. Once all the v1.5 features are in the v1.4 branch (or if we run
     out of time, or ...), rename it to v1.5, conduct a concerted
     community effort to stabilize v1.5 to "super stable" status, and
     release it.
  8. Start the whole cycle again with v1.6/v1.7.

Ralph and I feel that this proposal is well-suited to the development
style of the Open MPI community.  We'll describe this in detail and
answer any questions on tomorrow's teleconference.

--
Jeff Squyres
Cisco Systems

Reply via email to