Hi Druid devs!

I've been thinking about our release process and would love to get your
thoughts on how we manage new features.

When a new feature is added is it first marked as experimental?
How do users know which features are experimental?
How do we ensure that features do not break with each new release?
Should the release manager manually check each feature works as part of the
release process?
    This doesn't seem like it can scale.
Should integration tests always be required if the feature is being added
to core?

To address these issues, I'd like to propose we introduce a feature
lifecycle for all features so that we can set expectations for users
appropriately - either in the docs, product or both. I'd like to propose
something like this:
* Alpha - Known major bugs / performance issues. Incomplete functionality.
Disabled by default.
* Beta - Feature is not yet battle tested in production. API and
compatibility may change in the future. May not be forward/ backward
compatible.
* GA - Feature has appropriate user facing documentation and testing so
that it won't regres with a version upgrade. Will be forward / backward
compatible for x releases (maybe 4? ~ 1 year)

I think a model like this will allow us to continue to ship features
quickly while keeping the release quality bar high so that our users can
continue to rely on Druid without worrying about upgrade issues.
I understand that adding integration tests may not always make sense for
early / experimental features when we're uncertain of the API or the
broader use case we're trying to solve. This model would make it clear to
our users which features are still work in progress, and which ones they
can expect to remain stable for a longer time.

Below is an example of how I think this model can be applied to a new
feature:

This PR adds support for a new feature -
https://github.com/apache/druid/pull/9449

While it has been tested locally, there may be changes that enter Druid
before the 0.19 release that break this feature, or more likely - a
refactoring after 0.19 that breaks something in this feature. In this
example, I think the feature should be marked as alpha, since there are
future changes expected to the functionality. At this stage integration
tests are not expected. Once the feature is complete, there should be happy
path integration tests for the feature and it can graduate to Beta. After
it has been running in production for a while, the feature can graduate to
GA once we've added enough integration tests that we feel confident that
the feature will continue to work if the integration tests run successfully.

I know this is a very long email, but I look forward to hearing your
thoughts on this.
Suneet

Reply via email to