Hi Thanks for the suggestions and feedback Gian and Surekha! I've not forgotten about this thread :)
I've been a little busy recently, and haven't had time to fully process the suggestions and write up responses/ next steps. I'll follow up to this next week. On Mon, Jun 15, 2020 at 11:55 PM Surekha Saharan <surekha.saha...@imply.io> wrote: > Thanks for starting this discussion, regression testing is essential for > maintaining code quality and happy users. Introducing feature > lifecycle sounds good to me, I have some questions and comments: > > - Right now we have features categorized into "experimental" and "GA", are > you suggesting to further divide the experimental features into "alpha" and > "beta" ? > > - We should clearly define the meaning of "alpha", "beta" and "GA" features > to set expectations for both contributors and users. I think in your > "alpha" definition, it says "known major bugs", that doesn't sound right, i > think there can be "known limitations or unknown major bugs" in alpha. I > agree with the idea suggested on undocumented "alpha" features and > documented "beta" features. > > - For GA features, you suggested forward/backward compatibility for x > number of releases, i think 2 might be a good number in that case, as I > often see folks skipping a version when upgrading, or if we decide to do > major releases less often in future as suggested, 1 is fine too. > > - The existing GA features which do not meet the existing testing criteria, > instead of putting the burden on the developer who fixes a bug in that area > in future, we could create github issues for major features and someone > wanting to understand that feature from the community could > potentially help write integration tests for those features.Of course if a > contributor writes those tests as part of doing some bug fix/enhancement, > it's welcome. > > - Not all, but i think a lot of regression can be prevented by writing good > unit tests for changes related to API, serde etc, while integration tests > are necessary for end to end testing and catching dependency changes. > > Btw is there a page where we list all the experimental features or is it on > the feature page where we add a note on it's experimental nature or in the > release notes. Thanks Gian for classifying the current features into > "alpha", "beta", "GA", it helps in understanding the definition. > > -Surekha > > On Mon, Jun 15, 2020 at 8:40 PM Gian Merlino <g...@apache.org> wrote: > > > IMO the alpha / beta / GA terminology makes sense, and makes things > clearer > > to users, which is good. > > > > Some thoughts on the specifics of your proposal: > > > > - You're suggesting we commit to a specific number of releases that a GA > > feature will be forward / backward compatible for. IMO, our current > > commitment (one major release) is okay, but it would be good to strive > for > > doing this as infrequently as possible. In the future, we may decide to > do > > major releases less often, which will naturally lengthen the commitment > > times. > > > > - I like the idea of phasing in the testing bar as features move from > alpha > > -> beta -> GA. I think it'd be good to point to examples of features > where > > the testing is done "right" for each stage. It should help contributors > > know what to shoot for. > > > > - Plenty of GA features today do not meet the testing bar you've > mentioned, > > including some "day 1" features. This is fine — it is a natural > consequence > > of raising the testing bar over time — but we should have an idea of what > > we want to do about this. One possible approach is to require that tests > be > > added to meet the bar when fixes or changes are made to the feature. But > > this leads to situations where a small change can't be made without > adding > > a mountain of tests. IMO it'd be good to do an amount of new testing > > commensurate with the scope of the change. A big refactor to a feature > that > > doesn't have much testing should involve adding a mountain of tests to > it. > > But we don't necessarily need to require that for a small bug fix or > > enhancement (but it would be great, of course!). > > > > - For "beta" the definition you suggest is all negative ("not battle > > tested", "may change", "may not be compatible"). We should include > > something positive as well, to illustrate what makes beta better than > > alpha. How about "no major known issues" or "no major API changes > planned"? > > > > - I would suggest moving the "appropriate user-facing documentation" > > requirement to beta rather than GA. In order to have a useful beta > testing > > period, we need to have good user-facing docs so people can try the > feature > > out. > > > > - I think we might want to leave some alpha features undocumented, if > their > > quality or stability level is so low that they won't be useful to people > > that aren't developers. The goal would be to avoid clogging up the > > user-facing docs with a bunch of half-baked stuff. Too much of that > lowers > > the perceived quality level of the project. > > > > Now, thinking about specific features, I suggest we classify the current > > experimental features in the following way: > > > > - Java 11 support: Beta or GA (depending on how good the test coverage > is) > > - HTTP remote task runner: Alpha (there aren't integration tests yet) > > - Router process: GA > > - Indexer process: Alpha or Beta (also depending on how good the test > > coverage is) > > - Segment locking / minor compaction: Alpha > > - Approximate histograms: GA, but deprecated (they are stable and have > > plenty of tests, but users should consider switching to DataSketches > > quantiles) > > - Lookups: Beta > > - Kinesis ingestion: GA (now that there are integration tests: > > https://github.com/apache/druid/pull/9724) > > - Materialized view extension: Alpha > > - Moments sketch extension: Alpha > > > > On Mon, Jun 8, 2020 at 1:49 PM Suneet Saldanha <suneet.salda...@imply.io > > > > wrote: > > > > > Hi Druid devs! > > > > > > I've been thinking about our release process and would love to get your > > > thoughts on how we manage new features. > > > > > > When a new feature is added is it first marked as experimental? > > > How do users know which features are experimental? > > > How do we ensure that features do not break with each new release? > > > Should the release manager manually check each feature works as part of > > the > > > release process? > > > This doesn't seem like it can scale. > > > Should integration tests always be required if the feature is being > added > > > to core? > > > > > > To address these issues, I'd like to propose we introduce a feature > > > lifecycle for all features so that we can set expectations for users > > > appropriately - either in the docs, product or both. I'd like to > propose > > > something like this: > > > * Alpha - Known major bugs / performance issues. Incomplete > > functionality. > > > Disabled by default. > > > * Beta - Feature is not yet battle tested in production. API and > > > compatibility may change in the future. May not be forward/ backward > > > compatible. > > > * GA - Feature has appropriate user facing documentation and testing so > > > that it won't regres with a version upgrade. Will be forward / backward > > > compatible for x releases (maybe 4? ~ 1 year) > > > > > > I think a model like this will allow us to continue to ship features > > > quickly while keeping the release quality bar high so that our users > can > > > continue to rely on Druid without worrying about upgrade issues. > > > I understand that adding integration tests may not always make sense > for > > > early / experimental features when we're uncertain of the API or the > > > broader use case we're trying to solve. This model would make it clear > to > > > our users which features are still work in progress, and which ones > they > > > can expect to remain stable for a longer time. > > > > > > Below is an example of how I think this model can be applied to a new > > > feature: > > > > > > This PR adds support for a new feature - > > > https://github.com/apache/druid/pull/9449 > > > > > > While it has been tested locally, there may be changes that enter Druid > > > before the 0.19 release that break this feature, or more likely - a > > > refactoring after 0.19 that breaks something in this feature. In this > > > example, I think the feature should be marked as alpha, since there are > > > future changes expected to the functionality. At this stage integration > > > tests are not expected. Once the feature is complete, there should be > > happy > > > path integration tests for the feature and it can graduate to Beta. > After > > > it has been running in production for a while, the feature can graduate > > to > > > GA once we've added enough integration tests that we feel confident > that > > > the feature will continue to work if the integration tests run > > > successfully. > > > > > > I know this is a very long email, but I look forward to hearing your > > > thoughts on this. > > > Suneet > > > > > >