alamb opened a new issue, #16622: URL: https://github.com/apache/datafusion/issues/16622
### Is your feature request related to a problem or challenge? One of the dreams of the composable data ecosystem is to quickly assemble a system from various components (DataFusion, data formats DataFusion still releases once a month, which allows code to quickly flow but also causes at least 2 challenges: 1. Takes non trivial work required to upgrade downstream projects, as mentioned in https://github.com/apache/datafusion/issues/5269 2. Make upgrading and using downstream third-party extensions hard Third party extensions like delta-rs and iceberg provide `TableProviders` for DataFusion, which is really nice. However, to use those packages the versions of DataFusion must match exactly. This means for an application that relies on multiple downstream packages must wait until **ALL** of them have upgraded to the new version in order to upgrade DataFusion. If there is any delay in the downstream libraries updating, it delays. For example, an application that wants to use delta-rs, iceberg, and the `table-providers` crate, there is a race after each upgrade of DataFusion Let's take a release timeline for 1. +0 days: DataFusion version `X` released 2. +7 days: New delta-rs releases upgraded to DataFusion `X` 3. +11 days: new iceberg crate released upgraded to DataFusion `X` 4. +12 days: new table-providers version is released 5. +13-30 days: End user app can upgrade DataFusion and delta, and icerberg 6. +31 days: New DataFusion is released again ### Describe the solution you'd like I would like downstream libraries to have more time and schedule flexibility when upgrading DataFusion and other dependent crates, so that it is easier to construct a system from different components ### Describe alternatives you've considered ## Option 1: Switch to major/minor release cadence We could follow the model of arrow-rs which does releases monthly, but breaking releases only quarterly. Here is how it works in arrow-rs: https://github.com/apache/arrow-rs?tab=readme-ov-file#release-versioning-and-schedule The major cost here is that maintainers and contributors would have to be diligent about not merging breaking API changes until a major release This is possible to automate somewhat: - https://github.com/apache/datafusion/pull/16078 from @logan-keede - https://github.com/apache/datafusion/pull/16541 from @lic ## Option 2: LTS and feature branch -Keep (at least) two branches going: LTS and main, as proposed by @andygrove in https://github.com/apache/datafusion/issues/5269 In this model we would likely backport changes to the LTS branch and make releases from there. The downside of this approach is that there is extra work to backport changes to LTS. ### Additional context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org