I also think it sounds like a good process to get all the various packages released in a timely manner. Thank you for taking point on this issue
Andrew On Sun, Aug 1, 2021 at 11:24 PM Andy Grove <andygrov...@gmail.com> wrote: > Thanks QP. This seems reasonable to me. > > On Sun, Aug 1, 2021, 3:24 PM QP Hou <houqp....@gmail.com> wrote: > > > Summarizing the discussed proposal in our Github issue [1] for broader > > discussion and review on the dev list. > > > > The current arrow-datafusion repo contains the following high level > > subprojects: datafusion, datafusion python binding and ballista. > > > > In order to be able to release ballista and datafusion python binding > > with semantic versioning, I propose we decouple subproject versions > > from each other. As a result, we will be able to release a breaking > > change in datafusion without forcing a major version bump in ballista > > or python binding if that breaking change is not visible to their > > consumers. > > > > To reduce release overhead, we will still vote on the whole > > arrow-datafusion repo on every release. From the same release tarball, > > we can then release these sub-projects to their language specific > > registries (crates.io and pypi) with their own versions. > > > > Take the upcoming datafusion 5.0.0 release as an example. Within the > > same source release, we also have the code for ballista-0.5.0 and > > datafusion-python-0.3.0. We only need to vote on a signed > > apache-arrow-datafusion-5.0.0.tar.gz tarball. > > > > Consequence of this process is every time we need to release a new > > version of the python binding or ballista, we need to trigger a new > > datafusion release as well. However, datafusion release won't require > > a new release from the other two subprojects. For example, datafusion > > 5.1.0 release can just include a datafusion python release 0.4.0 > > without a ballista release. In that case, we will just skip crates.io > > publish for ballista. > > > > Here is what the release process will look like: > > > > * Send a PR with the following changes to prepare the source tree for > > a new release: > > - Update versions in Cargo.toml files > > - Run automation script to generate > > {datafusion,python,ballista}/CHANGELOG.md > > * After PR gets merged, push git tag x.y.z to Github > > * Run dev/release/create-tarball.sh to create and upload a signed > > tarball for voting in the dev list > > * After vote passed, run ./dev/release/release-tarball.sh to move > > approved tarball to the release location in SVN > > * Unpack released tarball and release subproject to language specific > > registries: > > - run `cargo publish` in datafusion to release datafusion to > crates.io > > - if there is a new ballista release > > - run `cargo publish` in > > ballista/rust/{client,core,executor,scheduler} folders to release > > ballista to crates.io > > - push `ballista-x.y.z` tag to Github > > - if there is a new datafusion python release > > - run `maturin publish` in python folder to release datafusion > > python binding to pypi > > - release python documentation > > - push `python-x.y.z` tag to Github > > > > I would like to get some feedback on this proposal since it is a > > little bit different from other Arrow projects. But I do think this > > will provide a bitter dependency pinning experience and changelog > > tracking for those sub-projects' downstream consumers. > > > > [1]: https://github.com/apache/arrow-datafusion/issues/771 > > > > > > On Tue, Jul 27, 2021 at 4:18 PM Andrew Lamb <al...@influxdata.com> > wrote: > > > > > > Thanks to you both -- this sounds great. > > > > > > On Tue, Jul 27, 2021 at 8:37 AM Jiayu Liu <jimex...@gmail.com> wrote: > > > > > > > Not sure it's necessarily bundled together but I believe a Python, > > > > documentation, etc. release can also be helpful. I can volunteer to > > help if > > > > somehow these works can be parallelized. > > > > > > > > On Tue, Jul 27, 2021 at 3:29 PM QP Hou <houqp....@gmail.com> wrote: > > > > > > > > > Following up on this, since delta-rs could really benefit from this > > > > > release, I have started some initial work with > > > > > https://github.com/apache/arrow-datafusion/pull/780 to move things > > > > > forward. Others are welcome to join the party. > > > > > > > > > > On Fri, Jul 23, 2021 at 12:58 PM Andrew Lamb <al...@influxdata.com > > > > > > wrote: > > > > > > > > > > > > Does anyone want to make a DataFusion / Ballista official release > > (and > > > > > then > > > > > > subsequent release to crates.io)? There is now a ticket [1] to > > track > > > > > this > > > > > > work. I think it would be great to do if someone has time. There > > are > > > > all > > > > > > sorts of great features that have gone in since 4.0.0 > > > > > > > > > > > > I don't have much time to devote to the release management of > > > > DataFusion > > > > > / > > > > > > Ballista in the near term (as my project uses DataFusion master > > and my > > > > > > release management budget is already spent on managing arrow-rs > > > > > releases). > > > > > > > > > > > > Andrew > > > > > > > > > > > > [1] https://github.com/apache/arrow-datafusion/issues/771 > > > > > > > > > > > >