Thanks QP. This seems reasonable to me.

On Sun, Aug 1, 2021, 3:24 PM QP Hou <houqp....@gmail.com> wrote:

> Summarizing the discussed proposal in our Github issue [1] for broader
> discussion and review on the dev list.
>
> The current arrow-datafusion repo contains the following high level
> subprojects: datafusion, datafusion python binding and ballista.
>
> In order to be able to release ballista and datafusion python binding
> with semantic versioning, I propose we decouple subproject versions
> from each other. As a result, we will be able to release a breaking
> change in datafusion without forcing a major version bump in ballista
> or python binding if that breaking change is not visible to their
> consumers.
>
> To reduce release overhead, we will still vote on the whole
> arrow-datafusion repo on every release. From the same release tarball,
> we can then release these sub-projects to their language specific
> registries (crates.io and pypi) with their own versions.
>
> Take the upcoming datafusion 5.0.0 release as an example. Within the
> same source release, we also have the code for ballista-0.5.0 and
> datafusion-python-0.3.0. We only need to vote on a signed
> apache-arrow-datafusion-5.0.0.tar.gz tarball.
>
> Consequence of this process is every time we need to release a new
> version of the python binding or ballista, we need to trigger a new
> datafusion release as well. However, datafusion release won't require
> a new release from the other two subprojects. For example, datafusion
> 5.1.0 release can just include a datafusion python release 0.4.0
> without a ballista release. In that case, we will just skip crates.io
> publish for ballista.
>
> Here is what the release process will look like:
>
> * Send a PR with the following changes to prepare the source tree for
> a new release:
>     - Update versions in Cargo.toml files
>     - Run automation script to generate
> {datafusion,python,ballista}/CHANGELOG.md
> * After PR gets merged, push git tag x.y.z to Github
> * Run dev/release/create-tarball.sh to create and upload a signed
> tarball for voting in the dev list
> * After vote passed, run ./dev/release/release-tarball.sh to move
> approved tarball to the release location in SVN
> * Unpack released tarball and release subproject to language specific
> registries:
>     - run `cargo publish` in datafusion to release datafusion to crates.io
>     - if there is a new ballista release
>         - run `cargo publish` in
> ballista/rust/{client,core,executor,scheduler} folders to release
> ballista to crates.io
>         - push `ballista-x.y.z` tag to Github
>     - if there is a new datafusion python release
>         - run `maturin publish` in python folder to release datafusion
> python binding to pypi
>         - release python documentation
>         - push `python-x.y.z` tag to Github
>
> I would like to get some feedback on this proposal since it is a
> little bit different from other Arrow projects. But I do think this
> will provide a bitter dependency pinning experience and changelog
> tracking for those sub-projects' downstream consumers.
>
> [1]: https://github.com/apache/arrow-datafusion/issues/771
>
>
> On Tue, Jul 27, 2021 at 4:18 PM Andrew Lamb <al...@influxdata.com> wrote:
> >
> > Thanks to you both -- this sounds great.
> >
> > On Tue, Jul 27, 2021 at 8:37 AM Jiayu Liu <jimex...@gmail.com> wrote:
> >
> > > Not sure it's necessarily bundled together but I believe a Python,
> > > documentation, etc. release can also be helpful. I can volunteer to
> help if
> > > somehow these works can be parallelized.
> > >
> > > On Tue, Jul 27, 2021 at 3:29 PM QP Hou <houqp....@gmail.com> wrote:
> > >
> > > > Following up on this, since delta-rs could really benefit from this
> > > > release, I have started some initial work with
> > > > https://github.com/apache/arrow-datafusion/pull/780 to move things
> > > > forward. Others are welcome to join the party.
> > > >
> > > > On Fri, Jul 23, 2021 at 12:58 PM Andrew Lamb <al...@influxdata.com>
> > > wrote:
> > > > >
> > > > > Does anyone want to make a DataFusion / Ballista official release
> (and
> > > > then
> > > > > subsequent release to crates.io)?  There is now a ticket [1] to
> track
> > > > this
> > > > > work. I think it would be great to do if someone has time. There
> are
> > > all
> > > > > sorts of great features that have gone in since 4.0.0
> > > > >
> > > > > I don't have much time to devote to the release management of
> > > DataFusion
> > > > /
> > > > > Ballista in the near term (as my project uses DataFusion master
> and my
> > > > > release management budget is already spent on managing arrow-rs
> > > > releases).
> > > > >
> > > > > Andrew
> > > > >
> > > > > [1] https://github.com/apache/arrow-datafusion/issues/771
> > > >
> > >
>

Reply via email to