Thanks for the eager discussion, great to see that we are aligned on the broad strokes! In general I think that this is not something we neither need to nor want to implement from 0 to 100. Incrementally evolving and evaluating our process is key for sucsh a core change
> I think C#, JS, [Java], and Go are the most obvious candidates to decouple. +1 (though I am not sure that Java fits due to the JNI parts?) >Even then, we should probably only separate these candidates if they have willing release managers. Ideally we would have an active release manager for each component but I don't quite see that yet and onboarding/changing technical situation will take some time, We currently handle verification and packaging for all implementations already and given that we will keep to 'batched' releases for the foreseeable future I think determining the proper version for each component (which would be on the comitters + a conventional commit like framework + automation?) should be the first step taken in parallel with refactoring the technical release infrastructure to handle different versions and partial releases. > We could simply release C++, R, Python and C/GLib together. While as mentioned R is backwards compatible down to 13.0.0 and could be versioned separately (as its development pace does not keep up with libarrow/pyarrow at the moment), this could still be a good step in the right direction. A proper semver scheme for this group/libarrow would still be a valuable addition. The Matlab implementation is currently establishing its release process and could be considered a binding like pyarrow, as far as I understand MLTBX. Thanks, Jacob Am Mo., 8. Apr. 2024 um 15:56 Uhr schrieb Weston Pace <[email protected] >: > > Probably major versions should match between C++ and PyArrow, but I guess > > we could have diverging minor and patch versions. Or at least patch > > versions given that > > a new minor version is usually cut for bug fixes too. > > I believe even this would be difficult. Stable ABIs are very finicky in > C++. If the public API surface changes in any way then it can lead to > subtle bugs if pyarrow were to link against an older version. I also am > not sure there is much advantage in trying to separate pyarrow from > arrow-cpp since they are almost always changing in lockstep (e.g. any > change to arrow-cpp enables functionality in pyarrow). > > I think we should maybe focus on a few more obvious cases. > > I think C#, JS, Java, and Go are the most obvious candidates to decouple. > Even then, we should probably only separate these candidates if they have > willing release managers. > > C/GLib, python, and ruby are all tightly coupled to C++ at the moment and > should not be a first priority. I would have guessed that R is also in > this list but Jacob reported in the original email that they are already > somewhat decoupled? > > I don't know anything about swift or matlab. > > On Mon, Apr 8, 2024 at 6:23 AM Alessandro Molina > <[email protected]> wrote: > > > On Sun, Apr 7, 2024 at 3:06 PM Andrew Lamb <[email protected]> wrote: > > > > > > > > We have had separate releases / votes for Arrow Rust (and Arrow > > DataFusion) > > > and it has served us quite well. The version schemes have diverged > > > substantially from the monorepo (we are on version 51.0.0 in arrow-rs, > > for > > > example) and it doesn't seem to have caused any large confusion with > > users > > > > > > > > I think that versioning will require additional thinking for libraries > like > > PyArrow, Java etc... > > For rust this is a non problem because there is no link to the C++ > library, > > > > PyArrow instead is based on what the C++ library provides, > > so there is a direct link between the features provided by C++ in a > > specific version > > and the features provided in PyArrow at a specific version. > > > > More or less PyArrow 20 should have the same bug fixes that C++ 20 has, > > and diverging the two versions would lead to confusion easily. > > Probably major versions should match between C++ and PyArrow, but I guess > > we could have diverging minor and patch versions. Or at least patch > > versions given that > > a new minor version is usually cut for bug fixes too. > > >
