Thanks for the eager discussion, great to see that we are aligned on the
broad strokes!
In general I think that this is not something we neither need to nor want
to implement from 0 to 100.
Incrementally evolving and evaluating our process is key for sucsh a core
change

> I think C#, JS, [Java], and Go are the most obvious candidates to
decouple.
+1 (though I am not sure that Java fits due to the JNI parts?)

>Even then, we should probably only separate these candidates if they have
willing release managers.
Ideally we would have an active release manager for each component but I
don't quite see that yet and
onboarding/changing technical situation will take some time, We currently
handle verification and packaging
for all implementations already and given that we will keep to 'batched'
releases for the foreseeable future I
think determining the proper version for each component (which would be on
the comitters
+ a conventional commit like framework + automation?) should be the first
step taken in parallel with
refactoring the technical release infrastructure to handle different
versions and partial releases.

> We could simply release C++, R, Python and C/GLib together.
While as mentioned R is backwards compatible down to 13.0.0 and could be
versioned separately
(as its development pace does not keep up with libarrow/pyarrow at the
moment), this could still be a good step in the right direction.
A proper semver scheme for this group/libarrow would still be a valuable
addition.

The Matlab implementation is currently establishing its release process and
could be considered a binding like pyarrow, as far as I understand MLTBX.

Thanks,
Jacob



Am Mo., 8. Apr. 2024 um 15:56 Uhr schrieb Weston Pace <weston.p...@gmail.com
>:

> > Probably major versions should match between C++ and PyArrow, but I guess
> > we could have diverging minor and patch versions. Or at least patch
> > versions given that
> > a new minor version is usually cut for bug fixes too.
>
> I believe even this would be difficult.  Stable ABIs are very finicky in
> C++.  If the public API surface changes in any way then it can lead to
> subtle bugs if pyarrow were to link against an older version.  I also am
> not sure there is much advantage in trying to separate pyarrow from
> arrow-cpp since they are almost always changing in lockstep (e.g. any
> change to arrow-cpp enables functionality in pyarrow).
>
> I think we should maybe focus on a few more obvious cases.
>
> I think C#, JS, Java, and Go are the most obvious candidates to decouple.
> Even then, we should probably only separate these candidates if they have
> willing release managers.
>
> C/GLib, python, and ruby are all tightly coupled to C++ at the moment and
> should not be a first priority.  I would have guessed that R is also in
> this list but Jacob reported in the original email that they are already
> somewhat decoupled?
>
> I don't know anything about swift or matlab.
>
> On Mon, Apr 8, 2024 at 6:23 AM Alessandro Molina
> <alessan...@voltrondata.com.invalid> wrote:
>
> > On Sun, Apr 7, 2024 at 3:06 PM Andrew Lamb <al...@influxdata.com> wrote:
> >
> > >
> > > We have had separate releases / votes for Arrow Rust (and Arrow
> > DataFusion)
> > > and it has served us quite well. The version schemes have diverged
> > > substantially from the monorepo (we are on version 51.0.0 in arrow-rs,
> > for
> > > example) and it doesn't seem to have caused any large confusion with
> > users
> > >
> > >
> > I think that versioning will require additional thinking for libraries
> like
> > PyArrow, Java etc...
> > For rust this is a non problem because there is no link to the C++
> library,
> >
> > PyArrow instead is based on what the C++ library provides,
> > so there is a direct link between the features provided by C++ in a
> > specific version
> > and the features provided in PyArrow at a specific version.
> >
> > More or less PyArrow 20 should have the same bug fixes that C++ 20 has,
> > and diverging the two versions would lead to confusion easily.
> > Probably major versions should match between C++ and PyArrow, but I guess
> > we could have diverging minor and patch versions. Or at least patch
> > versions given that
> > a new minor version is usually cut for bug fixes too.
> >
>

Reply via email to