Hello Everyone! I would like to resurface the discussion of separate versioning/releases/voting for monorepo components. We have previously touched on this topic mostly in the community meetings and spread across multiple, only tangential related threads. I think a focused discussion can be a bit more results oriented, especially now that we almost regularly deviate from the quarterly release cadence with minor releases. My hope is that discussing this and adapting our process can lower the amount of work required and ease the pressure on our release managers (Thank you Raúl and Kou!).
I think the base of the topic is the separate versioning for components as otherwise separate releases only have limited value. From a technical perspective standalone implementations like Go or JS are the easiest to handle in that regard, they can just follow their ecosystem standards, which has been requested by users already (major releases in Go require manual editing across a code base as dependencies are usually pinned to a major version). For Arrow C++ bindings like Arrow R and PyArrow having distinct versions would require additional work to both enable the use of different versions and ensure version compatibility is monitored and potentially updated if needed. For Arrow R we have already implemented these changes for different reasons and have backwards compatibility with libarrow >= 13.0.0. From a user standpoint of PyArrow this is likely irrelevant as most users get binary wheels from pypi, if a user regularly builds PyArrow from source they are also capable of managing potentially different libarrow version requirements as this is already necessary to build the package just with an exact version match. A more meta question is about the messaging that different versioning schemes carry, as it might no longer be obvious on first glance which versions are compatible or have the newest features. Though I would argue that this a marginal concern at best as there is no guarantee of feature parity between different components with the same version. Breaking that implicit expectation with separate versions could be seen as clearer. If a component only receives dependency bumps or minor bug fixes, releasing this component with a patch version aligns much better with expectations than a major version bump. In addition there are already several differently versioned libraries in the apache/arrow-* ecosystem that are released outside of the monorepo release process. A proper support policy for each component would also be required but could just default to 'current major release' as it is now. >From an ASF perspective there is no requirement to release the entire repository at once as the actual release artifact is the source tarball. As long as that is verified and voted on by the PMC it is an official release. This brings me to the release process and voting. I think it is pretty clear that completely decoupling all components and their release processes isn't feasible at the moment, mainly from a technical perspective (crossbow) and would likely also lead to vote fatigue. We have made efforts to ease the verification required for the vote easier and will continue these efforts. Though I can see some of the components managing their own releases (e.g. R, as we do with post release tasks already due to CRAN, ) a continued quarterly 'batch release' seems like a more appealing solution and would still allow us to use separate versions. Voting in one thread on all components/a subset of components per voter and the surrounding technicalities is something I would like to hear some opinions on. In my opinion being stricter with release requirements for components might lead to smaller/less active components not releasing. This seems like a bad thing at first glance but might also spur the user community to get involved when the reassuring, regular releases dry up and reflect the reality of the development situation of the component. I am eager to hear your thoughts! Best Jacob