Hi, That makes a lot of sense. I am sorry that I did not understand that from the versioning document and the discussion on this thread.
Best, Jorge On Tue, Jul 28, 2020 at 8:30 PM Wes McKinney <wesmck...@gmail.com> wrote: > On Tue, Jul 28, 2020 at 8:49 AM Jorge Cardoso Leitão > <jorgecarlei...@gmail.com> wrote: > > > > Thanks for the summary, > > > > So, someone discloses a 0 day vulnerability on a dependency from > arrow/js, > > and the maintainers release a new backward-compatible fix, but they do a > > major release instead (1.9.3 to 2.0.0). Since npm uses semver, we must > bump > > it on our package.json (i.e. ^1.9.3 to ^2.0.0). This requires a major > > release of Arrow libraries. So, we are all now under pressure during a 0 > > day incident to release a new version of arrow. > > We finally release it, and arrow-js is backward incompatible. Now > everyone > > depending on arrow-js will also have to bump arrow on their own > > `package.json` (e.g. ^1.0.0 -> ^2.0.0). Since our release is backward > > incompatible, they will have to perform a code migration, and so now they > > are the ones under pressure. > > The project can make patch releases out of maintenance branches, > including patch releases that only affect a single component. If this > occurs, I am certain that we can fashion a solution that does not > negatively impact the other groups of developers. > > The main point is that it isn't currently practical from a process > standpoint to be cutting many different major/minor releases from > different parts of the project. Worth emphasizing that this is not an > artifact of the monorepo setup by any means and purely a function of > maintainer / release manager bandwidth, of which there is not a lot at > the moment. In fact, back when the JS project got going there was a > desire to make more frequent NPM releases, but then the project fell > behind and it was easier for JS to be a part of the monorelease. We > can't have an individual committer releasing to NPM arbitrarily from > the command line by running `git tag ... ; npm publish` -- we have to > have votes on the mailing list so that the community has an > opportunity to inspect the release candidate and ensure that it meets > the community's standards. > > > Fortunately for the community, only arrow releases security patches on > top > > of backward incompatible changes. However, the moment other projects > start > > doing this, this process grows exponentially throughout the dependency > > tree. Also note that this is not an issue of js; it happens on any > > programming language that arrow maintains whose package manager uses > semver > > for dependency resolution (npm, pip, cargo, etc), more dramatically, we > are > > connecting the dependency tree of cargo with the dependency tree from pip > > and npm by aligning all our libraries under the same version. > > > > If we had not released backward incompatible code along with our security > > fix, our dependencies only needed to run `npm audit fix` to update their > > package.lock (or requirements.txt, or whatever). > > > > From all of this, I conclude that our versioning strategy implies that: > > > > 1. we do not have stable library releases: every release is potentially > > backward incompatible, including security patches. > > > > 2. we get and cause significant pressure in the release process of a 0 > day > > vulnerability security patch, either affecting arrow directly or through > > some of its dependencies on _any_ of its language-specific libraries. > > Per above I do not agree. If there is a security fix that necessitates > a patch release then we can make a $MAJOR.0.$PATCH release that > incorporates the fix, and only publish the relevant artifacts (e.g. > for NPM) that are needed for that patch release. > > > Anyway, there is a consensus, so you likely thought this through more > than > > I and weighed it in the decision. Thus, thank you for the clarification > and > > great work on this awesome project. > > > > Best, > > Jorge > > > > > > > > On Mon, Jul 27, 2020 at 11:35 PM Wes McKinney <wesmck...@gmail.com> > wrote: > > > > > Yes, the TL;DR is that we do not at this time intend to make minor > > > LIBRARY releases in SemVer parlance, even if there are no backwards > > > incompatible changes. Either we will make Major releases or Patch > > > releases of the libraries. We will likely make minor releases of the > > > columnar protocol, though. > > > > > > The other questions are handled in the Versioning document, we are now > > > observing a dual-versioning scheme with FORMAT version being separate > > > from LIBRARY version. Each version of the libraries will have a > > > corresponding FORMAT version, and the format version will change more > > > slowly than the libraries. So LIBRARY version 2.0.0 may use FORMAT > > > version 1.0.0 unless new features are added in which case the format > > > version may be 1.1.0 > > > > > > On Mon, Jul 27, 2020 at 11:54 AM Neal Richardson > > > <neal.p.richard...@gmail.com> wrote: > > > > > > > > https://arrow.apache.org/docs/format/Versioning.html is the > statement > > > that > > > > came from the resolution of the previous discussion. IIRC the > discussion > > > > came between the 0.15 and 0.16 releases, if you want to search the > > > mailing > > > > list archives. > > > > > > > > I wouldn't want to speak for everyone, but I believe there are a few > > > things > > > > at play: > > > > > > > > * Release logistics: I believe the community has decided that it > wants to > > > > continue releasing all components at the same time, in which case > having > > > a > > > > single release number greatly simplifies things. > > > > * Compatibility of libraries: it's a lot easier to know that two > > > libraries > > > > in different languages are compatible because they have the same > number. > > > > * Version numbers are cheap, and (IMO) there's little useful > information > > > in > > > > version numbers other than "higher means newer" (unless you're in > Python > > > > and have parallel major releases for years ;) > > > > > > > > While I might also question whether the next release for the library > I'm > > > > working on "should" have a major or minor version bump, I'm skeptical > > > that > > > > having that autonomy is worth the maintenance cost. > > > > > > > > Neal > > > > > > > > > > > > On Mon, Jul 27, 2020 at 9:37 AM Jorge Cardoso Leitão < > > > > jorgecarlei...@gmail.com> wrote: > > > > > > > > > Hi > > > > > > > > > > First off, congrats for the 1.0.0 release! > > > > > > > > > > I am writing because I am trying to understand the versioning > schema we > > > > > will use going onwards. > > > > > > > > > > AFAI understand, 1.0.0 was assigned to all subcomponents of arrow. > > > I.e. I > > > > > can now use pyarrow and assign something like >=1,<2 on a setup.py. > > > > > > > > > > However, looking at other parts of the project, I get the feeling > that > > > > > these components are less mature / more recent, and likely need > more > > > > > backward incompatible changes until a stable API is achieved. In > other > > > > > words, within arrow, I get the feeling that different parts are at > > > > > significantly different stages of their development lifetime. > > > > > > > > > > How are we planning to reconcile this fact? E.g. I can see pyarrow > not > > > > > wanting to bump from 1 to 2 since no backward incompatible change > was > > > > > introduced, while other components have backward incompatible > changes. > > > > > > > > > > A related question: what exactly are we versioning with this > 1.0.0? The > > > > > protocol? The individual APIs? The project as a whole? > > > > > > > > > > In my view, there is a case here to _not_ align the versions of the > > > > > different components, exactly to avoid having one component's > version > > > (e.g. > > > > > pyarrow) be dependent on other's code (e.g. rust arrow). However, I > > > suspect > > > > > that this discussion has already taken place and I have been > unable to > > > find > > > > > a summary of it. > > > > > > > > > > Best, > > > > > Jorge > > > > > > > > >