Hi,

That makes a lot of sense. I am sorry that I did not understand that from
the versioning document and the discussion on this thread.

Best,
Jorge



On Tue, Jul 28, 2020 at 8:30 PM Wes McKinney <wesmck...@gmail.com> wrote:

> On Tue, Jul 28, 2020 at 8:49 AM Jorge Cardoso Leitão
> <jorgecarlei...@gmail.com> wrote:
> >
> > Thanks for the summary,
> >
> > So, someone discloses a 0 day vulnerability on a dependency from
> arrow/js,
> > and the maintainers release a new backward-compatible fix, but they do a
> > major release instead (1.9.3 to 2.0.0). Since npm uses semver, we must
> bump
> > it on our package.json (i.e. ^1.9.3 to ^2.0.0). This requires a major
> > release of Arrow libraries. So, we are all now under pressure during a 0
> > day incident to release a new version of arrow.
> > We finally release it, and arrow-js is backward incompatible. Now
> everyone
> > depending on arrow-js will also have to bump arrow on their own
> > `package.json` (e.g. ^1.0.0 -> ^2.0.0). Since our release is backward
> > incompatible, they will have to perform a code migration, and so now they
> > are the ones under pressure.
>
> The project can make patch releases out of maintenance branches,
> including patch releases that only affect a single component. If this
> occurs, I am certain that we can fashion a solution that does not
> negatively impact the other groups of developers.
>
> The main point is that it isn't currently practical from a process
> standpoint to be cutting many different major/minor releases from
> different parts of the project. Worth emphasizing that this is not an
> artifact of the monorepo setup by any means and purely a function of
> maintainer / release manager bandwidth, of which there is not a lot at
> the moment. In fact, back when the JS project got going there was a
> desire to make more frequent NPM releases, but then the project fell
> behind and it was easier for JS to be a part of the monorelease. We
> can't have an individual committer releasing to NPM arbitrarily from
> the command line by running `git tag ... ; npm publish` -- we have to
> have votes on the mailing list so that the community has an
> opportunity to inspect the release candidate and ensure that it meets
> the community's standards.
>
> > Fortunately for the community, only arrow releases security patches on
> top
> > of backward incompatible changes. However, the moment other projects
> start
> > doing this, this process grows exponentially throughout the dependency
> > tree. Also note that this is not an issue of js; it happens on any
> > programming language that arrow maintains whose package manager uses
> semver
> > for dependency resolution (npm, pip, cargo, etc), more dramatically, we
> are
> > connecting the dependency tree of cargo with the dependency tree from pip
> > and npm by aligning all our libraries under the same version.
> >
> > If we had not released backward incompatible code along with our security
> > fix, our dependencies only needed to run `npm audit fix` to update their
> > package.lock (or requirements.txt, or whatever).
> >
> > From all of this, I conclude that our versioning strategy implies that:
> >
> > 1. we do not have stable library releases: every release is potentially
> > backward incompatible, including security patches.
> >
> > 2. we get and cause significant pressure in the release process of a 0
> day
> > vulnerability security patch, either affecting arrow directly or through
> > some of its dependencies on _any_ of its language-specific libraries.
>
> Per above I do not agree. If there is a security fix that necessitates
> a patch release then we can make a $MAJOR.0.$PATCH release that
> incorporates the fix, and only publish the relevant artifacts (e.g.
> for NPM) that are needed for that patch release.
>
> > Anyway, there is a consensus, so you likely thought this through more
> than
> > I and weighed it in the decision. Thus, thank you for the clarification
> and
> > great work on this awesome project.
> >
> > Best,
> > Jorge
> >
> >
> >
> > On Mon, Jul 27, 2020 at 11:35 PM Wes McKinney <wesmck...@gmail.com>
> wrote:
> >
> > > Yes, the TL;DR is that we do not at this time intend to make minor
> > > LIBRARY releases in SemVer parlance, even if there are no backwards
> > > incompatible changes. Either we will make Major releases or Patch
> > > releases of the libraries. We will likely make minor releases of the
> > > columnar protocol, though.
> > >
> > > The other questions are handled in the Versioning document, we are now
> > > observing a dual-versioning scheme with FORMAT version being separate
> > > from LIBRARY version. Each version of the libraries will have a
> > > corresponding FORMAT version, and the format version will change more
> > > slowly than the libraries. So LIBRARY version 2.0.0 may use FORMAT
> > > version 1.0.0 unless new features are added in which case the format
> > > version may be 1.1.0
> > >
> > > On Mon, Jul 27, 2020 at 11:54 AM Neal Richardson
> > > <neal.p.richard...@gmail.com> wrote:
> > > >
> > > > https://arrow.apache.org/docs/format/Versioning.html is the
> statement
> > > that
> > > > came from the resolution of the previous discussion. IIRC the
> discussion
> > > > came between the 0.15 and 0.16 releases, if you want to search the
> > > mailing
> > > > list archives.
> > > >
> > > > I wouldn't want to speak for everyone, but I believe there are a few
> > > things
> > > > at play:
> > > >
> > > > * Release logistics: I believe the community has decided that it
> wants to
> > > > continue releasing all components at the same time, in which case
> having
> > > a
> > > > single release number greatly simplifies things.
> > > > * Compatibility of libraries: it's a lot easier to know that two
> > > libraries
> > > > in different languages are compatible because they have the same
> number.
> > > > * Version numbers are cheap, and (IMO) there's little useful
> information
> > > in
> > > > version numbers other than "higher means newer" (unless you're in
> Python
> > > > and have parallel major releases for years ;)
> > > >
> > > > While I might also question whether the next release for the library
> I'm
> > > > working on "should" have a major or minor version bump, I'm skeptical
> > > that
> > > > having that autonomy is worth the maintenance cost.
> > > >
> > > > Neal
> > > >
> > > >
> > > > On Mon, Jul 27, 2020 at 9:37 AM Jorge Cardoso Leitão <
> > > > jorgecarlei...@gmail.com> wrote:
> > > >
> > > > > Hi
> > > > >
> > > > > First off, congrats for the 1.0.0 release!
> > > > >
> > > > > I am writing because I am trying to understand the versioning
> schema we
> > > > > will use going onwards.
> > > > >
> > > > > AFAI understand, 1.0.0 was assigned to all subcomponents of arrow.
> > > I.e. I
> > > > > can now use pyarrow and assign something like >=1,<2 on a setup.py.
> > > > >
> > > > > However, looking at other parts of the project, I get the feeling
> that
> > > > > these components are less mature / more recent, and likely need
> more
> > > > > backward incompatible changes until a stable API is achieved. In
> other
> > > > > words, within arrow, I get the feeling that different parts are at
> > > > > significantly different stages of their development lifetime.
> > > > >
> > > > > How are we planning to reconcile this fact? E.g. I can see pyarrow
> not
> > > > > wanting to bump from 1 to 2 since no backward incompatible change
> was
> > > > > introduced, while other components have backward incompatible
> changes.
> > > > >
> > > > > A related question: what exactly are we versioning with this
> 1.0.0? The
> > > > > protocol? The individual APIs? The project as a whole?
> > > > >
> > > > > In my view, there is a case here to _not_ align the versions of the
> > > > > different components, exactly to avoid having one component's
> version
> > > (e.g.
> > > > > pyarrow) be dependent on other's code (e.g. rust arrow). However, I
> > > suspect
> > > > > that this discussion has already taken place and I have been
> unable to
> > > find
> > > > > a summary of it.
> > > > >
> > > > > Best,
> > > > > Jorge
> > > > >
> > >
>

Reply via email to