For argument's sake, I might suggest that the process you described in
your initial note would probably work best in another repo: you would
be able to iterate faster and release/version at your own pace. The
flexibility you get from moving to a separate repo comes at the cost
of extra responsibility: you have to set up your own CI, manage your
own issues, and set up your own release verification scripts + release
votes on the mailing list. Because you bind Arrow C++, you would have
to take sufficient steps to ensure that the Arrow C++ developers are
made aware of changes that break the Matlab bindings and vice versa
(i.e., test against dev Arrow C++ in a CI job).

Setting up that infrastructure for apache/arrow-nanoarrow took ~a week
of development time, and it now takes ~half a day to release a new
version (it took more for the first few versions, and the matlab
version has considerably higher complexity). Probably the biggest
barrier to releasing from another repo is that you have to ensure a
critical mass of PMC members can/will run your release verification
script and vote.

I happen to feel that it's the PMC's/wider community's responsibility
to help language binding contributors adopt a workflow that suits
their needs. If active Matlab contributors agree that they want to
release version 0.1 from another repo, (I feel that) we're here to
help you do that. If the active contributors want to stay in
apache/arrow, there is less flexibility about what you release and
when; however, the release process is well-defined.

On Tue, Nov 7, 2023 at 8:43 PM Sutou Kouhei <k...@clear-code.com> wrote:
>
> Hi,
>
> > As a point of reference, we noticed that PyArrow is on
> > version 14.0.0, but it feels "misleading" to say that the
> > MATLAB interface is at version 14.0.0 when we haven't yet
> > implemented or stabilized all core Arrow APIs.
>
> I can understand this but I suggest that we use the same
> version as other packages in apache/arrow. Because:
>
> * Using isolated version increases release complexity.
> * Using isolated version may introduce another
>   "misleading"/"confusion": For example, "the MATLAB
>   interface 1.0.0 uses Apache Arrow C++ 20.0.0" may be
>   misleading/confused:
>   * The MATLAB interface 1.0.0 doesn't use Apache Arrow C++
>     1.0.0.
>   * It may be difficult to find the corresponding
>     Apache Arrow C++ version from the MATLAB interface
>     version.
>
> Can we just mention "This is not stable yet!!!" in the
> documentation instead of using isolated version?
>
> We may want to use the status page for it:
> https://arrow.apache.org/docs/status.html
>
> > 1. Manually build the MATLAB interface on Windows, macOS, and Linux
>
> It's better that we use CI for this like other binary
> packages such as .deb/.rpm/.wheel/.jar/...
>
> If we release the MATLAB interface separately, which Apache
> Arrow C++ version is used? If we release the MATALB
> interface right now, is Apache Arrow C++ 14.0.0 (the latest
> release) used or is Apache Arrow C++ main (not released yet)
> used? The MATLAB interface on main will depend on Apache
> Arrow C++ main, we may not be able to use the latest release
> for the MATLAB interface on main.
>
> > 2. Combine all of the cross platform build artifacts into
> >    a single MLTBX file [1] for distribution
>
> Does the MLTBX file include Apache Arrow C++ binaries too
> like .wheel/.jar?
>
> > 3. Host the MLTBX somewhere that is easliy accessible for download
>
> MATLAB doesn't provide the official package repository such
> as PyPI for Python and https://rubygems.org/ for Ruby, right?
>
> > 1. Is there a recommended location where we can host the MLTBX file? e.g. 
> > GitHub Releases [2], JFrog [3], etc.?
>
> If the official package repository for MATLAB doesn't exist,
> JFrog is better because the MLTBX file will be large (Apache
> Arrow C++ binaries are large).
>
> > 2. Is there a recommended location for hosting release notes?
>
> How about creating https://arrow.apache.org/docs/matlab/ ?
> We can use Sphinx like the Python docs
> https://arrow.apache.org/docs/python/ or another
> documentation tools like the R docs
> https://arrow.apache.org/docs/r/ .
> If we use Sphinx, we can create
> https://github.com/apache/arrow/tree/main/docs/source/matlab/
> .
>
> > 3. Is there a recommended cadence for incremental point releases?
>
> I suggest avoiding separated release as above.
>
> > 4. Are there any notable ASF procedures [4] [5] (e.g. voting on a new 
> > release proposal) that we should be aware of as we consider creating an 
> > initial release?
>
> We don't need additional task for an initial release.
>
> > 5. How should the Arrow project release (i.e. 14.0.0)
> >    relate to the MATLAB interface version (i.e. 0.1)? As a
> >    point of reference, we noticed that PyArrow is on
> >    version 14.0.0, but it feels "misleading" to say that
> >    the MATLAB interface is at version 14.0.0 when we
> >    haven't yet implemented or stabilized all core Arrow
> >    APIs. Is there any precedent for using independent
> >    release versions for language bindings which are not
> >    fully stabilized and are also part of the main
> >    apache/arrow repository?
>
> We don't have any precedent for using independent release
> versions for language bindings. All language bindings used
> the same version.
>
> Apache Arrow JavaScript isn't a language bindings but it
> used separated release and isolated versions before
> 0.4.1. It joined apache/arrow release after 0.4.1. (The next
> version of Apache Arrow JavaScript 0.4.1 is 13.0.0.)
>
> > We've noticed that Arrow-related projects which are not
> > part of the main apache/arrow GitHub repository
> > (e.g. DataFusion) follow a mailing list-based voting and
> > release process. However, it's not clear whether it makes
> > sense to follow this process for the MATLAB interface
> > since it is part of the main apache/arrow repository.
>
> If we want to use separated release for the MATLAB
> interface, we should follow the same release process as
> apache/arrow and other apache/arrow-* because it's the
> standard ASF release process.
>
>
> Thanks,
> --
> kou
>
> In 
> <mn2pr05mb649619998eae9579cceba692ae...@mn2pr05mb6496.namprd05.prod.outlook.com>
>   "[DISCUSS][MATLAB] Proposal for incremental point releases of the MATLAB 
> interface" on Tue, 7 Nov 2023 20:31:31 +0000,
>   Kevin Gurney <kgur...@mathworks.com.INVALID> wrote:
>
> > Hi All,
> >
> > A considerable amount of new functionality has been added to the MATLAB 
> > interface over the last few months. We appreciate all the community's 
> > support in making this possible and are happy to see all the progress that 
> > is being made.
> >
> > At this point, we would like to create an initial "0.1" release of the 
> > MATLAB interface. Incremental point releases will enable MATLAB users to 
> > provide early feedback. In addition, learning how to navigate the release 
> > process is an important step towards eventually releasing a stable 1.0 
> > version of the MATLAB interface.
> >
> > Our proposed approach to creating an initial release would be to:
> >
> > 1. Manually build the MATLAB interface on Windows, macOS, and Linux
> > 2. Combine all of the cross platform build artifacts into a single MLTBX 
> > file [1] for distribution
> > 3. Host the MLTBX somewhere that is easliy accessible for download
> >
> > For reference - MLTBX is a standard packaging format for MATLAB which 
> > enables simple "one-click" installation - analogous to a Python pip package 
> > or a Ruby gem.
> >
> > Creating an MLTBX file manually should be relatively low effort. However, 
> > in the long term, we would love to enable semi-automated "push button" 
> > releases via GitHub Actions (and possibly even "nightly builds").
> >
> > Since this is our first time creating a release of the MATLAB interface, we 
> > wanted to draw on the community's expertise to answer a few questions:
> >
> > 1. Is there a recommended location where we can host the MLTBX file? e.g. 
> > GitHub Releases [2], JFrog [3], etc.?
> > 2. Is there a recommended location for hosting release notes?
> > 3. Is there a recommended cadence for incremental point releases?
> > 4. Are there any notable ASF procedures [4] [5] (e.g. voting on a new 
> > release proposal) that we should be aware of as we consider creating an 
> > initial release?
> > 5. How should the Arrow project release (i.e. 14.0.0) relate to the MATLAB 
> > interface version (i.e. 0.1)? As a point of reference, we noticed that 
> > PyArrow is on version 14.0.0, but it feels "misleading" to say that the 
> > MATLAB interface is at version 14.0.0 when we haven't yet implemented or 
> > stabilized all core Arrow APIs. Is there any precedent for using 
> > independent release versions for language bindings which are not fully 
> > stabilized and are also part of the main apache/arrow repository?
> >
> > We've noticed that Arrow-related projects which are not part of the main 
> > apache/arrow GitHub repository (e.g. DataFusion) follow a mailing 
> > list-based voting and release process. However, it's not clear whether it 
> > makes sense to follow this process for the MATLAB interface since it is 
> > part of the main apache/arrow repository.
> >
> > We sincerely appreciate the community's help and guidance on this topic!
> >
> > Please let us know if you have any questions.
> >
> > [1] 
> > https://www.mathworks.com/help/matlab/creating-help.html?s_tid=CRUX_lftnav
> > [2] https://github.com/apache/arrow/releases
> > [3] https://apache.jfrog.io/ui/native/arrow/
> > [4] https://www.apache.org/foundation/voting.html
> > [5] https://www.apache.org/legal/release-policy.html#release-approval
> >
> > Best Regards,
> >
> > Kevin Gurney

Reply via email to