This is an automated email from the ASF dual-hosted git repository.
kevingurney pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow.git
The following commit(s) were added to refs/heads/main by this push:
new ce11e561d3 GH-38659: [CI][MATLAB][Packaging] Add MATLAB `packaging`
task to crossbow `tasks.yml` (#38660)
ce11e561d3 is described below
commit ce11e561d37db3cdbc8c55e000ca46256f504dc1
Author: Kevin Gurney <[email protected]>
AuthorDate: Fri Mar 29 16:57:39 2024 -0400
GH-38659: [CI][MATLAB][Packaging] Add MATLAB `packaging` task to crossbow
`tasks.yml` (#38660)
### Rationale for this change
Per the following mailing list discussion:
https://lists.apache.org/thread/0xyow40h7b1bptsppb0rxd4g9r1xpmh6
to integrate the MATLAB interface code with the existing Arrow release
tooling, we first need to add a task to the [`packaging`
group](https://github.com/apache/arrow/blob/1fd11d33cb56fd7eff4dce05edaba1c9d8a1dccd/dev/tasks/tasks.yml#L55)
to crossbow. This packaging task will automatically create a [MLTBX
file](https://www.mathworks.com/help/matlab/creating-help.html?s_tid=CRUX_lftnav)
(the MATLAB equivalent to a Python binary wheel or Ruby gem) that can be
installed via a "one-click" [...]
### Licensing
For more information about licensing of the MLTBX file contents, please
refer to the mailing list discussion and ASF Legal ticket linked below:
1. https://lists.apache.org/thread/zlpnncgvo6l4cvkxfxn7zt4q7qhptotw
2. https://issues.apache.org/jira/browse/LEGAL-665
### What changes are included in this PR?
1. Added a `matlab` task to the [`packaging`
group](https://github.com/apache/arrow/blob/1fd11d33cb56fd7eff4dce05edaba1c9d8a1dccd/dev/tasks/tasks.yml#L55)
in `dev/tasks/tasks.yml`.
4. Added a new GitHub Actions workflow called
`dev/tasks/matlab/github.yml` which builds the MATLAB interface code on all
platforms (Windows, macOS, and Ubuntu 20.04) and packages the generated build
artifacts into a single MLTBX file using
[`matlab.addons.toolbox.packageToolbox`](https://www.mathworks.com/help/matlab/ref/matlab.addons.toolbox.packagetoolbox.html).
5. Changed the GitHub-hosted runner to `ubuntu-20.04` from `ubuntu-latest`
for the MATLAB CI check (i.e. `.github/workflows/matlab.yml`). The rationale
for this change is that we primarily develop and qualify against Debian 11
locally, but the CI check has been building against `ubuntu-latest` (i.e.
`ubuntu-22.04`). There are two issues with using `ubuntu-22.04`. The first is
that the version of `GLIBC` shipped with `ubuntu-22.04` is not fully compatible
with the version of `GLIBC` sh [...]
### Are these changes tested?
Yes.
1. Successfully submitted a crossbow `packaging` job for the MATLAB
interface by commenting `@ github-actions crossbow submit matlab`. Example of a
successful packaging job:
https://github.com/ursacomputing/crossbow/actions/runs/6893506432/job/18753227453.
2. Manually installed the resulting MLTBX file on macOS, Windows, Debian
11, and Ubuntu 20.04. Ran all tests under `matlab/test` using `runtests .
IncludeSubFolders 1`.
### Are there any user-facing changes?
No.
### Notes
1. While qualifying, we discovered that [MATLAB's programmatic packaging
interface](https://www.mathworks.com/help/matlab/ref/matlab.addons.toolbox.packagetoolbox.html)
does not properly include symbolic link files in the packaged MLTBX file.
We've reported this bug to the relevant MathWorks development team. As a
temporary workaround, we included a step to change the expected name of the
Arrow C++ libraries (using `patchelf`/`install_name_tool`) which
`libarrowproxy.so`/`libarrowprox [...]
### Future Directions
1. Add tooling to upload release candidate (RC) MLTBX files to
apache/arrow's GitHub Releases area and mark them as "Prerelease". In other
words, modify
https://github.com/apache/arrow/blob/main/dev/release/05-binary-upload.sh.
2. Add a post-release script to upload release MLTBX files to
apache/arrow's GitHub Releases area (similar to how
https://github.com/apache/arrow/blob/main/dev/release/post-09-python.sh works).
4. Enable nightly builds for the MATLAB interface.
6. Document how to qualify a MATLAB Arrow interface release.
7. Enable building and testing the MATLAB Arrow interface on multiple
Ubuntu distributions simulatneously (e.g. 20.04 *and* 22.04).
* Closes: #38659
* GitHub Issue: #38659
Lead-authored-by: Sarah Gilmore <[email protected]>
Co-authored-by: Kevin Gurney <[email protected]>
Signed-off-by: Kevin Gurney <[email protected]>
---
.github/workflows/matlab.yml | 28 +++---
dev/tasks/matlab/github.yml | 162 ++++++++++++++++++++++++++++++++++
dev/tasks/tasks.yml | 9 ++
matlab/CMakeLists.txt | 17 ----
matlab/tools/packageMatlabInterface.m | 84 ++++++++++++++++++
5 files changed, 273 insertions(+), 27 deletions(-)
diff --git a/.github/workflows/matlab.yml b/.github/workflows/matlab.yml
index eceeb551a0..dfc734e043 100644
--- a/.github/workflows/matlab.yml
+++ b/.github/workflows/matlab.yml
@@ -42,7 +42,23 @@ jobs:
ubuntu:
name: AMD64 Ubuntu 20.04 MATLAB
- runs-on: ubuntu-latest
+ # Explicitly pin the Ubuntu version to 20.04 for the time being because:
+ #
+ # 1. The version of GLIBCXX shipped with Ubuntu 22.04 is not binary
compatible
+ # with the GLIBCXX bundled with MATLAB R2023a. This is a relatively
common
+ # issue.
+ #
+ # For example, see:
+ #
+ #
https://www.mathworks.com/matlabcentral/answers/1907290-how-to-manually-select-the-libstdc-library-to-use-to-resolve-a-version-glibcxx_-not-found
+ #
+ # 2. The version of GLIBCXX shipped with Ubuntu 22.04 is not binary
compatible with
+ # the version of GLIBCXX shipped with Debian 11. Several of the
Arrow community
+ # members who work on the MATLAB bindings use Debian 11 locally for
qualification.
+ # Using Ubuntu 20.04 eases development workflows for these
community members.
+ #
+ # In the future, we can investigate adding support for building against
more Linux (e.g. `ubuntu-22.04`) and MATLAB versions (e.g. R2023b).
+ runs-on: ubuntu-20.04
if: ${{ !contains(github.event.pull_request.title, 'WIP') }}
steps:
- name: Check out repository
@@ -74,14 +90,6 @@ jobs:
run: ci/scripts/matlab_build.sh $(pwd)
- name: Run MATLAB Tests
env:
- # libarrow.so requires a more recent version of libstdc++.so
- # than is bundled with MATLAB under <matlabroot>/sys/os/glnxa64.
- # Therefore, if a MEX function that depends on libarrow.so
- # is executed within the MATLAB address space, runtime linking
- # errors will occur. To work around this issue, we can explicitly
- # force MATLAB to use the system libstdc++.so via LD_PRELOAD.
- LD_PRELOAD: /usr/lib/x86_64-linux-gnu/libstdc++.so.6
-
# Add the installation directory to the MATLAB Search Path by
# setting the MATLABPATH environment variable.
MATLABPATH: matlab/install/arrow_matlab
@@ -89,7 +97,7 @@ jobs:
with:
select-by-folder: matlab/test
macos:
- name: AMD64 macOS 11 MATLAB
+ name: AMD64 macOS 12 MATLAB
runs-on: macos-latest
if: ${{ !contains(github.event.pull_request.title, 'WIP') }}
steps:
diff --git a/dev/tasks/matlab/github.yml b/dev/tasks/matlab/github.yml
new file mode 100644
index 0000000000..1cd3949efb
--- /dev/null
+++ b/dev/tasks/matlab/github.yml
@@ -0,0 +1,162 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+{% import 'macros.jinja' as macros with context %}
+
+{{ macros.github_header() }}
+
+jobs:
+
+ ubuntu:
+ name: AMD64 Ubuntu 20.04 MATLAB
+ runs-on: ubuntu-20.04
+ steps:
+ {{ macros.github_checkout_arrow()|indent }}
+ - name: Install ninja-build
+ run: sudo apt-get update && sudo apt-get install ninja-build
+ - name: Install MATLAB
+ uses: matlab-actions/setup-matlab@v1
+ with:
+ release: R2023a
+ - name: Build MATLAB Interface
+ env:
+ {{ macros.github_set_sccache_envvars()|indent(8) }}
+ run: arrow/ci/scripts/matlab_build.sh $(pwd)/arrow
+ - name: Change shared library dependency name
+ # MATLAB's programmatic packaging interface does not properly
+ # include symbolic link files in the package MLTBX - this is a
+ # bug. As a temporary workaround, change the expected name of the
+ # Arrow C++ library which libarrowproxy.so depends on. For example,
+ # change libarrow.so.1500 to libarrow.so.1500.0.0.
+ run: |
+ pushd arrow/matlab/install/arrow_matlab/+libmexclass/+proxy/
+ SYMLINK_ARROW_LIB="$(find . -name 'libarrow.so.*' -type l | xargs
basename)"
+ REGULAR_ARROW_LIB="$(echo libarrow.so.*.*)"
+ echo "SYMLINK_ARROW_LIB = ${SYMLINK_ARROW_LIB}"
+ echo "REGULAR_ARROW_LIB = ${REGULAR_ARROW_LIB}"
+ patchelf --replace-needed $SYMLINK_ARROW_LIB $REGULAR_ARROW_LIB
libarrowproxy.so
+ popd
+ - name: Compress into single artifact
+ run: tar -cvzf matlab-arrow-ubuntu.tar.gz
arrow/matlab/install/arrow_matlab
+ - name: Upload artifacts
+ uses: actions/upload-artifact@v4
+ with:
+ name: matlab-arrow-ubuntu.tar.gz
+ path: matlab-arrow-ubuntu.tar.gz
+
+ macos:
+ name: AMD64 macOS 12 MATLAB
+ runs-on: macos-latest
+ steps:
+ {{ macros.github_checkout_arrow()|indent }}
+ - name: Install ninja-build
+ run: brew install ninja
+ - name: Install MATLAB
+ uses: matlab-actions/setup-matlab@v1
+ with:
+ release: R2023a
+ - name: Build MATLAB Interface
+ env:
+ {{ macros.github_set_sccache_envvars()|indent(8) }}
+ run: arrow/ci/scripts/matlab_build.sh $(pwd)/arrow
+ - name: Change shared library dependency name
+ # MATLAB's programmatic packaging interface does not properly
+ # include symbolic link files in the package MLTBX - this is a
+ # bug. As a temporary workaround, change the expected name of the
+ # Arrow C++ library which libarrowproxy.dylib depends on.
+ # For example, change libarrow.1500.dylib to libarrow.1500.0.0.dylib.
+ run: |
+ pushd arrow/matlab/install/arrow_matlab/+libmexclass/+proxy
+ SYMLINK_ARROW_LIB="$(find . -name 'libarrow.*.dylib' -type l |
xargs basename)"
+ REGULAR_ARROW_LIB="$(echo libarrow.*.*.dylib)"
+ echo "SYMLINK_ARROW_LIB = ${SYMLINK_ARROW_LIB}"
+ echo "REGULAR_ARROW_LIB = ${REGULAR_ARROW_LIB}"
+ install_name_tool -change @rpath/$SYMLINK_ARROW_LIB
@rpath/$REGULAR_ARROW_LIB libarrowproxy.dylib
+ popd
+ - name: Compress into single artifact
+ run: tar -cvzf matlab-arrow-macos.tar.gz
arrow/matlab/install/arrow_matlab
+ - name: Upload artifacts
+ uses: actions/upload-artifact@v4
+ with:
+ name: matlab-arrow-macos.tar.gz
+ path: matlab-arrow-macos.tar.gz
+
+ windows:
+ name: AMD64 Windows 2022 MATLAB
+ runs-on: windows-2022
+ steps:
+ {{ macros.github_checkout_arrow()|indent }}
+ - name: Install MATLAB
+ uses: matlab-actions/setup-matlab@v1
+ with:
+ release: R2023a
+ - name: Install sccache
+ shell: bash
+ run: arrow/ci/scripts/install_sccache.sh pc-windows-msvc $(pwd)/sccache
+ - name: Build MATLAB Interface
+ shell: cmd
+ env:
+ {{ macros.github_set_sccache_envvars()|indent(8) }}
+ run: |
+ call "C:\Program Files\Microsoft Visual
Studio\2022\Enterprise\VC\Auxiliary\Build\vcvarsall.bat" x64
+ bash -c "arrow/ci/scripts/matlab_build.sh $(pwd)/arrow"
+ - name: Compress into single artifact
+ shell: bash
+ run: tar -cvzf matlab-arrow-windows.tar.gz
arrow/matlab/install/arrow_matlab
+ - name: Upload artifacts
+ uses: actions/upload-artifact@v4
+ with:
+ name: matlab-arrow-windows.tar.gz
+ path: matlab-arrow-windows.tar.gz
+
+ package-mltbx:
+ name: Package MATLAB Toolbox (MLTBX) Files
+ runs-on: ubuntu-latest
+ needs:
+ - ubuntu
+ - macos
+ - windows
+ steps:
+ {{ macros.github_checkout_arrow(fetch_depth=0)|indent }}
+ - name: Download Artifacts
+ uses: actions/download-artifact@v4
+ with:
+ path: artifacts-downloaded
+ - name: Decompress Artifacts
+ run: |
+ mv artifacts-downloaded/*/*.tar.gz .
+ tar -xzvf matlab-arrow-ubuntu.tar.gz
+ tar -xzvf matlab-arrow-macos.tar.gz
+ tar -xzvf matlab-arrow-windows.tar.gz
+ - name: Copy LICENSE.txt and NOTICE.txt for packaging
+ run: |
+ cp arrow/LICENSE.txt arrow/matlab/install/arrow_matlab/LICENSE.txt
+ cp arrow/NOTICE.txt arrow/matlab/install/arrow_matlab/NOTICE.txt
+ - name: Install MATLAB
+ uses: matlab-actions/setup-matlab@v1
+ with:
+ release: R2023a
+ - name: Run commands
+ env:
+ MATLABPATH: arrow/matlab/tools
+ ARROW_MATLAB_TOOLBOX_FOLDER: arrow/matlab/install/arrow_matlab
+ ARROW_MATLAB_TOOLBOX_OUTPUT_FOLDER: artifacts/matlab-dist
+ ARROW_MATLAB_TOOLBOX_VERSION: {{ arrow.no_rc_version }}
+ uses: matlab-actions/run-command@v1
+ with:
+ command: packageMatlabInterface
+ {{
macros.github_upload_releases(["artifacts/matlab-dist/*.mltbx"])|indent }}
diff --git a/dev/tasks/tasks.yml b/dev/tasks/tasks.yml
index 2abfbc1517..5e1ef8d13b 100644
--- a/dev/tasks/tasks.yml
+++ b/dev/tasks/tasks.yml
@@ -59,6 +59,7 @@ groups:
- conan-*
- debian-*
- java-jars
+ - matlab
- nuget
- python-sdist
- r-binary-packages
@@ -665,6 +666,14 @@ tasks:
params:
formula: apache-arrow.rb
+ ############################## MATLAB Packages
################################
+
+ matlab:
+ ci: github
+ template: matlab/github.yml
+ artifacts:
+ - matlab-arrow-{no_rc_version}.mltbx
+
############################## Arrow JAR's ##################################
java-jars:
diff --git a/matlab/CMakeLists.txt b/matlab/CMakeLists.txt
index 206ecb318b..b85f782d2d 100644
--- a/matlab/CMakeLists.txt
+++ b/matlab/CMakeLists.txt
@@ -201,9 +201,6 @@ get_filename_component(ARROW_SHARED_LIB_DIR
${ARROW_SHARED_LIB} DIRECTORY)
get_filename_component(ARROW_SHARED_LIB_FILENAME ${ARROW_SHARED_LIB} NAME_WE)
if(NOT Arrow_FOUND)
- # If Arrow_FOUND is false, Arrow is built by the arrow_shared target and
needs
- # to be copied to CMAKE_PACKAGED_INSTALL_DIR.
-
if(APPLE)
# Install libarrow.dylib (symlink) and the real files it points to.
# on macOS, we need to match these files: libarrow.dylib
@@ -226,20 +223,6 @@ if(NOT Arrow_FOUND)
set(SHARED_LIBRARY_VERSION_REGEX
${ARROW_SHARED_LIB_FILENAME}${CMAKE_SHARED_LIBRARY_SUFFIX})
endif()
-
- # The subfolders cmake and pkgconfig are excluded as they will be empty.
- # Note: The following CMake Issue suggests enabling an option to exclude all
- # folders that would be empty after installation:
- # https://gitlab.kitware.com/cmake/cmake/-/issues/17122
-
- set(CMAKE_PACKAGED_INSTALL_DIR "${CMAKE_INSTALL_DIR}/+arrow")
-
- install(DIRECTORY "${ARROW_SHARED_LIB_DIR}/"
- DESTINATION ${CMAKE_PACKAGED_INSTALL_DIR}
- FILES_MATCHING
- REGEX ${SHARED_LIBRARY_VERSION_REGEX}
- PATTERN "cmake" EXCLUDE
- PATTERN "pkgconfig" EXCLUDE)
endif()
# MATLAB_ADD_INSTALL_DIR_TO_STARTUP_FILE toggles whether an addpath command to
add the install
diff --git a/matlab/tools/packageMatlabInterface.m
b/matlab/tools/packageMatlabInterface.m
new file mode 100644
index 0000000000..55b4d4241a
--- /dev/null
+++ b/matlab/tools/packageMatlabInterface.m
@@ -0,0 +1,84 @@
+% Licensed to the Apache Software Foundation (ASF) under one
+% or more contributor license agreements. See the NOTICE file
+% distributed with this work for additional information
+% regarding copyright ownership. The ASF licenses this file
+% to you under the Apache License, Version 2.0 (the
+% "License"); you may not use this file except in compliance
+% with the License. You may obtain a copy of the License at
+%
+% http://www.apache.org/licenses/LICENSE-2.0
+%
+% Unless required by applicable law or agreed to in writing,
+% software distributed under the License is distributed on an
+% "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+% KIND, either express or implied. See the License for the
+% specific language governing permissions and limitations
+% under the License.
+
+toolboxFolder = string(getenv("ARROW_MATLAB_TOOLBOX_FOLDER"));
+outputFolder = string(getenv("ARROW_MATLAB_TOOLBOX_OUTPUT_FOLDER"));
+toolboxVersionRaw = string(getenv("ARROW_MATLAB_TOOLBOX_VERSION"));
+
+appendLicenseText(fullfile(toolboxFolder, "LICENSE.txt"));
+appendNoticeText(fullfile(toolboxFolder, "NOTICE.txt"));
+
+% Output folder must exist.
+mkdir(outputFolder);
+
+disp("Toolbox Folder: " + toolboxFolder);
+disp("Output Folder: " + outputFolder);
+disp("Toolbox Version Raw: " + toolboxVersionRaw);
+
+
+% Note: This string processing heuristic may not be robust to future
+% changes in the Arrow versioning scheme.
+dotIdx = strfind(toolboxVersionRaw, ".");
+numDots = numel(dotIdx);
+if numDots >= 3
+ toolboxVersion = extractBefore(toolboxVersionRaw, dotIdx(3));
+else
+ toolboxVersion = toolboxVersionRaw;
+end
+
+disp("Toolbox Version:" + toolboxVersion);
+
+identifier = "ad1d0fe6-22d1-4969-9e6f-0ab5d0f12ce3";
+opts = matlab.addons.toolbox.ToolboxOptions(toolboxFolder, identifier);
+opts.ToolboxName = "MATLAB Arrow Interface";
+opts.ToolboxVersion = toolboxVersion;
+opts.AuthorName = "The Apache Software Foundation";
+opts.AuthorEmail = "[email protected]";
+
+% Set the SupportedPlatforms
+opts.SupportedPlatforms.Win64 = true;
+opts.SupportedPlatforms.Maci64 = true;
+opts.SupportedPlatforms.Glnxa64 = true;
+opts.SupportedPlatforms.MatlabOnline = true;
+
+% Interface is only qualified against R2023a at the moment
+opts.MinimumMatlabRelease = "R2023a";
+opts.MaximumMatlabRelease = "R2023a";
+
+opts.OutputFile = fullfile(outputFolder, compose("matlab-arrow-%s.mltbx",
toolboxVersionRaw));
+disp("Output File: " + opts.OutputFile);
+matlab.addons.toolbox.packageToolbox(opts);
+
+function appendLicenseText(filename)
+ licenseText = [ ...
+ newline +
"--------------------------------------------------------------------------------"
+ newline
+ "3rdparty dependency mathworks/libmexclass is redistributed as a
dynamically"
+ "linked shared library in certain binary distributions, like the
MATLAB"
+ "distribution." + newline
+ "Copyright: 2022-2024 The MathWorks, Inc. All rights reserved."
+ "Homepage: https://github.com/mathworks/libmexclass"
+ "License: 3-clause BSD" ];
+ writelines(licenseText, filename, WriteMode="append");
+end
+
+function appendNoticeText(filename)
+ noticeText = [ ...
+ newline +
"---------------------------------------------------------------------------------"
+ newline
+ "This product includes software from The MathWorks, Inc. (Apache 2.0)"
+ " * Copyright (C) 2024 The MathWorks, Inc."];
+ writelines(noticeText, filename, WriteMode="append");
+end
\ No newline at end of file