This is an automated email from the ASF dual-hosted git repository.

alamb pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow-rs.git


The following commit(s) were added to refs/heads/main by this push:
     new 58b897b5f9 Move arrow-pyarrow tests that require `pyarrow` to be 
installed into `arrow-pyarrow-testing` crate (#7742)
58b897b5f9 is described below

commit 58b897b5f975ddb331adcb438a3611a6a9776a3a
Author: Andrew Lamb <[email protected]>
AuthorDate: Mon Jul 7 11:36:48 2025 -0400

    Move arrow-pyarrow tests that require `pyarrow` to be installed into 
`arrow-pyarrow-testing` crate (#7742)
    
    # Which issue does this PR close?
    
    - Related to https://github.com/apache/arrow-rs/issues/7394
    - Closes https://github.com/apache/arrow-rs/issues/7736
    
    # Rationale for this change
    
    At its core, if someone isn't using / modifying the pyarrow integration
    for arrow-rs they shouldn't have to install / configure python to get
    the tests working in `arrow-rs`
    
    - after the change in https://github.com/apache/arrow-rs/pull/7694
    
    Running `cargo test --workspace` now also runs tests that require python
    to be setup and the `pyarrow` module to be installed. This is
    problematic because:
    1. Some people may not have that environment setup
    2. Apparently you can not use virtualenvs with py03 in Mac due to
    https://github.com/PyO3/pyo3/issues/1741
    
    The second item was very confusing for me while I tried to debug what
    going on as I ket getting an error about pyarrow not being installed,
    even though it was installed in my `venv`:
    
    ```
    thread 'test_to_pyarrow' panicked at arrow-pyarrow/tests/pyarrow.rs:43:6:
    called `Result::unwrap()` on an `Err` value: PyErr { type: <class 
'ModuleNotFoundError'>, value: ModuleNotFoundError("No module named 
'pyarrow'"), traceback: None }
    ```
    
    # What changes are included in this PR?
    
    1. Move the tests that require pyarrow to be installed into
    `arrow-pyarrow-testing`, which is not part of the workspace and thus not
    run with `cargo test --all`
    2. Remove `cargo test --exclude arrow-pyarrow`
    3. Add documentation on rationale and hints about running the test
    
    # Frequently Asked Questions
    
    ## Why not add ` --exclude arrow-pyarrow` to
    `verify_release_candidate.sh`?
    
    While the minimal fix would be to add ` --exclude arrow-pyarrow` to
    verify_release_candidate.sh this requires all users of arrow to remember
    to add `--exclude arrow-pyarrow` to their tests even if they don't care
    about python
    
    ## Why not in `pyarrow-arrow-integration-testing` ?
    I did not put this test in `pyarrow-arrow-integration-testing` because
    that module doesn't compile for me with the stock python install
    
    Somehow python needs to be installed with the ability to make dynamic
    libraries that I haven't figured out and don't really want to. It seems
    maybe related to https://pyo3.rs/v0.18.1/getting_started#python (thanks
    to @Xuanwo for the pointer in https://github.com/PyO3/pyo3/issues/2136 /
    https://github.com/apache/opendal/issues/1675)
    
    
    ```
    (venv) root@5e8d0406fabe:/arrow-rs/arrow-pyarrow-integration-testing# cargo 
test --test pyarrow
    warning: `/arrow-rs/arrow-pyarrow-integration-testing/.cargo/config` is 
deprecated in favor of `config.toml`
    note: if you need to support cargo 1.38 or earlier, you can symlink 
`config` to `config.toml`
       Compiling target-lexicon v0.13.2
       Compiling flatbuffers v25.2.10
       Compiling pyo3-build-config v0.24.2
       Compiling arrow-ipc v55.2.0 (/arrow-rs/arrow-ipc)
       Compiling pyo3-macros-backend v0.24.2
       Compiling pyo3-ffi v0.24.2
       Compiling pyo3 v0.24.2
       Compiling pyo3-macros v0.24.2
       Compiling arrow-pyarrow v55.2.0 (/arrow-rs/arrow-pyarrow)
       Compiling arrow v55.2.0 (/arrow-rs/arrow)
       Compiling arrow-pyarrow-integration-testing v0.1.0 
(/arrow-rs/arrow-pyarrow-integration-testing)
    error: linking with `cc` failed: exit status: 1
      |
      = note:  "cc" "/tmp/rustc0jx15I/symbols.o" "<43 object files omitted>" 
"-Wl,--as-needed" "-Wl,-Bstatic" 
"<sysroot>/lib/rustlib/aarch64-unknown-linux-gnu/lib/{libtest-*,libgetopts-*,libunicode_width-*,librustc_std_workspace_std-*}.rlib"
 
"/arrow-rs/arrow-pyarrow-integration-testing/target/debug/deps/{libarrow-7996898a6777f964.rlib,libarrow_row-63508de6e52f4d4d.rlib,libarrow_pyarrow-8b510eeadc952ad2.rlib,libpyo3-c463c3a2243eeab9.rlib,libmemoffset-836dc1ddd866c614.rlib,libpyo3_ffi-fbf18
 [...]
      = note: some arguments are omitted. use `--verbose` to show all linker 
arguments
      = note: /usr/bin/ld: 
/arrow-rs/arrow-pyarrow-integration-testing/target/debug/deps/libarrow_pyarrow-8b510eeadc952ad2.rlib(arrow_pyarrow-8b510eeadc952ad2.8xxa5xo5oql7wlj24034o033n.rcgu.o):
 in function `<pyo3::instance::Borrowed<pyo3::types::tuple::PyTuple> as 
pyo3::call::PyCallArgs>::call_positional':
              
/root/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/pyo3-0.24.2/src/call.rs:213:
 undefined reference to `PyObject_Call'
    ```
    
    
    
    # Are there any user-facing changes?
    
    If there are user-facing changes then we may require documentation to be
    updated before approving the PR.
    
    If there are any breaking changes to public APIs, please call them out.
    
    ---------
    
    Co-authored-by: Copilot <[email protected]>
---
 .github/workflows/integration.yml                  |  5 ++-
 .github/workflows/rust.yml                         |  5 +--
 Cargo.toml                                         |  3 ++
 arrow-pyarrow-testing/Cargo.toml                   | 51 ++++++++++++++++++++++
 arrow-pyarrow-testing/src/lib.rs                   | 20 +++++++++
 .../tests/pyarrow.rs                               | 22 ++++++++++
 6 files changed, 101 insertions(+), 5 deletions(-)

diff --git a/.github/workflows/integration.yml 
b/.github/workflows/integration.yml
index 1b6eeb15dc..0971171929 100644
--- a/.github/workflows/integration.yml
+++ b/.github/workflows/integration.yml
@@ -165,8 +165,9 @@ jobs:
       - name: Run Rust tests
         run: |
           source venv/bin/activate
-          cargo test -p arrow-pyarrow
-      - name: Run tests
+          cd arrow-pyarrow-testing
+          cargo test
+      - name: Run Python tests
         run: |
           source venv/bin/activate
           cd arrow-pyarrow-integration-testing
diff --git a/.github/workflows/rust.yml b/.github/workflows/rust.yml
index a20575391b..e4ffb10a11 100644
--- a/.github/workflows/rust.yml
+++ b/.github/workflows/rust.yml
@@ -52,7 +52,7 @@ jobs:
           # do not produce debug symbols to keep memory usage down
           export RUSTFLAGS="-C debuginfo=0"
           # PyArrow tests happen in integration.yml.
-          cargo test --workspace --exclude arrow-pyarrow
+          cargo test --workspace
 
 
   # Check workspace wide compile and test with default features for
@@ -84,8 +84,7 @@ jobs:
           # do not produce debug symbols to keep memory usage down
           export RUSTFLAGS="-C debuginfo=0"
           export PATH=$PATH:/d/protoc/bin
-          # PyArrow tests happen in integration.yml.
-          cargo test --workspace --exclude arrow-pyarrow
+          cargo test --workspace 
 
 
   # Run cargo fmt for all crates
diff --git a/Cargo.toml b/Cargo.toml
index a9b00f9537..1083c9444c 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -55,6 +55,9 @@ members = [
 resolver = "2"
 
 exclude = [
+    # arrow-pyarrow-testing is excluded because it requires a Python 
interpreter with the pyarrow package installed,
+    # which makes running `cargo test --all` fail if the appropriate Python 
environment is not set up.
+    "arrow-pyarrow-testing",
     # arrow-pyarrow-integration-testing is excluded because it requires 
different compilation flags, thereby
     # significantly changing how it is compiled within the workspace, causing 
the whole workspace to be compiled from
     # scratch this way, this is a stand-alone package that compiles 
independently of the others.
diff --git a/arrow-pyarrow-testing/Cargo.toml b/arrow-pyarrow-testing/Cargo.toml
new file mode 100644
index 0000000000..96c20d31bb
--- /dev/null
+++ b/arrow-pyarrow-testing/Cargo.toml
@@ -0,0 +1,51 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Note this package is not published to crates.io, it is only used for testing
+# the arrow-pyarrow crate in the arrow-rs repository.
+#
+# It is not part of the workspace so that `cargo test --all` does not require
+# a Python interpreter or the pyarrow package to be installed.
+#
+# It is used to run tests that require a Python interpreter and the pyarrow
+# package installed. It is not intended to be used as a library or a standalone
+# application.
+#
+# It is different from `arrow-pyarrow-integration-testing` in that it works
+# with a standard pyarrow installation, rather than building a dynamic library
+# that can be loaded by Python (which requires additional configuraton of the
+# Python environment).
+
+[package]
+name = "arrow-pyarrow-testing"
+description = "Tests for arrow-pyarrow that require only a Python interpreter 
and pyarrow installed"
+version = "0.1.0"
+homepage = "https://github.com/apache/arrow-rs";
+repository = "https://github.com/apache/arrow-rs";
+authors = ["Apache Arrow <[email protected]>"]
+license = "Apache-2.0"
+keywords = [ "arrow" ]
+edition = "2021"
+rust-version = "1.81"
+publish = false
+
+
+[dependencies]
+# Note no dependency on arrow, to ensure arrow-pyarrow can be used by itself
+arrow-array = { path = "../arrow-array" }
+arrow-pyarrow = { path = "../arrow-pyarrow" }
+pyo3 = { version = "0.25", default-features = false }
diff --git a/arrow-pyarrow-testing/src/lib.rs b/arrow-pyarrow-testing/src/lib.rs
new file mode 100644
index 0000000000..31b805c573
--- /dev/null
+++ b/arrow-pyarrow-testing/src/lib.rs
@@ -0,0 +1,20 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+//! This crate exists to provide a test environment for the `arrow-pyarrow` 
crate.
+//! It is not intended to be used by itself. See comments in Cargo.toml for 
more
+//! details.
\ No newline at end of file
diff --git a/arrow-pyarrow/tests/pyarrow.rs 
b/arrow-pyarrow-testing/tests/pyarrow.rs
similarity index 83%
rename from arrow-pyarrow/tests/pyarrow.rs
rename to arrow-pyarrow-testing/tests/pyarrow.rs
index 12e2f97abf..3d3c30cf21 100644
--- a/arrow-pyarrow/tests/pyarrow.rs
+++ b/arrow-pyarrow-testing/tests/pyarrow.rs
@@ -15,6 +15,28 @@
 // specific language governing permissions and limitations
 // under the License.
 
+//! Tests pyarrow bindings
+//!
+//! This test requires installing the `pyarrow` python package. If you do not
+//! have this package installed, you will see an error such as the following:
+//!
+//! ```text
+//! PyErr { type: <class 'ModuleNotFoundError'>, value: 
ModuleNotFoundError("No module named 'pyarrow'"), traceback: None }
+//! ```
+//!
+//! # Notes
+//!
+//! You can not use a virtual environment to run these tests on MacOS, as it 
will
+//! fail to find the pyarrow module due to 
<https://github.com/PyO3/pyo3/issues/1741>
+//!
+//! One way to run them is to install the `pyarrow` package in the system 
Python,
+//! which might break other packages, so use with caution:
+//!
+//! ```shell
+//! brew install pipx
+//! pip3 install --break-system-packages pyarrow
+//! ```
+
 use arrow_array::builder::{BinaryViewBuilder, StringViewBuilder};
 use arrow_array::{
     Array, ArrayRef, BinaryViewArray, Int32Array, RecordBatch, StringArray, 
StringViewArray,

Reply via email to