I think you need to add:

    export PYARROW_WITH_DATASET=1

On Tue, May 10, 2022 at 7:07 AM Yaron Gvili <rt...@hotmail.com> wrote:
>
> Hello,
>
> I ran into a problem with running PyArrow that I locally built. The build 
> worked fine (or so it seems) but then the testing procedure had a failure due 
> to not being able to load pyarrow._dataset, which I manually confirmed. I'd 
> appreciate any guidance on how to fix this error.
>
> Below are the commands I used to build and test along with the failure 
> console-output (other console-output, for successful commands, is not 
> included), followed by my manual confirmation:
>
> $ conda activate pyarrow-dev
> $ mkdir -p arrow/cpp/build/pyarrow-release
> $ pushd arrow/cpp/build/pyarrow-release
> $ cmake -GNinja -DCMAKE_INSTALL_PREFIX=$ARROW_HOME -DCMAKE_INSTALL_LIBDIR=lib 
> $(for a in COMPUTE DATASET ENGINE FILESYSTEM IPC PARQUET PYTHON WITH_BZ2 
> WITH_ZLIB WITH_ZSTD WITH_LZ4 WITH_SNAPPY WITH_BROTLI BUILD_TESTS; do echo 
> "-DARROW_${a}=ON"; done) -DPARQUET_REQUIRE_ENCRYPTION=ON ../..
> $ ninja -j 6
> $ cmake --build . --target install
> $ popd
> $ pushd arrow/python
> $ export PYARROW_WITH_PARQUET=1
> $ export PYARROW_WITH_PARQUET_ENCRYPTION=1
> $ python setup.py build_ext --inplace
> $ python -m pytest pyarrow/
> ...
> FAILED pyarrow/tests/parquet/test_dataset.py::test_partitioned_dataset[True] 
> - ModuleNotFoundError: No module named 'pyarrow._dataset'
> ...
> $ python
> Python 3.9.12 | packaged by conda-forge | (main, Mar 24 2022, 23:25:59)
> [GCC 10.3.0] on linux
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import pyarrow._dataset
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> ModuleNotFoundError: No module named 'pyarrow._dataset'
>
>
> Cheers,
> Yaron.

Reply via email to