I think you need to add: export PYARROW_WITH_DATASET=1
On Tue, May 10, 2022 at 7:07 AM Yaron Gvili <rt...@hotmail.com> wrote: > > Hello, > > I ran into a problem with running PyArrow that I locally built. The build > worked fine (or so it seems) but then the testing procedure had a failure due > to not being able to load pyarrow._dataset, which I manually confirmed. I'd > appreciate any guidance on how to fix this error. > > Below are the commands I used to build and test along with the failure > console-output (other console-output, for successful commands, is not > included), followed by my manual confirmation: > > $ conda activate pyarrow-dev > $ mkdir -p arrow/cpp/build/pyarrow-release > $ pushd arrow/cpp/build/pyarrow-release > $ cmake -GNinja -DCMAKE_INSTALL_PREFIX=$ARROW_HOME -DCMAKE_INSTALL_LIBDIR=lib > $(for a in COMPUTE DATASET ENGINE FILESYSTEM IPC PARQUET PYTHON WITH_BZ2 > WITH_ZLIB WITH_ZSTD WITH_LZ4 WITH_SNAPPY WITH_BROTLI BUILD_TESTS; do echo > "-DARROW_${a}=ON"; done) -DPARQUET_REQUIRE_ENCRYPTION=ON ../.. > $ ninja -j 6 > $ cmake --build . --target install > $ popd > $ pushd arrow/python > $ export PYARROW_WITH_PARQUET=1 > $ export PYARROW_WITH_PARQUET_ENCRYPTION=1 > $ python setup.py build_ext --inplace > $ python -m pytest pyarrow/ > ... > FAILED pyarrow/tests/parquet/test_dataset.py::test_partitioned_dataset[True] > - ModuleNotFoundError: No module named 'pyarrow._dataset' > ... > $ python > Python 3.9.12 | packaged by conda-forge | (main, Mar 24 2022, 23:25:59) > [GCC 10.3.0] on linux > Type "help", "copyright", "credits" or "license" for more information. > >>> import pyarrow._dataset > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > ModuleNotFoundError: No module named 'pyarrow._dataset' > > > Cheers, > Yaron.