Todd Farmer created ARROW-17076: ----------------------------------- Summary: [Python][Docs] Enable building documentation with pyarrow nightly builds Key: ARROW-17076 URL: https://issues.apache.org/jira/browse/ARROW-17076 Project: Apache Arrow Issue Type: Improvement Components: C++, Documentation, Python Reporter: Todd Farmer
The [instructions for building documentation|https://arrow.apache.org/docs/developers/documentation.html] describes needing pyarrow to successfully build the docs. It also highlights that certain optional pyarrow features must be enabled to successfully build: {code:java} Note that building the documentation may fail if your build of pyarrow is not sufficiently comprehensive. Portions of the Python API documentation will also not build without CUDA support having been built. {code} "Sufficiently comprehensive" is relatively ambiguous, leaving users to repeat a sequence of steps to identify and resolve required elements: * Build C++ * Build Python * Attempt to build docs * Evaluate missing features based on error messages This adds significant overhead to simply building docs, limiting accessibility for less experienced users to offer docs improvements. Rather than attempt to follow the steps above, I attempted to use a nightly pyarrow build to satisfy docs build requirements. This did not work, though, because nightly builds are not built with the options needed to build docs: {code:java} (base) todd@pop-os:~/arrow$ pushd docs make html popd ~/arrow/docs ~/arrow sphinx-build -b html -d _build/doctrees -j8 source _build/html Running Sphinx v5.0.2 WARNING: Invalid configuration value found: 'language = None'. Update your configuration to a valid langauge code. Falling back to 'en' (English). making output directory... done [autosummary] generating autosummary for: c_glib/index.rst, cpp/api.rst, cpp/api/array.rst, cpp/api/async.rst, cpp/api/builder.rst, cpp/api/c_abi.rst, cpp/api/compute.rst, cpp/api/cuda.rst, cpp/api/dataset.rst, cpp/api/datatype.rst, ..., python/json.rst, python/memory.rst, python/numpy.rst, python/orc.rst, python/pandas.rst, python/parquet.rst, python/plasma.rst, python/timestamps.rst, r/index.rst, status.rst WARNING: [autosummary] failed to import pyarrow.compute.CumulativeSumOptions. Possible hints: * ModuleNotFoundError: No module named 'pyarrow.compute.CumulativeSumOptions'; 'pyarrow.compute' is not a package * AttributeError: module 'pyarrow.compute' has no attribute 'CumulativeSumOptions' * ImportError: WARNING: [autosummary] failed to import pyarrow.compute.cumulative_sum. Possible hints: * ModuleNotFoundError: No module named 'pyarrow.compute.cumulative_sum'; 'pyarrow.compute' is not a package * ImportError: * AttributeError: module 'pyarrow.compute' has no attribute 'cumulative_sum' WARNING: [autosummary] failed to import pyarrow.compute.cumulative_sum_checked. Possible hints: * ImportError: * AttributeError: module 'pyarrow.compute' has no attribute 'cumulative_sum_checked' * ModuleNotFoundError: No module named 'pyarrow.compute.cumulative_sum_checked'; 'pyarrow.compute' is not a package WARNING: [autosummary] failed to import pyarrow.dataset.WrittenFile. Possible hints: * ModuleNotFoundError: No module named 'pyarrow.dataset.WrittenFile'; 'pyarrow.dataset' is not a package * ImportError: * AttributeError: module 'pyarrow.dataset' has no attribute 'WrittenFile'Extension error (sphinx.ext.autosummary): Handler <function process_generate_options at 0x7f6f49ebe820> for event 'builder-inited' threw an exception (exception: no module named pyarrow.parquet.encryption) make: *** [Makefile:81: html] Error 2 ~/arrow {code} Nightly builds should be made sufficient to build documentation. -- This message was sent by Atlassian Jira (v8.20.10#820010)