[
https://issues.apache.org/jira/browse/ARROW-12526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17533471#comment-17533471
]
Kevin Crouse edited comment on ARROW-12526 at 5/8/22 2:18 PM:
--------------------------------------------------------------
I was pointed to this issue because I was similarly interested in allowing
pythonic documentation to be written for functions in which docs are
autogenerated from the Arrow C++ details. As a result, I've written out the
code to do both of these things for the `pyarrow.compute` module:
[https://github.com/krcrouse/arrow/compare/master...krcrouse:generate-pyarrow-compute-and-improve-docs]
Major points:
* creates `python/docs/additions` tree where the reStructrued text docs that
include the sections to overwrite. Using raw reSt so that code block examples
can be tested using doctest - see the README for more verbose details
* `pyarrow.docutils` (or maybe should be _docutils) provides functions to
processes `python/docs/additions` and return a data structure of the components
per function.
* `python/scripts/generate_sources.py` uses `pyarrow.docutils` and writes out
the code for the compute functions in `pyarrow/generated/compute.py`. All of
the logic from the release-branch `pyarrow.compute` module that dynamically
generated the compute functions has been moved to this script.
** I didn't check the generated file into the repo because I generally do not
include generated files that would be generated by the build process should be
in source control, but I realize there are other perspectives on this
* `pyarrow.compute` now imports from `pyarrow.generated.compute` for all of
the autogenerated compute bindings. Override and custom functions are still
defined here.
* The old `pyarrow._compute_docstrings` is gone because its purpose is
subsumed in the above.
* I've updated the tests so that they work with the above changes.
Below I've attached the `pyarrow/generated/compute.py` file that is currently
created by `generate_source.py` as of commit
[13c2b0e|https://github.com/krcrouse/arrow/commit/13c2b0e14fbfbc483bec559e610c2c222ae7d367]
was (Author: JIRAUSER286896):
I was pointed to this issue because I was similarly interested in allowing
pythonic documentation to be written for functions in which docs are
autogenerated from the Arrow C++ details. As a result, I've written out the
code to do both of these things for the `pyarrow.compute` module:
[https://github.com/krcrouse/arrow/compare/master...krcrouse:generate-pyarrow-compute-and-improve-docs]
Major points:
* creates `python/docs/additions` tree where the reStructrued text docs that
include the sections to overwrite. Using raw reSt so that code block examples
can be tested using doctest - see the README for more verbose details
* `pyarrow.docutils` (or maybe should be _docutils) provides functions to
processes `python/docs/additions` and return a data structure of the components
per function.
* `python/scripts/generate_sources.py` uses `pyarrow.docutils` and writes out
the code for the compute functions in `pyarrow/generated/compute.py`. All of
the logic from the release-branch `pyarrow.compute` module that dynamically
generated the compute functions has been moved to this script.
** I didn't check the generated file into the repo because I generally do not
include generated files that would be generated by the build process should be
in source control, but I realize there are other perspectives on this
* `pyarrow.compute` now imports from `pyarrow.generated.compute` for all of
the autogenerated compute bindings. Override and custom functions are still
defined here.
* The old `pyarrow._compute_docstrings` is gone because its purpose is
subsumed in the above.
> [Python] Pre-generate pyarrow.compute members
> ----------------------------------------------
>
> Key: ARROW-12526
> URL: https://issues.apache.org/jira/browse/ARROW-12526
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Python
> Affects Versions: 4.0.0
> Reporter: Adam Lippai
> Priority: Minor
> Fix For: 9.0.0
>
> Attachments: compute.py
>
>
> Static analysis tools (e.g. pylint) don't recognize simple members like
> pyarrow.compute.equal, they report is as _missing_. Generating file (well a
> file imported by this file I assume)
> [https://github.com/apache/arrow/blob/master/python/pyarrow/compute.py]
> instead of runtime wrapping of the functions would improve the developer
> experience.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)