[ 
https://issues.apache.org/jira/browse/ARROW-15321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475262#comment-17475262
 ] 

Joris Van den Bossche commented on ARROW-15321:
-----------------------------------------------

More specifically, I think it doesn't work for cython classes, because they 
don't define a {{\_\_module\_\_}}, which is used here: 
https://github.com/apache/arrow/blob/ab86daf3f7c8a67bee6a175a749575fd40417d27/dev/archery/archery/lang/python.py#L142-L144


Example: {{Table.to_pandas}} is implemented in cython, but {{ORCFile.read}} is 
a python class method:

{code}
In [14]: pa.Table.to_pandas.__module__
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-14-fd0a68c749ef> in <module>
----> 1 pa.Table.to_pandas.__module__

AttributeError: 'method_descriptor' object has no attribute '__module__'

In [17]: from pyarrow import orc

In [18]: orc.ORCFile.read.__module__
Out[18]: 'pyarrow.orc'
{code}

> [Dev][Archery] numpydoc validation doesn't check all class methods
> ------------------------------------------------------------------
>
>                 Key: ARROW-15321
>                 URL: https://issues.apache.org/jira/browse/ARROW-15321
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Developer Tools, Python
>            Reporter: Joris Van den Bossche
>            Assignee: Alessandro Molina
>            Priority: Major
>             Fix For: 8.0.0
>
>
> From discussion at 
> https://github.com/apache/arrow/pull/12076#discussion_r783810077
> It seems that by default, it doesn't loop over all _methods_ of classes, but 
> only module-level objects?
> For example, I notice that explicitly asking for {{pyarrow.Table.to_pandas}} 
> catches some issues:
> {code}
> $ archery numpydoc pyarrow.Table.to_pandas --allow-rule PR10
> INFO:archery:Running Python docstring linters
> PR10: Parameter "categories" requires a space before the colon separating the 
> parameter name and type
> PR10: Parameter "use_threads" requires a space before the colon separating 
> the parameter name and type
> {code}
> But with the default (check all of pyarrow) with {{archery numpydoc 
> --allow-rule PR10}} it doesn't list those errors.
> cc [~kszucs] [~amol-]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to