jorisvandenbossche commented on pull request #11726:
URL: https://github.com/apache/arrow/pull/11726#issuecomment-1010092480


   > 1. We should avoid the form `list of int` because `of` is nowhere 
recognised for containers, containers are dealt with `list[int]` form. Se 
should probably migrate all cases to that form because it solves some 
misunderstandings from numpydoc
   
   Personally I am not in favour of using type annotation syntax in the 
docstrings (if we want type annotations, we can actually use type annoations in 
the signature). My stance is that docstrings should be meant as human readable 
text, and I think only a minority of python users is familiar with typing 
syntax. 
   (I know `list[int]` is of course readable, and probably relatively easy to 
understand even for someone that doesn't know type annotation syntax, but you 
quickly get into more complicated stuff with unions etc if you want to do this 
consistently)
   
   If the "of" is the problem here, having that in the `numpydoc_xref_ignore` 
list (as you already did) solves it as well.
   
   > 2\. numpydoc is somewhat buggy, `or` is only detected as a literal in some 
cases, in some other cases is parsed as a reference and becomes `` obj:`or`  ``.
   
   Yes, I noticed that as well. I started looking into why, but didn't look 
further as just including in the ignore list also fixes it.
   
   > 3\. Some words it's correct that they are identified as generic references 
(`function`, `Mapping`, `iterator`) as they are in practice 
interfaces/protocols, so they might not have a reference in the Python docs but 
we want to identify them as code anyway.
   
   I don't think for example "function" should be identified/formatted as code? 
While it's indeed a generic python term, it's not an actual Python code keyword 
or builtin or .. (i.e. "if you type it in a console, you get an error"), so I 
think we should see that as prose text? 
   
   > For `bool`, `True`, `False`, `object` I think it would be wrong to place 
them in ignore as they are currently correctly detected and linked to the 
Python documentation.
   
   The reason that I included `None`, `True` and `False` in the ignore list in 
my comment above, is because we very often have things like "param : type, 
default None" or "..., default True". I don't see much added value in each time 
linking the None/True/False to the python docs, while this linking does add 
visual "noise" (by changing a prose text "default None" into something with 2 
colors because one word is a link and the other not). 
   
   (now, there is a bug in numpydoc that listing those 3 values in the ignore 
list doesn't have any effect, because the "ignore" list has no priority over 
some default values they link (which you can supplement with 
`numpydoc_xref_aliases`), so this discussion item is a bit moot for now)
   
   > [about "use_threads : bool, default True"] where the whole parameter is 
misinterpreted for unknown reasons.
   
   That's because there is a missing space in our code:
   
   
https://github.com/apache/arrow/blob/ccffcea3fd383c448aa9da292baf2d0805ecab4d/python/pyarrow/array.pxi#L714-L716
   
   numpydoc is sensitive to the delimiter actually being `" : "` with space 
before/after the colon. Normally the validation should catch that 
(https://github.com/apache/arrow/pull/7732), but I suppose this rule is not yet 
activated in the checks.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to