jorisvandenbossche commented on pull request #11726: URL: https://github.com/apache/arrow/pull/11726#issuecomment-1010092480
> 1. We should avoid the form `list of int` because `of` is nowhere recognised for containers, containers are dealt with `list[int]` form. Se should probably migrate all cases to that form because it solves some misunderstandings from numpydoc Personally I am not in favour of using type annotation syntax in the docstrings (if we want type annotations, we can actually use type annoations in the signature). My stance is that docstrings should be meant as human readable text, and I think only a minority of python users is familiar with typing syntax. (I know `list[int]` is of course readable, and probably relatively easy to understand even for someone that doesn't know type annotation syntax, but you quickly get into more complicated stuff with unions etc if you want to do this consistently) If the "of" is the problem here, having that in the `numpydoc_xref_ignore` list (as you already did) solves it as well. > 2\. numpydoc is somewhat buggy, `or` is only detected as a literal in some cases, in some other cases is parsed as a reference and becomes `` obj:`or` ``. Yes, I noticed that as well. I started looking into why, but didn't look further as just including in the ignore list also fixes it. > 3\. Some words it's correct that they are identified as generic references (`function`, `Mapping`, `iterator`) as they are in practice interfaces/protocols, so they might not have a reference in the Python docs but we want to identify them as code anyway. I don't think for example "function" should be identified/formatted as code? While it's indeed a generic python term, it's not an actual Python code keyword or builtin or .. (i.e. "if you type it in a console, you get an error"), so I think we should see that as prose text? > For `bool`, `True`, `False`, `object` I think it would be wrong to place them in ignore as they are currently correctly detected and linked to the Python documentation. The reason that I included `None`, `True` and `False` in the ignore list in my comment above, is because we very often have things like "param : type, default None" or "..., default True". I don't see much added value in each time linking the None/True/False to the python docs, while this linking does add visual "noise" (by changing a prose text "default None" into something with 2 colors because one word is a link and the other not). (now, there is a bug in numpydoc that listing those 3 values in the ignore list doesn't have any effect, because the "ignore" list has no priority over some default values they link (which you can supplement with `numpydoc_xref_aliases`), so this discussion item is a bit moot for now) > [about "use_threads : bool, default True"] where the whole parameter is misinterpreted for unknown reasons. That's because there is a missing space in our code: https://github.com/apache/arrow/blob/ccffcea3fd383c448aa9da292baf2d0805ecab4d/python/pyarrow/array.pxi#L714-L716 numpydoc is sensitive to the delimiter actually being `" : "` with space before/after the colon. Normally the validation should catch that (https://github.com/apache/arrow/pull/7732), but I suppose this rule is not yet activated in the checks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
