Re: [sympy] Google Season of Docs 2020 Proposal

Aaron Meurer Wed, 29 Jul 2020 15:51:18 -0700

Thanks for the writeup. I agree with you about tooling. For
documentation in particular, over use of tooling can lead to a
situation where documentation is written more for machines than
humans. Consistency in formatting is important and certainly makes it
easier for humans to read documentation, but this can go too far.
Things that are easy for machines to parser are not necessarily the
same as the things that are easy for humans to read and understand.


With that being said, tooling can be helpful because there are a lot
of rules in our documentation style guide, and the more of them that
we can enforce with tests, the better. Otherwise it becomes a burden
on reviewers to know all the rules, and the documentation ultimately
ends up not following it unless we have a technical writer who
constantly reviews all documentation. Of course, the first step is to
actually make it so everything conforms to the style guide. Once that
is done, we can look into how to use tests and tooling to keep it that
way.

Aaron Meurer

On Wed, Jul 29, 2020 at 2:26 PM 'Brandon David' via sympy
<sympy@googlegroups.com> wrote:
>
> The numpydoc validator is available at 
> https://github.com/numpy/numpydoc/blob/master/numpydoc/validate.py and can be 
> run with python -m numpydoc --validate
>
> However, it has some limitations out-of-the-box, e.g. it does not offer 
> package-wide validation or any form of .rst parsing. Rather, it accepts the 
> name of a single object and uses importlib to fetch that object's docstring. 
> As a result, most projects maintain their own validation scripts that wrap 
> numpydoc.validate and make repeated calls to it. For example, scikit-learn 
> has a script that enumerates their functions/classes/methods (using pkgutil), 
> filters that list of objects, calls numpydoc.validate on each, ignores 
> certain error codes, and pretty prints the results:
> https://github.com/scikit-learn/scikit-learn/blob/master/maint_tools/test_docstrings.py
>
> And over at pandas, they use numpydoc.validate alongside some custom 
> validation, all rolled into their CI:
> https://github.com/pandas-dev/pandas/blob/master/scripts/validate_docstrings.py
> https://github.com/pandas-dev/pandas/blob/master/scripts/tests/test_validate_docstrings.py
>
> Note that pandas' version parses .rst files directly to enumerate the objects 
> to be validated, as that is how the validation script was written before it 
> migrated from pandas to numpydoc. To clarify, I have not contributed code to 
> numpydoc.validate but I did participate in its migration (on GitHub and via 
> email) and adapted/tested both versions for use with SciPy. One issue that 
> cropped up was that the .rst parsing of the original script assumed 
> autosummary, while many projects (like SciPy) use autodoc. For that and other 
> reasons, .rst parsing was removed entirely from numpydoc.validate, at least 
> for now. I see that SymPy has considered migrating from autodoc to 
> autosummary (#18594), which could make it easy to mimic some of the 
> sophisticated things pandas is doing with their docstring validation, 
> including CI.
>
> But despite any .rst parsing, the validation itself is still done through 
> importlib. This dependency makes numpydoc.validate somewhat clunky to use 
> like a linter, as that would require building from source before each 
> validation. I certainly don't want to imply that numpydoc.validate is the 
> perfect tool for all workflows and, in fact, an overenthusiasm for tooling 
> can easily generate technical debt and distract from more valuable work (like 
> actually writing docstrings). My proposed use of numpydoc.validate was just 
> as another tool in the toolbox; a convenient way to populate my tasklist. For 
> example, here is a quick list of the SymPy objects that have custom sections 
> or don't follow the section order specified in the SymPy docstring guide 
> (i.e. Explanation, Examples, Parameters, See Also, References):
> https://gist.github.com/brandondavid/02868ca74600897d5d61c43c43e2654a
>
> --Brandon
>
>
> On Tuesday, July 28, 2020 at 3:11:10 PM UTC-7, Aaron Meurer wrote:
>>
>> You mentioned in your proposal that you contributed to the numpydoc
>> validator. Can you reference where that work is? Where is the source
>> for the numpydoc validator?
>>
>> Aaron Meurer
>>
>> On Tue, Jul 28, 2020 at 2:23 PM Aaron Meurer <asme...@gmail.com> wrote:
>> >
>> > Thanks for doing this. Our style guide that was developed last year
>> > differs from numpydoc in a few ways, and it also has some additional
>> > things. But being able to automatically validate those things that can
>> > be validated is good.
>> >
>> > Aaron Meurer
>> >
>> > On Tue, Jul 28, 2020 at 2:18 PM 'Brandon David' via sympy
>> > <sy...@googlegroups.com> wrote:
>> > >
>> > > Hello SymPy mentors!
>> > >
>> > > I apologize that this introduction is coming so late; I was unable to 
>> > > take advantage of the "exploration" period and didn't know until just 
>> > > recently that applicants are still allowed to contact mentors even after 
>> > > the application deadline. I have been hoping to speak with someone about 
>> > > my proposal and would love to answer any questions or concerns. Would it 
>> > > be possible to do so before the review period ends?
>> > >
>> > > Since submitting my proposal, I have chipped away at its tasks and 
>> > > thought it might be helpful to share some of the early results. In 
>> > > preparation for converting all docstrings to the numpydoc format, I ran 
>> > > the entire smypy library through the numpydoc validator. It took a quite 
>> > > bit of monkey-patching to have it not choke on SymPy's docstrings, so 
>> > > the nearly 17,000 errors it flagged are just a rough measure of the 
>> > > situation:
>> > > https://gist.github.com/brandondavid/b008667314397b24bbf4f2e6f1fc70ca
>> > >
>> > > I also noticed a few issues that the numpydoc validator didn't, such as 
>> > > malformed "See Also" sections and docstrings with repeated section names 
>> > > (e.g. crypto.encipher_hill has two "Notes" sections). Still, I think the 
>> > > above list is certainly enough to get started and I would be eager to do 
>> > > so -- or if you have already settled on a different technical writer, I 
>> > > would be happy to supply them with the work I've already done.
>> > >
>> > > Many thanks for your time and I look forward to speaking further!
>> > >
>> > > Cheers,
>> > > Brandon
>> > >
>> > > P.S. If any mentor needs a copy of my proposal, please let me know. I 
>> > > can also be reached at brando...@zoho.com
>> > >
>> > > --
>> > > You received this message because you are subscribed to the Google 
>> > > Groups "sympy" group.
>> > > To unsubscribe from this group and stop receiving emails from it, send 
>> > > an email to sy...@googlegroups.com.
>> > > To view this discussion on the web visit 
>> > > https://groups.google.com/d/msgid/sympy/21d0454c-e1af-4f84-984b-22f0ce4eb29eo%40googlegroups.com.
>
> --
> You received this message because you are subscribed to the Google Groups 
> "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to sympy+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sympy/104147a1-4b1e-48f3-86b4-ddb5de3bdba5o%40googlegroups.com.

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sympy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sympy/CAKgW%3D6KpL7P8m5D_QDzHyka8bp7yNtVZzmifS62QQ9TcY9e8qw%40mail.gmail.com.

Re: [sympy] Google Season of Docs 2020 Proposal

Reply via email to