On 02/26/2013 05:54 AM, Steven D'Aprano wrote: > One week ago, "JoePie91" wrote a blog post challenging the Python > community and the state of Python documentation, titled: > > "The Python documentation is bad, and you should feel bad". > > http://joepie91.wordpress.com/2013/02/19/the-python-documentation-is-bad- > and-you-should-feel-bad/ > > It is valuable to contrast and compare the PHP and Python docs:
tl;dr? tb I haven't used PHP or its documentation so I can't compare it to Python's. I have used Python's documentation and can say I agree with many of the criticisms made by JoePie91. One of the problems with "fixing" the Python reference docs (by which I mean primarily the Language and Library References) it that there is no common agreement about what a "good" reference should be. In the Python development community that controls the overall structure and contents of the Python documentation, there seems to be strong minimalist streak. It often seems like the documentation is the product of a contest to find the minimum number of words to describe something and still be able to defend it as correct. Any documentation must be written with a target audience in mind and IMO the audience for the Python reference docs should be programmers familiar with one or two procedural or OO languages at an intermediate level. (Obviously different sections of documentation can modify this. Later documentation will assume knowledge of basic concepts like Python objects, argument passing and assignment semantics and so forth that were presented earlier, and documentation for specialized problem domain modules, eg an SMTP module, would assume some knowledge of email, smtp and networking.) As JoePie91 pointed out, reference material should describe its subject matter completely and accurately. Once documentation has archived that minimum bar of viability, its quality is determined by how effectively it transfers that information to the reader. I distinguish reference from tutorial material in that the former is optimized for looking up information and presenting it concisely, the latter for presenting (quite possibly the same) information in a linear fashion with no forward references and presenting it verbosely and experientially. I distinguish a language reference from a language standard in that the audience for the latter are language implementors rather than users. I would describe a reference document for those already competent with Python and as a big cheat-sheet. A frequent failing of the Python docs is just plain poor writing. When explaining something, start with a description of what the something is, does, etc, in a form understandable by the target audience. Is there anyone who can understand what the very useful collections.defaultdict does without multiple rereadings? According to its docs, it "returns a new dictionary-like object." That is underspecified -- many things return dictionary-like objects. It continues "it overrides one method and adds one writable instance variable." OK, but WTF does it *do*?! It then goes on to describe its use which one has to understand without an overarching context and then reason backwards to eventually figure out that it is a dict that provides for user-specified behavior when accessed with a key that doesn't exist [*1] Important quality enablers are good tables of contents, indexes, glossaries, cross references and examples. Examples should be used to illustrate a textual description and never used as a substitute for textual descriptions. Cross references are particularly important in tying together related material that is found in disparate doc locations. For example, information on Python's "+" operator is found in: Lang: 2.5. Operators Lang: 3.3.7. Emulating numeric types Lang: 6.5. Unary arithmetic and bitwise operations Lang: 6.6. Binary arithmetic operations Lang: 6.15. Summary (mislabeled, actually operator precedence) Lib: 4.4. Numeric Types Lib: 4.6.1. Common Sequence Operations Lib: 10.3. operator and probably other places I did not think to look. The index is not much help in tying any of these together: "add" -> Lib: 2.5 "+" -> Lib: 4.4 "plus" -> Lang: 6.5 There are also more obscure uses that should be findable such as in float hex strings (4.4.3. Additional Methods on Float) Cross references to similar information can help cover for failings in the index -- if you can find some similar function or concept, there is (or should be) a good chance of a cross- reference to what you really wanted. Good documentation will anticipate the questions a reader will have and answer them. ---- Rebuttals to common responses to criticism of Python docs: Python docs are already good * Criticisms of Python's docs pop up on the Python maillist and blogs with regularity. * Many people confuse "usable", "i've learned to use despite", "look impressive", etc with "good". Google / blogs / stackoverflow / reddit, etc can provide better * Even were it true, it is an argument that Python doesn't need good documentation, rather than an argument that Python's docs are good. * They don't provide answers for infrequent questions. * Answers can be conflicting, wrong, or out of date with no way to correct. * Even today, not everyone has access to internet all the time. Try it in an interactive Python session * This is useful practical advice but experiments do not substitute for documentation because they tell you only what Python version 3.3 on Redhat Linux 4.2 does on a machine with 2GB of memory 3 days after the full moon. Documentation is the ultimate authority for what it is *supposed* to do. Read the source code * Oh please! The purpose of documentation is to alleviate the need to read source code. * Those most in need of documentation are those without the Python knowledge to read the source code. * Some source code is very complex and difficult to understand even for experts. * The behavior of source code is often obscured by details not directly related to the info being looked for: error handling, options for alternate behavior, performance optimizations etc.. Don't complain, submit doc fixes. * The people with motivation to fix the docs are often not qualified to and the people qualified to have no motivation because they already know it. (They may not even recognize there is a problem.) * There is a group of core developers who define (by accepting or rejecting patches) the nature of the changes that can be made. If the view of this group favors changes that continue the status-quo, significant improvements via this route are not possible. * Small fixes can require orders of magnitude more effort to submit and defend than the fix took to write. Tutorials are the place to explain basics. * Tutorials are great for some people but not everyone. * They are not optimized for looking up and answering specific questions. * Their linear style builds on preceding info requiring start-to-end reading. * Since finding info in them is harder, there is an expectation the reader will permanently commit the information to memory as encountered. The best learning style for many is to memorize most frequently needed info by looking things up as needed. * They often introduce programming or general programming language concepts already known to the reader from prior experience. * They are often bloated with exercises/examples that are not needed by readers with a higher level of experience. * They require an unreasonable time/effort commitment for those without a preexisting commitment to using Python. * They are an alternate format of, not a replacement for, information that should be in reference manuals. The high standards demanded are impossible * There are other reference manuals that do achieve a high standard so it is not impossible, for example Beasley's Python Essential Reference [*2]. The are also examples for other languages. * But, it may be impractical for the Python community to achieve such results due to various Python intra- community factors. Python docs are excellent compared to most free software docs * The "most free software docs" bar is too low to be a good metric. Most such docs vary between "sucks" and "non-existent". Please compare Python docs to best available docs (which is why comparison to commercial books like Beasley's Essential Reference is valid.) ---- [*1] I am not an advanced Python user nor a good technical writer so my defauldict description may well be poor. That does not mean that a better description than currently exists can't or shouldn't be provided. [*2] I am not holding up Beazley's book as a gold standard; it has a number of its own problems. But it does provide an example of reference material with better organization and clarity than the python.org docs. -- http://mail.python.org/mailman/listinfo/python-list