Xah Lee wrote: > Python Doc Problem Example: gzip > > Xah Lee, 20050831 > > Today i need to use Python to compress/decompress gzip files. Since > i've read the official Python tutorial 8 months ago, have spent 30 > minutes with Python 3 times a week since, have 14 years of computing > experience, 8 years in mathematical computing and 4 years in unix admin > and perl, i have quickly found the official doc: > http://python.org/doc/2.4.1/lib/module-gzip.html > > I'd imagine it being a function something like: > > fileContent = GzipFile(filePath, comprress/decompress) > > However, scanning the doc after 20 seconds there's no single example > showing how it is used. > > Instead, the doc starts with some arcane info about compatibility with > some other compression module and other software. Then it talks in a > very haphazard way with confused writing about the main function > GzipFile. No perspectives whatsoever about using it to solve a problem > nor a concrete description of how to use it. Instead, jargons of Class, > Constructor, Object etc are thrown together with presumption of > reader's expertise of IO programing in Python and gzip compression > arcana. > > After no understanding, and being not a Python expert, i wanted to read > about file objects but there's no link. > > After locating the file object's doc page: > http://python.org/doc/2.4.1/lib/bltin-file-objects.html, but itself is > written and organized in a very unhelpful way. > > Here's the detail of the problems of its documentation. It starts with: > > «The data compression provided by the zlib module is compatible > with that used by the GNU compression program gzip. Accordingly, the > gzip module provides the GzipFile class to read and write gzip-format > files, automatically compressing or decompressing the data so it looks > like an ordinary file object. Note that additional file formats which > can be decompressed by the gzip and gunzip programs, such as those > produced by compress and pack, are not supported by this module.» > > This intro paragraph is about 3 things: (1) the purpose of this gzip > module. (2) its relation with zlib module. (3) A gratuitous arcana > about gzip program's support of “compress and pack” software being > not supported by Python's gzip module. Necessarily mentioned because > how the writing in this paragraph is phrased. The writing itself is a > jumble. > > Of the people using the gzip module, vast majority really just need to > decompress a gzip file. They don't need to know (2) and (3) in a > preamble. The worst aspect here is the jumbled writing. > > «class GzipFile( [filename[, mode[, compresslevel[, fileobj]]]]) > Constructor for the GzipFile class, which simulates most of the methods > of a file object, with the exception of the readinto() and truncate() > methods. At least one of fileobj and filename must be given a > non-trivial value. The new class instance is based on fileobj, which > can be a regular file, a StringIO object, or any other object which > simulates a file. It defaults to None, in which case filename is opened > to provide a file object.» > > This paragraph assumes that readers are thoroughly familiar with > Python's File Objects and its methods. The writing is haphazard and > extremely confusive. Instead of explicitness and clarity, it tries to > convey its meanings by side effects. > > • The words “simulate” are usd twice inanely. The sentence > “...Gzipfile class, which simulates...” is better said by > “Gzipfile is modeled after Python's File Objects class.” > > • The intention to state that it has all Python's File Object methods > except two of them, is ambiguous phrased. It is as if to say all > methods exists, except that two of them works differently. > > • The used of the word “non-trivial value” is inane. What does a > non-trivial value mean here? Does “non-trivial value” have specific > meaning in Python? Or, is it meant with generic English interpretation? > If the latter, then what does it mean to say: “At least one of > fileobj and filename must be given a non-trivial value”? Does it > simply mean one of these parameters must be given? > > • The rest of the paragraph is just incomprehensible. > > «When fileobj is not None, the filename argument is only used to > be included in the gzip file header, which may includes the original > filename of the uncompressed file. It defaults to the filename of > fileobj, if discernible; otherwise, it defaults to the empty string, > and in this case the original filename is not included in the header.» > > “discernible”? This writing is very confused, and it assumes the > reader is familiar with the technical specification of Gzip. > > «The mode argument can be any of 'r', 'rb', 'a', 'ab', 'w', or > 'wb', depending on whether the file will be read or written. The > default is the mode of fileobj if discernible; otherwise, the default > is 'rb'. If not given, the 'b' flag will be added to the mode to ensure > the file is opened in binary mode for cross-platform portability.» > > “discernible”? Again, familiarity with the working of Python's file > object is implicitly assumed. For people who do not have expertise with > working with files using Python, it necessatates the reading of > Python's file objects documentation. > > «The compresslevel argument is an integer from 1 to 9 controlling > the level of compression; 1 is fastest and produces the least > compression, and 9 is slowest and produces the most compression. The > default is 9.» > > «Calling a GzipFile object's close() method does not close > fileobj, since you might wish to append more material after the > compressed data. This also allows you to pass a StringIO object opened > for writing as fileobj, and retrieve the resulting memory buffer using > the StringIO object's getvalue() method.» > > huh? append more material? pass a StringIO? and memory buffer? > > Here, expertise in programing with IO is assumed of the reader. > Meanwhile, the writing is not clear about how exactly what it is trying > to say about the close() method. > Suggestions > -------------------------- > A quality documentation should be clear, succinct, precise. And, the > least it assumes reader's expertise to obtain these qualities, the > better it is. > > Vast majority of programers using this module really just want to > compress or decompress a file. They do not need to know any more > details about the technicalities of this module nor about the Gzip > compression specification. Here's what Python documentation writers > should do to improve it: > > • Rewrite the intro paragraph. Example: “This module provides a > simple interface to compress and decompress files using the GNU > compression format gzip. For detailed working with gzip format, use the > zlib module.”. The “zlib module” phrase should be linked to its > documentation. > > • Near the top of the documentation, add a example of usage. A > example is worth a thousand words: > > # decompressing a file > import gzip > fileObj = gzip.GzipFile("/Users/joe/war_and_peace.txt.gz", 'rb'); > fileContent = fileObj.read() > fileObj.close() > > # compressing a file > import gzip > fileObj = gzip.GzipFile("/Users/mary/hamlet.txt.gz", 'wb'); > fileObj.write(fileContent) > fileObj.close() > > • Add at the beginning of the documentation a explicit statement, > that GzipFile() is modeled after Python's File Objects, and provide a > link to it. > > • Rephrase the writing so as to not assume that the reader is > thoroughly familiar with Python's IO. For example, when speaking of the > modes 'r', 'rb', ... add a brief statement on what they mean. This way, > readers may not have to take a extra step to read the page on File > Objects. > > • Remove arcane technical details about gzip compression to the > bottom as footnotes. > > • General advice on the writing: The goal of writing on this module > is to document its behavior, and effectively indicate how to use it. > Keep this in mind when writing the documentation. Make it clear on what > you are trying to say for each itemized paragraph. Make it precise, but > without over doing it. Assume your readers are familiar with Python > language or gzip compression. For example, what are classes and objects > in Python, and what compressions are, compression levels, file name > suffix convention. However, do not assume that the readers are expert > of Python IO, or gzip specification or compression technology and > software in the industry. If exact technical details or warnings are > necessary, move them to footnotes. > --------------- > > Xah > [EMAIL PROTECTED] > ∑ http://xahlee.org/
""" You want to create the world before which you can kneel: this is your ultimate hope and intoxication. Also sprach Zarathustra. """ --Friedrich Nietzsche, Thus Spoke Zarathustra, 1885 Gerard -- http://mail.python.org/mailman/listinfo/python-list