Re: New Python regex Doc (was: Python documentation moronicities)
Xah Lee wrote: > Let me expose one another fucking incompetent part of your writing capablities? If you really had a point, there wouldn't be any need of swearing... -- John MexIT: http://johnbokma.com/mexit/ personal page: http://johnbokma.com/ Experienced programmer available: http://castleamber.com/ Happy Customers: http://castleamber.com/testimonials.html -- http://mail.python.org/mailman/listinfo/python-list
Re: New Python regex Doc (was: Python documentation moronicities)
On Saturday 07 May 2005 04:28 pm, Xah Lee wrote: > Note: âIn other words, the "|" operator is never greedy.â > > Note the need to inject the high-brow jargon âgreedyâ here as a > latch on sentence. The first definition of "jargon" in the Collaborative International Dictionary of English is: "To utter jargon; to emit confused or unintelligible sounds; to talk unintelligibly, or in a harsh and noisy manner." Despite your misuse of the word "jargon", jargon seems to be an area in which you are carving yourself a niche. The term "greedy" has a particular meaning in regex, as does the word "algorithm" in computer science. Take a look at Mastering Regular Expressions for an exhaustive discussion of the meaning of "greedy" as it applies to regular expressions. Of course I anticipate that you will confusedly and unintelligibly bash this book, even though it is quite obvious that you have yet to read or understand it. -- James Stroud UCLA-DOE Institute for Genomics and Proteomics Box 951570 Los Angeles, CA 90095 http://www.jamesstroud.com/ -- http://mail.python.org/mailman/listinfo/python-list
Re: New Python regex Doc (was: Python documentation moronicities)
Let me expose one another fucking incompetent part of Python doc, in illustration of the Info Tech industry's masturbation and ignorant nature. The official Python doc on regex syntax ( http://python.org/doc/2.4/lib/re-syntax.html ) says: --begin quote-- "|" A|B, where A and B can be arbitrary REs, creates a regular expression that will match either A or B. An arbitrary number of REs can be separated by the "|" in this way. This can be used inside groups (see below) as well. As the target string is scanned, REs separated by "|" are tried from left to right. When one pattern completely matches, that branch is accepted. This means that once A matches, B will not be tested further, even if it would produce a longer overall match. In other words, the "|" operator is never greedy. To match a literal "|", use \|, or enclose it inside a character class, as in [|]. --end quote-- Note: âIn other words, the "|" operator is never greedy.â Note the need to inject the high-brow jargon âgreedyâ here as a latch on sentence. ânever greedyâ? What is greedy anyway? âGreedyâ, when used in the context of computing, describes a certain characteristics of algorithms. When a algorithm for a minimizing/maximizing problem is such that, whenever it faced a choice it simply chose the shortest path, without considering whether that choice actually results in a optimal solution. The rub is that such stratedgy will often not obtain optimal result in most problems. If you go from New York to San Francisco and always choose the road most directly facing your destination, you'll never get on. For a algorithm to be greedy, it is implied that it faces choices. In the case of alternatives in regex "regex1|regex2|regex3", there is really no selection involved, but following a given sequence. What the writer were thinking when he latched on about greediness, is that the result may not be from the pattern that matches the most substring, therefore it is not âgreedyâ. It's not greedy Python docer's ass. Such blind jargon throwing, as found everywhere in tech docs, is a significant reason why the computing industry is filled with shams the likes of unix, Perl, Programing Patterns, eXtreme Programing, âUniversal Modeling Languageâ, fucking shits. A better writen doc for the complete regex module is at: http://xahlee.org/perl-python/python_re-write/lib/module-re.html See also: Responsible Software Licensing http://xahlee.org/UnixResource_dir/writ/responsible_license.html Xah [EMAIL PROTECTED] â http://xahlee.org/ -- http://mail.python.org/mailman/listinfo/python-list
Re: New Python regex Doc (was: Python documentation moronicities)
This is nice! I just might understand regex eventually. Xah Lee wrote: > erratum: > > the correct URL is: > http://xahlee.org/perl-python/python_re-write/lib/module-re.html > > Xah > [EMAIL PROTECTED] > â http://xahlee.org/ -- http://mail.python.org/mailman/listinfo/python-list
Re: New Python regex Doc (was: Python documentation moronicities)
Xah> I don't know what kind of system is used to generate the Python Xah> docs, but it is quite unpleasant to work with manually, as there Xah> are egregious errors and inconsistencies. The main Python documentation is written in LaTeX. I believe most, if not all, HTML is generated by latex2html. I suspect most of the HTML cruftiness arises from latex2html. Skip -- http://mail.python.org/mailman/listinfo/python-list
Re: New Python regex Doc (was: Python documentation moronicities)
erratum: the correct URL is: http://xahlee.org/perl-python/python_re-write/lib/module-re.html Xah [EMAIL PROTECTED] â http://xahlee.org/ -- http://mail.python.org/mailman/listinfo/python-list
Re: New Python regex Doc (was: Python documentation moronicities)
HTML Problems in Python Doc I don't know what kind of system is used to generate the Python docs, but it is quite unpleasant to work with manually, as there are egregious errors and inconsistencies. For example, on the âModule Contentsâ page ( http://python.org/doc/2.4.1/lib/node111.html ), the closing tags for are never used, and all the tags are in lower case. However, on the regex syntax page ( http://python.org/doc/2.4.1/lib/re-syntax.html ), the closing tages for are given, and all tages are in caps. The doc's first lines declare a type of: yet in the files they uses "/>" to close image tags, which is a XHTML syntax. the doc litters and never closes them, making it a illegal XML/XHTML by breaking the minimal requirement of well-formedness. Asides from correctness, the code is quite bloated as in generally true of generated HTML. For example, it is littered with: which isn't used in the style sheet, and i don't think those ids can serve any purpose other than in style sheet. Although the doc uses a huge style sheet and almost every tag comes with a class or id attribute, but it also profusively uses hard-coded style tags like , and Netcsape's . It also abuse tables that effectively does nothing. Here's a typical line: compile( pattern[, flags]) If Python is supposed to be a quality language, then its documentation's content and code seems indicate otherwise. --- This email is archived at: http://xahlee.org/perl-python/re-write_notes.html Xah [EMAIL PROTECTED] â http://xahlee.org/ â -- http://mail.python.org/mailman/listinfo/python-list
Re: New Python regex Doc (was: Python documentation moronicities)
To add to what others have said: * Typos and lack of spell-checking, such as "occurances" vs "occurrences" * Poor grammar, such as "Other characters that has special meaning includes:" * You dropped version-related notes like "New in version 2.4" * You seem to love the use of s, while docs.python.org uses them sparingly * The category names you created, "Wildcards", "Repetition Qualifiers", and so forth, don't help me understand regular expressions any better than the original document * Your document dropped some basic explanations of how regular expressions work, without a replacement text: Regular expressions can be concatenated to form new regular expressions; if A and B are both regular expressions, then AB is also a regular expression. In general, if a string p matches A and another string q matches B, the string pq will match AB. [...] Thus, complex expressions can easily be constructed from simpler primitive expressions like the ones described here. Instead, you start off with one unclear example ("a+" matching "hh!") and one misleading example (a regular expression that matches some tiny subset of valid e-mail addresses) * You write Characters that have special meanings in regex do not have special meanings when used inside []. For example, '[b+]' does not mean one or more b; It just matches 'b' or '+'. and then go on to explain that backslash still has special meaning; I see that the original documentation has a similar problem, but this just goes to show that you aren't improving the accuracy or clarity of the documentation in most cases, just rewriting it to suit your own style. Or maybe just as an excuse to write offensive things like "[a] fucking toy whose max use is as a simplest calculator" I can't see anything to make me recommend this documentation over the existing documentation. Jeff pgp5Y4v6p63xE.pgp Description: PGP signature -- http://mail.python.org/mailman/listinfo/python-list
Re: New Python regex Doc (was: Python documentation moronicities)
Xah Lee wrote: > 99% of programers really don't need to give a flying fuck about the > history of a language. Ironically, I'm pretty confident that the same percentage of readers on this group feel _exactly the same way_ about your 'improvements'. -alex23 -- http://mail.python.org/mailman/listinfo/python-list
Re: New Python regex Doc (was: Python documentation moronicities)
Xah Lee wrote: > I have now also started to rewrite the re-syntax page. At first i > thought that page needs not to be rewritten, since its about regex and > not really involved with Python. But after another look, that page is > as incompetent as every other page of Python documentation. > > The rewritten page is here: > http://xahlee.org/perl-python/python_re-write/lib/re-syntax.html > > It's not complete and it no longer describes how things work. study the inner workings of the RE engine some more, and try again. -- http://mail.python.org/mailman/listinfo/python-list
Re: New Python regex Doc (was: Python documentation moronicities)
I have now also started to rewrite the re-syntax page. At first i thought that page needs not to be rewritten, since its about regex and not really involved with Python. But after another look, that page is as incompetent as every other page of Python documentation. The rewritten page is here: http://xahlee.org/perl-python/python_re-write/lib/re-syntax.html It's not complete, but is a start. The organization is largely taken care of, except the last few paragraphs. The bottom half on capturing and extension syntax i haven't started working on. In particular, they need examples. The ârepetitionsâ section also needs to be examed. here are few notes on this whole rewriting ordeal. --- In the doc, examples are often given in Python command line interface format, e.g. >>> def f(n): ... return n+1 ... >>> f(1) 2 instead of: def f(n): return n+1 print f(1) # returns 2 the clean format should be used because it does not require familiarity with Python command line, it is more readable, and the code can be copied and run readily. A significant portion of Python doc's readers, if not majority, didn't come to Python as beginning programers, and or one way or another never used or cared about the Python command line interface. Suppose a non-Python programer is casually shown a page of Python doc. She will get much more from the clean example than the version cluttered with Python Command line interface irrelevancies. Suppose now we have a experienced professional Python programer. Upon reading the Python doc, she will also find examples in plain code much more readable and familiar, than the version plastered with Python Command line interface irrelevancies. The only place where the Python command line look-and-feel is appropriate is in the Python tutorial, and arguably only in the beginning sections. - Extra point: If the Python command line interface is actually a robust application, like so-called IDE e.g. Mathematica front-end, then things are very different. In reality, the Python command line interface is a fucking toy whose max use is as a simplest calculator and double as a chanting novelty for standard coding morons. In practice it isn't even suitable as a trial'n'error pad for real-world programing. Extra point: do not use the fucking stupid meaningless jargon âinterpreterâ. 90% of its use in the doc should be deleted. They should be replaced with "software", "program", "command line interface", or "language" or others. (I dare say that 50% of all uses of the word interpreter in computer language contexts are inane. Fathering large amounts of misunderstanding and confusion.) - history of Python are littered all over the doc. e.g. âIncompatibility note: in the original Python 1.5 release, maxsplit was ignored. This has been fixed in later releases.â 99% of programers really don't need to give a flying fuck about the history of a language. Inevitably software including languages change over time, however conservative one tries to be. So, move all these changes into a "New and Incompatible changes" page at some appendix of the lang spec. This way, people who are maintaining older code, can find their info and in one coherent place. While, the general programers are not forced to wade thru the details of fuckups or whatnot of the past in every few paragraphs. (few exceptions can be made, when the change is a major fuckup that all practicing Python coders really must be informed regardless whether they maintain old code.) -- do not take a attitude like you have to stick to some artificial format or order or "correctness" in the doc. Remember, the doc's prime goal is to communicate to programers how a language functions, not how it is implemented or how technically or computer scientifically speaking. In writing a language documentation, there is a question of how to organize it. This is a issue of design, and it takes thinking. When a doc writer is faced with such a challenge, the easiest route is a no-brainer by following the way the language is implemented. For example, the doc will start with âdata typesâ supported by the language. This no-brainer stupidity is unfortunately how most language docs are organized by, and the Python doc is one of the worst. One can see this phenomenon in the official doc of Python's RE module. For example, it begin with Regex Syntax, then it follows with âModule contentsâ, then Regex Objects, then Match Objects. And in each page, the functions or methods are arranged in a alphabetical order. This is typical of the no-brainers organization following how the module is implemented or certain âcomputer scientific logicâ. It has remote connection to how the module is used to perform a task. In general, language docs should be organize by the tasks it is supposed to accomplish, then by each module or function's functionalities. For example, the RE module doc, organize it by the purposes of the module. To begin, we explain in t
Re: New Python regex Doc (was: Python documentation moronicities)
Re: http://xahlee.org/perl-python/python_re-write/lib/module-re.html Bill Mill <[EMAIL PROTECTED]> writes: > Alright, I feel like I'm feeding the trolls just by posting in this > thread. Just so that nobody else has to read the "revised" docs, no it > doesn't: I find that Lee's version complements the official docs quite nicely. > 1) He didn't really change anything besides the intro page and > deleting the matching vs. searching page and the examples page. He > also put a couple of breaks into the doc. Official doc: findall(pattern, string[, flags]) Return a list of all non-overlapping matches of pattern in string. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result unless they touch the beginning of another match. New in version 1.5.2. Changed in version 2.4: Added the optional flags argument. Revised doc: findall(pattern, string[, flags]) Return a list of all non-overlapping matches of pattern in string. For example: re.findall(r'@+', 'what @@@do @@you @think') # returns ['@@@', '@@', '@'] If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. For example: re.findall(r'( +)(@+)', 'what @@@do @@you @think') # returns [(' ', '@@@'), (' ', '@@'), (' ', '@')] Empty matches are included in the result unless they touch the beginning of another match. For example: re.findall(r'\b', 'what @@@do @@you @think') # returns ['', '', '', '', '', '', '', ''] need another example here showing what is meant by "unless they touch the beginning of another match." Personally I find the latter much clearer (even in its incomplete state). > 3) adding "MAY NEED AN EXAMPLE HERE" instead of actually putting one in Well, you could suggest one to him. Cheers, -- [EMAIL PROTECTED] Gunnm: Broken Angel http://amv.reimeika.ca http://reimeika.ca/ http://photo.reimeika.ca -- http://mail.python.org/mailman/listinfo/python-list
Re: New Python regex Doc (was: Python documentation moronicities)
On 4/18/05, Jeff Epler <[EMAIL PROTECTED]> wrote: > On Mon, Apr 18, 2005 at 01:40:43PM -0700, Xah Lee wrote: > > i have rewrote the Python's re module documentation. > > See it here for table of content page: > > http://xahlee.org/perl-python/python_re-write/lib/module-re.html > > For those who have long ago consigned Mr. Lee to a killfile, it looks > like he's making an honest attempt to improve Python's documentation > here. Alright, I feel like I'm feeding the trolls just by posting in this thread. Just so that nobody else has to read the "revised" docs, no it doesn't: 1) He didn't really change anything besides the intro page and deleting the matching vs. searching page and the examples page. He also put a couple of breaks into the doc. 2) notes like "NOTE TO DOC WRITERS: The doc sayz: ..." followed by the same drum he's been beating for a while, instead of actually editing the section to be correct. 3) adding "MAY NEED AN EXAMPLE HERE" instead of actually putting one in > > Mr Lee, I hope you will submit your documentation changes to python's > patch tracker on sourceforge.net. I don't fully agree with some of what > you've written (e.g., you give top billing to the use of functions like > re.search while I would encourage use of the search method on compiled > RE objetcts, and I like examples to be given as though from interactive > sessions, complete with ">>>" and "..."), but nits can always be picked > and I'm not the gatekeeper to Python's documentation. > I'd suggest that he actually make an effort at improving the docs before submitting them. Peace Bill Mill bill.mill at gmail.com -- http://mail.python.org/mailman/listinfo/python-list
Re: New Python regex Doc (was: Python documentation moronicities)
send your feedbacks to Steve Holden. (http://www.holdenweb.com/) If he deem it proper, he will paypal me $100 bucks, and you can thank him for the instigation and betterment of the Python doc. Meanwhile, feel free to incorporate my edits into python doc. Xah [EMAIL PROTECTED] â http://xahlee.org/ Xah Lee wrote: > i have rewrote the Python's re module documentation. > See it here for table of content page: > http://xahlee.org/perl-python/python_re-write/lib/module-re.html > > The doc is broken into 4 sections: > * regex functions (node111.html) > * regex OOP (re-objects.html) > * matched objects (match-objects.html) > * regex syntax (re-syntax.html) > > the regex syntax page i haven't edited, except the introductory first > paragraph. The other pages are completely rewritten for about 80%. > > There are a couple fine points or 3 places in the original doc i can't > understand. They are noted as NOTE DOC WRITERS or NEED EXAMPLE HERE. > > Xah > [EMAIL PROTECTED] > â http://xahlee.org/ -- http://mail.python.org/mailman/listinfo/python-list
Re: New Python regex Doc (was: Python documentation moronicities)
On Mon, Apr 18, 2005 at 01:40:43PM -0700, Xah Lee wrote: > i have rewrote the Python's re module documentation. > See it here for table of content page: > http://xahlee.org/perl-python/python_re-write/lib/module-re.html For those who have long ago consigned Mr. Lee to a killfile, it looks like he's making an honest attempt to improve Python's documentation here. Mr Lee, I hope you will submit your documentation changes to python's patch tracker on sourceforge.net. I don't fully agree with some of what you've written (e.g., you give top billing to the use of functions like re.search while I would encourage use of the search method on compiled RE objetcts, and I like examples to be given as though from interactive sessions, complete with ">>>" and "..."), but nits can always be picked and I'm not the gatekeeper to Python's documentation. Jeff pgpq9H9EDt08X.pgp Description: PGP signature -- http://mail.python.org/mailman/listinfo/python-list
New Python regex Doc (was: Python documentation moronicities)
i have rewrote the Python's re module documentation. See it here for table of content page: http://xahlee.org/perl-python/python_re-write/lib/module-re.html The doc is broken into 4 sections: * regex functions (node111.html) * regex OOP (re-objects.html) * matched objects (match-objects.html) * regex syntax (re-syntax.html) the regex syntax page i haven't edited, except the introductory first paragraph. The other pages are completely rewritten for about 80%. There are a couple fine points or 3 places in the original doc i can't understand. They are noted as NOTE DOC WRITERS or NEED EXAMPLE HERE. Xah [EMAIL PROTECTED] â http://xahlee.org/ -- http://mail.python.org/mailman/listinfo/python-list