Re: New Python regex Doc (was: Python documentation moronicities)

2005-05-07 Thread John Bokma
Xah Lee wrote:

> Let me expose one another fucking incompetent part of

your writing capablities?

If you really had a point, there wouldn't be any need of swearing...

-- 
John   MexIT: http://johnbokma.com/mexit/
   personal page:   http://johnbokma.com/
Experienced programmer available: http://castleamber.com/
Happy Customers: http://castleamber.com/testimonials.html
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: New Python regex Doc (was: Python documentation moronicities)

2005-05-07 Thread James Stroud
On Saturday 07 May 2005 04:28 pm, Xah Lee wrote:
> Note: âIn other words, the "|" operator is never greedy.â
>
> Note the need to inject the high-brow jargon âgreedyâ here as a
> latch on sentence.

The first definition of "jargon" in the Collaborative International Dictionary 
of English is:

"To utter jargon; to emit confused or unintelligible sounds; to talk 
unintelligibly, or in a harsh and noisy manner."

Despite your misuse of the word "jargon", jargon seems to be an area in which 
you are carving yourself a niche.

The term "greedy" has a particular meaning in regex, as does the word 
"algorithm" in computer science. Take a look at Mastering Regular Expressions 
for an exhaustive discussion of the meaning of "greedy" as it applies to 
regular expressions. Of course I anticipate that you will confusedly and 
unintelligibly bash this book, even though it is quite obvious that you have 
yet to read or understand it.


-- 
James Stroud
UCLA-DOE Institute for Genomics and Proteomics
Box 951570
Los Angeles, CA 90095

http://www.jamesstroud.com/
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: New Python regex Doc (was: Python documentation moronicities)

2005-05-07 Thread Xah Lee
Let me expose one another fucking incompetent part of Python doc, in
illustration of the Info Tech industry's masturbation and ignorant
nature.

The official Python doc on regex syntax (
http://python.org/doc/2.4/lib/re-syntax.html ) says:

--begin quote--

"|"
A|B, where A and B can be arbitrary REs, creates a regular expression
that will match either A or B. An arbitrary number of REs can be
separated by the "|" in this way. This can be used inside groups (see
below) as well. As the target string is scanned, REs separated by "|"
are tried from left to right. When one pattern completely matches, that
branch is accepted. This means that once A matches, B will not be
tested further, even if it would produce a longer overall match. In
other words, the "|" operator is never greedy. To match a literal "|",
use \|, or enclose it inside a character class, as in [|].

--end quote--

Note: âIn other words, the "|" operator is never greedy.â

Note the need to inject the high-brow jargon âgreedyâ here as a
latch on sentence.

ânever greedyâ? What is greedy anyway?

âGreedyâ, when used in the context of computing, describes a
certain characteristics of algorithms. When a algorithm for a
minimizing/maximizing problem is such that, whenever it faced a choice
it simply chose the shortest path, without considering whether that
choice actually results in a optimal solution.

The rub is that such stratedgy will often not obtain optimal result in
most problems. If you go from New York to San Francisco and always
choose the road most directly facing your destination, you'll never get
on.

For a algorithm to be greedy, it is implied that it faces choices. In
the case of alternatives in regex "regex1|regex2|regex3", there is
really no selection involved, but following a given sequence.

What the writer were thinking when he latched on about greediness, is
that the result may not be from the pattern that matches the most
substring, therefore it is not âgreedyâ. It's not greedy Python
docer's ass.

Such blind jargon throwing, as found everywhere in tech docs, is a
significant reason why the computing industry is filled with shams the
likes of unix, Perl, Programing Patterns, eXtreme Programing,
âUniversal Modeling Languageâ, fucking shits.


A better writen doc for the complete regex module is at:
http://xahlee.org/perl-python/python_re-write/lib/module-re.html

See also: Responsible Software Licensing
http://xahlee.org/UnixResource_dir/writ/responsible_license.html

 Xah
 [EMAIL PROTECTED]
â http://xahlee.org/

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: New Python regex Doc (was: Python documentation moronicities)

2005-05-07 Thread Philippe C. Martin
This is nice! I just might understand regex eventually.


Xah Lee wrote:

> erratum:
> 
> the correct URL is:
> http://xahlee.org/perl-python/python_re-write/lib/module-re.html
> 
>  Xah
>  [EMAIL PROTECTED]
> â http://xahlee.org/

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: New Python regex Doc (was: Python documentation moronicities)

2005-05-07 Thread Skip Montanaro

Xah> I don't know what kind of system is used to generate the Python
Xah> docs, but it is quite unpleasant to work with manually, as there
Xah> are egregious errors and inconsistencies.

The main Python documentation is written in LaTeX.  I believe most, if not
all, HTML is generated by latex2html.  I suspect most of the HTML cruftiness
arises from latex2html.

Skip


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: New Python regex Doc (was: Python documentation moronicities)

2005-05-06 Thread Xah Lee
erratum:

the correct URL is:
http://xahlee.org/perl-python/python_re-write/lib/module-re.html

 Xah
 [EMAIL PROTECTED]
â http://xahlee.org/

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: New Python regex Doc (was: Python documentation moronicities)

2005-05-06 Thread Xah Lee
HTML Problems in Python Doc

I don't know what kind of system is used to generate the Python docs,
but it is quite unpleasant to work with manually, as there are
egregious errors and inconsistencies.

For example, on the âModule Contentsâ page (
http://python.org/doc/2.4.1/lib/node111.html ), the closing tags for
 are never used, and all the tags are in lower case. However, on
the regex syntax page ( http://python.org/doc/2.4.1/lib/re-syntax.html
), the closing tages for  are given, and all tages are in caps.

The doc's first lines declare a type of:


yet in the files they uses "/>" to close image tags, which is a XHTML
syntax.

the doc litters  and never closes them, making it a illegal
XML/XHTML by breaking the minimal requirement of well-formedness.

Asides from correctness, the code is quite bloated as in generally true
of generated HTML. For example, it is littered with:  which isn't used in the style sheet, and i don't
think those ids can serve any purpose other than in style sheet.

Although the doc uses a huge style sheet and almost every tag comes
with a class or id attribute, but it also profusively uses hard-coded
style tags like ,  and Netcsape's .

It also abuse tables that effectively does nothing. Here's a typical
line:

  compile(
  pattern[,
flags])


If Python is supposed to be a quality language, then its
documentation's content and code seems indicate otherwise.
---

This email is archived at:
http://xahlee.org/perl-python/re-write_notes.html

 Xah
 [EMAIL PROTECTED]
â http://xahlee.org/


â

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: New Python regex Doc (was: Python documentation moronicities)

2005-05-06 Thread Jeff Epler
To add to what others have said:

* Typos and lack of spell-checking, such as "occurances" vs "occurrences"

* Poor grammar, such as "Other characters that has special meaning
  includes:"

* You dropped version-related notes like "New in version 2.4"

* You seem to love the use of s, while docs.python.org uses them
  sparingly

* The category names you created, "Wildcards", "Repetition Qualifiers",
  and so forth, don't help me understand regular expressions any better
  than the original document

* Your document dropped some basic explanations of how regular
  expressions work, without a replacement text:
Regular expressions can be concatenated to form new regular
expressions; if A and B are both regular expressions, then AB is
also a regular expression. In general, if a string p matches A and
another string q matches B, the string pq will match AB. [...] Thus,
complex expressions can easily be constructed from simpler primitive
expressions like the ones described here.
  Instead, you start off with one unclear example ("a+" matching
  "hh!") and one misleading example (a regular expression that
  matches some tiny subset of valid e-mail addresses)

* You write
Characters that have special meanings in regex do not have special
meanings when used inside []. For example, '[b+]' does not mean one
or more b; It just matches 'b' or '+'.
  and then go on to explain that backslash still has special meaning; I
  see that the original documentation has a similar problem, but this
  just goes to show that you aren't improving the accuracy or clarity of
  the documentation in most cases, just rewriting it to suit your own
  style.  Or maybe just as an excuse to write offensive things like "[a]
  fucking toy whose max use is as a simplest calculator"

I can't see anything to make me recommend this documentation over the
existing documentation.

Jeff


pgp5Y4v6p63xE.pgp
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: New Python regex Doc (was: Python documentation moronicities)

2005-05-05 Thread alex23

Xah Lee wrote:
> 99% of programers really don't need to give a flying fuck about the
> history of a language.

Ironically, I'm pretty confident that the same percentage of readers on
this group feel _exactly the same way_ about your 'improvements'.

-alex23

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: New Python regex Doc (was: Python documentation moronicities)

2005-05-05 Thread Fredrik Lundh
Xah Lee wrote:

> I have now also started to rewrite the re-syntax page. At first i
> thought that page needs not to be rewritten, since its about regex and
> not really involved with Python. But after another look, that page is
> as incompetent as every other page of Python documentation.
>
> The rewritten page is here:
> http://xahlee.org/perl-python/python_re-write/lib/re-syntax.html
>
> It's not complete

and it no longer describes how things work.  study the inner workings
of the RE engine some more, and try again.





-- 
http://mail.python.org/mailman/listinfo/python-list


Re: New Python regex Doc (was: Python documentation moronicities)

2005-05-05 Thread Xah Lee
I have now also started to rewrite the re-syntax page. At first i
thought that page needs not to be rewritten, since its about regex and
not really involved with Python. But after another look, that page is
as incompetent as every other page of Python documentation.

The rewritten page is here:
http://xahlee.org/perl-python/python_re-write/lib/re-syntax.html

It's not complete, but is a start. The organization is largely taken
care of, except the last few paragraphs. The bottom half on capturing
and extension syntax i haven't started working on. In particular, they
need examples. The ârepetitionsâ section also needs to be examed.

here are few notes on this whole rewriting ordeal.

---

In the doc, examples are often given in Python command line interface
format, e.g.

>>> def f(n):
... return n+1
...
>>> f(1)
2

instead of:

def f(n):
  return n+1
print f(1)   # returns 2

the clean format should be used because it does not require familiarity
with Python command line, it is more readable, and the code can be
copied and run readily.

A significant portion of Python doc's readers, if not majority, didn't
come to Python as beginning programers, and or one way or another never
used or cared about the Python command line interface.

Suppose a non-Python programer is casually shown a page of Python doc.
She will get much more from the clean example than the version
cluttered with Python Command line interface irrelevancies.

Suppose now we have a experienced professional Python programer. Upon
reading the Python doc, she will also find examples in plain code much
more readable and familiar, than the version plastered with Python
Command line interface irrelevancies.

The only place where the Python command line look-and-feel is
appropriate is in the Python tutorial, and arguably only in the
beginning sections.

-
Extra point: If the Python command line interface is actually a robust
application, like so-called IDE e.g. Mathematica front-end, then things
are very different. In reality, the Python command line interface is a
fucking toy whose max use is as a simplest calculator and double as a
chanting novelty for standard coding morons. In practice it isn't even
suitable as a trial'n'error pad for real-world programing.

Extra point: do not use the fucking stupid meaningless jargon
âinterpreterâ. 90% of its use in the doc should be deleted. They
should be replaced with "software", "program", "command line
interface", or "language" or others.

(I dare say that 50% of all uses of the word interpreter in computer
language contexts are inane. Fathering large amounts of
misunderstanding and confusion.)

-
history of Python are littered all over the doc. e.g.
âIncompatibility note: in the original Python 1.5 release, maxsplit
was ignored. This has been fixed in later releases.â

99% of programers really don't need to give a flying fuck about the
history of a language. Inevitably software including languages change
over time, however conservative one tries to be. So, move all these
changes into a "New and Incompatible changes" page at some appendix of
the lang spec. This way, people who are maintaining older code, can
find their info and in one coherent place. While, the general
programers are not forced to wade thru the details of fuckups or
whatnot of the past in every few paragraphs. (few exceptions can be
made, when the change is a major fuckup that all practicing Python
coders really must be informed regardless whether they maintain old
code.)

--

do not take a attitude like you have to stick to some artificial format
or order or "correctness" in the doc. Remember, the doc's prime goal is
to communicate to programers how a language functions, not how it is
implemented or how technically or computer scientifically speaking.

In writing a language documentation, there is a question of how to
organize it. This is a issue of design, and it takes thinking.

When a doc writer is faced with such a challenge, the easiest route is
a no-brainer by following the way the language is implemented. For
example, the doc will start with âdata typesâ supported by the
language. This no-brainer stupidity is unfortunately how most language
docs are organized by, and the Python doc is one of the worst.

One can see this phenomenon in the official doc of Python's RE module.
For example, it begin with Regex Syntax, then it follows with âModule
contentsâ, then Regex Objects, then Match Objects. And in each page,
the functions or methods are arranged in a alphabetical order. This is
typical of the no-brainers organization following how the module is
implemented or certain âcomputer scientific logicâ. It has remote
connection to how the module is used to perform a task.

In general, language docs should be organize by the tasks it is
supposed to accomplish, then by each module or function's
functionalities.

For example, the RE module doc, organize it by the purposes of the
module. To begin, we explain in t

Re: New Python regex Doc (was: Python documentation moronicities)

2005-04-20 Thread marco

Re: http://xahlee.org/perl-python/python_re-write/lib/module-re.html

Bill Mill <[EMAIL PROTECTED]> writes:

> Alright, I feel like I'm feeding the trolls just by posting in this
> thread. Just so that nobody else has to read the "revised" docs, no it
> doesn't:

I find that Lee's version complements the official docs quite nicely.

> 1) He didn't really change anything besides the intro page and
> deleting the matching vs. searching page and the examples page. He
> also put a couple of  breaks into the doc.

Official doc:

findall(pattern, string[, flags])

Return a list of all non-overlapping matches of pattern in string. If
one or more groups are present in the pattern, return a list of groups;
this will be a list of tuples if the pattern has more than one
group. Empty matches are included in the result unless they touch the
beginning of another match. New in version 1.5.2. Changed in version
2.4: Added the optional flags argument.

Revised doc:

findall(pattern, string[, flags])

Return a list of all non-overlapping matches of pattern in string. For
example:

re.findall(r'@+', 'what   @@@do  @@you @think')
# returns ['@@@', '@@', '@']

If one or more groups are present in the pattern, return a list of
groups; this will be a list of tuples if the pattern has more than one
group. For example:

re.findall(r'( +)(@+)', 'what   @@@do  @@you @think')
# returns [('   ', '@@@'), ('  ', '@@'), (' ', '@')]

Empty matches are included in the result unless they touch the
beginning of another match. For example:

re.findall(r'\b', 'what   @@@do  @@you @think')
# returns ['', '', '', '', '', '', '', '']

need another example here showing what is meant by "unless they touch the
beginning of another match."


Personally I find the latter much clearer (even in its incomplete state).

> 3) adding "MAY NEED AN EXAMPLE HERE" instead of actually putting one in

Well, you could suggest one to him.

Cheers,

--
[EMAIL PROTECTED]
Gunnm: Broken Angel  http://amv.reimeika.ca
http://reimeika.ca/  http://photo.reimeika.ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: New Python regex Doc (was: Python documentation moronicities)

2005-04-19 Thread Bill Mill
On 4/18/05, Jeff Epler <[EMAIL PROTECTED]> wrote:
> On Mon, Apr 18, 2005 at 01:40:43PM -0700, Xah Lee wrote:
> > i have rewrote the Python's re module documentation.
> > See it here for table of content page:
> > http://xahlee.org/perl-python/python_re-write/lib/module-re.html
> 
> For those who have long ago consigned Mr. Lee to a killfile, it looks
> like he's making an honest attempt to improve Python's documentation
> here.

Alright, I feel like I'm feeding the trolls just by posting in this
thread. Just so that nobody else has to read the "revised" docs, no it
doesn't:

1) He didn't really change anything besides the intro page and
deleting the matching vs. searching page and the examples page. He
also put a couple of  breaks into the doc.

2) notes like "NOTE TO DOC WRITERS: The doc sayz: ..." followed by the
same drum he's been beating for a while, instead of actually editing
the section to be correct.

3) adding "MAY NEED AN EXAMPLE HERE" instead of actually putting one in

> 
> Mr Lee, I hope you will submit your documentation changes to python's
> patch tracker on sourceforge.net.  I don't fully agree with some of what
> you've written (e.g., you give top billing to the use of functions like
> re.search while I would encourage use of the search method on compiled
> RE objetcts, and I like examples to be given as though from interactive
> sessions, complete with ">>>" and "..."), but nits can always be picked
> and I'm not the gatekeeper to Python's documentation.
> 

I'd suggest that he actually make an effort at improving the docs
before submitting them.

Peace
Bill Mill
bill.mill at gmail.com
--
http://mail.python.org/mailman/listinfo/python-list


Re: New Python regex Doc (was: Python documentation moronicities)

2005-04-19 Thread Xah Lee
send your feedbacks to Steve Holden. (http://www.holdenweb.com/)
If he deem it proper, he will paypal me $100 bucks, and you can thank
him for the instigation and betterment of the Python doc.

Meanwhile, feel free to incorporate my edits into python doc.

 Xah
 [EMAIL PROTECTED]
â http://xahlee.org/


Xah Lee wrote:
> i have rewrote the Python's re module documentation.
> See it here for table of content page:
> http://xahlee.org/perl-python/python_re-write/lib/module-re.html
>
> The doc is broken into 4 sections:
> * regex functions (node111.html)
> * regex OOP (re-objects.html)
> * matched objects (match-objects.html)
> * regex syntax (re-syntax.html)
>
> the regex syntax page i haven't edited, except the introductory first
> paragraph. The other pages are completely rewritten for about 80%.
>
> There are a couple fine points or 3 places in the original doc i
can't
> understand. They are noted as NOTE DOC WRITERS or NEED EXAMPLE HERE.
> 
>  Xah
>  [EMAIL PROTECTED]
> â http://xahlee.org/

--
http://mail.python.org/mailman/listinfo/python-list


Re: New Python regex Doc (was: Python documentation moronicities)

2005-04-18 Thread Jeff Epler
On Mon, Apr 18, 2005 at 01:40:43PM -0700, Xah Lee wrote:
> i have rewrote the Python's re module documentation.
> See it here for table of content page:
> http://xahlee.org/perl-python/python_re-write/lib/module-re.html

For those who have long ago consigned Mr. Lee to a killfile, it looks
like he's making an honest attempt to improve Python's documentation
here.

Mr Lee, I hope you will submit your documentation changes to python's
patch tracker on sourceforge.net.  I don't fully agree with some of what
you've written (e.g., you give top billing to the use of functions like
re.search while I would encourage use of the search method on compiled
RE objetcts, and I like examples to be given as though from interactive
sessions, complete with ">>>" and "..."), but nits can always be picked
and I'm not the gatekeeper to Python's documentation.

Jeff


pgpq9H9EDt08X.pgp
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

New Python regex Doc (was: Python documentation moronicities)

2005-04-18 Thread Xah Lee
i have rewrote the Python's re module documentation.
See it here for table of content page:
http://xahlee.org/perl-python/python_re-write/lib/module-re.html

The doc is broken into 4 sections:
* regex functions (node111.html)
* regex OOP (re-objects.html)
* matched objects (match-objects.html)
* regex syntax (re-syntax.html)

the regex syntax page i haven't edited, except the introductory first
paragraph. The other pages are completely rewritten for about 80%.

There are a couple fine points or 3 places in the original doc i can't
understand. They are noted as NOTE DOC WRITERS or NEED EXAMPLE HERE.

 Xah
 [EMAIL PROTECTED]
â http://xahlee.org/

--
http://mail.python.org/mailman/listinfo/python-list