Jim Jewett <jimjjew...@gmail.com> added the comment: It sounds like this is a case where the docs should mention an external library; perhaps something like changing the intro of http://docs.python.org/dev/library/html.parser.html from:
""" 19.2. html.parser — Simple HTML and XHTML parser Source code: Lib/html/parser.py This module defines a class HTMLParser which serves as the basis for parsing text files formatted in HTML (HyperText Mark-up Language) and XHTML. """ to: """ 19.2. html.parser — Simple HTML and XHTML parser Source code: Lib/html/parser.py This module defines a class HTMLParser which serves as the basis for parsing text files formatted in HTML (HyperText Mark-up Language) and XHTML. Note that mainstream web browsers also attempt to repair invalid markup; the algorithms for this can be quite complex, and are evolving too quickly for the Python release cycle. Applications handling arbitrary web pages should consider using 3rd-party modules. The python version of html5lib ( http://code.google.com/p/html5lib/ ) is being developed in parallel with the HTML standard itself, and serves as a reference implementation. """ ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue14538> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com