2012/6/18 Robert Zaremba <[email protected]> > Hi, I would like to import changes from: > The problem is that HTMLParser from 2.7.2 is not lenient and likes to throw > exceptions, when html document is not well formed: > http://bugs.python.org/issue13987 > > This often involves exception from BeautifoulSoup, which gains great speed > up > when using from pypy + HTMLParser from stdlib: > "RuntimeWarning: Python's built-in HTMLParser cannot parse the given > document. This is not a bug in Beautiful Soup. The best solution is to > install > an external parser (lxml or html5lib), and use Beautiful Soup with that > parser. See > http://www.crummy.com/software/BeautifulSoup/bs4/doc/#installing- > a-parser for help." > > However lxml is not compatibile with PyPy, and html5lib is slow. > > Can I port the HTMLParser.py from python 2.7.3 to PyPy? >
In general, no, unless you port the all the rest to 2.7.3 as well. There is already work in progress for this, in the stdlib-2.7.3 branch. It's almost finished (and definitely worth a try), there are some nightly builds there (only 32bit Linux for the moment): http://buildbot.pypy.org/nightly/stdlib-2.7.3/ Still missing are the implementation of randomized hashes (not enabled by default anyway) and a couple of obscure bugs in the import system, probably implementation details. -- Amaury Forgeot d'Arc
_______________________________________________ pypy-dev mailing list [email protected] http://mail.python.org/mailman/listinfo/pypy-dev
