Package: python-beautifulsoup Version: 3.1.0.1-1 Severity: important The recent upgrade from 3.0.7 to 3.1.0 caused BeautifulSoup to stop being able to parse HTML pages that contain particular forms of embedded JavaScript.
Here is a small example that parses correctly with 3.0.7. <html> <head> <title>Not-So-Beautiful Soup</title> </head> <body> <script> function legalJS() { var str = '</p>'; return 0<str.length; } </script> </body> </html> With 3.1.0, it causes this failure: File "./souptest.py", line 7, in <module> soup = BeautifulSoup(page) File "/var/lib/python-support/python2.5/BeautifulSoup.py", line 1499, in __init__ BeautifulStoneSoup.__init__(self, *args, **kwargs) File "/var/lib/python-support/python2.5/BeautifulSoup.py", line 1230, in __init__ self._feed(isHTML=isHTML) File "/var/lib/python-support/python2.5/BeautifulSoup.py", line 1263, in _feed self.builder.feed(markup) File "/usr/lib/python2.5/HTMLParser.py", line 108, in feed self.goahead(0) File "/usr/lib/python2.5/HTMLParser.py", line 148, in goahead k = self.parse_starttag(i) File "/usr/lib/python2.5/HTMLParser.py", line 226, in parse_starttag endpos = self.check_for_whole_start_tag(i) File "/usr/lib/python2.5/HTMLParser.py", line 301, in check_for_whole_start_tag self.error("malformed start tag") File "/usr/lib/python2.5/HTMLParser.py", line 115, in error raise HTMLParseError(message, self.getpos()) HTMLParser.HTMLParseError: malformed start tag, at line 9, column 28 -- System Information: Debian Release: 5.0 APT prefers testing APT policy: (990, 'testing'), (500, 'stable'), (400, 'unstable'), (1, 'experimental') Architecture: i386 (i686) Kernel: Linux 2.6.26-1-686 (SMP w/2 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Versions of packages python-beautifulsoup depends on: ii python 2.5.2-3 An interactive high-level object-o ii python-support 0.8.7 automated rebuilding support for P python-beautifulsoup recommends no packages. python-beautifulsoup suggests no packages. -- no debconf information -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org