Package: python-beautifulsoup
Version: 3.1.0.1-1
Severity: important

The recent upgrade from 3.0.7 to 3.1.0 caused BeautifulSoup to stop
being able to parse HTML pages that contain particular forms of
embedded JavaScript.

Here is a small example that parses correctly with 3.0.7.

    <html>
    <head>
    <title>Not-So-Beautiful Soup</title>
    </head>
    <body>
    <script>
    function legalJS() {
        var str = '</p>';
        return 0<str.length;
    }
    </script>
    </body>
    </html>

With 3.1.0, it causes this failure:

  File "./souptest.py", line 7, in <module>
    soup = BeautifulSoup(page)
  File "/var/lib/python-support/python2.5/BeautifulSoup.py", line 1499, in 
__init__
    BeautifulStoneSoup.__init__(self, *args, **kwargs)
  File "/var/lib/python-support/python2.5/BeautifulSoup.py", line 1230, in 
__init__
    self._feed(isHTML=isHTML)
  File "/var/lib/python-support/python2.5/BeautifulSoup.py", line 1263, in _feed
    self.builder.feed(markup)
  File "/usr/lib/python2.5/HTMLParser.py", line 108, in feed
    self.goahead(0)
  File "/usr/lib/python2.5/HTMLParser.py", line 148, in goahead
    k = self.parse_starttag(i)
  File "/usr/lib/python2.5/HTMLParser.py", line 226, in parse_starttag
    endpos = self.check_for_whole_start_tag(i)
  File "/usr/lib/python2.5/HTMLParser.py", line 301, in 
check_for_whole_start_tag
    self.error("malformed start tag")
  File "/usr/lib/python2.5/HTMLParser.py", line 115, in error
    raise HTMLParseError(message, self.getpos())
HTMLParser.HTMLParseError: malformed start tag, at line 9, column 28

-- System Information:
Debian Release: 5.0
  APT prefers testing
  APT policy: (990, 'testing'), (500, 'stable'), (400, 'unstable'), (1, 
'experimental')
Architecture: i386 (i686)

Kernel: Linux 2.6.26-1-686 (SMP w/2 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages python-beautifulsoup depends on:
ii  python                        2.5.2-3    An interactive high-level object-o
ii  python-support                0.8.7      automated rebuilding support for P

python-beautifulsoup recommends no packages.

python-beautifulsoup suggests no packages.

-- no debconf information



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to