Paweł Widera <mo...@man.poznan.pl> added the comment:

A simple workaround for the BeautifulSoup is the following wrapper. It
sanitize the javascript code before passing it to the parser by joining
the disjoint strings, so that "</scr"+"ipt>" becomes "</script>".

def bs(input):
        pattern = re.compile('\"\+\"')
        match = lambda x: ""
        massage = copy.copy(BeautifulSoup.MARKUP_MASSAGE)
        massage.extend([(pattern, match)])
        return BeautifulSoup(input, markupMassage=massage)

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue670664>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to