New submission from Ezio Melotti <ezio.melo...@gmail.com>:

The attached patch fixes a few problems with HTMLParser on 2.7.
Instead of raising error when invalid markup is detected, the parser now 
consumes the invalid input and proceeds.  This patch is a partial backport of 
#1486713.

After this two more patches will follow.
The first will get rid of errors raised while parsing declarations and should 
also solve #13576:
     def unknown_decl(self, data):
-        self.error("unknown declaration: %r" % (data,))
+        pass

The second will take care of "bogus comments" (see #13960).

Once this is done HTMLParser should be able to parse (almost) everything.  I'm 
planning to commit this before the release of 2.7.3.

----------
assignee: ezio.melotti
components: Library (Lib)
files: issue13987.diff
keywords: patch
messages: 153043
nosy: benjamin.peterson, eric.araujo, ezio.melotti, r.david.murray
priority: normal
severity: normal
stage: patch review
status: open
title: Handling of broken markup in HTMLParser on 2.7
type: behavior
versions: Python 2.7
Added file: http://bugs.python.org/file24475/issue13987.diff

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue13987>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to