Ezio Melotti <[email protected]> added the comment:
HTMLParser is supposed to follow the HTML5 standard, and never raise an error.
For the example in the first comment ("<![hi world]>"), the steps should be:
* https://html.spec.whatwg.org/multipage/parsing.html#data-state:tag-open-state
*
https://html.spec.whatwg.org/multipage/parsing.html#tag-open-state:markup-declaration-open-state
*
https://html.spec.whatwg.org/multipage/parsing.html#markup-declaration-open-state:bogus-comment-state
* https://html.spec.whatwg.org/multipage/parsing.html#bogus-comment-state
I agree that the error should be fixed by setting `match` to None, and a test
case that triggers the UnboundLocalError (before the fix) should be added as
well (what provided by Karthikeyan looks good).
However, it also seems wrong that HTMLParser ends up calling self.error()
through Lib/_markupbase.py ParserBase after HTMLParser.error() and all the
calls to it have been removed. _markupbase.py is internal, so it should be
safe to remove ParserBase.error() and the code that calls it as suggested in
#31844 (and possibly to merge _markupbase into html.parser too). Even if this
is done and the call to self.error() is removed from
ParserBase.parse_marked_section(), `match` still needs to be set to None
(either in the `else` branch or before the `if/elif` block).
----------
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue34480>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com