New submission from Ross: If convert_charrefs is set to true the final data section is not return by feed(). It is held until the next tag is encountered.
--- from html.parser import HTMLParser class MyHTMLParser(HTMLParser): def __init__(self): HTMLParser.__init__(self, convert_charrefs=True) self.fed = [] def handle_starttag(self, tag, attrs): print("Encountered a start tag:", tag) def handle_endtag(self, tag): print("Encountered an end tag :", tag) def handle_data(self, data): print("Encountered some data :", data) parser = MyHTMLParser() parser.feed("foo <a>link</a> bar") print("") parser.feed("spam <a>link</a> eggs") --- gives Encountered some data : foo Encountered a start tag: a Encountered some data : link Encountered an end tag : a Encountered some data : barspam Encountered a start tag: a Encountered some data : link Encountered an end tag : a With 'convert_charrefs = False' it works as expected. ---------- components: Library (Lib) messages: 233291 nosy: xkjq priority: normal severity: normal status: open title: html.parser.HTMLParser: setting 'convert_charrefs = True' leads to dropped text type: behavior versions: Python 3.4 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue23144> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com