[issue23144] html.parser.HTMLParser: setting 'convert_charrefs = True' leads to dropped text

2015-09-09 Thread Larry Hastings
Larry Hastings added the comment: The Misc/NEWS entry for this was added under Python 3.5.0rc3. But, since no pull request has been made for this change, this change hasn't been merged into 3.5.0. It will ship as part of Python 3.5.1. I've moved the Misc/NEWS entry accordingly. --

[issue23144] html.parser.HTMLParser: setting 'convert_charrefs = True' leads to dropped text

2015-09-06 Thread Roundup Robot
Roundup Robot added the comment: New changeset ef82131d0c93 by Ezio Melotti in branch '3.4': #23144: Make sure that HTMLParser.feed() returns all the data, even when convert_charrefs is True. https://hg.python.org/cpython/rev/ef82131d0c93 New changeset 1f6155ffcaf6 by Ezio Melotti in branch

[issue23144] html.parser.HTMLParser: setting 'convert_charrefs = True' leads to dropped text

2015-09-06 Thread Ezio Melotti
Ezio Melotti added the comment: Fixed, thanks for the report! -- resolution: -> fixed stage: commit review -> resolved status: open -> closed versions: +Python 3.6 -Python 2.7 ___ Python tracker

[issue23144] html.parser.HTMLParser: setting 'convert_charrefs = True' leads to dropped text

2015-09-04 Thread Ezio Melotti
Ezio Melotti added the comment: I'll try to take care of this during the weekend. Feel free to ping me if I don't. -- ___ Python tracker ___

[issue23144] html.parser.HTMLParser: setting 'convert_charrefs = True' leads to dropped text

2015-07-29 Thread Robert Collins
Robert Collins added the comment: @ezio I think you should commit what you have so far. LGTM. -- nosy: +rbcollins ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23144 ___

[issue23144] html.parser.HTMLParser: setting 'convert_charrefs = True' leads to dropped text

2015-03-08 Thread Ezio Melotti
Ezio Melotti added the comment: A context manager here would seem a bit strange. I still haven't thought this through, but I can't see any problem with it right now. This would be similar to: from contextlib import closing with closing(MyHTMLParser()) as parser: parser.feed(html)

[issue23144] html.parser.HTMLParser: setting 'convert_charrefs = True' leads to dropped text

2015-03-07 Thread Ezio Melotti
Ezio Melotti added the comment: Here is a patch that fixes the problem. Even though calling .close() is the correct solution, I preferred to restore the previous behavior and call handle_data as soon as possible. There is a corner case in which a charref might be cut in half while feeding

[issue23144] html.parser.HTMLParser: setting 'convert_charrefs = True' leads to dropped text

2015-03-07 Thread Ezio Melotti
Changes by Ezio Melotti ezio.melo...@gmail.com: Removed file: http://bugs.python.org/file38376/issue23144.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23144 ___

[issue23144] html.parser.HTMLParser: setting 'convert_charrefs = True' leads to dropped text

2015-03-07 Thread Ezio Melotti
Changes by Ezio Melotti ezio.melo...@gmail.com: Added file: http://bugs.python.org/file38380/issue23144.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23144 ___

[issue23144] html.parser.HTMLParser: setting 'convert_charrefs = True' leads to dropped text

2015-03-07 Thread Martin Panter
Martin Panter added the comment: I still think it would be worthwhile adding close() calls to the examples in the documentation (Doc/library/html.parser.rst). BTW I haven’t tested this, and maybe it is not a concern, but even with this patch it looks like the parser will buffer unlimited data

[issue23144] html.parser.HTMLParser: setting 'convert_charrefs = True' leads to dropped text

2015-03-07 Thread Ezio Melotti
Ezio Melotti added the comment: I still think it would be worthwhile adding close() calls to the examples in the documentation (Doc/library/html.parser.rst). If I add context manager support to HTMLParser I can update the examples to use it, but otherwise I don't think it's worth changing

[issue23144] html.parser.HTMLParser: setting 'convert_charrefs = True' leads to dropped text

2015-03-07 Thread Martin Panter
Martin Panter added the comment: A context manager here would seem a bit strange. Is there any precedent for using context managers with feed parsers? The two others that come to mind are ElementTree.XMLParser and email.parser.FeedParser. These two build an object while parsing, and close()

[issue23144] html.parser.HTMLParser: setting 'convert_charrefs = True' leads to dropped text

2015-01-09 Thread Ezio Melotti
Changes by Ezio Melotti ezio.melo...@gmail.com: -- assignee: docs@python - ezio.melotti ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23144 ___

[issue23144] html.parser.HTMLParser: setting 'convert_charrefs = True' leads to dropped text

2015-01-01 Thread Martin Panter
Martin Panter added the comment: You “forgot” to call close(): parser.close() Encountered some data : eggs Perhaps this is a documentation bug, since there is a lot of example code given, but none of the examples call close(). -- assignee: - docs@python components: +Documentation

[issue23144] html.parser.HTMLParser: setting 'convert_charrefs = True' leads to dropped text

2015-01-01 Thread R. David Murray
Changes by R. David Murray rdmur...@bitdance.com: -- nosy: +ezio.melotti, r.david.murray ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23144 ___

[issue23144] html.parser.HTMLParser: setting 'convert_charrefs = True' leads to dropped text

2015-01-01 Thread Ross
New submission from Ross: If convert_charrefs is set to true the final data section is not return by feed(). It is held until the next tag is encountered. --- from html.parser import HTMLParser class MyHTMLParser(HTMLParser): def __init__(self): HTMLParser.__init__(self,

[issue23144] html.parser.HTMLParser: setting 'convert_charrefs = True' leads to dropped text

2015-01-01 Thread Ross
Ross added the comment: That would make sense. Might also be worth mentioning the difference in behaviour with convert_charrefs = True/False as that was what led me to think this was a bug. -- ___ Python tracker rep...@bugs.python.org