Bugs item #1504676, was opened at 2006-06-12 06:41 Message generated for change (Comment added) made by fdrake You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1504676&group_id=5470
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: Python 2.5 Status: Open Resolution: None Priority: 5 Submitted By: Sam Ruby (rubys) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: Make sgmllib char and entity references pluggable Initial Comment: The changes being made to sgmllib in Python 2.5 may break existing applications. This patch makes it easy for subclasses to revert to the old behavior. Additionally, it makes it easier to provide new behaviors, like supporting unicode, hexadecimal character references, and additional entities. ---------------------------------------------------------------------- >Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-06-14 09:06 Message: Logged In: YES user_id=3066 Thanks. I'll look into this again tonight. ---------------------------------------------------------------------- Comment By: Sam Ruby (rubys) Date: 2006-06-14 09:02 Message: Logged In: YES user_id=141556 Note that the pre-existing code transforms tag data twice. Ideally, the handing for entities in attributes and data would be unified. ---------------------------------------------------------------------- Comment By: Sam Ruby (rubys) Date: 2006-06-14 08:59 Message: Logged In: YES user_id=141556 updated patch with test case. Note that in the pre-existing code tag data values are transformed twice -- this should be corrected and ideally the code for handing references should be unified. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-06-14 01:14 Message: Logged In: YES user_id=3066 This patch certainly makes the subclass interface nicer; I like that. There is a case that it breaks (foolishly not covered by the existing tests, but clear on reading the patch that it broke). I've added the relevant test in this change: http://mail.python.org/pipermail/python-checkins/2006-June/053975.html The problem with the patch is that attribute values are transformed twice (once for entity refs, once for character refs), instead of just once, so entity ref expansions can cause character refs to be located that aren't in the markup. I'm out of time tonight, but should be able to make this patch work with the additional tests tomorrow night if sruby doesn't beat me to it. Documentation and tests for the subclass interface changes are still needed as well. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1504676&group_id=5470 _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com