New submission from Daniel Lovell <lovell.danie...@gmail.com>:
html.entities.html5 keys should either require a trailing semicolon. The Python docs say: html.entities.html5 "A dictionary that maps HTML5 named character references [1] to the equivalent Unicode character(s), e.g. html5['gt;'] == '>'. Note that the trailing semicolon is included in the name (e.g. 'gt;'), however some of the names are accepted by the standard even without the semicolon: in this case the name is present with and without the ';'. See also html.unescape()." https://docs.python.org/3/library/html.entities.html?highlight=html However, it is not clear without looking at the source which keys require the semicolon and which do not. Taking a look at the source, the number which require a trailing semicolon vastly outnumber the others. For simplicity and continuity with the w3.org standard HTML5 Character Entity Reference Chart - I recommend that the trailing semicolon be required. As they are in HTML5: https://dev.w3.org/html5/html-author/charref My recommendation could then be extrapolated to say we should require the ampersand as HTML5 does, but I don't think this revision should be taken this far unless others agree. ---------- components: Library (Lib) messages: 329105 nosy: daniellovell priority: normal severity: normal status: open title: html.entities.html5 should require a trailing semicolon type: behavior versions: Python 3.7 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue35142> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com