New submission from Ezio Melotti:

A JSON file containing all the HTML5 entities is now available at 
http://dev.w3.org/html5/spec/entities.json.
I tested from the interpreter to see if it matches the values in 
html.entities.html5 and there are a dozen of entities that need to be updated:

>>> s = json.load(open('entities.json'))
>>> from html.entities import html5
>>> for (k1,i1),(k2,i2) in zip(sorted(s.items()), sorted(html5.items())):
...   if i1['characters'] != i2: (k1, k2, i1['characters'], i2, 
i1['codepoints'], list(map(ord, i2)))
... 
('⃜', 'DotDot;', '⃜', '◌⃜', [8412], [9676, 8412])
('̑', 'DownBreve;', '̑', '◌̑', [785], [9676, 785])
('⟨', 'LeftAngleBracket;', '⟨', '〈', [10216], [9001])
('
', 'NewLine;', '\n', '␊', [10], [9226])
('⟩', 'RightAngleBracket;', '⟩', '〉', [10217], [9002])
('	', 'Tab;', '\t', '␉', [9], [9225])
('⃛', 'TripleDot;', '⃛', '◌⃛', [8411], [9676, 8411])
('⟨', 'lang;', '⟨', '〈', [10216], [9001])
('⟨', 'langle;', '⟨', '〈', [10216], [9001])
('⟩', 'rang;', '⟩', '〉', [10217], [9002])
('⟩', 'rangle;', '⟩', '〉', [10217], [9002])
('⃛', 'tdot;', '⃛', '◌⃛', [8411], [9676, 8411])

The Tools/scripts/parseentities.py script should also be updated (or possibly a 
new, separate script should be added), so it can be used to generate the html5 
dict.  I'm setting this as release blocker so that the update gets done before 
the release (other values might change in the meanwhile).

----------
assignee: ezio.melotti
messages: 173021
nosy: ezio.melotti
priority: release blocker
severity: normal
stage: needs patch
status: open
title: Update html.entities.html5 dictionary and parseentities.py
type: behavior
versions: Python 3.3, Python 3.4

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue16245>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to