Re: [whatwg] HTML5 named entity Gt; and Lt;
On Wed, 14 Dec 2011, Mike Samuel wrote: The table in section 12.5 ( http://www.whatwg.org/specs/web-apps/current-work/multipage/named-character-references.html ) says GT;U+0003E Gt;U+0226B≫ gt;U+0003E GT U+0003E gt U+0003E which I believe means that GT;, gt;,GT, and gt all encode but Gt; encodes U+226B MUCH GREATER-THAN. Correct. Similarly Lt;U+0226A≪ Correct. This is a potential source of confusion for naive HTML entity decoders fall-back to case-insensitive matching when there is no mapping for a given entity name. Such decoders are non-conforming. MathML already has other succinct mappings for U+226A (ll;) and U+226B (gg;). Could HTML5 avoid confusion by deprecating Lt; and Gt; in favor of ll; and gg; or remove them entirely? The mappings in the HTML standard are actually the MathML mappings. We literally use the same database they do to automatically generate the mapping in the spec. On Wed, 14 Dec 2011, Ilhan Y. wrote: By the way, can we have Unicode names (HTML names) for Mercury, Sun, Earth and other planets. They are used by many astronomers on the internet. The named character references used in HTML are just those provided to us by the MathML working group, so if you actually want a change here, I recommend contacting that group. In general though I doubt we will add more names. It's gotten rather out of hand. On Wed, 14 Dec 2011, Jukka K. Korpela wrote: After all, there is no rationale given for the inclusion of new “named character references,” so people might see the idea as asking authors to submit new proposals for every possible and impossible character. The rationale is compatibility with deployed MathML content. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
[whatwg] HTML5 named entity Gt; and Lt;
The table in section 12.5 ( http://www.whatwg.org/specs/web-apps/current-work/multipage/named-character-references.html ) says GT;U+0003E Gt;U+0226B≫ gt;U+0003E GT U+0003E gt U+0003E which I believe means that GT;, gt;,GT, and gt all encode but Gt; encodes U+226B MUCH GREATER-THAN. http://svn.whatwg.org/webapps/entities-unicode.inc includes these but the entities-legacy.inc does not. Similarly Lt;U+0226A≪ This is a potential source of confusion for naive HTML entity decoders fall-back to case-insensitive matching when there is no mapping for a given entity name. MathML already has other succinct mappings for U+226A (ll;) and U+226B (gg;). Could HTML5 avoid confusion by deprecating Lt; and Gt; in favor of ll; and gg; or remove them entirely? http://www.google.com/codesearch#search/q=amp;Gt;%20file:.html$%20case:yestype=cs shows four files using Gt;, 2 of which treat it as synonymous with gt;.
Re: [whatwg] HTML5 named entity Gt; and Lt;
By the way, can we have Unicode names (HTML names) for Mercury, Sun, Earth and other planets. They are used by many astronomers on the internet. On Wed, Dec 14, 2011 at 7:18 PM, Mike Samuel mikesam...@gmail.com wrote: The table in section 12.5 ( http://www.whatwg.org/specs/web-apps/current-work/multipage/named-character-references.html ) says GT; U+0003E Gt; U+0226B ≫ gt; U+0003E GT U+0003E gt U+0003E which I believe means that GT;, gt;,GT, and gt all encode but Gt; encodes U+226B MUCH GREATER-THAN. http://svn.whatwg.org/webapps/entities-unicode.inc includes these but the entities-legacy.inc does not. Similarly Lt; U+0226A ≪ This is a potential source of confusion for naive HTML entity decoders fall-back to case-insensitive matching when there is no mapping for a given entity name. MathML already has other succinct mappings for U+226A (ll;) and U+226B (gg;). Could HTML5 avoid confusion by deprecating Lt; and Gt; in favor of ll; and gg; or remove them entirely? http://www.google.com/codesearch#search/q=amp;Gt;%20file:.html$%20case:yestype=cs shows four files using Gt;, 2 of which treat it as synonymous with gt;.
Re: [whatwg] HTML5 named entity Gt; and Lt;
2011-12-14 19:34, Ilhan Y. wrote: By the way, can we have Unicode names (HTML names) for Mercury, Sun, Earth and other planets. They are used by many astronomers on the internet. Nice parody! But maybe people won’t take it as parody. After all, there is no rationale given for the inclusion of new “named character references,” so people might see the idea as asking authors to submit new proposals for every possible and impossible character. The whole idea of extending the repertoire is wrong. We have lived with a certain set of entity references (now being renamed “named character references”), widely supported by browsers, except possibly in XHTML mode. Authors who need other characters can enter them as such, using UTF-8 (which is being favored, is it not?) or using numeric character references. So nobody really needs any added pseudo-mnemonic “named references,” and they just cause incompatibility: pages fail on most browsers, when they would work perfectly if other methods of including characters had been used. Allowing gt and GT and GT; as synonyms for gt; might be pragmatic, if there is sufficient evidence of their use on legacy pages, but code checkers should issue a warning (there is nothing to be gained by using such deviating forms). And adding things like Gt;, with a different meaning, is just asking for trouble. Yucca
Re: [whatwg] HTML5 named entity Gt; and Lt;
On Wed, 14 Dec 2011 19:40:04 +0100, Jukka K. Korpela jkorp...@cs.tut.fi wrote: The whole idea of extending the repertoire is wrong. We have lived with a certain set of entity references (now being renamed “named character references”), widely supported by browsers, except possibly in XHTML mode. Authors who need other characters can enter them as such, using UTF-8 (which is being favored, is it not?) or using numeric character references. Personally, I like named entities, I use them all the time to get the correct Unicode code point (e.g. data:text/html,middot;). That is often faster than looking the character up somehow. -- Anne van Kesteren http://annevankesteren.nl/