Re: pre-HTML5 and the BOM

Jukka K. Korpela Tue, 17 Jul 2012 21:40:51 -0700

2012-07-18 5:09, "Martin J. Dürst" wrote:

Well, the "considered" in the BOM case applies to everybody (including
the W3C), but in the character references case, it applies only to
people who didn't understand how things were working. In fact, although
RFC 2070 and HTML4 clearly nailed down the interpretation of numeric
character references to Unicode, there were implementations (the ones I
know were in the mobile space) past 2000.

I presume that you mean that there were *faulty* implementations, withwrong interpretations of numbers in character references. That’s true,but it’s a different issue. What I meant is that it was widely said, andit is still said by many people, that entity references like “å”were safer than directly entering characters like “å”. Part of this wasthat not all transmissions were 8-bit safe; another part was that it wasnot clear what encodings user agents can handle—so ASCII + entities wasdescribed as the safe solution.

And these safety considerations have long ago been reversed, just aswith the BOM.

To take a more modern example, the native e-mail client on my Android
seems to systematically display character and entity references
literally when displaying message headers with small excerpts of
content, even though it correctly interprets them when displaying the
message itself.


The reason for this may simply be that email bodies can be in HTML, but
that there is no way at all to use HTML in email header fields.

Good guess, but the å etc. do not appear in headers; they areexcerpts from the body in HTML format—which is sort-of parsed butapparently without interpreting entity references. And this is modernsoftware, not Netscape 1.


Yucca

Re: pre-HTML5 and the BOM

Reply via email to