[whatwg] document.write("\r"): the spec doesn't say how to handle it.

David Flanagan Wed, 02 Nov 2011 17:11:46 -0700

The spec for document.write()http://www.whatwg.org/specs/web-apps/current-work/multipage/elements.html#dom-document-writesays: "... have the tokenizer process the characters that were inserted,one at a time, processing resulting tokens as they are emitted, andstopping when the tokenizer reaches the insertion point..."

But what happens if the last character written by document.write() is acarriage return?

The HTML parsing spec says that CR followed by LF is ignored but CRfollowed by anything else is converted to LF. So if the last characteris CR, then the tokenizer can't process all characters up to theinsertion point because it needs to lookahead at the next character, right?

Firefox, Chrome and Safari all seem to do the right thing: wait for thenext character before tokenizing the CR. And I think this means thatthe description of document.write needs to be changed. (Opera, on theother hand, just gets this wrong and emits a CR character).

Similarly, what should the tokenizer do if the document.write emits halfof a UTF-16 surrogate pair as the last character?


    David

[whatwg] document.write("\r"): the spec doesn't say how to handle it.

Reply via email to