Edward Z. Yang wrote:
The reason I'd like to know this is because I am the author of a tool named HTML Purifier, which takes user-input HTML and cleans it for standards-compliance as well as XSS. We insist on output being standards compliant, because the result is unambiguous.
Nothing in section 8 is going to ensure that you get output that passes a conformance check. If you do transform the output into something that is conforming then you have to make up the rules yourself so you have just shifted the ambiguity from the client (where it will hopefully disappear in a few years once the HTML5 algorithm has large-scale adoption) to the sanitizer implementation.