Re: [whatwg] Allowing ">" in attribute values

Lachln Hunt Fri, 25 Jun 2010 05:18:34 -0700

On 2010-06-25 11:46, Skrol29 wrote:

A agree disallowing ">" chars in attributes greatly simplifies parsing. Not
only with regular expressions, but any parsing.
If ">" are allowed, it means that in order to found the end of the element
you do have to read all attributes before. This is very costy. Just an
example but they are many others:  let's image you'd like to convert an HTML
document into flat text. To simplify you're algorithm you've chosen  to
retrieve the content of the<body>  element and then to delete all elements
in it. This is very fast if ">" are not allowed in attributes because you're
able found elements bounds just by searching "<" and then">".  But if">"
are allowed, the operation gets much more complicated, and you spend much
more time to scan all elements.

You seem to be conflating document conformance requirements with parsingrequirements. Even if '>' was disallowed in attribute values fordocument conformance, parsers would still be required to handle it if itwere present. If your parser doesn't handle it because it just assumesthat '>' is the end of the tag name, then your paser is broken. Changingthe parsing requirements such that '>' is treated as the end of a tag,in places where it's currently treated as part of an attribute value,would break backwards compatibility.


--
Lachlan Hunt - Opera Software
http://lachy.id.au/
http://www.opera.com/

Re: [whatwg] Allowing ">" in attribute values

Reply via email to