Simon Pieters wrote:
Hi,

From: Lachlan Hunt <[EMAIL PROTECTED]>
However, there may be a 5th option available. Consider this, using the following markup samples from the article.

1.
<em><p>X</em>Y</p>

BODY
  + P
    + EM
      + #text: X
    + #text: Y

Why would you drop the first EM? Why should this be parsed any different than 4? I think it should look like this instead:

Because there were no text nodes between the <em> start-tag and the <p> start tag, so putting it in there would be completely redundant and useless. Although putting it there will have no detrimental effect beyond wasting a minuscule amount of memory, so it really doesn't matter.

2.
<em><p>XY</p></em>

BODY
  + P
    + EM
      + #text: X
      + #text: Y

Why are there two text nodes?

Copy & paste error.

I don't think there's much advantage of differentiating between "well-formed" and "malformed" markup. They should be parsed the same to keep things simple and predictable. Thus, <em><p>XY</p></em> should be parsed as:

BODY
  + EM
  + P
    + EM
      + #text: XY

...IMHO.

Agree; but again, the empty EM element is redundant.

--
Lachlan Hunt
http://lachy.id.au/

Reply via email to