Hi,

From: Lachlan Hunt <[EMAIL PROTECTED]>
However, there may be a 5th option available. Consider this, using the following markup samples from the article.

1.
<em><p>X</em>Y</p>

BODY
  + P
    + EM
      + #text: X
    + #text: Y

Why would you drop the first EM? Why should this be parsed any different than 4? I think it should look like this instead:

BODY
  + EM
  + P
    + EM
      + #text: X
    + #text: Y

2.
<em><p>XY</p></em>

BODY
  + P
    + EM
      + #text: X
      + #text: Y

Again, I think that there should be an empty EM before the P. Why are there two text nodes?

BODY
  + EM
  + P
    + EM
      + #text: XY

3.
<em><p>X</p><p>Y</p></em>

BODY
  + P
    + EM
      + #text: X
  + P
    + EM
      + #text: Y

BODY
  + EM
  + P
    + EM
      + #text: X
  + P
    + EM
      + #text: Y

4.
<em>X<p>Y</em>Z</p>

BODY
  + EM
    + #text: X
  + P
    + EM
      + #text: Y
    + #text: Z

Agree.

I don't think there's much advantage of differentiating between "well-formed" and "malformed" markup. They should be parsed the same to keep things simple and predictable. Thus, <em><p>XY</p></em> should be parsed as:

BODY
  + EM
  + P
    + EM
      + #text: XY

...IMHO.

Regards,
Simon Pieters


Reply via email to