Re: Strange way to handle white spaces during parsing

roddey 22 Mar 2000 21:49:05 -0000

Sorry, my mind wasn't in gear. I was pointing out how it could work, but
should have pointed out that we of course do the right thing (at least the
C++ parser) and always call characters() if we are not validating.

----------------------------------------
Dean Roddey
Software Weenie
IBM Center for Java Technology - Silicon Valley
[EMAIL PROTECTED]



Norman Walsh <[EMAIL PROTECTED]> on 03/22/2000 11:26:02 AM

Please respond to [EMAIL PROTECTED]

To:   [EMAIL PROTECTED]
cc:
Subject:  Re: Strange way to handle white spaces during parsing



/ [EMAIL PROTECTED] was heard to say:
| If a DTD is present, its read and the information required to make this
| decision is present. It doesn't require validation, just a check to see
| what type of content model the element has.

I'm not comfortable with that answer at all. I think an option that
ignores "element" whitespace in a non-validating parse is non-standard
and potentially dangerous. Consider:

  The XML 1.0 REC, Section 2.10:

  An XML processor must always pass all characters in a document
  that are not markup through to the application. A validating
  XML processor must also inform the application which of these
  characters constitute white space appearing in element
  content.

I can't think of any way to interpret that such that a
non-validating parse could ignore whitespace.

Consider the following example:

<!DOCTYPE test [
<!ELEMENT a (b+)>
<!ELEMENT b (#PCDATA)>
]>
<a>test<b/> <b/>this! 4 or 5?</a>

Does <a> have four children or five? The answer has to be five.

And what about a document with an external subset that has
parameter entities that cannot be located, so that the DTD is
really half a loaf. Does it ignore whitespace in content models
that it found, but not in others?

                                        Be seeing you,
                                          norm

--
Norman Walsh <[EMAIL PROTECTED]>      | As a general rule, the most
http://nwalsh.com/                 | successful man in life is the man
                                   | who has the best
                                   | information.--Benjamin Disraeli
Re: Strange way to handle white spaces during parsing

Reply via email to