Sorry, my mind wasn't in gear. I was pointing out how it could work, but
should have pointed out that we of course do the right thing (at least the
C++ parser) and always call characters() if we are not validating.
----------------------------------------
Dean Roddey
Software Weenie
IBM Center for Java Technology - Silicon Valley
[EMAIL PROTECTED]
Norman Walsh <[EMAIL PROTECTED]> on 03/22/2000 11:26:02 AM
Please respond to [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
cc:
Subject: Re: Strange way to handle white spaces during parsing
/ [EMAIL PROTECTED] was heard to say:
| If a DTD is present, its read and the information required to make this
| decision is present. It doesn't require validation, just a check to see
| what type of content model the element has.
I'm not comfortable with that answer at all. I think an option that
ignores "element" whitespace in a non-validating parse is non-standard
and potentially dangerous. Consider:
The XML 1.0 REC, Section 2.10:
An XML processor must always pass all characters in a document
that are not markup through to the application. A validating
XML processor must also inform the application which of these
characters constitute white space appearing in element
content.
I can't think of any way to interpret that such that a
non-validating parse could ignore whitespace.
Consider the following example:
<!DOCTYPE test [
<!ELEMENT a (b+)>
<!ELEMENT b (#PCDATA)>
]>
<a>test<b/> <b/>this! 4 or 5?</a>
Does <a> have four children or five? The answer has to be five.
And what about a document with an external subset that has
parameter entities that cannot be located, so that the DTD is
really half a loaf. Does it ignore whitespace in content models
that it found, but not in others?
Be seeing you,
norm
--
Norman Walsh <[EMAIL PROTECTED]> | As a general rule, the most
http://nwalsh.com/ | successful man in life is the man
| who has the best
| information.--Benjamin Disraeli