Although it's not explicitly documented (that I could see), the XML 
source indentation algorithm used by XXE appears to turn on text fill 
for (and disable line breaks before children in) all elements that can 
have #PCDATA, whether they contain it or not.

This generally works well, but it breaks down a bit for some of our XML 
formats, which use modular doctypes, where an extension element is 
defined to contain ANY, and an external parameter entity provides the 
DTD subset for the elements that are contained within the extension 
element.  Using this mechanism, new DTD pieces can be composed to 
support new data structures without modifying the existing DTDs.  The 
attached DTDs and XML files demonstrate how this works.

The problem is that because the <extwrapper> element is defined as ANY, 
it can contain #PCDATA, although in practice it never does.  As a 
result, the contents of <extwrapper>s are filled even though they never 
contain #PCDATA.  As you can see in the example.xml, this can lead to 
very long lines, since there may not be any whitespace for breaking 
lines (after stripping superfluous whitespace).

Ideally, text fill would only be turned on if there were actually text 
present in the ANY item (e.g. the third <extwrapper> in example.xml). 
But if that's too tricky, I would make a case that although ANY can 
contain #PCDATA, in practice it rarely does, and that therefore elements 
with ANY syntax should not be treated as though they might contain 
#PCDATA (for indentation/text fill purposes only).

@alex
-- 
mailto:dupuy at sysd.com
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: base.dtd
Url: 
http://www.xmlmind.com/pipermail/xmleditor-support/attachments/20030220/8083d2fa/attachment.bat
 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: extension.dtd
Url: 
http://www.xmlmind.com/pipermail/xmleditor-support/attachments/20030220/8083d2fa/attachment-0001.bat
 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: example.xml
Type: text/xml
Size: 578 bytes
Desc: not available
Url : 
http://www.xmlmind.com/pipermail/xmleditor-support/attachments/20030220/8083d2fa/attachment.xml
 

Reply via email to