I'd like to check my understanding of xmlKeepBlanksDefault.
What I want to do is to make xmlParseFile not generate whitespace nodes (i.e. I want it to generate the same tree exactly as if no additional whitespace had been provided), but have xmlSaveFormatFile write a file out with formatting. I know I can set XML_PARSE_NOBLANKS in xmlParserOption in xmlReadFile, but this is not available in xmlParseFile. xmlReadFile also seems to do far more than xmlParseFie (currently I'm using ML_PARSE_NONET | XML_PARSE_NODICT | XML_PARSE_NOXINCNODE | XML_PARSE_NOBLANKS to turn all that off). Thus I want to avoid using xmlReadFile (rather than xmlParseFile) as it appears to do a lot more stuff I'd like to avoid (and I've already tracked one SEGV down to it putting stuff in the tree I don't want). Setting xmlKeepBlanksDefault to 0 looks promising, and indeed appears to work. However, the manual page somewhat cryptically says:
Set and return the previous value for default blanks text nodes support. The 1.x version of the parser used an heuristic to try to detect ignorable white spaces. As a result the SAX callback was generating xmlSAX2IgnorableWhitespace() callbacks instead of characters() one, and when using the DOM output text nodes containing those blanks were not generated. The 2.x and later version will switch to the XML standard way and ignorableWhitespace() are only generated when running the parser in validating mode and when the current element doesn't allow CDATA or mixed content. This function is provided as a way to force the standard behavior on 1.X libs and to switch back to the old mode for compatibility when running 1.X client code on 2.X . Upgrade of 1.X code should be done by using xmlIsBlankNode() commodity function to detect the "empty" nodes generated. This value also affect autogeneration of indentation when saving code if blanks sections are kept, indentation is not generated.
I've read that several times and still cannot understand it. My observations are: 1. Contrary to the last line, it does not appear to affect output format. With xmlSaveFileFormat (anyway) the output appears to be the same whether this is set or not. 2. I don't understand the sentence starting "The 2.x and later version". I am running 2.x, and even though I am not running the parser in validating mode, then with xmlKeepsBlankDefault set to 1 it *does* appear to generate blank nodes. 3. I /appear/ to be using a compatibility mode, though despite reading the paragraph several times, I don't know whether xmlKeepBlanksDefault(1) (the default) is the compatibility mode, or whether xmlKeepBlanksDefault(0) is the compatibility mode. 4. This appears to have been written from the point of view of someone writing their own parser. A description of how it affects xmlParseFile and friends would be really useful. What am I missing here? -- Alex Bligh _______________________________________________ xml mailing list, project page http://xmlsoft.org/ [email protected] http://mail.gnome.org/mailman/listinfo/xml
