Hi, I'm trying to use libxml to extract NOTATION entries on an internal DTD,
but am struggling. I can't seem to find a way to get the DTD notations
interleaved in amongst the other DTD elements.
Some background - my app (which contains some XML parsing/formatting
functionality) is actually written in Objective-C, so I was originally using
NSXMLDocument (DOM-based) but for some reason the notations property on
NSXMLDTD is always nil (Apple suggests this is a libxml bug, but I am not yet
convinced). Their suggestion was to use NSXMLParser (SAX-based) - which does
actually return the notations, but the problem is that it doesn't fire an event
indicating that parsing has entered the DOCTYPE, so if I have the following
XML, I don't know whether comment2 is inside the DOCTYPE or outside.
<?xml version="1.0" standalone="yes" ?>
<!-- comment1 -->
<!DOCTYPE xxx SYSTEM "XXX" [
<!-- comment2 -->
<!ENTITY blah SYSTEM "BLAH" NDATA note>
<!NOTATION note PUBLIC "my notation">
]>
<xxx>some text</xxx>
So, my next step is to fallback to libxml itself. Exploring xmllint, I can see
that the --format option does indeed find and print the notations (which is
good), but I've noticed that it doesn't preserve the original order of the
various entities. Its output for the above XML is (note that the NOTATION has
been moved ahead of the comment/entity):
<?xml version="1.0" standalone="yes"?>
<!-- comment1 -->
<!DOCTYPE xxx SYSTEM "XXX" [
<!NOTATION note PUBLIC "my notation" >
<!-- comment2 --><!ENTITY blah SYSTEM "BLAH" NDATA note>
]>
<xxx>some text</xxx>
More digging reveals the xmlDumpNotationTable() function - which looks like it
ultimately calls an opaque hash table scanner wherein I pass in a function
pointer. OK, maybe that is what I need to do to iterate over the notations?
Some more wandering through the code leads me xmlDtdDumpOutput() - which says
"Dump the notations first as they are not in the DTD children list".
That seems odd. Why aren't the notations treated as children?
Anyway, I've tried using xmlCtxtReadFile() and traversing the resulting
xmlDocPtr/xmlDtdPtr objects, but can't find a way to get to the notations.
I've also tried xmlNewTextReaderFilename() but it only seems to traverse the
XML elements, not the internal DTD.
Is there something I've missed? If notations aren't added as children, I'm not
sure how to get back a correctly sequenced set of elements (including
notations). Do I really need to drop all the way back to implementing my own
SAX event handler in order to preserve the list of notations? Or have I
totally missed something obvious?
Any advise would be much appreciated (sorry for the long-winded post, but I
wanted to cover off what I've already tried).
Cheers,
Craig
_______________________________________________
xml mailing list, project page http://xmlsoft.org/
[email protected]
https://mail.gnome.org/mailman/listinfo/xml