On Wed, Apr 29, 2009 at 04:05:43PM +0100, Rachael Churchill wrote: > Hi > > I hope I'm not being rude, but I didn't get any replies to this email > when I sent it to the mailing list before, so I'm just sending it again > in case it got missed. If anyone can help answer my questions that'd be > really good.
Why SAX and why not the XML Reader ? I saw your mail, and my 10s attention span was reduced to why do they absolutely need SAX if they are using Entities declaration in the DTD. > Thanks very much, > Rachael > > -------- Original Message -------- > Subject: [xml] Resolving entity references from a DTD using SAX parser > Date: Fri, 17 Apr 2009 13:13:26 +0100 > From: Rachael Churchill <[email protected]> > To: [email protected] > > > > Hi > > Please could you help me get libxml2 to resolve entity references such > as &foo; which are defined in a DTD declared in the XML document? > > I am using the SAX parser (I know about the warning on > http://xmlsoft.org/entities.html, but we really have to use SAX for our > purposes), declaring element handlers and character handlers and then > calling XMLParseChunk. I originally assumed the parser would > automatically read in the DTD when it found the declaration, and then it does, it calls you back. you have to build the internal structures associated with the entity in the associated handlers. > resolve the custom-defined entity references as it found them, like it > does with the predefined ones such as & but this seems not to be the > case. Because you didn't built the entities in the internal tables when the internal subset was parsed. Proof: paphio:~/XML -> cat test/ent1 <?xml version="1.0"?> <!DOCTYPE EXAMPLE SYSTEM "example.dtd" [ <!ENTITY xml "Extensible Markup Language"> ]> <EXAMPLE> &xml; </EXAMPLE> paphio:~/XML -> ./xmllint --sax test/ent1 SAX.setDocumentLocator() SAX.startDocument() SAX.internalSubset(EXAMPLE, , example.dtd) SAX.entityDecl(xml, 1, (null), (null), Extensible Markup Language) SAX.getEntity(xml) SAX.externalSubset(EXAMPLE, , example.dtd) SAX.startElementNs(EXAMPLE, NULL, NULL, 0, 0, 0) SAX.characters( , 5) SAX.getEntity(xml) SAX.error: Entity 'xml' not defined SAX.reference(xml) SAX.characters( , 1) SAX.endElementNs(EXAMPLE, NULL, NULL) SAX.endDocument() paphio:~/XML -> > I think I have to declare a reference-handling function as > SAXHandler.reference, and then in that function, which gets called when > an entity reference is found, call xmlGetEntityFromTable to look up what > that entity reference resolves to. But how do I populate the table > using the entity references defined in the DTD? The functions like using the entity declaration and others DTD related declaration callbacks. > xmlAddDtdEntity and xmlAddDocEntity seem to require an xmlDocPtr, which > I thought only existed when doing DOM parsing. So do I need to use Create one by hand. > xmlNewEntity instead, which does the same thing without needing an > xmlDocPtr? Also, xmlGetEntityFromTable isn't exposed by the API. It's > called from xmlGetDocEntity, xmlGetDtdEntity and xmlGetParameterEntity, > but they all require an xmlDocPtr too. May I change the API so that > xmlGetEntityFromTable is exposed, or is there another way of doing it? > > Also, how do I get the filename of the DTD? Is there a function for in the SAX.internalSubset callback. But really if you expect to have external subset handling with libxml2 SAX you will have an awful lot of work, or have to reuse all the existing default SAX callbacks from SAX2.c in any case an awful lot of work, hence the warning in red ! > that, or do I need to manually parse the XML document looking for the > declaration? Then when I have the DTD, I need to parse it. Calling > xmlParseDTD with the filename as the SystemID argument seems to work; > then can I use the "elements" field of the xmlDtdPtr it returns as the > table I need to refer to in xmlGetEntityFromTable (and so not need > xmlAdd*Entry)? Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ [email protected] | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/ _______________________________________________ xml mailing list, project page http://xmlsoft.org/ [email protected] http://mail.gnome.org/mailman/listinfo/xml
