At 18.22 14/03/2002 -0800, Dean Roddey wrote: >XMLScanner is an internal implementation, so you take your own chances if >you use it since it could change any time. This is most likely why it isn't >documented, since they don't want to accept the responsibility for people >using it. > >Actually, some more abstract API could be added to the DOM parser to get the >system id of the current entity. There are issues, since some entities are >internal and have no id. The scanner, if I remember correctly, has a method >that searchs back up the reader stack to find the most nested external >entity, skipping over any internal entities.
I think you are talking about XMLScanner::getLastExtLocation. In any case, I want to point that using a custom EntityResolver will not work when you will try to use an XML Schema. For example, suppose you have an xml file like this one <instance xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://www.myco.com/schema.xsd"> .... </instance> and the XML Schema is something like this <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:include schemaLocation="subschema.xsd"/> ... </xs:schema> Now, in your C++ program you instanciate your scanner and set up the EntityResolver object LocalFileInputSource inputSource(L"c:\\instance.xml"); DOMParser parser; MyEntityResolver resolver(&parser); parser.setEntityResolver(&resolver); parser.parse(&inputSource); The only ways the EntityResolver::resolveEntity method will be able to know the name of file currently being parsed are: - if you are using DOM, either give him a pointer to the parser (so that it can call getScanner().getLastExtLocation()) or implement the EntityResolver interface on your own DOMParser-derived class - if you are using SAX, implement the EntityResolver interface in the same object implementing DocumentHandler (DocumentHandler::setDocumentLocator will be called with the pointer to a Locator object, and EntityResolver::resolveEntity will be able to call Locator::getSystemId etc..) But, when XMLScanner will find a reference to the XML Schema (because of an xsi:noNamespaceLocation, xs:import, xs:include or xs:redefine instruction) it will execute this code: void XMLScanner::resolveSchemaGrammar(...) { ... IDOMParser parser; XMLInternalErrorHandler internalErrorHandler(fErrorHandler); parser.setValidationScheme(IDOMParser::Val_Never); parser.setDoNamespaces(true); parser.setErrorHandler((ErrorHandler*) &internalErrorHandler); parser.setEntityResolver(fEntityResolver); ... parser.parse(*srcToFill) ; This means that the entity resolver you specified (fEntityResolver) will be silently attached to a different parser (in this case, IDOMParser). EntityResolver::resolveEntity will be called, for instance, to open the schema "subschema.xsd" (because of <xs:include schemaLocation="subschema.xsd"/>), but it will not be able to correctly determine the location of the current file. It will think we are still parsing c:\instance.xml, instead of http://www.myco.com/schema.xsd. This looks like a problem with the spec of the SAX interface (that define EntityResolver): given the current implementation, EntityResolver cannot be implemented by a standalone object (it needs another interface to assign the pointer to either a parser, scanner or reader manager object), but it is used in this case as it is freely usable to any parser object. So, what is the solution? Fix the SAX interface.... I have changed EntityResolver to receive another parameter, specifying the name of the entity currently being parsed. I see now that, on Jan 30, the people working on the SAX interfaces realized the existence of this use case, and, in the SAX2 Extensions 1.1 (beta1), they changed the signature of the EntityResolver2::resolveEntity function to include the URI and the name of the current file (see http://sax.sourceforge.net/apidoc/org/xml/sax/ext/EntityResolver2.html ) Concluding this my long e-mail (I hope my english was readable...), my final question is: do you plan to add EntityResolver2 to Xerces any time soon? P.S. My original intention was to provide my patched sources of EntityResolver.hpp, but now they would be non-standard... Thanks, Alberto ------------------------------ ------------------------------- Alberto Massari eXcelon Corp. http://www.StylusStudio.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
