Alberto, thanks for this. What I notice is that if I have preloaded a schema into the grammarpool, then the namespace declaration is good enough as a hook to use that, and then the document is treated as a schema based document. Is this a bug / side effect?
Should I really expect xercesc to treat every document as based on an empty DTD unless a schema declaration / DOCTYPE is found? If it is safe for me to assume that a namespace declaration is enough to identify the type of document as long as the namespace is already in the grammarpool, then I have a workaround (which is a shame - as I would have preferred the scanner to do this for me) which involves intercepting the documents owning xmlns attribute and then looking up and preloading the grammar before the parse takes place. When scanning the document, I notice that the method IGXMLScanner::scanStartTagNS(bool& gotData) is being called and the namespace is being identified as a part of the document scan. It seems to be eminently sensible for xercesc to have some form of callback method for allowing the calling code to respond to the namespace, and this is what I understood Michael to mean in his original post. Best regards, and thanks for your advice on this. Ben. On 13 May 2010, at 15:09, Alberto Massari wrote: > When Xerces parses an XML file, it assumes it is based on en empty DTD; only > when a schema declaration is found, the schema validator becomes the active > one. > Having an element declared in a non-empty namespace doesn't make it use an > XMLSchema, it's only a namespace declaration. > And any resource resolver is used only when trying to actually load an > external resource, e.g. when a xsi:schemaLocation or a DOCTYPE instruction is > found in the XML document. > > Alberto > > On 5/13/2010 3:39 PM, Ben Griffin wrote: >> On the xerces-j list about 3 years ago, Michael said: >> >> On 1/29/07, Michael Glavassevich<[email protected]> wrote: >> >>> If you were expecting to resolve the schema documents based on their target >>> namespace >>> you should use an API which has a resolver that will pass that >>> information (see the JAXP 1.3 Validation API [2] and LSResourceResolver >>> [3]) to you. >>> >> I really want to be able to do this usng xercesc, but I keep hitting walls. >> >> I am not sure if it because of the API statement >> The LSParser will then allow the application to intercept any external >> entities, including the >> external DTD subset and external parameter entities, before including them. >> The top-level document entity is never passed to the resolveResource method. >> >> but for however I try, when parsing ( with DOMLSParser ) an xml document >> such as >> >> <foo xmlns="http://www.foo.org"> >> ... >> </foo> >> >> there is no callback to my resolveResource() method. >> >> I am setting up my DOMLSResourceResolver with >> >> conf->setParameter(XMLUni::fgDOMResourceResolver,myResourceHandler); >> >> (I have tried using XMLEntityResolver classes also, to no avail). >> >> Also (and this maybe related) - it appears to me that unless the grammar is >> already loaded against the root element's namespace, >> the xml document is treated as if it were a DTD instance, rather than a >> Schema instance, >> even though there is no DOCTYPE declaration, and there is clearly marked an >> xmlns 'attribute' on the root element. >> >> Any answers?
