Hello-

For performance reasons, I'm caching grammars for validation of an
instance against a set of connected and disconnected schemas of an XBRL
taxonomy. Schemas can be disconnected when discovered via linkbases. The
only way I can get this working is to re-generate grammar every time,
letting Xerces construct the pool before validating the instance. This
is very expensive to let xerces parse grammar more then once.

Searching the xerces-2 grammar FAQ and other mailing lists on the
internet has suggestions on how to "actively" construct grammar pools
(caching grammars), but I still get the following warning(s):

Warning:

One of the grammar(s) returned from the user's grammar pool is in
conflict with another grammar.

Here is my integration code for building a grammar pool from 2 or more
root schemas:

 while (schemas.hasNext()) {

IDTSNode schemaNode = (IDTSNode) schemas.next();

            XMLGrammarPoolImpl gp =
(XMLGrammarPoolImpl)schemaNode.getGrammarPool();

            Grammar[] g =
gp.retrieveInitialGrammarSet("http://www.w3.org/2001/XMLSchema";);


 
grammarPool.cacheGrammars("http://www.w3.org/2001/XMLSchema";, g);

}

A root schema does not have any referencing schemas and the set of
schemas are reverse topologically sorted. Inspection of the graph of
grammars within grammarPool reveals that they are the same as when
Xerces populates the grammar pool. Is this the correct way to cache
grammars into the grammar pool?

If there is only 1 root schema, there is no additional "merging" of
grammars and the grammar pool from the single root schema can be used
directly to validate an instance.

A grammar pool is set on a configuration instance as a feature and
passed to the constructor of DOMParser:

                                  DOMParser parser = new
DOMParser(config);

The grammarPool is locked before parsing and unlocked after parsing to
prevent further entity resolution. In addition to XML schema validation,
my application caches Grammars for PSVI analysis and XML prototyping.

Analysis of the grammarPool source code indicates that a warning is
generated because there must be two instances of Grammar with the same
target namespace. However, I don't see this when looking through the
grammarPool instance. My next step is to step through the Xerces-2
sources under my application, but was hoping somebody could indicate
improper usage on my part.

thanks,

-Andy

 


______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
______________________________________________________________________

Reply via email to