Re: Schemas & grammar pools

Neil Graham Tue, 11 Nov 2003 08:31:31 -0800


Hi Esmond,

> (1) I've studied the XMLGrammarBuilder sample and tried quite a few
things
> but I can't get Xerces to behave as I want. What I want is that parsing
an
> XML document should fail unless every schema it references is *already
in*
> the grammar pool. Xerces seems to parse schemas referenced on demand
(i.e.
> if not in the pool), add them to the grammarBucket, and then give a
warning
> 'One of the grammar(s) returned from the user's grammar pool is in
conflict
> with another grammar' when processing the final grammarBucket for the
> document. I want to *prevent* the loading of  external schemas, but I'm
not
> clear how to do that - e.g. what can an entity resolver do to prevent
> further processing of the entity if it's an unwanted schema?

If you register an EntityResolver, it will be called when Xerces requires a
grammar and can't find it in the registered grammar pool.  At this point,
you'll have an opportunity to abort parsing.  In effect, your entity
resolver implementation could be as simple as to abort parsing every time
resolveEntity() is called.

> (2) Parsing a schema with the pool unlocked results in the schema and all
> referenced schemas being placed in the pool. This is great but if I am
> parsing a directory full of schemas this results in grammar conflict
> warnings if a referenced schema is parsed which has already been loaded.

Actually, the warnings should be issued not when a referenced schema has
already been loaded (Xerces should be able to get that out of the pool) but
when you try and parse a schema that's already been loaded.  e.g.:  if
schema A imports schema B, then if you parse schema A, the grammar pool
will acquire both schemas.  If you then ask Xerces to parse schema B, it
won't know that it already has this in its grammar pool until after it has
created a SchemaGrammar object for it (since it can't tell what
targetNamespace that schema has until it's already begun to process it).

The easiest way around this is either to parse only "root" schema
documents, or to parse all schema documents but start with schema documents
on which others have dependencies.  This is the application's
responsibility and I'm afraid there's nothing Xerces's default
implementation can do to help you.

Hope that helps,
Neil
Neil Graham
XML Parser Development
IBM Toronto Lab
Phone:  905-413-3519, T/L 969-3519
E-mail:  [EMAIL PROTECTED]




                                                                                       
                                                
                      "Esmond Pitt"                                                    
                                                
                      <[EMAIL PROTECTED]        To:       <[EMAIL PROTECTED]>          
                                       
                      le.net>                  cc:                                     
                                                
                                               Subject:  Schemas & grammar pools       
                                                
                      11/10/2003 09:02                                                 
                                                
                      PM                                                               
                                                
                      Please respond to                                                
                                                
                      xerces-j-dev                                                     
                                                
                                                                                       
                                                
                                                                                       
                                                



(1) I've studied the XMLGrammarBuilder sample and tried quite a few things
but I can't get Xerces to behave as I want. What I want is that parsing an
XML document should fail unless every schema it references is *already in*
the grammar pool. Xerces seems to parse schemas referenced on demand (i.e.
if not in the pool), add them to the grammarBucket, and then give a warning
'One of the grammar(s) returned from the user's grammar pool is in conflict
with another grammar' when processing the final grammarBucket for the
document. I want to *prevent* the loading of  external schemas, but I'm not
clear how to do that - e.g. what can an entity resolver do to prevent
further processing of the entity if it's an unwanted schema?

(2) Parsing a schema with the pool unlocked results in the schema and all
referenced schemas being placed in the pool. This is great but if I am
parsing a directory full of schemas this results in grammar conflict
warnings if a referenced schema is parsed which has already been loaded.
Should I worry about this?

Many thanks in advance for any assistance.

EJP
Motile Research Pty Ltd



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Re: Schemas & grammar pools

Reply via email to