Implementing XML Schema 1.1 Overriding Component Definitions

udayanga wickramasinghe Wed, 05 May 2010 13:41:31 -0700

Hi Devs,

As you may know , I am currently engaged in implementing XML Schema 1.1
overriding component definitions [2] (<xs:override>) as a Gsoc project for
Xerces XML Schema. I have been working on a good design plan for
implementing xs:override semantics within XML Schema processor. My idea of
implementation at the moment is described below. Although i can think of
several approaches for implementing this (including introducing a new phase
to Xerces schema processor) , i think following approach would be much more
flexible and convenient... Please feel free to put forward you
ideas,suggestions,improvements,etc regarding them .


following are the basic steps of the implemenation i came up with...

1) Prerequisite : making necessary xs:override schema symbols,XSDescription
constructs ,etc available to XSDHandler , which is the main schema
processing class...

2) I am planning to use #constructTrees() phase of XSDHandler ,for applying
necessary xs:override transformations . Since what #constructTrees does is
building a dependency tree for included and imported schemas , i think we
can use it to our advantage. This is because end result of override
transformation would be an inclusion of the augmented schema document which
override element points to. So in brief, whenever an override element is
encountered ,corresponding schema location will be resolved (schema D) and
the augmented schema is generated (D') and included in the dependency map.
And as usually , constructingTrees will be recursively applied to augmented
schemas as well so that complete dependency tree will be generated.

-One important consideration will be how transformations are done on the
respective overridden schemas .At the moment I'm thinking of using the
override.xslt [1] transformation on schema DOM objects rather than doing it
programatically .I guess i can use javax.xml.transform library to do the
necessary transformations on a DOM element taking override schema component
and overriden schema document as parameters. However i am planning to
provide necessary extensions so that anyone can plug in a custom transform
mechanism of their own . For example , plugging in programatic override
transformation algorithm using native Xerces Dom (ie:-NodeImpl)
Implementation.

-Another would be introducing of component TransformationManager[2]. Rather
than implementing this component inside XSDHandler class (which is already
big and has lot of things being handled) i'm thinking of delegating the
responsibility to a class such as OverrideTransformationManager . It's main
method would be
OverrideTransformationManager#transform(overrideElem,overridenSchema)
which will be responsible for applying necessary transformations. I am keen
to know your views regarding this as well. what wud be the suitable package
(ie:- org.apache.xerces.impl.xs.traversers ) to include such a class?

(3) Just applying transformations would be inadequate inside
#constructTrees() phase.This is bcoz , there wud be different versions of
overridden schema  included during transformations and also cyclic
dependencies can happen with name collisions. So every time  new augmented
schemas are  generated , it would be checked against a map
(fSystemId2OverridenDocMap) to see if duplicates are encountered and also to
check if they are valid schemas (ie:-inclusion of augmented/overriden schema
versions D' and D'' can be considered valid if they are identical) . This
structure  should be able to map SystemId's to their schemaDocuments .By
doing so we shud be able to detect different augmented versions of a schema
document while in the process of applying transformation.Additional
structures(ie:-override2XSDMap) and logic will be needed within
#constructTrees phase to map other relations, detect cycles(ie:-if augmented
schema inclusions repeat ) and if so (ie:- if same schema inclusion repeats)
stop further transformations being happening.

-#checkDuplicateSchema(schemaElement) would be the main function used for
this purpose. This would be part of the DependencyAnalyzer[2] component i
discussed.

(4)Since after construction of dependency tree , #buildGlobalRegistries
phase can be used to , detect general name collisions , process redefine
components, etc. However we may now encounter duplicate errors due to
multiple augmented schema versions bcoz of the override preprocessing
happened in the prior stage. Hence i need to modify the current duplicate
checking  code in #checkForDuplicateNames() to exclude multiple valid
versions of overriden schemas,etc being taken into account. Maps and data
structures generated in the #constructTrees phase can be used here to handle
such scenarios.

5)if all goes well on the above steps ,following phases #traverseSchemas and
#traverseLocalElements in XSDHandler would build the grammer from the schema
sources accordingly.


Above is the basic implementation strategy i can think of now , but i guess
i would encounter additional requirements as i move on to the implementation
as well..hence i will really appreciate your thoughts on the above..thanks
in advance..

Regards,
Udayanga

[1]http://www.w3.org/TR/xmlschema11-1/#override-xslt
[2]http://wiki.apache.org/xerces/gsoc_xs_override_proposal

-- 
http://www.udayangawiki.blogspot.com

Implementing XML Schema 1.1 Overriding Component Definitions

Reply via email to