Re: About Xerces projects for GSoc 2010

Khaled Noaman Mon, 29 Mar 2010 17:20:18 -0700

Hi Udayanga,

Here's a quick overview on how schemas are processed in Xerces-J.


1. We preprocess all include, import and redefine elements and create all 
the necessary grammar objects. This way we have all the schema information 
handy (constructTrees)
2. We then go through all global declarations in each schema document we 
preprocessed. This is where we do some redefine preprocessing where we 
change the names of redefined components. This process will also store the 
first occurrence of each global component, whether it's a type, an 
element, an attribute, etc. (key is a concat of a component name and its 
namespace) as well as the corresponding schema document for that 
component. This way you can easily check for duplicates and know which 
global component to process when it's referred to by another component 
during processing of the actual components (e.g. <element name="a" 
type="ns:type1"/>). In that case when processing element, "a" we can get 
access to the representation of "type1" if it was not yet processed. This 
of course will cause some problems with xs:override. Since xs:override now 
has precedence. So, the logic will need to change to take xs:override into 
consideration.
3. We then go and process all global components in each schema document we 
have preprocessed. A global component is any schema component (excluding 
<include>, <import>, <redefine>. and <override>) that's a child of the 
<schema> element, e.g.
<xs:schema .....>
  ...
  <xs:element name="child1" type="xs:string"/>
  <xs:complexType name="ctype1">
    <xs:sequence>
      <xs:element name="gchild1" type="xs:int">
    </xs:sequence>
  </xs:complexType>
</xs:schema>

So, we will process element "child1" and type "ctype1".

4. We then process any local elements components. A local element is a 
usually a child of a component such as complex type or group. So, in the 
above example "gchild1" is a local element. We would process that element 
after we have processed all global components

When implementing xs:override, a lot of considerations has to be taken 
care of during preprocessing. We also need to make sure that when 
processing a global component or a local component that refers to a 
component that is being overriden, that we use the overriding component.

Regards,
Khaled





From:
udayanga wickramasinghe <[email protected]>
To:
[email protected]
Date:
03/27/2010 07:08 PM
Subject:
Re: About Xerces projects for GSoc 2010



Hi Khaled,
I went through quite a bit of xerces interfaces/implementations (and 
samples ie:-xs.QueryXS , jaxp.SourceValidator ,xni.parser ,etc) related to 
my project ,including the one's you have mentioned .Now I have a fair 
amount of understanding on how Xerces parsers 
works,configurations,pipes,core interfaces,etc..  From what i have 
gathered so far i'm trying to outline the Xerces process of schema 
processing/loading and instance validation as follows...please correct me 
if my understanding of them are wrong...

XMLSchemaValidator--> instance documents are parsed and validated later in 
the pipeline( of ie:-XML11Configuration parser ) against the loaded 
schemas (ie:-loaded by XMLScehmaLoader) for schema documents...

XMLSchemaFactory ,XMLScehmaLoader --> loads xml schema  from various .xsd 
sources and initiates a Grammer/pool ()from the provided schema documents

to do this ,i suppose Schema Loader wraps an XSDHandler class ,which is 
responsible for parsing each schema source(ie:-using SchemaDomParser) , 
resolving and loading grammer (in #parseSchema(...) ) ,etc respective to 
each schema document source...
i see several stages present in XSDHandler's primary method 
,#parseSchema(...) . (ie:-construcutngTees,build 
globalRegistries,travesersSchemastraverseLocal,etc...)...
it seems  #constructTrees is a very important method (although i dont hv a 
complete understanding on it's exact semantics..) to xs:override 
implementation since it tries to resolve included,redfined schema 
components and build a dependency map (havin said that ,  can we build a 
custom dependency map/s here for override components as well for  
processing necessary schema semantics...??) .I see Hiranya's patch focused 
on this method mostly.....
It would be very helpful if you could explain these schema parsing 
stages(in #parseSchema(...)) in little detail and how it relate to 
xs:override implementation so that i could have a better understanding and 
put together the pieces of the puzzle.......also i do want to clarify,as 
mentioned  in #buildGlobalNameRegistries() ,#traverseLocalElements() ,etc 
...what is meant by "globaly"and "locally" declared components...thnx in 
advance... 

I am currently in the process of writing a GSoc proposal for xs:override 
implementation. i'll make it  available as a draft proposal to you  asap  
,so i could do discuss necessary modifications (if required) with you and 
put final submission in place...(also deadline for GSoc proposals are on 
April9  )....thnx again..

Regards,
Udayanga


On Thu, Mar 25, 2010 at 2:37 AM, Khaled Noaman <[email protected]> wrote:

Hi Udayanga, 

See my comments below (<kn>). 

Regards, 
Khaled




From: 
udayanga wickramasinghe <[email protected]> 
To: 
[email protected] 
Date: 
03/24/2010 04:46 PM 
Subject: 
Re: About Xerces projects for GSoc 2010




Hi Khaled,

thnx for your feedback. please see some of my comments, below...


On Wed, Mar 24, 2010 at 9:13 PM, Khaled Noaman <[email protected]> wrote: 


Hi Udayanga, 

According to your example, B and B' would be considered 2 different 
documents and we would end up with conflicting components, not just e2 
(assuming B.xsd had other global components). The reason we consider B and 
B' as different documents is the facet that B' now contain a different 
declaration for e2. 
  
If xs:override did not apply from C->B, then in that case we can consider 
B and B' the same and there would be no duplicate components. 

yes.....just to verify...what you mean is like , if B.xsd has the 
following format ,
Schema B 
    <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema";> 
    <xs:override schemaLocation="schemaC.xsd"> 
<xs:element name="e1" type="xs:int"/> 
<xs:element name="e2"   type="xs:date"/> 
    <xs:override> 
<xs:element name="e3" type="xs:string"/> 
</xs:schema> 
then C->B overrride won't ocuur , since overrdden schema B.xsd , dont have 
either element e2 or e1 for override. Hence no schematic difference in [B] 
and [B'] and schema inclusion for both A.xsd and C.xsd would be 
idential...i suppose  this is what you meant... 
<kn>Yes. That's what I meant</kn>
  

You would need to apply override to check for cyclical dependencies.As I 
mentioned above if override does not modify the overridden schema, then 2 
similar schema documents (B and B') would be treated as similar (in other 
words, no duplicates). 

As from the above example , nw i see after only applying override we can 
definitely say for sure whether there exists  cyclic dependency 
conflicts..
    
Consider the following case: 
A include B and C, B and C override D. Now you end up with 2 versions of D 
(D' included by B and D'' included by C). If neither B or C changes D, 
then both D' and D'' are considered the same. 
  

It would be great if you can start by looking at the following packages in 
Xerces-J:
* org.apache.xerces.impl.xs.traversers - schema processing classes 
(XSDHandler is a starting point)
* org.apache.xerces.impl.xs - classes representing the different schema 
components as well the main class for schema validation 
(XMLSchemaValidator)
* org.apache.xerces.impl.xs.models - content model classes (e.g. DFA. all, 
empty) 

sure i'll go through the above implementations n interfaces and get to you 
incase i want to clarify some finer points....thnx for the details....Btw 
are there any architecture docs/articles on Xerces Xmlschema processing ? 
(i found several docs related to Xerces2 parsers,XNI and validators but 
not a lot on XmlSchema ) .thnx again.. 

<kn>Check the documentation for Xerces-J (
http://xerces.apache.org/xerces2-j/xml-schema.html). You can also take a 
look at the samples that are included as part of Xerces-J source 
code.</kn>

Regards,
Udayanga 






-- 
http://www.udayangawiki.blogspot.com

Re: About Xerces projects for GSoc 2010

Reply via email to