Bizarre interaction between choice restrictions and substitutionGroups
----------------------------------------------------------------------

         Key: XERCESJ-1032
         URL: http://nagoya.apache.org/jira/browse/XERCESJ-1032
     Project: Xerces2-J
        Type: Bug
  Components: XML Schema Structures  
    Versions: 2.6.2    
 Environment: java 1.4 on linux
    Reporter: Lucian Holland
 Attachments: test1.xsd

You've probably seen this one before, but there's a really wacky interaction 
between the way substitutionGroups work and how choice restrictions are 
specified. The particular oddness that I'm looking at results from the 
combination of three rules from the schema spec:

1) substitutionGroups are supposed to be expanded into choices prior to 
checking the validity of a restriction
2) When validating that a choice particle is a valid restriction of another 
choice particle, MapAndSum specifies an *order-preserving* mapping between the 
particles of the base and the restriction
3) The order of the particles in the generated choice for a substitutionGroup 
is nowhere specified.

As a consequence, if you have an element Er in a restricting type R of a base 
type B, corresponding to an element Eb in the base type, then you run into 
difficulties if Er is in the substitutionGroup of Eb and there is at least one 
element in the substitutionGroup of Er. (See the attached schema for an 
example) Basically what you end up with is two choice groups of undefined 
ordering, one of which is supposed to be a restriction of the other. 

Currently, the ordering that Xerces applies is a haphazard product of the way 
that loop iterations are conducted in the SubstitutionGroupHandler and 
XSContraints; as  a result, schemas of this nature are almost always marked as 
invalid due to a MapAndSum error. 

Given that the validity of such schemas is left open by the schema spec due to 
its silence on the ordering of the generated choices, and the fact that the 
schemas in question are "obviously valid" to a human reader, I would like to 
propose a patch that ensures that substitutionGroups always have a defined 
ordering (I picked an ordering on namespace, then localname, but it isn't 
really important provided that it's consistent). The patch means that Xerces 
reports schemas of the form described above as valid; given that all 
substitutionGroups are now consistently ordered, I don't see that this could 
have any negative side-effects from the point of view of validation. The only 
potential downsides are the performance of using a TreeSet rather than a Vector 
(I have done some very unscientific testing on some pretty large schemas with 
very heavy use of substituionGroups, and I couldn't see any measurable 
difference) and the fact that it obviously changes the ordering reported for 
the elements in a substitutionGroup by the XSModel (but I don't see this as 
particularly significant since I believe the ordering would already have been 
non-deterministic for subGroups that spanned multiple grammars - the grammar 
bucket is based around a HashTable)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://nagoya.apache.org/jira/secure/Administrators.jspa
-
If you want more information on JIRA, or have a bug to report see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to