Bizarre interaction between choice restrictions and substitutionGroups
----------------------------------------------------------------------
Key: XERCESJ-1032
URL: http://nagoya.apache.org/jira/browse/XERCESJ-1032
Project: Xerces2-J
Type: Bug
Components: XML Schema Structures
Versions: 2.6.2
Environment: java 1.4 on linux
Reporter: Lucian Holland
Attachments: test1.xsd
You've probably seen this one before, but there's a really wacky interaction
between the way substitutionGroups work and how choice restrictions are
specified. The particular oddness that I'm looking at results from the
combination of three rules from the schema spec:
1) substitutionGroups are supposed to be expanded into choices prior to
checking the validity of a restriction
2) When validating that a choice particle is a valid restriction of another
choice particle, MapAndSum specifies an *order-preserving* mapping between the
particles of the base and the restriction
3) The order of the particles in the generated choice for a substitutionGroup
is nowhere specified.
As a consequence, if you have an element Er in a restricting type R of a base
type B, corresponding to an element Eb in the base type, then you run into
difficulties if Er is in the substitutionGroup of Eb and there is at least one
element in the substitutionGroup of Er. (See the attached schema for an
example) Basically what you end up with is two choice groups of undefined
ordering, one of which is supposed to be a restriction of the other.
Currently, the ordering that Xerces applies is a haphazard product of the way
that loop iterations are conducted in the SubstitutionGroupHandler and
XSContraints; as a result, schemas of this nature are almost always marked as
invalid due to a MapAndSum error.
Given that the validity of such schemas is left open by the schema spec due to
its silence on the ordering of the generated choices, and the fact that the
schemas in question are "obviously valid" to a human reader, I would like to
propose a patch that ensures that substitutionGroups always have a defined
ordering (I picked an ordering on namespace, then localname, but it isn't
really important provided that it's consistent). The patch means that Xerces
reports schemas of the form described above as valid; given that all
substitutionGroups are now consistently ordered, I don't see that this could
have any negative side-effects from the point of view of validation. The only
potential downsides are the performance of using a TreeSet rather than a Vector
(I have done some very unscientific testing on some pretty large schemas with
very heavy use of substituionGroups, and I couldn't see any measurable
difference) and the fact that it obviously changes the ordering reported for
the elements in a substitutionGroup by the XSModel (but I don't see this as
particularly significant since I believe the ordering would already have been
non-deterministic for subGroups that spanned multiple grammars - the grammar
bucket is based around a HashTable)
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://nagoya.apache.org/jira/secure/Administrators.jspa
-
If you want more information on JIRA, or have a bug to report see:
http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]