Hi all,
On the basis of my memory of few past discussions on this list (and on
few of the XercesJ jira discussions as well), I've the feeling that,
Xerces's XSD 1.1 XPath 2.0 regex implementation is little non compliant. I
wish to discuss that, a little bit here.
The XPath 2.0 F&O regex requirements are specified at
https://www.w3.org/TR/xquery-operators/#regex-syntax [1].
My current analysis says that, Xerces's XSD 1.1 XPath 2.0 regex
implementation is compliant to a great extent to the above cited
specification [1]. The section "7.6.1.1 Flags" mentioned at [1], at the
bottom says following,
x: If present, whitespace characters (#x9, #xA, #xD and #x20) in the
regular expression are removed prior to matching with one exception:
whitespace characters within character class expressions (charClassExpr)
are not removed. This flag can be used, for example, to break up long
regular expressions into readable lines. [2]
We comply to point [2] cited above, except to following that is mentioned
at point [2]: "with one exception: whitespace characters within character
class expressions (charClassExpr) are not removed". Xerces's XPath 2.0
regex implementation seems to remove whitespaces from within character
class expressions as well, when the flag "x" is present.
To test my above mentioned claims, I wrote the following XML Schema 1.1
example,
XML document:
<?xml version="1.0"?>
<X>123</X>
XML Schema 1.1 document,
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="X">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:assertion test="matches($value, '[0- 9]{3}', 'x')"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
</xs:schema>
For above XSD 1.1 validation example, Xerces reports a XSD 1.1 valid schema
assessment outcome. To my opinion, the xs:assertion mentioned above should
have failed (i.e should have returned false), since there's a space
character within [] (its a regex character class) on the mentioned regex.
Other than the implementation deficiency mentioned above, I find that,
Xerces's XSD 1.1 XPath 2.0 processor's regex implementation is compliant to
the XPath 2.0 regex spec.
Actually, Xerces's XSD 1.1 XPath 2.0 processor's regex implementation
(specifically, the behaviour of the XPath 2.0 regex flag "x"), behaves very
much like that of Java's regex support.
I'd be happy, to continue discussion about this topic.
--
Regards,
Mukul Gandhi