Greg Iwinski created XERCESC-2063:
-------------------------------------

             Summary: A 4 byte UTF-8 character incorrectly failing maxlenght 
facet.
                 Key: XERCESC-2063
                 URL: https://issues.apache.org/jira/browse/XERCESC-2063
             Project: Xerces-C++
          Issue Type: Bug
          Components: Validating Parser (XML Schema)
    Affects Versions: 3.1.3
         Environment: Windows (Affects all OS)
            Reporter: Greg Iwinski


A 4 byte UTF-8 character incorrectly failing maxlenght facet.
The data is F0 9D 90 80 and is a 4-byte UTF-8 sequence to represent 1 character.
It is failing with
Error at file input.xml, line 4, char 17
  Message: value '??' has length '2' which exceeds maxLength facet value '1'
when running  sax2count.exe

This looks like a limitation but I could not find any documentation about it in 
the bug list.

**Input XML**

<?xml version="1.1" encoding="UTF-8"?>
<Root xmlns="http://www.example.org/Test"; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; 
xsi:schemaLocation="http://www.example.org/Test
Input.xsd">
        <Data>𝐀</Data>
</Root>


**Schema**

<?xml version="1.0" encoding="UTF-8"?>
<schema targetNamespace="http://www.example.org/Test"; 
elementFormDefault="qualified" xmlns="http://www.w3.org/2001/XMLSchema"; 
xmlns:tns="http://www.example.org/Test";>
<element name="Root">
<complexType>
<sequence>
<element name="Data">
<simpleType>
<restriction base="string">
<maxLength value="1"/>
</restriction>
</simpleType>
</element>
</sequence>
</complexType>
</element>
</schema>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org
For additional commands, e-mail: c-dev-h...@xerces.apache.org

Reply via email to