Elliotte Rusty Harold wrote: > A W3C XML Schema Language validator needs a character based API to > correctly implement the minLength and maxLength facets on xsd:string
As far as I understand, xsd:string is a list of "Character"-s, and a "Character" is an integer which can hold any valid Unicode code point. In other terms, xsd:string is necessarily in UTF-32 (or something close to it): it cannot be in UTF-8 or UTF-16. The numbers returned by length, minLength and maxLength are the actual, minimum and maximum number of *list elements*, contained in the list. I.e., in the case of xsd:string, the *size* of the string in *encoding units*. The fact that, in UTF-32, the *size* of the sting in encoding units corresponds to the number of "characters" is coincidental. In any case, the useful information is always the *size* of the string in encoding units (octets for UTF-8, 16-bit units for UTF-16, etc.), not the number of "characters" it contains. _ Marco