The ACES image container specification, meant to be compatible OpenEXR,
prescribes UTF-8 for the representation of strings. Therefore I suggest
that OpenEXR adopt the following rules:
- All text strings are to be interpreted as Unicode, encoded as UTF-8.
This includes attribute names and strings contained in attributes,
for example, as channel names.
- Text strings stored in files must be in Normalization Form C (NFC,
canonical decomposition followed by canonical composition).
- Where text strings need to be collated, strcmp() is used to compare
the corresponding char sequences: string A comes before (or is less
than) string B if
strcmp(A,B) == -1
(Note: this is not ambigous; the C99 standard specifies that strcmp()
interprets the bytes that make up a string as unsigned.)
- Text strings passed to the IlmImf library must be encoded as UTF-8
and in Normalization Form C.
As far as I can tell, these rules are entirely compatible with all
existing versions of the IlmImf library. Users whose writing system
includes non-ASCII Unicode characters can continue to employ the
existing library versions without change.
Future versions of the library should verify that text strings are
valid UTF-8. In addition, the library should either verify that
strings are normalized to NFC, or normalize to NFC on the fly.
Florian
_______________________________________________
Openexr-devel mailing list
[email protected]
https://lists.nongnu.org/mailman/listinfo/openexr-devel