If such a rule exists, it means that the only supported character *repertoire* is ISO10646.

The repertoire is the set of abstract characters, and this means that no product can use custom assignments within the ISO10646 codepoint space which are not backed by a standard and pulbished agreement. This would certainly not not rule out PUAs, but any attempt to create software that will use non-PUA codepoints which are still reserved and unassigned, or permanently assigned as non-characters. It also means that using PUAs without an explicit agreement will be ruled out.

ISO10646 just defines the repertoire of characters, and the assigned blocks for encoded scripts, with a codepoint assignments, normative English and French character names, some usage hints, and a representative glyph. It does not define any other property.

ISO10646 does not define the charset mappings to ISO10646 streams. It does not define the normalization forms (which are creating *distinct* strings, although these texts may be considered as canonically or compatibility "equivalent" with Unicode), and no compatibility or decomposition mappings!

If such rule is applied, it means that the only acceptable charsets for the EU are those defined with a published charset mapping to legal ISO10646 codepoints. It won't prevent using software needing custom combining classes... And it won't prevent using charsets which are not Unicode UTFs or encoding schemes, and it won't require that all software support the mappings to Unicode, if they are defined only with some supported but limited charsets but that can be transcoded and interpreted externally to ISO10646 streams of codepoints.

Under these cisconstances, a software supporting only one of the ISO-8859 charsets, or Windows ANSI or OEM codepages, or one of the legacy MacOS encodings, is legal (it is legal because MacOS defines an explicit agreement for its non-standard Apple logo to a well-defined PUA, this agreement being valid only on MacOS, other systems handling the character as an unknown but legal PUA).

On the opposite, a software working with non-explicit or ambiguous charset mappings would become illegal. This may effectively be a problem for Linux/Unix softwares that are not written to correctly handling at least the user's locale to disambiguate the charset used in filenames (the same is true for FAT filesystems, but not for FAT32 with UTF-16 encoded LFN support enabled).

It may also be a problem for systems working with proprietary charsets that are defined in terms of glyph variants instead of abstract characters: a text encoded with separate glyph IDs that can't be easily be mapped to ISO10646 codepoints would become inappropriate for some regulated European markets (but these softwares would still be usable by users agreeing privately with the software provider, and if the provider does not expose a false "CE" compliance logo for its product).

But if you have pointers about more precise rules, going further than the ISO10646 repertoire, and integrating other rules (from Unicode or other European or International standards based on ISO10646), fee free to inform this list!

----- Original Message ----- From: "E. Keown" <[EMAIL PROTECTED]>
To: <unicode@unicode.org>; <[EMAIL PROTECTED]>
Sent: Thursday, December 23, 2004 11:38 PM
Subject: ISO 10646 compliance and EU law



If you reply, please cc: me--I'm always 'on vacation'
from all lists, and I only read them online.  I'm
looking for a URL for the info below.

A reliable friend told me that compliance with ISO
10646 is now part of the legal structure of the EU
(European Union).

He thought that it is illegal under certain
circumstances to sell non-ISO 10646-compliant software
in the EU.

That is, if I develop Hebrew/Castilian software which
uses custom combining classes or other deviations from
Unicode Hebrew, this software cannot be sold in Spain
(in EU since 1986).





Reply via email to