Question:
I've heard that the differences between v1.1 and v2.0 unicode are the
movement of Hangul characters, the addition of some Hebrew and Tibetan
characters and some bug fixes. In particular, this is what Oracle support
stated:
* Fixed decompositions with TONOS to use correct NSM: 030D.
* Removed old Hangul Syllables; mapping to new characters are in a
separate table.
* Marked compability decompositions with additional tags.
* Changed old tag names for clarity.
* Revision of decompositions to use first-level decomposition, instead of
maximal decomposition.
* Correction of all known errors in decompositions from earlier versions.
* Added control code names (as old Unicode names).
* Added Hangul Jamo decompositions.
* Added Number category to match properties list in book.
* Fixed categories of Koranic Arabic marks.
* Fixed categories of precomposed characters to match decomposition where
possible.
* Added Hebrew cantillation marks and the Tibetan script.
* Added place holders for ranges such as CJK Ideographic Area and the
Private Use Area.
* Added categories Me, Sk, Pc, Nl, Cs, Cf, and rectified a number of
mistakes in the database.
Generally one of the most substantial differences between 1.1 and 2.0 is
in Korean characters. They are mapped to different codepoints so if you use
them please be aware that their values will be different on different
versions.
So here is my question:
I'm trying to determine the potential harm of an Oracle 7.3 database
publishing data in AL24UTFFSS character set (which is UTF8 for Unicode v1.1)
to a client that accepts UTF8 (Unicode v2.0) if the publisher isn't
publishing any Hangul characters.
The characters that have been added don't matter because the Unicode v1.1
database is the publisher not the subscriber. So the only other potential
issue would be "Revisions of decompostitions and correction of
decompositions" assuming Oracle gave out precomposed characters in
decomposed form. Otherwise, I don't see a problem? The program receiving
the AL24UTFFSS data and transfering it to the UTF8 application may do its
own decompostion, but that program is based on functions that are part of
unicode 2.0 and unicode 3.0. So that should work just fine. The only
issue is whether the wrong codeunits are received from Oracle. Obviously
yes in the sending of Hangul. But I would think not in any other case.
Any insights?
Thanks,
Oodi