On Friday, June 27, 2003 5:53 PM, Karljürgen Feuerherm <[EMAIL PROTECTED]> wrote: > And in any case this should NOT muck things up which aren't broken, > like MH.
Not breaking Modern Hebrew means not changing the combining classes of the characters it uses. Adding a distinct set for Traditional Hebrew may then be the only practical solution: after all there are many such concessions in ISO10646, which did not try to unify Greek and Cyrillic despite these two scripts are extremely related... With Unicode, there is for now no solution, so scholars will need to develop their own "legacy" encoding with distinct mappings to a future ISO10646 and Unicode standard, and for interoperability with these existing documents using this legacy 8-bit encoding, then will come the need to map this encoding to a distinct set in ISO10646 and Unicode. This would be the end of the nightmare. What Unicode will then publish, is a set of *compatibility* equivalences between the new diacritics for Traditional Hebrew and the existing diacritics for Modern Hebrew. I'm curious to see how legacy 8-bit documents are used with Biblic texts... Are the current conversion tables (informative in the Unicode database) for the ISO and Windows charsets correct with that perspective? If so the conversion from these 8-bit encodings to Unicode would be less simple than simple mappings, as it would require looking at the place of diacritics in the 8-bit encoding to see if they can safely be normalized once in Unicode accoding to their relative combining classes. -- Philippe.