Peter Kirk wrote:
On 03/11/2003 15:26, Markus Scherer wrote:
I suggest you try it out - http://oss.software.ibm.com/cgi-bin/icu/lx/en_US/utf-8/?_=he&EXPLORE_CollationElements=

ICU implements the UCA, including discontiguous contractions.

Thank you, Markus. Unfortunately the results are barely usable because they are in Arial Unicode MS or something (and cannot be changed) which simply fails to give a meaningful display of pointed Hebrew. There is a clear need for a mechanism for the user to specify a useful display font with good support for the text in question.

Sorry for that. I told the developer of the Locale Explorer about this, and he will look into it, although it may take a few weeks. He is thinking about an option to turn off emitting any CSS.


In the meantime, copy-paste to another application should work, maybe you can disable CSS or override the page/CSS font setting.

But one thing is immediately clear. I sorted a set of shins with various combinations of shin and sin dot and dagesh, each followed by alef and separately by bet. The default collator sorted all the shin alefs before all the shin bets. This is probably correct for modern Hebrew. It is not the preferred ordering for biblical Hebrew.

Possible, and I am not an expert in Hebrew at all - collation or otherwise. I simply suggested this demo as a way to try out how the UCA works. In this case, the normalization on/off option is important for the discussion.


Note that Hebrew collation in ICU uses not just the UCA but also a tailoring. You can edit the tailoring in the online demo, and even supply your own entirely. You could work out a tailoring for biblical Hebrew and use it in the demo as well as with runtime ICU libraries.

If you think that the current Hebrew tailoring is incorrect, then the best place to submit a bug is with the CLDR: http://www.openi18n.org/subgroups/lade/locale/index.htm

Best regards,
markus




Reply via email to