Hello all, The question of case changing in Greek has come up in another thread. Whilst the details here aren't XeTeX (or even TeX) specific, given the interest by members of the list I hope I can take advantage to ask about the area.
For work on LaTeX3/expl3 we've put together an approach to case changing in XeTeX (and LuaTeX) that is not tied to a 1-1 mapping. One of the design ideas behind the code was to allow a way to tackle context- and language-dependent changes. At the same time, to date we have used the Unicode docs to define case mappings. Thus the 'standard' mappings follow those in UnicodeData.txt (1-1 lower/title/upper) and SpecialCasing.txt (more complex cases). Included in that 'standard' set up is the final sigma rule for Greek text. For performance reasons that code has been set up to assume that a sigma is final if it is followed by a space, a control sequence or a character from the list ) ] } . : ; , ! ? ' " Other potential additions are welcome as is testing of what we have done. (There seem to be a lot of edge cases. For example, what happens if a sigma is immediately followed by a number, say in a computational identifier.) What has not been covered at all to date is any special handling of accents. As indicated in the other thread, it seems that the handling of accents in Greek is non-trivial. Notable, we have an implementation which separates out title case from upper case and have the idea of language-dependent mappings. Thus it would be perfectly possible to have logic 'Retain accents on the first letter of a word when title casing; remove them when upper casing'. Similarly, I wonder if there are differences in practice related to the nature of the text: modern writing vs. historical text, etc. Again, this can be added if there is a clear set of rules to follow. Detailed information is most welcome. -- Joseph Wright -------------------------------------------------- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex