On Mon, Feb 8, 2016 at 12:30 PM, Doug Ewell <d...@ewellic.org> wrote:
> James Tauber wrote: > > > I'm wondering what potential objections / problems I should be aware > > of before trying to put together a proposal for these extra > > precomposed characters to be included. > > It sounds from the blog post that the basic rationale for adding > precomposed characters is that existing fonts, input methods, and other > tools don't always work correctly with the combining sequences. > > I suppose one potential challenge you might face is to explain why the > following FAQ items, though phrased in terms of Latin base letters, > don't apply equally to Greek: > > http://www.unicode.org/faq/char_combmark.html#11 > http://www.unicode.org/faq/char_combmark.html#12b > Yes, I read those FAQs and hesitated before even posting because of them. The Greek Extended block already somewhat contradicts that by having the precomposed characters it does but I presume that was largely for legacy reasons and existing font encodings. There's no doubt the font and input methods can be improved right now regardless of any change to Unicode. That said, I still have questions around relative ordering of combining characters and also interaction of combining characters and precomposed characters. At the very least I'd like to put together some best practices for those dealing with polytonic Greek, even before I go to font foundries and keyboard software developers. Even with all this, though, my own work includes accentuation and syllabification algorithms, all of which are made more cumbersome by the lack of precomposed characters indicating vowel length. I'm currently leaning towards adding a layer of "character" processing on top of Python 3's otherwise decent support that effectively treats the relevant character sequences as single characters even if they aren't (and can't be precomposed). I'd be interested if others have tackled similar issues outside of Greek. James