Re: Experiments with classical Greek keyboard input
Την Fri, 10 Feb 2006 20:14:16 +0100,ο(η) Jan Willem Stumpel <[EMAIL PROTECTED]> έγραψε/wrote: Πιστιόλης Κωνσταντίνος wrote: In that page you propose: ...A font which includes all accent combinations for Classical Greek is, for instance, FreeSerif. The efont bitmap fonts (for xterm) also have them... Which may or may not be valid depending which symbol your keymap produces for acute (oxia or tonos). FreeSerif has a different symbol for 'tonos' and 'oxia' and ancient greek is probably not viewed correctly if someone types using the gr(polytonic) keymap with el_GR.UTF-8 locale You are right of course. But this (I am sorry) is in the 'keyboard input' section of my page, which I have not updated yet, and I am still not quite sure what it should say. Should there, or should there not, be input methods for both 'oxia' and 'tonos', given that they are 'officially' the same? I mean, what should be the advice to the classicists? My request for comment was, so far, only on the new 'font' section of the document, section 4.5. Ok, quite explanatory! Just one comment: ... Typographical fashions in Greece have now changed, so this solution is right for modern Greek also... It's not like a typographic fashion change; modern greek may still use any glyph for 'tonos'. You may see a dot, an acute, a line, a triangle, even a comma if it is on a capital letter (like capital A-acute Ά, usually accent goes to the left of capital letters). Let me explain more. There is only one accent mark for modern greek, and it doesn't really matter how to draw it. It is just that the greek government admitted that 'tonos' which has replaced the former three accents (oxia, varia, perispomeni) is actualy nothing more than 'oxia'. In other words, formally speaking, oxia replaced both varia and perispomeni. Why is valid for monotonic tonos (oxia) to have any glyph? Because, at least since my parents remember (1940), noone cared about the difference between varia (`) and oxia (΄). The books were printing them correctly but noone bothered in hand writing the formal 'katharevousa' or 'dimotiki' greek. People used to make a distinction only between perispomeni and tonos (meaning oxia or varia) and they usually preffered the glyph of oxia or a vertical line above for this tonos. Modern polytonic greek scripts usually don't use varia (grave). oxia is mostly used in it's place Technically speaking, a 'correct' font may be: 1. monotonic, (with no polytonic characters at all) where it doesn't matter which glyph it uses for tonos 2. polytonic, which shall define the same glyph in 0x1f71 as in 0x3ac and it should be oxia. (if it is not oxia, the font is still usable for monotonic greek, even for polytonic if one does not use varia, but not for ancient greek or modern polytonic greek with varia) The 'correct' way to render different glyphs for every case, is probably a 'smart' font implementation (unfortunately too far from today's reality). Some greek terminology which may be useful -- 'Tonos' (τόνος) in greek means 'accent (mark)' in general, so this word was used to indicate an accent without specifying which one there are three tonos'es (οξεία, βαρεία, περισπωμένη) 'pnevma' (πνεῦμα) is the breathing mark. There are two of them -'psili' (ψιλή) smooth breathing mark (comma above) and -'dasia' (δασεία) rough breathing mark (reversed comma above). Both do not exist in modern monotonic greek 'ypogegrameni' (ὑπογεγραμμένη) is the iota subscript (like ῃ, ᾳ) and it also does not exist in monotonic greek. 'monotonic' and 'polytonic' greek, stands for using only one 'tonos' or all the symbols. Modern greek is officially monotonic, but some people (old men, the church, men of literature) still use it (me too). There were two branches of evolution of the greek language. The informal language of people, called 'dimotiki' (δημοτική, which means 'public') and the formal language of ecudated people 'katharevousa' (καθαρεύουσα, which means 'pure'). Katharevousa comes in many versions, depending how close it is to ancient greek. Today dimotiki is the official language and practically only the church sometimes uses 'simple' katharevousa (the most modern version). Church always uses polytonic greek, but it does't distinguish between oxia and varia (uses oxia only) I hope it helped. Feel free to ask any question about greek regards, Konstantinos -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
Re: Experiments with classical Greek keyboard input
Την Fri, 10 Feb 2006 12:06:07 +0100,ο(η) Jan Willem Stumpel <[EMAIL PROTECTED]> έγραψε/wrote: Πιστιόλης Κωνσταντίνος wrote: Την Mon, 06 Feb 2006 21:58:13 +0100,ο(η) Jan Willem Stumpel <[EMAIL PROTECTED]> έγραψε/wrote: In ancient greek and modern "katharevousa" (a formal archaic greek) there were three accents. [..] Thanks very much for this explanation. I put a digest of it on my ‘user-level’ utf-8 page. In that page you propose: ...A font which includes all accent combinations for Classical Greek is, for instance, FreeSerif. The efont bitmap fonts (for xterm) also have them... Which may or may not be valid depending which symbol your keymap produces for acute (oxia or tonos). FreeSerif has a different symbol for 'tonos' and 'oxia' and ancient greek is propably not viewed correctly if someone types using the gr(polytonic) keymap with el_GR.UTF-8 locale Check http://ptolemy.tlg.uci.edu/~opoudjis/unicode/unicode_gkbkgd.html#oxia to see which fonts define different symbols Kostas -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
Re: Experiments with classical Greek keyboard input
Την Mon, 06 Feb 2006 21:58:13 +0100,ο(η) Jan Willem Stumpel <[EMAIL PROTECTED]> έγραψε/wrote: Imitating the difficult-to-learn Windows system for 'multiple diacriticals' should IMHO be offered as an option, but not as the only option. The ease with which diacriticals can be combined by means of xkb/Compose could be a 'Linux selling point' in the academic world. BTW I am now terribly confused about he tonos/oxia issue. -- "Tonos and oxia are considered equivalent in Unicode" - but why, then, are there different code points for them (U+1FFD, and all the letters "with oxia", vs. U+0384 and all the letters "with tonos")? Where does it actually say that they are equivalent? In ancient greek and modern "katharevousa" (a formal archaic greek) there were three accents. (I don't know the english names) Perispomeni (~), oxia (acute) and grave (`), which were all together named with the word 'tonos' (accents) Yet, in modern greek practically noone was actually distinguishing between acute and grave, so the accents used was oxia and perispomeni. The next step was to deprecate all these accent marks and use only one simpe accent, for the words that have multiple syllabes. This was called 'monotonic greek'. That simple accent was simply called "tonos" (accent) and actually was the acute. Still typographically there was no prefference about the slope of tonos (/ \ or |) and modern "monotonic" greek fonts may use a | glyph, or a dot above This glyph may be good for monotonic greek, but it is completely unsuitable for ancient or polytonic greek, so in the meantime font designers were making different glyphs and were using different character codes for each case. This is a very stupid distinction, because there is no such difference between tonos and oxia (acute), and no such symbol as a "vertical line above" or a "dot above" in greek; The issue was finally resolved by greek government, which declared that tonos is actually the acute (oxia). But this has become TOO LATE, because EL.O.T. (the Hellenic Standarization Organization) had allready proposed different characters to the unicode consortium. After that, many people who were using polytonic greek (out of Greece) had allready converted their texts from the original 8bit encodings to unicode using the new characters with 'oxia' This faq describes the story. http://www.unicode.org/faq/greek.html and for more info http://ptolemy.tlg.uci.edu/~opoudjis/unicode/unicode.html The difference between 'oxia' and 'tonos' and the problems related to that is mentionned in more detail here: http://ptolemy.tlg.uci.edu/~opoudjis/unicode/unicode_gkbkgd.html#oxia -- Many (maybe most) font creators made different glyphs for oxia and tonos (although others did not, see the Gentium font), because they were "looking at unicode". But, surely, that was the correct place to look? Well there is no other way for modern greek. Neither can be a distinction between tonos and oxia, nor we may have two different keycodes for the same character. Imagine what will happen if a Greek user uses polytonic keyboard to enter a filename. It's just a matter of fonts. If someone wants to write monotonic greek is free to use any font he/she likes. But for polytonic greek he/she has to use a polytonic font (which must define correctly the polytonic glyphs) Font designers claim the opposite; that the user should keep oxia and tonos combinations distinct, but this is incorrect according unicode and, as I said, is extremely dangerous when mixed with modern greek. Then again, the actual reason is that unicode cannocinal equivalence is not correctly implemented neither by applications nor by fonts. According to unicode, a proccess must not treat equivalent characters differently, nor assume that some other proccess does. Even more, a text may be automatically normalized at any time (without the user or any other program knowing that) by the system or a intermediate proccess, having some characters decomposed or replaced by their canonical equivalents. -- Kostas calls it "a bug of the fonts". If there is a bug, isn't it in the Unicode standard ? As Simos said, this is rather a way of thinking than a bug. Unicode has not altered existing encodings. It has included them all and defined the relationships and the equivalences for future use. The problem is that most applications do not yet implement these rules. And since people are still treating equivalent characters as not equal, some font designers decide to do so too. When it comes to Greek there is another reason. Usually a font implements the basic symbols first (with tonos) in the monotonic way, so later they just add polytonic accents. I hope there is a way to put the genie back into the bottle. Just making the keyboard entry for oxia "hard, forcing people not to use it" does not seem to be the right way. The correct way is the maturity of unicode: When all the texts are beeing normalized, all programs will become awar
Re: Experiments with classical Greek keyboard input
You know, there really should be a way to create a keyboard layout on X11 compatible with the Windows XP / typewriter one. Is this currently possible? To do this, either many more "generic" dead keys are needed, or a way to have a single keypress produce many keysyms, for use in a compose sequence. For reference, here's the Windows XP way to produce polytonic Greek characters: http://support.microsoft.com/default.aspx?scid=kb;el;GR750052 According to the table there, the dead keys used are [ ] - = | \ / ; ' combined with Shift, Alt, and AltGr. In total, 27 different "virtual" dead keys... Not an easy system to learn, but I think anyone who's learned it, should be able to keep using it under X11. Is it possible to implement this with the current xkb plus simple Compose-file infrastructure? Or is it only possible with complex input method software? I thought of this too, but I don't see an easy way to do this with xkb. Anyway, the idea of using combinations of dead keys instead of a dead key for every mark combination was used before in macintosh and as long as the single symbol dead keys have the same position with the old keymap... perhaps it is enough for now. It is propably better to implement this legacy keyboard map with some complex input method at a later time, instead of messing up xkb now. ... I don't know if the latter odd combination would produce conflicts in an international Compose file, but this idea was used in the past in greek keyboard, in the following combinations: dead_tonos + . : above (middle) dot dead_tonos + < : « dead_tonos + > : » I don't think there are any conflicts, and these combinations are very nice from a usability point of view: you don't have to memorize obscure AltGr combinations, just to remember that puting an accent on a character that doesn't take one produces a "special" (less common) character that looks similart. The three combinations listed above were also used in some old MS-DOS keyboard drivers. yes, it is a very good idea, but in an international compose file it would be a conflict if greek keymap wanted to use: dead_acute + . : above (middle) dot and some other language's keymap uses: dead_acute + . : The dead_XXX definitions are accessible for all languages (and this is correct). The correct way to do this would be to have xkb defining a different Compose file for every keymap ... Another idea is to use the same kind of rules to increase the usability of the polytonic keyboard for writing tenchical texts: To have a double press of a dead_key and the altGr + dead_key to produce the "lost" symbol so that the user wouldn't have to ... I agree with this. But: 1. it could cause the same kind of conflicts as mentioned above 2. in the proposed keymap dead_horn is placed in ' so we want the rule dead_horndead_horn: '\'' But if someone creates a new keymap with dead_horn placed in ] we won't be able to add a new rule. This will work for only one keymap messing up all the (future) others (if we ever need any) Another proposed use of altGr is for the dead acute. ELLOT, the Hellenic Standard Organization has proposed and defined different symbols for acute and tonos (which is actually the same symbol) which are equivalent in unicode. That was a mistake... My opinion is that having different glyphs for OXIA and TONOS in fonts is a bug. Upright and slanted oxia don't have ... are equivalent according to Unicode, and without a justification in representing actual Greek text. ... is some justification. But the correct way to solve this according to the Unicode model is with higher-level protocols and smart fonts. For example, with modern smart fonts (OpenType etc.), it's possible to have both U+00B7 and U+0387 assume their correct shape and position depending on their surrounding characters. I agree The combination altGr-dead_tonos + vowel is proposed to produce the letter with accent, in case someone needs it. Well... it probably won't hurt much, except in perpetuating the idea that tonos/accent and oxia/accute are different. And also systems which do their own keysym processing (i.e. GTK+) will have to add some more illogical combinations... I could hurt because many people will prefer to use it, in order to avoid this bug of the fonts. (and this will cause a lot of trouble when mixed up with monotonic greek of a linux with hellenic locale) This is why I propose altGr-dead_acute, so that the combination will be hard, forcing people not to use it. Unfortunately this is necessary, because a lot of polytonic greek texts are encoded like that. If you want to search text with google you will have to use this accent. Look at google search results. Searching for: ἀνθρώπου (with tonos) yields 584 results and ἀνθρώπου (with polytonic set's acute) yields 21.400 results! (I think that this happens because most texts are converted from older 8bit encodings) This is a google bug (?) too, because