Re: Standaridized variation sequences for the Deseret alphabet?
Martin J. Dürst wrote, > What is right for Deseret has to be decided by > and for Deseret users, rather than by script > historians. The Universal Character Set is used by everyone, including script historians. While modern day deployment of the script is determined by its users, the proper encoding of the script should be detemined by character encoders based upon expert input from all interested parties. Best regards, James Kass
Re: Standaridized variation sequences for the Deseret alphabet?
Hello Michael, others, On 2017/03/23 09:03, Michael Everson wrote: Its the same diphthong (a sound) written with different letters. Am 23.03.2017 um 06:54 schrieb Martin J. Dürst: I think this may well be the *historically* correct analysis. And that may have some influence on how to encode this, but it shouldn't be dominant. What's most important is (past and) *current use*. Same issue as with German sharp S: The blackletter »ß« derives from an ſ-z ligature (thence its German name »Eszet«), whilst the Roman type »ß« derives from an ſ-s ligature. Still, we encode both variants as identical letters. I’ve got a print from 1739 with legends in both German (blackletter) and French (Roman italics), comprising both types of ligatures in one single document. Best wishes, Otto
Re: Standaridized variation sequences for the Deseret alphabet?
On Thu, 23 Mar 2017 11:23:27 +0100 Otto Stolz wrote: > Same issue as with German sharp S: The blackletter »ß« derives from an > ſ-z ligature (thence its German name »Eszet«), whilst the Roman type > »ß« derives from an ſ-s ligature. Still, we encode both variants as > identical letters. I’ve got a print from 1739 with legends in both > German (blackletter) and French (Roman italics), comprising both types > of ligatures in one single document. There's another, lesser German analogy. If I understand correctly, in some styles the diaeresis and umlaut marks may be distinguished visually. While it is permissible to use CGJ to mark the difference, the TUS claims (TUS 9.0 p833, in Section 23.2) that CGJ does not affect rendering, except for the direct effect of blocking canonical reordering. (This does appear to be in contrast to its seemingly archaic effect in inhibiting line-breaking.) However, combining marks are, by policy, unified more readily than letters. Richard.
Re: Standaridized variation sequences for the Deseret alphabet?
2017-03-23 6:54 GMT+01:00 Martin J. Dürst : > Hello Michael, others, > > On 2017/03/23 09:03, Michael Everson wrote: > >> On 22 Mar 2017, at 21:39, David Starner wrote: >> > > There's the same characters here, written in different ways. >>> >> >> No, it’s not. Its the same diphthong (a sound) written with different >> letters. >> > > The closes to the current case that I was able to find was the German ß. > It has roots in both an ss and an sz (to be precise, an ſs and an ſz) > ligature (see https://en.wikipedia.org/wiki/ß). And indeed in some fonts, > its right part looks more like an s, and in other fonts more like a z (and > in lower case, more often like an s, but in upper case, much more like a > (cursive) Z). Nevertheless, there is only one character (or two if you > count upper case) encoded, because anything else would be highly confusing > to virtually all users. > This is a good case for encoding explicit variants, including for the two German ß, to distinguish letter forms in historic (medieval?) texts where ſs and ſz were more distinguished. This does not require disuynification, and fonts that can have both forms can choose the correct glyph to use for each variant, and take a default form for the unified character depending on the contextual language (if it is detected) or based on the font style itself (if it was initially designed for a specific language, notably in medieval styles). > What is right for Deseret has to be decided by and for Deseret users, > rather than by script historians. > In historic texts it is not clear which letter form is better than the other, and historic Deseret was basically for a single language (but there may have been regional variants prefering a form instead of the other). I think that now the distinction is in fact more recent, where some eople will want to distinguish them for new uses with dinstinctions. Here also a variant encoding would solve these special cases but we should not disunify the character (and in fact there's not a lot of fonts except for fancy usages, such as trying to mimic handwritten styles for specific authors about how they draw these shapes; I've not seen however any conclusive case of distinction in typesetted texts). In fact we are in a situation similar to the case of shapes for decimal digits like 4 (open or closed), 7 (with an overstriking bar or none), or 0 (with an overstriking slash or dot, or none), 3 (with an angular or circle top part), or letters like g (with a curled leg drawn counterclockwise, or just a bottom foot from right to left: here a distinctive shape was encoded for the IPA symbol) > > Regards, Martin. >
Re: Standaridized variation sequences for the Deseret alphabet?
> On 23 Mar 2017, at 05:54, Martin J. Dürst wrote: > > Hello Michael, others, > > [Fixed script name in subject.] > > On 2017/03/23 09:03, Michael Everson wrote: >> On 22 Mar 2017, at 21:39, David Starner wrote: > >>> There's the same characters here, written in different ways. >> >> No, it’s not. Its the same diphthong (a sound) written with different >> letters. > > I think this may well be the *historically* correct analysis. And that may > have some influence on how to encode this, but it shouldn't be dominant. Well, Martin, maybe you’re comfortable with shifting goalposts, but we have used historically correct analysis to identify characters in the past and to continue with this precedent is consistent with good practice. > What's most important is (past and) *current use*. If the distinction is an > orthographic one (e.g. different words being written with different shapes), > then that's definitely a good indication for splitting. It *is* an orthographic one. For one thing, the 1859 glyphs look NOTHING LIKE the 1855 glyphs. > On the other hand, if fonts (before/outside Unicode) only include one variant > at the time, if people read over the variant without much ado, if people > would be surprised to find both corresponding variants in one and the same > text (absent font variations), if there are examples where e.g. the variant > is adjusted in quotes from texts that used the 'old' variant inside a text > with the 'new' variants, and so on, then all these would be good indications > that this is, for actual usage purposes, just a font difference, and should > therefore best be handled as such. Um, yeah. Why have Unicode at all? I mean people in Georgia were happy with ASCII-based font hacks. Lots of people are still using them. Sure, people put up with the unification of Coptic and Greek. Just font differences. Yeah. > The closes to the current case that I was able to find was the German ß. It > has roots in both an ss and an sz (to be precise, an ſs and an ſz) ligature > (see https://en.wikipedia.org/wiki/ß). And indeed in some fonts, its right > part looks more like an s, and in other fonts more like a z (and in lower > case, more often like an s, but in upper case, much more like a (cursive) Z). > Nevertheless, there is only one character (or two if you count upper case) > encoded, because anything else would be highly confusing to virtually all > users. The situation of the Deseret diphthong letters isn’t anything like German ß. Yes, you can analyse it as something like ſs and ſȥ, but THOSE LOOK VERY NEARLY ALIKE. Ignoring the stroke of SHORT I which is the same for all the Deseret letters being discussed, we have EW represented by 𐐅 and 𐐋 (which look nothing alike) and OI represented by 𐐉 and 𐐃 (which look nothing alike). A unification of these as “glyph variants” is perverse and not consistent with the way we have encoded things in the past. > What is right for Deseret has to be decided by and for Deseret users, rather > than by script historians. Odd. That view doesn’t seem to be applicable to CJK unification. Michael
Re: Standaridized variation sequences for the Desert alphabet?
On 23 Mar 2017, at 06:28, David Starner wrote: > > Does "Яussia" require a new Latin letter because the way R was written has > > a different origin than the normal R? > > But it doesn’t. It’s the Latin letter R turned backwards by a designer for a > logo. We wouldn’t encode that, because it’s a logo. > > What logo? Oh, sorry. “Toys Я Us” which is what I saw when I saw your “Яussia”. > I honestly don't know what logo you're talking about, but a quick Google > search confirms it's used outside of a logo. I was thinking of > http://www.sjgames.com/gurps/books/Russia/img/cover_lg.jpg which actually > doesn't use the reversed R, but uses other Cyrillic characters. Decorative display type and font play on book covers is a very different thing from the development of the Deseret alphabet we are discussing here. >> We don’t encode diphthongs. We encode the elements of writing systems. The >> “idea” here is represented by one ligature of 𐐆 + 𐐅 (1855 EW), one ligature >> of 𐐆 + 𐐋 (1859 EW), one ligature of 𐐉 + 𐐆 (1855 OI), and one ligature of 𐐃 + >> 𐐆 (1859 OI). > > If they're ligatures, they should be encoded as ligatures; if they're > indivisible characters, then their glyph forms are of less interest. We don’t encode ligatures. We encode letters which are historically derived from ligation. That’s what the existing EW and OI are, and that’s what the 1859 revised letters were. >> Those ligatures are not glyph variants of one another. You might as well say >> that Æ and Œ are glyph variants of one another. > > Æ and Œ have contrasting use; they're used in the same text in distinct ways. That happens to be the case, but the analogy has to do with the origin of the ligatures. > Note that n and v̆ are considered glyph variants of each other, because v̆ is > used in Sutterlin in exactly the places that n is used in typewritten > versions of the text. It’s n and ǔ in Sütterlin, not n and v̆. > æ is not œ even when they are printed in fonts that make it nearly impossible > to tell them apart. It has nothing to do with the glyphs or how those glyphs > were created, it's because they're used in different ways. It was an analogy about the structural development of the ligated letters. > The example of Sutterlin strikes me as quite relevant here; characters get > all sorts of weird shapes in handwriting. Sometimes they end up immortalized > in printing, and then they usually get encoded. Usually not. Again: The source of 1855 EW and OI uses *different* letters than the 1859 EW and OI do. This wasn’t accidental. It’s not hard to puzzle out or to see. This isn’t random or even systematic natural development of handwriting styles. It was a principled revision done on the basis of phonetic analysis. English diphthongs EW and OI were first represented by ligatures representing [ɪuː] and [ɒɪ], and then later by ligatures representing [ɪʊ] and [ɔːɪ]. Indeed I would say to John Jenkins and Ken Beesley that the richness of the history of the Deseret alphabet would be impoverished by treating the 1859 letters as identical to the 1855 letters. Michael Everson
Re: Standaridized variation sequences for the Desert alphabet?
On Thu, Mar 23, 2017 at 6:54 AM Michael Everson wrote: > Again: The source of 1855 EW and OI uses *different* letters than the 1859 > EW and OI do. This wasn’t accidental. It’s not hard to puzzle out or to > see. This isn’t random or even systematic natural development of > handwriting styles. It was a principled revision done on the basis of > phonetic analysis. English diphthongs EW and OI were first represented by > ligatures representing [ɪuː] and [ɒɪ], and then later by ligatures > representing [ɪʊ] and [ɔːɪ]. > Sutterlin was created by Ludwig Sütterlin in 1915. There's lots of principled revision going on all the time in the world's scripts that doesn't get recorded by Unicode, and this goes double for young constructed scripts, where people are playing around with them. > Indeed I would say to John Jenkins and Ken Beesley that the richness of > the history of the Deseret alphabet would be impoverished by treating the > 1859 letters as identical to the 1855 letters. > And yet the richness of the history of the Latin alphabet is not impoverished by treating https://commons.wikimedia.org/wiki/File:I_littera_in_manuscripto.jpg (a monocase Latin cursive) as identical to part of the modern Latin-script alphabet, which besides casing, has split the i/j and u/v on the basis of phonetic analysis?