Re: Standaridized variation sequences for the Desert alphabet?

Michael Everson Thu, 06 Apr 2017 07:16:50 -0700

On 6 Apr 2017, at 08:01, Martin J. Dürst <[email protected]> wrote:


> Hello Michael,

Hi Martin.

>> It’s as though you’d not participated in this work for many years, really.
> 
> Well, looking back, my time commitment to Unicode has definitely varied over 
> the years. But that might be true for everybody.

I just get frustrated when everyone including the veterans seems to forget 
every bit of precedent that we have for the useful encoding of characters. 

> What's more important is that Unicode covers such a wide range of areas, and 
> not everybody has the same experience or knowledge. If we did, we wouldn't 
> need to work together; it would be okay to just have one of us. Indeed, 
> what's really very valuable and interesting in this work is the many very 
> varied backgrounds and experiences everybody has.

I do not disagree, particularly. 

>>> - That suggests that IF this script is in current use,
>> 
>> You don’t even know? You’re kidding, right?
> 
> Everything is relative. And without being part of the user community, it's 
> difficult to make any guesses.

Hm, but you did make a guess. 

>> Yeah, it doesn’t “seem” anything but a whole lot of special pleading to 
>> bolster your rigid view that the glyphs in question can be interchangeable 
>> because of the sounds they may represent.
> 
> I don't remember every claiming that the glyphs must be used interchangeably, 
> only that we should carefully examine whether they are or not, and that 
> because they represent the same sound (in a phonetic alphabet, as it is)

We don’t encode sounds, we encode writing systems, the marks on paper, and in 
Latinate scripts (I’ll ignore CJK) we have never unified characters which are 
formed of historical ligatures like these… I guess ſs and ſʒ might possibly be 
the exception, but I think nobody would find a use for distinguishing them. 

> and are shown in the same position in alphabet tables, we shouldn't a priori 
> exclude such a possibility.

As it happens, at least one writer used the 𐐅-with-stroke (encoded for /ju;/) 
for /ɔɪ/, but I wouldn’t substitute the 𐐉-with-stroke (𐐦) for it in a 
diplomatic transcription. Normalized spelling is something else, but the 
orthography of Deseret manuscripts themselves is what it is. Subtle things like 
the dialect of writers can be gleaned from them, and letterforms may help to 
date a text. 

>>> - There may not be enough information to understand how the creators and 
>>> early users of the script saw this issue,
>> 
>> Um, yeah. As if there were for Phoenician, or Luwian hieroglyphs, right?
> 
> Well, there's well over an order of magnitude difference in the time scales 
> involved. The language that Deseret is used to write is still in active use, 
> including in this very discussion. Quite different from Phoenician or Luwian 
> hieroglyphs.

The language is still in use, but we have no access to the minds of the dead 
users of Deseret unless they write about their orthographic practices 
explicitly. Accurate transcription can tell us if the speaker was from Boston 
or Britain, if for instance they regularly drop -r- in words like “start”. 

> In addition, we have meta-information such as alphabet tables, which we may 
> not have for the scripts you mention, as well as the fact that printing 
> technology may have forced a better identification of what's a character and 
> what not than inscriptions and other older technologies.

Well, we know there was a script reform in Deseret with regard to these and 
some other characters. 

>> Nobody worried about the number of modern users of the Insular letters we 
>> encoded. Why put such a constraints on users of Deseret? Ꝺꝺ Ꝼꝼ Ᵹᵹ Ꝿ Ꞃꞃ Ꞅꞅ Ꞇꞇ.
> 
> Because it's modern users, and future users, not users some hundred years or 
> so ago, that will use the encoding. In the case of Insular letters, my guess 
> is that nobody wants to translate/transcribe xkcd, for example, whereas there 
> is such a transcription for Deseret:
> http://www.deseretalphabet.info/XKCD/

Modern users use the insular letterforms for accurate representation of some 
texts. John does the XKCD transcriptions, I believe, and he doesn’t use the 
diphthong letters anyway, and that’s his orthographic practice. 

>> Most readers and writers of Deseret today use the shapes that are in their 
>> fonts, which are those in the Unicode charts, and most texts published today 
>> don’t use the EW and OI ligatures at all, because that’s John Jenkins’ 
>> editorial practice.
> 
> So I was wrong to write "modern practitioners", and should have written 
> "modern publishers" or "modern published texts". Or is the impression that I 
> get from what you wrote above wrong that most texts published these days are 
> edited by John, or by people following his practice?

John is active in the area of making and publishing modern editions in Deseret. 
Ken has worked in the area of manuscripts and their represntation. 

> I don't remember denying the value of separate encodings for historic 
> research. I only wanted to make sure that present-day use isn't 
> inconvenienced to make historic research easier.

Adding new characters won’t affect people who don’t want to use those 
characters in particular, though. 

> If the claims are correct that present-day usage is mostly a reconstruction 
> based on the Unicode encoding and the Unicode sample glyphs, then I'm fine 
> with helping historic research.

OK, good. Those modern users who want to use 𐐦 and 𐐧 will still be able to do 
so. Those who want to use the 𐐃-with-stroke and 𐐋-with-stroke characters will 
be able to do so if they are encoded. And there are some other letters not yet 
encoded. 

>> This is exactly the same thing as the medievalist Latin abbreviation and 
>> other characters we encoded. There is neither sense nor logic nor utility in 
>> trying to argue for why editors of Deseret documents shouldn’t have the same 
>> kinds of tools that medievalists have. And as far as medievalist concerns 
>> go, many of the characters are used by relatively few researchers. Some of 
>> the characters we encoded are used all over Europe at many times. Some are 
>> used only by Nordicists, some by Celticists, and some by subsets within the 
>> Nordicist and Celticist communities.
> 
> Maybe, maybe not. If e.g. somebody came and said that they wanted to disunify 
> the ſs and ſz ligatures for (German) ß in order to better analyze some old 
> manuscripts, and the modern users from hereon had to make sure they used the 
> right one depending on the font they used, then I'm sure a lot of Germans 
> would complain quite clearly, because  it would make their current use more 
> complicated.

That’s not true, though. We have both s and ſ encoded, and we gave both r and ꝛ 
encoded, and the long s and r rotunda do not bother any modern user of the 
Latin script or force them to alter their orthography. 

>> Harm? What harm? Recently the UTC looked at a proposal for capital letters 
>> for ʂ and ʐ. Evidence for their existence was shown. One person on the call 
>> to the UTC said he didn’t think anyone needed them. Two of us do need them. 
>> I needed them last weekend and I had to use awkward workarounds. They 
>> weren’t accepted. There wasn’t any good rationale for the rejection. I mean, 
>> the letters exist. Case is a normal function of the script. But they weren’t 
>> accepted. For the guy who didn’t think he needed them, well, so what? If 
>> they’re encoded, he doesn’t have to use them.
> 
> I have no idea what the reasons for this were, because I wasn't involved in 
> the discussion.

As I recall, because one person ended up agreeing “We don’t need to encode 
characters for failed orthographies”. The entire Deseret script is a failed 
orthography of course, and that viewpoint ignores (in this case) the historical 
importance of Pinyin and its development. But from a functional point of view I 
needed capitals for those two letters (not related to early Pinyin) and had to 
use workarounds. That is not a satisfactory situation. 

>> People who use Deseret use it to for historical purposes and for cultural 
>> reasons. Everybody in Utah reads English in standard Latin orthography.
> 
> I haven't been in Utah except for a one-time flight change in Salt Lake City 
> more than 10 years ago. So please don't assume that everybody on this list 
> know the state of usage for all the scripts that get discussed.

OK< but https://en.wikipedia.org/wiki/Deseret_alphabet is a pretty good 
article. 

>> I didn’t “come up” with separate historical derivations for the four 
>> characters in question.
> 
> I didn't mean "come up" in the sense of "make up out of thin air", but in the 
> sense of "discover". If it wasn't you but somebody else who discovered these 
> derivations, please let us know.

All it took was a look at 
https://en.wikipedia.org/wiki/Deseret_alphabet#/media/File:Deseret_glyphs_ew_and_oi_transformation_from_1855_to_1859.svg
 to KNOW without question the derivation of these letters, namely 𐐅/𐐋/𐐉/𐐃 with 
the stroke of 𐐆. It’s blindingly obvious! :-) 

>>>> What Deseret has is this:
>>>> 
>>>> 10426 DESERET CAPITAL LETTER LONG OO WITH STROKE
>>>>    * officially named “ew” in the code chart
>>>>    * used for ew in earlier texts
>>>> 10427 DESERET CAPITAL LETTER SHORT AH WITH STROKE
>>>>    * officially named “oi” in the code chart
>>>>    * used for oi in earlier texts
>>>> 1xxxx DESERET CAPITAL LETTER LONG AH WITH STROKE
>>>>    * used for oi in later texts
>>>> 1xxxx DESERET CAPITAL LETTER SHORT OO WITH STROKE
>>>>    * used for ew in later texts
>>> 
>>> Currently, it has this:
>>> 
>>> 10426 𐐦 DESERET CAPITAL LETTER OI
>>> 
>>> 10427 𐐧 DESERET CAPITAL LETTER EW
>> 
>> You are being deliberately obtuse. Note that I stated clearly “officially 
>> named ‘ew/oi’ in the code chart”.
> 
> Well, if you think I'm deliberately obtuse, then I'd have to say that I think 
> you're (deliberately?) obscure.

I was making a point; sorry if you didn’t catch it. The names as given in that 
list above are the kinds of descriptions of the letters that we often give. We 
have LATIN LETTER THORN WITH STROKE. We might have named it LATIN LETTER THAT. 

> You repeat hypothetical, non-existing names

They’re descriptive of the letter, not of the diphthong.

> such as "DESERET CAPITAL LETTER LONG OO WITH STROKE" over and over, using 
> capitals to make then look like the actual names, and bury the actual names 
> (such as "DESERET CAPITAL LETTER OI") by shortening and lowercasing them.

Well, I lowercased them because lowercase is used in informative notes. Anyway, 
sorry if my rhetoric failed to hit the mark. :-) 

> But even if that weren't the case, we would still want to treat it as one and 
> the same character, with a single code point. It would still be hopelessly 
> impractical for Germans to use two different characters, when they only can 
> decide which character to type once they have seen the actual character in 
> the font they type, and have to potentially change the character if they 
> change the font.

But even if we did encode an ſʒ letter (similar to the T-Z ligature-letter Ꜩ ꜩ 
we did encode) it would be encoded for a special purpose, and wouldn’t be 
intended to affect standard German. Look, we can write schön and we can write 
ſchoͤn and nobody’s affected by the latter. 

> And while we currently have no evidence that Deseret had developed a 
> typographic tradition where some type styles would use one set of ligatures, 
> and other styles would use another set, it wouldn't be possible to reject 
> this possibility without actually trying to find evidence one way or another.

There was type during the heyday of Deseret use, and evidence for several sorts 
but no typographic “tradition” really. That’s happened latterly. 

>> Your argument seemed to be based solely on the use of the letters for the 
>> sounds, ignoring the historical derivation and the facts of the spelling 
>> reform in Deseret.
> 
> The spelling reform is fine. What is important is what happened after the 
> spelling reform. Were the 1855 variants replaced by the 1859 variants? Was it 
> two different traditions, separated in some way or other? Or was it in effect 
> more like a mixture of both?
> (or maybe we don't know, or it's a little of everything?)

Where they were replaced, it helps to identify the provenance of a text. There 
are also some texts where there’s a bit of a mix. In fact adding some letters 
to the standard for Deseret will improve users’ ability to represent the 
historical texts. For those relatively few people who are creating new texts 
now, they will be able to choose what letters they need. Some, like John, don’t 
use the diphthong letters at all. In fact most modern readers read John’s 
texts, so few would probably worry about the other letters. 

> Examining these questions and bringing the available data to light and 
> clarifying the limits of our data and our understanding is very important. 
> Only in this way can we make decisions that will hopefully be valid for the 
> rest of the existence of Unicode (which might be quite a few decades at 
> least), or decisions that at a minimum might be evaluated as "well, they 
> didn't know better then", rather than as "they definitely should have known 
> better, even then”.

Really, my practice when approaching this is the same as it has been for 
additions to Latin or Greek or Cyrillic. I’m quite consistent. :-) 

>> A proposal will be forthcoming. I want to thank several people who have 
>> written to me privately supporting my position with regard to this topic on 
>> this list. I can only say that supporting me in public is more useful than 
>> supporting me in private.
> 
> I'm looking forward to your proposal. I hope it clearly indicates why (you 
> think) there's no danger of inconveniencing modern practitioners.

To be honest, we didn’t have to say “r rotunda will not affect modern users of 
the Latin script”, now, did we? :-)

Today I received Ken’s book on the Deseret-script English-Hopi vocabulary. This 
will help us move forward with a proposal.

Best,
Michael Everson

Re: Standaridized variation sequences for the Desert alphabet?

Reply via email to