Re: Standaridized variation sequences for the Desert alphabet?

Martin J. Dürst Thu, 06 Apr 2017 00:07:57 -0700

Hello Michael,

[I started to write this mail quite some time ago. I decided to try tolet things cool down a bit by waiting a day or two, but it has becomemore than a week now.]


On 2017/03/29 22:08, Michael Everson wrote:

Martin,

It’s as though you’d not participated in this work for many years, really.

Well, looking back, my time commitment to Unicode has definitely variedover the years. But that might be true for everybody.

What's more important is that Unicode covers such a wide range of areas,and not everybody has the same experience or knowledge. If we did, wewouldn't need to work together; it would be okay to just have one of us.Indeed, what's really very valuable and interesting in this work is themany very varied backgrounds and experiences everybody has.

In addition to variations in background, we also have a wide variety ofways of thinking, e.g. ranging from abstract to concrete, and so on.

On 29 Mar 2017, at 11:12, Martin J. Dürst <due...@it.aoyama.ac.jp> wrote:

- That suggests that IF this script is in current use,


You don’t even know? You’re kidding, right?

Everything is relative. And without being part of the user community,it's difficult to make any guesses.

- As far as we have heard (in the course of the discussion, after questioning 
claims made without such information), it seems that:


Yeah, it doesn’t “seem” anything but a whole lot of special pleading to bolster 
your rigid view that the glyphs in question can be interchangeable because of 
the sounds they may represent.

I don't remember every claiming that the glyphs must be usedinterchangeably, only that we should carefully examine whether they areor not, and that because they represent the same sound (in a phoneticalphabet, as it is) and are shown in the same position in alphabettables, we shouldn't a priori exclude such a possibility.

 - There may not be enough information to understand how the creators and early 
users of the script saw this issue,


Um, yeah. As if there were for Phoenician, or Luwian hieroglyphs, right?

Well, there's well over an order of magnitude difference in the timescales involved. The language that Deseret is used to write is still inactive use, including in this very discussion. Quite different fromPhoenician or Luwian hieroglyphs.

In addition, we have meta-information such as alphabet tables, which wemay not have for the scripts you mention, as well as the fact thatprinting technology may have forced a better identification of what's acharacter and what not than inscriptions and other older technologies.

 - Similarly, there seem to be not enough modern practitioners of the script 
using the ligatures that could shed any light on the question asked in the 
previous item in a historical context,


Completely irrelevant. Nobody worried about the number of modern users of the 
Insular letters we encoded. Why put such a constraints on users of Deseret? Ꝺꝺ 
Ꝼꝼ Ᵹᵹ Ꝿ Ꞃꞃ Ꞅꞅ Ꞇꞇ.

Because it's modern users, and future users, not users some hundredyears or so ago, that will use the encoding. In the case of Insularletters, my guess is that nobody wants to translate/transcribe xkcd, forexample, whereas there is such a transcription for Deseret:

http://www.deseretalphabet.info/XKCD/

first apparently because there are not that many modern practitioners at all, 
and second because modern practitioners seem to prefer spelling with individual 
letters rather than using the ligatures.


This is equally ridiculous. John Jenkins chooses not write the digraphs in the 
works which he transcribed, because that’s what *he* chooses. He doesn’t speak 
for anyone else who may choose to write in Deseret, and your assumption that 
“modern practitioners” do this is groundless.


You wrote:

Most readers and writers of Deseret today use the shapes that are intheir fonts, which are those in the Unicode charts, and most textspublished today don’t use the EW and OI ligatures at all, because that’sJohn Jenkins’ editorial practice.

So I was wrong to write "modern practitioners", and should have written"modern publishers" or "modern published texts". Or is the impressionthat I get from what you wrote above wrong that most texts publishedthese days are edited by John, or by people following his practice?

It also ignores the fact that the script had a reform and that the value of 
separate encodings for the various characters is of value to those studying the 
provenance and orthographic practices of those who wrote Deseret when it was in 
active use.

I don't remember denying the value of separate encodings for historicresearch. I only wanted to make sure that present-day use isn'tinconvenienced to make historic research easier. If the claims arecorrect that present-day usage is mostly a reconstruction based on theUnicode encoding and the Unicode sample glyphs, then I'm fine withhelping historic research.

This is exactly the same thing as the medievalist Latin abbreviation and other 
characters we encoded. There is neither sense nor logic nor utility in trying 
to argue for why editors of Deseret documents shouldn’t have the same kinds of 
tools that medievalists have. And as far as medievalist concerns go, many of 
the characters are used by relatively few researchers. Some of the characters 
we encoded are used all over Europe at many times. Some are used only by 
Nordicists, some by Celticists, and some by subsets within the Nordicist and 
Celticist communities.

Maybe, maybe not. If e.g. somebody came and said that they wanted todisunify the ſs and ſz ligatures for (German) ß in order to betteranalyze some old manuscripts, and the modern users from hereon had tomake sure they used the right one depending on the font they used, thenI'm sure a lot of Germans would complain quite clearly, because itwould make their current use more complicated.

- IF the above is true, then it may be that these ligatures are mostly used for 
historic purposes only, in which case it wouldn't do any harm to present-day 
users if they were separated.


Harm? What harm? Recently the UTC looked at a proposal for capital letters for 
ʂ and ʐ. Evidence for their existence was shown. One person on the call to the 
UTC said he didn’t think anyone needed them. Two of us do need them. I needed 
them last weekend and I had to use awkward workarounds. They weren’t accepted. 
There wasn’t any good rationale for the rejection. I mean, the letters exist. 
Case is a normal function of the script. But they weren’t accepted. For the guy 
who didn’t think he needed them, well, so what? If they’re encoded, he doesn’t 
have to use them.

I have no idea what the reasons for this were, because I wasn't involvedin the discussion.

If the above is roughly correct, then it's important that we reached that 
conclusion after explicitly considering the potential of a split to create 
inconvenience and confusion for modern practitioners,


People who use Deseret use it to for historical purposes and for cultural 
reasons. Everybody in Utah reads English in standard Latin orthography.

I haven't been in Utah except for a one-time flight change in Salt LakeCity more than 10 years ago. So please don't assume that everybody onthis list know the state of usage for all the scripts that get discussed.

not after just looking at the shapes only, coming up with separate historical 
derivations for each of them, and deciding to split because history is way more 
important than modern practice.


I didn’t “come up” with separate historical derivations for the four characters 
in question.

I didn't mean "come up" in the sense of "make up out of thin air", butin the sense of "discover". If it wasn't you but somebody else whodiscovered these derivations, please let us know.

On 2017/03/28 22:56, Michael Everson wrote:

On 28 Mar 2017, at 11:39, Martin J. Dürst <due...@it.aoyama.ac.jp> wrote:

An æ ligature is a ligature of a and of e. It is not some sort of pretzel.


Yes. But it's important that we know that because we have been faced with many cases where 
"æ" and "ae" were used interchangeably.


Irrelevant. This is just spelling. It’s no different than colour/color or 
maximize/maximise or aluminium/aluminum.

Whether we use "æ" or "ae" is indeed a matter of spelling. But I meantsomething else, namely that we know that what may look like a "pretzel"to the uninitiated is a ligature of 'a' and 'e' exactly because we useit as a spelling variant for "ae".

What Deseret has is this:

10426 DESERET CAPITAL LETTER LONG OO WITH STROKE
        * officially named “ew” in the code chart
        * used for ew in earlier texts
10427 DESERET CAPITAL LETTER SHORT AH WITH STROKE
        * officially named “oi” in the code chart
        * used for oi in earlier texts
1xxxx DESERET CAPITAL LETTER LONG AH WITH STROKE
        * used for oi in later texts
1xxxx DESERET CAPITAL LETTER SHORT OO WITH STROKE
        * used for ew in later texts


Currently, it has this:

10426 𐐦 DESERET CAPITAL LETTER OI

10427 𐐧 DESERET CAPITAL LETTER EW


You are being deliberately obtuse. Note that I stated clearly “officially named 
‘ew/oi’ in the code chart”.

Well, if you think I'm deliberately obtuse, then I'd have to say that Ithink you're (deliberately?) obscure. You repeat hypothetical,non-existing names such as "DESERET CAPITAL LETTER LONG OO WITH STROKE"over and over, using capitals to make then look like the actual names,and bury the actual names (such as "DESERET CAPITAL LETTER OI") byshortening and lowercasing them.

Don’t go trying to tell me that EW and SHORT OO WITH STROKE are glyph variants 
of the same character.

Don’t go trying to tell me that LONG AH WITH STROKE and OI are glyph variants 
of the same character.

They’re not. The origin of all those letterforms is obvious,

You don't have to repeat that. I clearly said, maybe even more thanonce, that I can agree with your hypothesis on the origin of theseletter forms.

and we do not encode sounds, we encode the elements of writing systems.

Yes. And we know that individual elements of a writing system sometimescan have multiple origins.

But we have seen cases where such a merge happens. ß is one of them.


That’s even arguable because ſʒ only really occurs in the whole-font Fraktur 
style. It’s pretty rare to see it in Antiqua. Of course it must be attested 
there, but it’s by no means common.

Do you mean that the merge didn't happen style-wise? That we thereforedon't need separate code points because historians don't need todistinguish between the two; they can just rely on the font used?

But even if that weren't the case, we would still want to treat it asone and the same character, with a single code point. It would still behopelessly impractical for Germans to use two different characters, whenthey only can decide which character to type once they have seen theactual character in the font they type, and have to potentially changethe character if they change the font.

And while we currently have no evidence that Deseret had developed atypographic tradition where some type styles would use one set ofligatures, and other styles would use another set, it wouldn't bepossible to reject this possibility without actually trying to findevidence one way or another.

There are quite a few in Han (not surprising because there are tons of 
ideographs there to begin with).

But that experience doesn't mean that we have to rush to a conclusion without 
examining as much of the evidence as we can get hold of.


I haven’t rushed to a conclusion. I’ve made a thorough analysis.


You made a thorough analysis of the graphic shapes.

You may have made some analysis with respect to usage, but you didn'tpresent it initially, and it took quite some time to get to it in thisdiscussion.

You’re smarter than that. So are Asmus and Mark and Erkki and any of the other 
sceptics who have chimed in here.


Skepticism is when presented with options without background facts is a virtue 
in my opinion.


Your argument seemed to be based solely on the use of the letters for the 
sounds, ignoring the historical derivation and the facts of the spelling reform 
in Deseret.

The spelling reform is fine. What is important is what happened afterthe spelling reform. Were the 1855 variants replaced by the 1859variants? Was it two different traditions, separated in some way orother? Or was it in effect more like a mixture of both?

(or maybe we don't know, or it's a little of everything?)

Examining these questions and bringing the available data to light andclarifying the limits of our data and our understanding is veryimportant. Only in this way can we make decisions that will hopefully bevalid for the rest of the existence of Unicode (which might be quite afew decades at least), or decisions that at a minimum might be evaluatedas "well, they didn't know better then", rather than as "they definitelyshould have known better, even then".

On 28 Mar 2017, at 11:59, Mark Davis ☕️ <m...@macchiato.com> wrote:

I agree with Martin.

Simply because someone used a particular shape at some time to mean a letter 
doesn't mean that Unicode should encode a letter for that shape.


Coming to a forum like this out of a concern for the corpus of Deseret 
literature is not some sort of attempt to encode things for encoding’s sake.


And coming to a discussion like this out of a concern for modern practitioners 
of the script (even if it seems, after a lot of discussion, that there aren't 
that many of these, and the issue at hand may indeed not concern them that 
much) is not some sort of attempt to unify things for unification's sake.


I think you made a lot of assumptions about “modern practitioners” which you 
didn’t disclose.

Maybe. But so likewise, you made a lot of assumptions about (theabsence) of modern practitioners which you didn't disclose.

A proposal will be forthcoming. I want to thank several people who have written 
to me privately supporting my position with regard to this topic on this list. 
I can only say that supporting me in public is more useful than supporting me 
in private.

I'm looking forward to your proposal. I hope it clearly indicates why(you think) there's no danger of inconveniencing modern practitioners.


I'd also like to thank the people who supported me, all of them on the list.


Regards,   Martin.

Re: Standaridized variation sequences for the Desert alphabet?

Reply via email to