Re: Is the Unicode Standard "The foundation for all modern software and communications around the world"?

2019-11-19 Thread Michael Everson via Unicode
Of course it’s not “misleading”. Human language is best conveyed by text. 

Michael Everson

> On 19 Nov 2019, at 18:59, Costello, Roger L. via Unicode 
>  wrote:
> 
> Hi Folks,
>  
> Today I received an email from the Unicode organization. The email said this: 
> (italics and yellow highlighting are mine)
>  
> The Unicode Standard is the foundation for all modern software and 
> communications around the world, including all modern operating systems, 
> browsers, laptops, and smart phones—plus the Internet and Web (URLs, HTML, 
> XML, CSS, JSON, etc.).
>  
> That is a remarkable statement! But is it entirely true? Isn’t it assuming 
> that everything is text? What about binary information such as JPEG, GIF, 
> MPEG, WAV; those are pretty core items to the Web, right? The Unicode 
> Standard is silent about them, right? Isn’t the above quote a bit misleading?
>  
> /Roger




Re: Proposal to extend the U+1F4A9 Symbol

2019-05-31 Thread Michael Everson via Unicode
No, thank you.

> On 31 May 2019, at 11:18, bristol_poo via Unicode  wrote:
> 
> Greetings,
> 
> I hope I dont intrude too much on this list with a proposal.
> 
> U+1F4A9, aka the 'pile of poo' emoji, has gained somewhat of a legendary 
> status in the modern society [1]. 
> 
> With the somewhat recent addition of skin tones in the Emoji Modifier 
> Sequences, I think there is some small room to add more depth to the emoji by 
> modulating it via the Bristol Scale [2].
> 
> This would produce 7 variants of the U+1F4A9 emoji, including existing (Which 
> I believe is about Type 4 on the scale). 
> 
> Why? I think this would really benefit the medical profession, with a large 
> uptick in e-doctor/text only conversations towards the medical profession. 
> 
> Cheers
> /BP
> 
> [1] We even have plush toys dedicated to this emoji 
> https://www.amazon.co.uk/Emoji-Shape-Pillow-Cushion-Stuffed/dp/B00VL55Q8O
> [2] https://en.wikipedia.org/wiki/Bristol_stool_scale




Re: Latin capital letter Is (Ꝭ)

2019-03-06 Thread Michael Everson via Unicode
On 6 Mar 2019, at 10:57, Fredrick Brennan via Unicode  
wrote:

>> Draw it as you wish. Most likely it will be the same shape as your 
>> lower-case one, adjusted to fit caps height. 
> 
> As I'm working on a blackletter font, it's unfortunately not this easy.

Sure it is.

> It seems like there is no blackletter style for the capital form from the 
> period… so I'll have to perhaps either (A) leave it empty, assuming users of 
> my font would never attempt to typeset a Ꝭ in blackletter but would choose 
> e.g. Junicode instead,

That’s a not a good idea.

> (B) look at examples in the Roman style and make up my own glyph as I've 
> already done for Greek and Cyrillic,

That is a better idea.

> or (C) just make the glyph an "IS" ligature as I've already done for e.g. 
> LATIN CAPITAL LIGATURE IJ (U+0132).

That is a very bad idea. If a text has a Ꝭ in it, a Ꝭ should be displayed, not 
an IS. Particularly as in Middle English the correct reading might be ES, and 
in Middle Cornish the reading might be YS.

> If anyone has any idea or example glyph from the period I'd love to see it, 
> but I doubt such exists :-)

You are the type designer. You may live in the 21st century, but you could just 
as easily have lived in the 16th. Your client says “I need a Ꝭ glyph” and it’s 
up to you to design one. The easiest thing for your purposes (since you may not 
find a capital Ꝭ easily is to take the ꝭ glyph and modify it to fit between 
caps height and baseline.

Cheers,
Michael Everson


Re: Latin capital letter Is (Ꝭ)

2019-03-03 Thread Michael Everson via Unicode
Fredrick,

> I sent this query to Michael Everson directly on Feb. 19 but did not hear 
> anything back. I assume that he was too busy to respond, perhaps I even broke 
> some unwritten rule of etiquette, for which I apologize; so I am hoping that 
> someone on the mailing list knows the answer instead.

I’m fairly sure you could have asked your question without this irritating 
paragraph. 

> I am trying to find examples of the glyph encoded as U+A76C (Ꝭ), the 
> so-called Latin Capital Letter Is. I have found ample proof and examples of 
> its younger brother, the so-called Latin Small Letter Is encoded as U+A76D 
> (ꝭ).

There is no reason to use “so-called”. One is called LATIN CAPITAL LETTER IS, 
and one is called LATIN SMALL LETTER IS.

> I checked Everson's proposal to encode these letters, and unfortunately found 
> there no proof of the existence of the capital variant.

The encoding allows you to write “sperꝭ” and it allows you to write “SPERꝬ”. To 
write “SPERꝭ" with the lower-case one would not be good. It is the same with 
“angꝯ” and “ANGꝮ”. The same with “romanoꝝ” and “ROMANOꝜ”. 

In my view, any writer should be able to choose the casing form of any letter. 
If you’re typesetting a journal, for instance, and want to use all caps or 
small caps in your page header, and the title of a chapter or article has such 
characters in it, it is problematic if the capital form is missing. I don't 
think it has been advantageous to the standard to balk at adding capital forms. 
A very few lower-case Latin letters probably can’t admit of a capital, but most 
of them can, and yet we are stuck wth waiting for someone to find a gap when we 
could with some reasonable design principles fill the gaps ourselves.

> Is it a dreaded Unicode-ism?

There is no such thing, and casing pairs are a normal part of the Latin script. 
These abbreviation characters were proposed, accepted, and added because of 
this structure. 

> How should I handle this in my font?

Draw it as you wish. Most likely it will be the same shape as your lower-case 
one, adjusted to fit caps height.

> Best,
> Fredrick Brennan
> 




Re: Spiral symbol

2019-02-18 Thread Michael Everson via Unicode
Emoji proposals aren’t notable for their uniform quality. There are guidelines, 
but essentially the subcommittee approves things they like and don’t approve 
things they don’t like.

> On 18 Feb 2019, at 12:52, Andrés Sanhueza via Unicode  
> wrote:
> 
> I understand the difference. My question was why it was needed to have both 
> spirals as different characters instead of a single one that can be either, 
> as the proposal didn't specify an use case where there is a semantic 
> difference between each one.
> 
> 
> El sáb., 16 feb. 2019 a las 11:26, Michael Everson via Unicode 
> () escribió:
> > Question: Why both a right and left facing spiral are exactly need? Isn't a 
> > single one (whose direction is just a glyph variant) enough? There was a 
> > previous thread that also suggested these very symbols, but otherwise I 
> > have found no evidence of the specific need for it.
> 
> Clockwise and anticlockwise are not the same thing.
> 
> Michael Everson




Re: Spiral symbol

2019-02-16 Thread Michael Everson via Unicode
> Question: Why both a right and left facing spiral are exactly need? Isn't a 
> single one (whose direction is just a glyph variant) enough? There was a 
> previous thread that also suggested these very symbols, but otherwise I have 
> found no evidence of the specific need for it.

Clockwise and anticlockwise are not the same thing.

Michael Everson


Re: Ancient Greek apostrophe marking elision

2019-01-28 Thread Michael Everson via Unicode
The hell I do, Julian. 

http://evertype.com/polynesian.html

> On 27 Jan 2019, at 21:00, Julian Bradfield via Unicode  
> wrote:
> 
> You have a very low opinion of Polynesian users. 




Re: Ancient Greek apostrophe marking elision

2019-01-27 Thread Michael Everson via Unicode
Yes, yes. It doesn’t matter. The discussion applies to both the two quotation 
marks and the two modifier letters.

> On 27 Jan 2019, at 15:08, Tom Gewecke via Unicode  wrote:
> 
> 
>> On Jan 26, 2019, at 11:08 PM, Richard Wordingham via Unicode 
>>  wrote:
>> 
>> It may be a matter of literacy in Hawaiian.  If the test readership
>> doesn't use ʼokina, 
> 
> I think the Unicode Hawaiian ʻokina is supposed to be U+02BB (instead of 
> U+02BC).
> 




Re: Ancient Greek apostrophe marking elision

2019-01-27 Thread Michael Everson via Unicode
have problems double-clicking “d’ Artagnan” you should probably just write 
“d’Artagnan”. 

> 
>>> Will your coding decision be machine readable for the readership?  
>> 
>> I don’t know what you mean by “readable”.
> 
> Will the difference between U+02BC and U+2019 be discernible by the readers?

They should be, in Polynesian languages. Otherwise the text isn't easily 
legible. 

> If one could copy a phrase to a general application and select a word by 
> double-clicking, then the difference would be visible.

If you know what the behaviour is then you can take it into account when you 
are copying a word. You can’t fix this by character encoding. Certainly not by 
screwing with 02BC.

> If the result of the publishing is simply a printed book, then your choice of 
> U+2019 or U+02BC will depend only on font differences.

That non-argument can be applied to everything. 

> Not that it makes much difference to the issue,  but isn't the correct 
> encoding for the ʻokina U+02BB MODIFIER LETTER TURNED COMMA? 

Yes, but both 02BB and 02BC are used in linguistic transcriptions and in 
Polynesian languages, and the graphic identity with 2018 and 2019 is 
problematic and unnecessary.

Using 02BC for the apostrophe is a mistake, in my view.

Michael Everson


Re: Ancient Greek apostrophe marking elision

2019-01-26 Thread Michael Everson via Unicode
Fair enough, but I didn’t wait.

> On 27 Jan 2019, at 01:59, James Kass via Unicode  wrote:
> 
> 
> Richard Wordingham responded to Michael Everson,
> 
> >> I’ll be publishing a translation of Alice into Ancient Greek in due
> >> course. I will absolutely only use U+2019 for the apostrophe. It
> >> would be wrong for lots of reasons to use U+02BC for this.
> >
> > Please list them.
> 
> Let's see the list of reasons why U+02BC should be used first.
> 




Re: Ancient Greek apostrophe marking elision

2019-01-26 Thread Michael Everson via Unicode
On 27 Jan 2019, at 01:37, Richard Wordingham via Unicode  
wrote:
> 
>> I’ll be publishing a translation of Alice into Ancient Greek in due
>> course. I will absolutely only use U+2019 for the apostrophe. It
>> would be wrong for lots of reasons to use U+02BC for this.
> 
> Please list them.

The Greek use is of an apostrophe. Often a mark elision (as here), that’s what 
2019 is for.

02BC is a letter. Usually a glottal stop. 

I didn’t follow the beginning of this. Evidently it has something to do with 
word selection of d’ + a space + what follows. If that’s so, then there’s no 
argument at all for 02BC. It’s a question of the space, and that’s got nothing 
to do with the identity of the apostrophe.

> Will your coding decision be machine readable for the readership?

I don’t know what you mean by “readable”.

>> Moreover, implementations of U+02BC need to be revised. In the
>> context of Polynesian languages, it is impossible to use U+02BC if it
>> is _identical_ to U+2019. Readers cannot work out what is what. I
>> will prepare documentation on this in due course.
> 
> It looks as though you've found a new character - or a revived
> distinction.

It may not be “revived’. In origin, linguists took the lead-type 2019 and used 
it as a consonant letter. Now, in the 21st century, where Harry Potter is 
translated into Hawaiian, and where Harry Potter has glottals alongside both 
single and double quotation marks, the 02BC’s need to be bigger or the text 
can’t be read easily. In our work we found that a vertical height of 140% 
bigger than the quotation mark improved legibility hugely. Fine typography asks 
for some other alterations to the glyph, but those are cosmetic.

If the recommended glyph for 02BC were to be changed, it would in no case 
impact adversely on scientific linguistics texts. It would just make the mark a 
bit bigger. But for practical use in Polynesian languages where the character 
has to be found alongside the quotation marks, a glyph distinction must be made 
between this and punctuation.

Michael Everson





Re: Ancient Greek apostrophe marking elision

2019-01-26 Thread Michael Everson via Unicode
Polynesians are using 0027 as a fallback, and this has to do with education, 
keyboarding, and training.

The typography of the fallback is of no consequence. It’s a fallback.

> On 27 Jan 2019, at 01:43, Richard Wordingham via Unicode 
>  wrote:
> 
> On Sat, 26 Jan 2019 17:11:49 -0800
> Asmus Freytag via Unicode  wrote:
> 
>> To make matters worse, users for languages that "should" use U+02BC
>> aren't actually consistent; much data uses U+2019 or U+0027. Ordinary
>> users can't tell the difference (and spell checkers seem not
>> successful in enforcing the practice).
> 
> That appears to contradict Michael Everson's remark about a Polynesian
> need to distinguish the two visually.
> 
> Richard.




Re: Ancient Greek apostrophe marking elision

2019-01-26 Thread Michael Everson via Unicode
I’ll be publishing a translation of Alice into Ancient Greek in due course. I 
will absolutely only use U+2019 for the apostrophe. It would be wrong for lots 
of reasons to use U+02BC for this.

Moreover, implementations of U+02BC need to be revised. In the context of 
Polynesian languages, it is impossible to use U+02BC if it is _identical_ to 
U+2019. Readers cannot work out what is what. I will prepare documentation on 
this in due course.

> On 26 Jan 2019, at 23:52, James Tauber via Unicode  
> wrote:
> 
> Well, my desire it to simple know whether to tell people doing digital 
> editions of Ancient Greek texts whether to use U+2019 or U+02BC for the 
> apostrophe marking elision (or at least accurately describe the trade-offs of 
> each).




Re: The encoding of the Welsh flag

2018-11-21 Thread Michael Everson via Unicode
What really annoys me about this is that there is no flag for Northern Ireland. 
The folks at CLDR did not think to ask either the UK or the Irish 
representatives to SC2 about this. Yes, there is no “official flag” for 
Northern Ireland. But there is one _universally_ used in sport, and that should 
have been made into an emoji at the same time when flags for Scotland, Wales, 
and England were made. And it still should. 

Michael Everson


Re: A sign/abbreviation for "magister"

2018-10-28 Thread Michael Everson via Unicode
I think that it is the _superscription_ that indicates the fact that it is an 
abbreviation. 

In English “þe" was written “ye” and and “yͤ” “yᵉ” and the last of these might 
have a dot or a line or a squiggle underneath it, or not, and in no case was 
that dot or line or squiggle either _meaningful_ or necessary.

Michael Everson

> On 28 Oct 2018, at 21:43, Piotr Karocki  wrote:
> 
>> The squiggle in your sample, Janusz, does not indicate anything; it is only 
>> a decoration, and the abbreviation is the same without it.
> 
> I disagreee. This squiggle means "warning, this is abbreviation", and is
> present in many abbreviations in many centuries (sometimes, although,
> 'abbrev symbol' is rendered differently). So yes, it is important symbol and
> shouldn't be lost in transliteration.
> 
> Piotr Karocki



Re: A sign/abbreviation for "magister"

2018-10-28 Thread Michael Everson via Unicode
This is no different the Irish name McCoy which can be written MᶜCoy where the 
raising of the c is actually just decorative, though perhaps it was once an 
abbreviation for Mac. In some styles you can see a line or a dot under the 
raised c. This is purely decorative. 

I would encode this as Mʳ if you wanted to make sure your data contained the 
abbreviation mark. It would not make sense to encode it as M=ͬ or anything else 
like that, because the “r” is not modifying a dot or a squiggle or an equals 
sign. The dot or squiggle or equals sign has no meaning at all. And I would not 
encode it as Mr͇, firstly because it would never render properly and you might 
as well encode it as Mr. or M:r, and second because in the IPA at least that 
character indicates an alveolar realization in disordered speech. (Of course it 
could be used for anything.)

I like palaeographic renderings of text very much indeed, and in fact remain in 
conflict with members of the UTC (who still, alas, do NOT communicate directly 
about such matters, but only in duelling ballot comments) about some actually 
salient representations required for medievalist use. The squiggle in your 
sample, Janusz, does not indicate anything; it is only a decoration, and the 
abbreviation is the same without it.

Michael Everson

> On 28 Oct 2018, at 17:28, Janusz S. Bień via Unicode  
> wrote:
> 
> For me only the latter seems acceptable. Using COMBINING LATIN SMALL
> LETTER R is a natural idea, but I feel uneasy using just EQUALS SIGN as
> the base character. However in the lack of a better solution I can live
> with it :-)




Re: Unicode 11 Georgian uppercase vs. fonts

2018-07-28 Thread Michael Everson via Unicode
But this behaviour is desirable. It is desirable to be able to select a 
Georgian word and to 

The only thing that seems to annoy people is that modern Georgian doesn’t do 
titlecasing. But that is orthographic, and automatic titlecasing doesn’t work 
properly anyway. French rules and English rules differ. “The Fellowship of the 
Ring” is acceptable in English. “The Fellowship Of The Ring” is not.

Michael Everson

> On 28 Jul 2018, at 18:01, Kent Karlsson  wrote:
> 
> I know it is too late now, but... Could have added the characters,
> without adding the case mappings. Just as it was done for the LATIN
> CAPITAL LETTER SHARP S (ẞ), where the proper case mapping was relegated
> to "special purpose software" (or just a special setting in common
> software). The (proper) case-mapping for ẞ is nowhere to be found the
> Unicode database (which I think is a pity, but that is a different matter).
> 
> I think "specialcasing.txt" is not really maintained anymore, but I'll
> disregard that here.
> 
> One could add a special-casing for each modern Georgian lowercase letter
> to (continue to) uppercase-map to itself (for the Georgian language at
> least).
> 
> /Kent K
> 
> 
> 
> Den 2018-07-28 15:26, skrev "Michael Everson via Unicode"
> :
> 
>> Mtavruli could not be represented in the UCS before we added these 
>> characters.
>> Now it can. 
>> 
>> Michael Everson
>> 
>>> On 28 Jul 2018, at 14:10, Richard Wordingham via Unicode
>>>  wrote:
>>> 
>>> On Sat, 28 Jul 2018 01:45:53 +
>>> Peter Constable via Unicode  wrote:
>>> 
>>>> (iii) gave
>>>> indication of intent to develop a plan of action for preparing their
>>>> institutions for this change as well as communicating that within
>>>> Georgian industry and society. It was only after that did UTC feel it
>>>> was viable to proceed with encoding Mtavruli characters.
>>> 
>>> It is dangerous to rely on declarations of intent when making
>>> irreversible decisions.  The UTC should have learnt that from the
>>> Mongolian mess.
>>> 
>>> Richard.
>> 
>> 
> 
> 




Re: Unicode 11 Georgian uppercase vs. fonts

2018-07-28 Thread Michael Everson via Unicode
Mtavruli could not be represented in the UCS before we added these characters. 
Now it can. 

Michael Everson

> On 28 Jul 2018, at 14:10, Richard Wordingham via Unicode 
>  wrote:
> 
> On Sat, 28 Jul 2018 01:45:53 +
> Peter Constable via Unicode  wrote:
> 
>> (iii) gave
>> indication of intent to develop a plan of action for preparing their
>> institutions for this change as well as communicating that within
>> Georgian industry and society. It was only after that did UTC feel it
>> was viable to proceed with encoding Mtavruli characters.
> 
> It is dangerous to rely on declarations of intent when making
> irreversible decisions.  The UTC should have learnt that from the
> Mongolian mess.
> 
> Richard.




Re: Unicode 11 Georgian uppercase vs. fonts

2018-07-27 Thread Michael Everson via Unicode
On 27 Jul 2018, at 13:42, James Kass via Unicode  wrote:
> 
> MIchael Everson wrote,
> 
>> No, James is mistaken. Georgian is structurally casing, and the difference 
>> is not stylistic, but orthographic.
> 
> I am not mistaken; I never said Georgian wasn't structurally casing and I 
> never said the difference is stylistic.

The use made of MTAVRULI is orthographic, based on the casing structure of the 
script. 

> If members of the Georgian user community want to consider this a stylistic 
> difference, they are free to do so.

It isn’t a stylistic difference. It is a different use of capital letters than 
Latin, Cyrillic and other scripts use them. 

> Unicode/UCS doesn't impose orthographic rules on user communities, it makes 
> no judgments of user practices, and it doesn't mandate or suggest how the 
> actual users use the encoded characters. The UCS
> provides a standard encoding scheme to preserve and exchange plain text 
> computer data.  In order to be UNIVERSAL, the UCS provides encoding not only 
> for day-to-day use, but also for scholarly and historic preservation purposes.

Nobody imposed anything. N4712 describes behaviour, points out that the 
behaviour cannot be supported by Unicode 10 and earlier, and proposes a 
solution to support that behaviour which has been accepted and formally adopted.

Michael Everson


Re: Unicode 11 Georgian uppercase vs. fonts

2018-07-27 Thread Michael Everson via Unicode
On 27 Jul 2018, at 13:42, Alexey Ostrovsky via Unicode  
wrote:

> Michael, can you please provide an example for the modern Georgian?

N4712 Figures 7 through 13.

> It is in present continuous tense, so, samples from 19th century are not 
> valid. (They are probably also not valid formally, but I have to check those 
> books first.)

What is “formal validity”? Those books exist. They are facts. We analyse 
material in order to describe the structure of scripts. 

Michael Everson


Re: Unicode 11 Georgian uppercase vs. fonts

2018-07-27 Thread Michael Everson via Unicode
On 27 Jul 2018, at 13:28, Alexey Ostrovsky via Unicode  
wrote:
> 
> On Fri, Jul 27, 2018 at 3:44 PM, Michael Everson via Unicode 
>  wrote:
>> You have me to thank for undoing that mistake. And some other mistakes. We 
>> all make mistakes. 
> 
> I would like to avoid personal discussions if possible. 

You are addressing the author of N2608R2 and N4712. I will say what I want 
about the documents I wrote and the ideas in them. 

>> > Those institutes were consulted. I met with representatives of both of 
>> > them on my trip to Tbilisi to work with the font designers who spearheaded 
>> > this project. The analysis in N4712 is correct. 
> 
> Could you please comment how the samples in photos prove that it is not small 
> caps? 

Where, in N4712? There are NO EXAMPLES of small caps in N4712. To use Latin 
examples (with explicitly-encoded Latin small caps:

mtavruli is lowercase, ᴍᴛᴀᴠʀᴜʟɪ is small caps, MTAVRULI is uppercase.

There is no evidence of Georgian being written in small caps. 

> N4712 does not contain analysis on that, only statements. (Simple assertions 
> that it is correct will add nothing to what is already stated in N4712.)

If you want to continue this line of argument, you have to cite individual 
Figures in N4712 and say what you think about them. The analysis in N4712 is 
sound, and convinced the Georgian authorities and the UTC to encode the 
characters which have been encoded.

You are going to have to live with them. 

>> So don’t go quoting me in 2003 in order to argue against me in 2016.
> 
> I will quote what I think is appropriate, please.

Do what you want. I was wrong in 2003, because I made the same mistaken 
assumption you have made. If the title-casing material had been available to 
me, I would have made a different analysis. 

> Also, note that the quote was made to demonstrate that N4712 denies what was 
> stated in N2608R2, introduces some changes, and then re-asserts some of 
> denied statements.

I am the author of N4712. I deny what I stated in N2608R2, of which I am the 
author. What I said in N2608R2 was based on a mistaken analysis. 

> The mistake in interpretation is yours. Here:
> 1) Latin script cases. It has capital letters and small letters.

Is this true? Yes or no. 

> 1a) English orthography uses capital letters at the beginnings of sentences 
> and of names and of the names of the months and weekdays. Sometimes ALL CAPS 
> are used.

The boy shouted “HELLO!” to Thomas from his car on Tuesday.

> 1b) French orthography uses capital letters at the beginnings of sentences 
> and of names but not at the beginnings of the names of the months and 
> weekdays. Sometimes ALL CAPS are used.

Le garçon a crié “BONJOUR!” à Thomas de sa voiture le mardi.

> 1c) German orthography uses capital letters at the beginnings of sentences 
> and of names and of the names of the months and weekdays. Sometimes ALL CAPS 
> are used, but not in Fraktur font styles.

Am Dienstag rief der Junge  “HALLO!" zu Thomas aus seinem Auto.

(In Fraktur one would write “ℌ픞픩픩픬!” not “ℌ프픏픏픒!”.)

> 2) Georgian script cases. It has capital letters and small letters.
> 2a) When Georgian orthography uses capital letters it uses them on every 
> letter in the word where they are used, regardless of what kind of word it is.
> 
> This is the same as small caps style, it cannot be used to assert existence 
> of cases.

It is not, for two reasons. First, small caps is pretty much writing capital 
letters the height of small letters. In the UCS, you can apply small caps 
styling to Adlam, Armenian, Cherokee, Coptic, Cyrillic, Deseret, Glagolitic, 
Greek, Khutsuri Latin, Ol Chiki, Old Hungarian — and now Georgian.

> 2b) In the 19th and early 20th century there was an orthography which used 
> capital letters at the beginnings of sentences and of names, as well as 
> full-word ALL CAPS. 
> 
> No, there were attempts in some books. 

Yes, and those books are things. They are facts. They can be read. Now, they 
can be transcribed accurately in Unicode. And the examples in N4712 are dated 
1865, 1876, 1890, 1912, 1913, and 1924. Publishers did this on purpose. They 
invested money in producing books which they expected to sell to readers. The 
orthographic experiment was not a success. 

> > The key question is whether Georgian is caseless or not in plain text 
> > encoding, and N2608R2 does not provide any evidence for casing in modern 
> > Georgian. 
> N2608R2 was written in 2003 and has been superceded by N4712. Mtavruli is ALL 
> CAPS. Mtavruli is not small caps.
> <...> 
> Mtavruli is not small caps. Mtavruli is ALL CAPS. 
>  
> This is the key statement. How can you prove that?

1) In 19th and early 20th-century texts they are not mixing small caps with 
Mkhedruli. They are writing Mtavru

Re: Unicode 11 Georgian uppercase vs. fonts

2018-07-27 Thread Michael Everson via Unicode
No, James is mistaken. Georgian is structurally casing, and the difference is 
not stylistic, but orthographic.

Other people made the argument you are making, Alex. My Georgian colleagues and 
I made the better, more accurate argument. Now Georgian users will be able to 
use Mtavruli in plain text, which is what they want to do.

Michael Everson 

> On 27 Jul 2018, at 13:11, Alexey Ostrovsky via Unicode  
> wrote:
> 
> On Fri, Jul 27, 2018 at 3:34 PM, James Kass via Unicode  
> wrote:
> There's nothing preventing the Georgian user community to continue to
> consider this a stylistic difference. 
> 
> Yes. The only issue here is that Unicode encoding does not reflect the actual 
> state, but (implicitly) promotes some actively pursuing point of view. 
> (Please, do not treat it as a kind of accusation, I simply think that that 
> move was a mistake.)
>  
> Sincerely,
> Alex.




Re: Unicode 11 Georgian uppercase vs. fonts

2018-07-27 Thread Michael Everson via Unicode
On 27 Jul 2018, at 12:22, Alexey Ostrovsky via Unicode  
wrote:
> 
> It is a mistake or misinterpretation of evidence provided (modern samples and 
> samples from 19th c., provided in N4712 in the same context, are of different 
> nature, it is clear even from images) and §8 of the document states opposite. 

No, it is a question of orthography, as I have shown with my 
English/French/German examples. Structurally, the script has case. 
Orthographically case is used in a way differently from other casing scripts.

> The criteria for presence of orthographic distinction between cases is clear: 
> there must be either some typical usage of a case (like USA) or there must be 
> a semantic difference between different cases (like smith vs. Smith).

Your analysis is mistaken. There is no “must”. 


> Neither one is correct for Georgian, use of "case" is totally optional (the 
> same §8 agrees with that): there is no difference between "ašš" and "AŠŠ" 
> (აშშ, USA) in the text, so use of uppercase is exactly the same as small caps 
> (and samples provided in photos only confirm it). There is no Georgian 
> orthography rules that regulate use of upper-case. If I am wrong, I will be 
> happy to see an orthographic rule that distinguish between upper- and 
> lowercase or, at least, recommends to use uppercase.

The rule is given clearly in N4712 §8. 

“Any word is written either in all-smalls (Mkhedruli) or in all-caps 
(Mtavruli)."

> What about samples from 19th century, it was the same attempt (under Cyrillic 
> influence), as an attempt of Shanidze in the middle of 20th century (however, 
> Shanidze used Asomtavruli, which, again, only proves that there were no 
> uppercase for Mkhedruli except on the level of an idea).

Figures 1 through 6 show examples of Georgian using an orthographic rule which 
is common to Latin, Cyrillic, Armenian and so on. 

> There were no orthography rules on that and, even more, it was not 
> orthography as well.

Perhaps your error is in thinking that there were formally codified 
orthographic rules published by some Academy or other. Probably there wasn’t. 
Most Georgians (and I asked a room full of them) do not remember learning 
Mtavruli. It’s sort of taken for granted. These facts are clear: 

A) Modern Georgian orthography uses lowercase letters always, unless uppercase 
letters are used in which all the letters in the word are uppercase. 

B) Some 19th and early 20th-century orthography uses lowercase and uppercase 
letters in the same way that they are used in Cyrillic and Latin. 

> Vast majority of samples from the same period of 19th century are caseless 
> (manuscripts, archive papers, official papers, books, journals, newspapers -- 
> everything).

Yes, we know.

> Either majority of texts from that period are orthographically incorrect, or 
> there was no such orthography like uppercase that time. 

There are two sets of orthographic rules, A) and B) above. 

> One have to distinguish clearly between experiments and a common practice, 
> and N4712 only provide samples, it does not clarify whether it was an 
> orthography or small caps -like usage. an assertion that those couple samples 
> prove that the georgian script had case in 19th century is the same as an 
> assertion that the latin script is caseless in 21st century just because we 
> have enough caseless samples (including this one).

All it means is that A) is the predominant Georgian orthography and B) is a 
failed experiment that Georgians don’t like any more. Isn’t it wonderful that 
the UCS can now support both, however? 

Be happy. Or don’t, but you’re going to have to live with the Georgian encoding 
as it is. 

> With all my respect, N2608R2 is right and N4712 is wrong about case in 
> Georgian.

You are mistaken. Also, in Unicode you can’t have small-caps styling without 
encoded capital letters because small caps are dependent on the encoded 
characters.

> Sincerely,
> Alex.

Regards,
Michael


Re: Unicode 11 Georgian uppercase vs. fonts

2018-07-27 Thread Michael Everson via Unicode
On 27 Jul 2018, at 09:35, Alexey Ostrovsky via Unicode  
wrote:

> On Fri, Jul 27, 2018 at 8:54 AM, James Kass via Unicode  
> wrote:
> https://unicode.org/wg2/docs/n4712-georgian.pdf
> 
>> The revised proposal to change the Georgian encoding model from caseless to 
>> casing was convincing and compelling.  (It's bilingual, too, English and 
>> Georgian.)
> 
> It may look so, but my statement is still correct.

No, it’s not.

> This is not the first time, when the consortium mistreats Georgian (one can 
> remember a story of encoding the ecclesiastic minuscule).

You have me to thank for undoing that mistake. And some other mistakes. We all 
make mistakes. 

> Just two points:
> 1) "compelling" (less important). The supporters are either font designers or 
> non-specialists organizations. There are several institutions in Georgia that 
> had to be involved IMHO (like Institute of Georgian Language, Institute of 
> Manuscripts and Academy of Sciences; Ministry of Economy is not an 
> institution competent in the script issues).

Those institutes were consulted. I met with representatives of both of them on 
my trip to Tbilisi to work with the font designers who spearheaded this 
project. The analysis in N4712 is correct. 

> 2) "convincing". I will not discuss all the controversies here, but will only 
> cite §1.1 and §8:
> §1.1, on "Mkhedruli… is caseless, and no casing behaviour is expected or 
> permitted by Georgian users. The mtavruli titling style of Mkhedruli… is not 
> case; it is a style analogous to small caps or bold or italic. <...> 
> Mtavruli-style letters are never used as “capitals”; a word is always 
> entirely presented in mtavruli or not. Mtavruli-style is used in titles, 
> newspaper headlines, and other kinds of headings." of the original encoding 
> (N2608R2):

I wrote N2608R2 and I said explicitly in N4712 that this was mistaken. It was 
mistaken because I had never seen the 19th-century title-casing material, and 
because I made assumptions about how to handle it as a style. The UTC also made 
a mistake in taking this at face-value, at least insofar as the suggestion that 
“small caps” styling be used, since small caps is dependent upon encoded 
capitals. So don’t go quoting me in 2003 in order to argue against me in 2016.

> — "This statement was not correct."
> At the same time, §8 on successful implementation of the proposal in 
> question: "Within a sentence a given word might be written IN ALL CAPS 
> (MTAVRULI) for emphasis. An entire sentence or header may also be written in 
> Mtavruli." And all the sample photos of the modern books and journals 
> demonstrate exactly the same behavior as described in N2608R2: " 
> Mtavruli-style is used in titles, newspaper headlines, and other kinds of 
> headings".
> (I can provide more information if needed)

The mistake in interpretation is yours. Here:

1) Latin script cases. It has capital letters and small letters.

1a) English orthography uses capital letters at the beginnings of sentences and 
of names and of the names of the months and weekdays. Sometimes ALL CAPS are 
used.

1b) French orthography uses capital letters at the beginnings of sentences and 
of names but not at the beginnings of the names of the months and weekdays. 
Sometimes ALL CAPS are used.

1c) German orthography uses capital letters at the beginnings of sentences and 
of names and of the names of the months and weekdays. Sometimes ALL CAPS are 
used, but not in Fraktur font styles.

2) Georgian script cases. It has capital letters and small letters.

2a) When Georgian orthography uses capital letters it uses them on every letter 
in the word where they are used, regardless of what kind of word it is.

2b) In the 19th and early 20th century there was an orthography which used 
capital letters at the beginnings of sentences and of names, as well as 
full-word ALL CAPS. 

> The key question is whether Georgian is caseless or not in plain text 
> encoding, and N2608R2 does not provide any evidence for casing in modern 
> Georgian. 

N2608R2 was written in 2003 and has been superceded by N4712. Mtavruli is ALL 
CAPS. Mtavruli is not small caps. 

> Basically, the issues addressed are the low level of technical support for 
> implementing small caps in Georgian typesetting (but this must not be Unicode 
> issue) and incorrect idea that small caps must be preserved in plain text 
> encoding (just because someone loves it), it is obvious from §1.1 (right 
> after the text I cited).

It is now possible to use Georgian small caps, since both capital letters and 
small letters are encoded. Previously it would not have been possible to do so, 
since small caps is a fancy-text style of presenting lowercase letters with 
uppercase glyphs. 

Mtavruli is not small caps. Mtavruli is ALL CAPS. 

Michael Everson


Re: Unicode 11 Georgian uppercase vs. fonts

2018-07-27 Thread Michael Everson via Unicode
Yes and it explains clearly that “effectively caseless Georgian” is incorrect. 
Georgian has case. Georgian uses case differently from other scripts. This is 
an orthographic distinction, not a structural one. In fact as it is also stated 
in the proposal, there are 19th-century texts which do titlecase. It’s just 
that that orthography is no longer in use and that behaviour no longer 
desirable.

Michael Everson

> On 27 Jul 2018, at 05:54, James Kass via Unicode  wrote:
> 
> Alexey Ostrovsky wrote,
> 
>> "The Georgian community understood" — sorry, but
>> here "the Georgian community" means a small group
>> of Georgian font designers who promote upper-case
>> for effectively caseless Georgian.
> 
> https://unicode.org/wg2/docs/n4712-georgian.pdf
> 
> The revised proposal to change the Georgian encoding model from
> caseless to casing was convincing and compelling.  (It's bilingual,
> too, English and Georgian.)
> 




Re: The Unicode Standard and ISO

2018-06-12 Thread Michael Everson via Unicode
All right, if you want a clear explanation.

Yes, I think the ISO 8859-4 character names for the Latvian letters were 
mistaken. Yes, I think that mapping them to decompositions with CEDILLA rather 
than COMMA BELOW was a mistake. Evidently some felt that the normative mapping 
was important. This does not mean that SC2 “failed to do its part” and it did 
not cause a lack of desire for cooperation, and it bloody well did not “damage 
the reputation of the whole ISO/IEC”. 

As to ISO 15924, it was developed bilingually, and there was consensus on the 
names that are there. Last year you suggested a massive number of name changes 
to the French translation of ISO/IEC 10646, and I criticized you for foregoing 
stability for your own preferences. When it came to the names in 15924, I told 
you that I do not trust your judgement, and that I would consider revisions to 
the French names when you came back with consensus on those changes with 
experts Alain LaBonté, Patrick Andries, Denis Jacquerye, and Marc Lodewijck. As 
I have not heard from them, I conclude that no such consensus exists. 

ISO 15924 is and ISO standard. Aspects of its content may be mirrored in other 
places, but “moving its content” to CLDR makes no sense. 

Michael Everson

> On 12 Jun 2018, at 16:20, Marcel Schneider via Unicode  
> wrote:
> On Tue, 12 Jun 2018 15:58:09 +0100, Michael Everson via Unicode wrote:
>> 
>> Marcel,
>> You have put words into my mouth. Please don’t. Your description of what I 
>> said is NOT accurate. 
>> 
>>> On 12 Jun 2018, at 03:53, Marcel Schneider via Unicode  wrote:
>>> And in this thread I wanted to demonstrate that by focusing on the wrong 
>>> priorities, i.e. legacy character names instead of the practicability of 
>>> on-going encoding and the accurateness of specified decompositions—so that 
>>> in some instances cedilla was used instead of comma below, Michael pointed 
>>> out—, ISO/IEC JTC1 SC2/WG2 failed to do its part and missed its mission—and 
>>> thus didn’t inspire a desire of extensive cooperation (and damaged the 
>>> reputation of the whole ISO/IEC).
> 
> Michael, I’d better quote your actual e-mail:
> 
> On Fri, 8 Jun 2018 13:01:48 +0100, Michael Everson via Unicode wrote:
> […]
>> Many things have more than one name. The only truly bad misnomers from that 
>> period was related to a mapping error,
>> namely, in the treatment of Latvian characters which are called CEDILLA 
>> rather than COMMA BELOW. 
> 
> Now I fail to understand why this mustn’t be reworded to “the accurateness of 
> specified decompositions—so that in some instances cedilla was used instead 
> of comma below[.]” If any correction can be made, I’d be eager to take note. 
> Thanks for correcting.
> 
> Now let’s append the e-mail that I was about to send:
> 
> Another ISO Standard that needs to be mentioned in this thread is ISO 15924 
> (script codes; not ISO/IEC). It has a particular status in that Unicode is 
> the Registration Authority. 
> 
> I wonder whether people agree that it has a French version. Actually it does 
> have a French version, but Michael Everson (Registrar) revealed on this List 
> multiple issues with synching French script names in ISO 15924-fr and in Code 
> Charts translations.
> 
> Shouldn’t this content be moved to CLDR? At least with respect to localized 
> script names.





Re: The Unicode Standard and ISO

2018-06-12 Thread Michael Everson via Unicode
Marcel,

You have put words into my mouth. Please don’t. Your description of what I said 
is NOT accurate. 

> On 12 Jun 2018, at 03:53, Marcel Schneider via Unicode  
> wrote:
> 
> And in this thread I wanted to demonstrate that by focusing on the wrong 
> priorities, i.e. legacy character names instead of the practicability of 
> on-going encoding and the accurateness of specified decompositions—so that in 
> some instances cedilla was used instead of comma below, Michael pointed out—, 
> ISO/IEC JTC1 SC2/WG2 failed to do its part and missed its mission—and thus 
> didn’t inspire a desire of extensive cooperation (and damaged the reputation 
> of the whole ISO/IEC).




Re: The Unicode Standard and ISO

2018-06-08 Thread Michael Everson via Unicode
On 8 Jun 2018, at 04:32, Marcel Schneider via Unicode  
wrote:

> the registration of the French locale in CLDR is still surprisingly 
> incomplete despite the meritorious efforts made by the actual contributors

Nothing prevents people from working to complete the French locale in CLDR. 
Synchronization with an unused ISO standard is not necessary to do that. 

Michael Everson


Re: The Unicode Standard and ISO

2018-06-08 Thread Michael Everson via Unicode
On 7 Jun 2018, at 20:13, Marcel Schneider via Unicode  
wrote:

> On Fri, 18 May 2018 00:29:36 +0100, Michael Everson via Unicode responded:
>> 
>> It would be great if mutual synchronization were considered to be of benefit.
>> Some of us in SC2 are not happy that the Unicode Consortium has published 
>> characters
>> which are still under Technical ballot. And this did not happen only once. 
> 
> I’m not happy catching up this thread out of time, the less as it ultimately 
> brings me where I’ve started 
> in 2014/2015: to the wrong character names that the ISO/IEC 10646 merger 
> infiltrated into Unicode.

Many things have more than one name. The only truly bad misnomers from that 
period was related to a mapping error, namely, in the treatment of Latvian 
characters which are called CEDILLA rather than COMMA BELOW.

> This is the very thing I did not vent in my first reply. From my point of 
> view, this misfortune would be 
> reason enough for Unicode not to seek further cooperation with ISO/IEC.

This is absolutely NOT what we want. What we want is for the two parties to 
remember that industrial concerns and public concerns work best together. 

> But I remember the many voices raising on this List to tell me that this is 
> all over and forgiven.

I think you are digging up an old grudge that nobody thinks about any longer. 

> Therefore I’m confident that the Consortium will have the mindfulness to 
> complete the ISO/IEC JTC 1 
> partnership by publicly assuming synchronization with ISO/IEC 14651,

There is no trouble with ISO/IEC 14651. 

> and achieving a fullscale merger with ISO/IEC 15897, after which the valid 
> data stay hosted entirely in CLDR, and ISO/IEC 15897 would be its ISO mirror. 

I wonder if Mark Davis will be quick to agree with me  when I say that ISO/IEC 
15897 has no use and should be withdrawn. 

Michael Everson


Re: The Unicode Standard and ISO

2018-06-07 Thread Michael Everson via Unicode
On 7 Jun 2018, at 14:20, Mark Davis ☕️ via Unicode  wrote:
> 
> A few facts. 
> 
>> > ... Consortium refused till now to synchronize UCA and ISO/IEC 14651.
> 
> ISO/IEC 14651 and Unicode have longstanding cooperation. Ken Whistler could 
> speak to the synchronization level in more detail, but the above statement is 
> inaccurate.

Mark is right. 

>> > ... For another part it [sync with ISO/IEC 15897] failed because the 
>> > Consortium refused to cooperate, despite of repeated proposals for a 
>> > merger of both instances.
> 
> I recall no serious proposals for that. 

Nor do I.

> (And in any event — very unlike the synchrony with 10646 and 14651 — ISO 
> 15897 brought no value to the table. Certainly nothing to outweigh the 
> considerable costs of maintaining synchrony. Completely inadequate structure 
> for modern system requirement, no particular industry support, and scant 
> content: see Wikipedia for "The registry has not been updated since December 
> 2001”.)

Mark is right.

Michael Everson


Re: Major vendors changing U+1F52B PISTOL  depiction from firearm to squirt gun

2018-05-23 Thread Michael Everson via Unicode
I consider it a significant semantic shift from the intended meaning of the 
character in the source Japanese character set. 

Michael Everson


Re: The Unicode Standard and ISO

2018-05-17 Thread Michael Everson via Unicode
It would be great if mutual synchronization were considered to be of benefit. 
Some of us in SC2 are not happy that the Unicode Consortium has published 
characters which are still under Technical ballot. And this did not happen only 
once.

> On 17 May 2018, at 23:26, Peter Constable via Unicode  
> wrote:
> 
> Hence, from an ISO perspective, ISO 10646 is the only standard for which 
> on-going synchronization with Unicode is needed or relevant.




Re: L2/18-181

2018-05-16 Thread Michael Everson via Unicode
It sounds to me like a fault in the keyboard software, which could be fixed by 
the people who own and maintain that software.

> On 17 May 2018, at 01:20, Richard Wordingham via Unicode 
> <unicode@unicode.org> wrote:
> 
> On Thu, 17 May 2018 00:34:35 +0100
> Michael Everson via Unicode <unicode@unicode.org> wrote:
> 
>> This is not a fault of the encoding.
>> 
>>> On 16 May 2018, at 23:01, Richard Wordingham via Unicode
>>> <unicode@unicode.org> wrote:
>>> 
>>> I think simple Windows keyboards have a limit of 4 16-bit code
>>> units; for an Indic SMP script, one couldn't map  to a single
>>> key, as it would require 6 code units.  
> 
> It is a consequence of the policy of avoiding precomposed characters.
> If there were a precomposed character for , the keyboard could emit
> that character - job done.
> 
> One objection is that one would need a sequence of decompositions:
> 
>  = <KA_PLUS, SSA>
>  = <KA, VIRAMA>
> 
> Some people are vehemently opposed to unnatural characters like
> .
> 
> Presumable the official view is that Windows Text Services have taken us
> beyond that point, and the likes of  above are not needed.
> 
> If X persists, perhaps named sequences should be assigned numbers so
> that X can make a generic allocation of keysym codes to named
> sequences.
> 
> Richard. 




Re: L2/18-181

2018-05-16 Thread Michael Everson via Unicode
And Icelandic. And Irish. And so on. 

> On 16 May 2018, at 23:41, Anshuman Pandey via Unicode  
> wrote:
> 
>> 2. Collation is different between the Assamese and Bengali languages,
>> and code point order should reflect collation order.
> 
> The same issue applies to dictionary order for Hindi, Marathi, which
> differ from the conventional Sanskrit order for Devanagari.




Re: L2/18-181

2018-05-16 Thread Michael Everson via Unicode
This is not a fault of the encoding.

> On 16 May 2018, at 23:01, Richard Wordingham via Unicode 
>  wrote:
> 
> I think simple Windows keyboards have a limit of 4 16-bit code units;
> for an Indic SMP script, one couldn't map  to a single key, as it
> would require 6 code units.




Re: 0027, 02BC, 2019, or a new character?

2018-02-20 Thread Michael Everson via Unicode
I absolutely disagree. There’s a whole lot of related languages out there, and 
the speakers share some things in common. Orthographic harmonization between 
these languages can ONLY help any speaker of one to access information in any 
of the others. That expands people’s worlds. That would be a good goal.

> On 21 Feb 2018, at 02:24, James Kass via Unicode  wrote:
> 
> A desire to choose their own writing system rather than have one
> imposed upon them is understandable.  If they also want it to be
> distinctive, who could blame them?




Re: 0027, 02BC, 2019, or a new character?

2018-02-20 Thread Michael Everson via Unicode
Stalin would be very pleased. Divide and conquer.

> On 21 Feb 2018, at 01:15, Garth Wallace via Unicode <unicode@unicode.org> 
> wrote:
> 
> AIUI "doesn't look like Turkish" was one of the design criteria, for 
> political reasons.
> 
> On Tue, Feb 20, 2018 at 1:07 PM Michael Everson via Unicode 
> <unicode@unicode.org> wrote:
> Not using Turkic letters is daft, particularly as there was a widely-used 
> transliteration in Kazakhstan anyway. And even if not Ç Ş, they could have 
> used Ć and Ś.
> 
> There’s no value in using diagraphs in Kazakh particularly when there could 
> be a one-to-one relation with the Cyrillic orthography, and I bet you 
> anything there will be ambiguity where some morpheme ends in -s and the next 
> begins with h- where you have [sx] and not [ʃ].
> 
> Groan.
> 
> > On 20 Feb 2018, at 20:40, Christoph Päper <christoph.pae...@crissov.de> 
> > wrote:
> >
> > Michael Everson:
> >> Why on earth would they use Ch and Sh when 1) C isn’t used by itself and 
> >> 2) if you’re using Ǵǵ you may as well use Çç Şş.
> >
> > I would have argued in favor of digraphs for G' and N' as well if there 
> > already was a decision for Ch and Sh.
> >
> > Many European orthographies use the digraph Qu although the letter Q does 
> > not occur otherwise.
> 
> 




Re: 0027, 02BC, 2019, or a new character?

2018-02-20 Thread Michael Everson via Unicode
Not using Turkic letters is daft, particularly as there was a widely-used 
transliteration in Kazakhstan anyway. And even if not Ç Ş, they could have used 
Ć and Ś. 

There’s no value in using diagraphs in Kazakh particularly when there could be 
a one-to-one relation with the Cyrillic orthography, and I bet you anything 
there will be ambiguity where some morpheme ends in -s and the next begins with 
h- where you have [sx] and not [ʃ]. 

Groan.

> On 20 Feb 2018, at 20:40, Christoph Päper <christoph.pae...@crissov.de> wrote:
> 
> Michael Everson:
>> Why on earth would they use Ch and Sh when 1) C isn’t used by itself and 2) 
>> if you’re using Ǵǵ you may as well use Çç Şş.
> 
> I would have argued in favor of digraphs for G' and N' as well if there 
> already was a decision for Ch and Sh.
> 
> Many European orthographies use the digraph Qu although the letter Q does not 
> occur otherwise.




Re: 0027, 02BC, 2019, or a new character?

2018-02-20 Thread Michael Everson via Unicode
Why on earth would they use Ch and Sh when 1) C isn’t used by itself and 2) if 
you’re using Ǵǵ you may as well use Çç Şş.

Groan.

> On 20 Feb 2018, at 19:40, Christoph Päper via Unicode  
> wrote:
> 
> Apparently the presidential decree prescribing the new Kazakh Latin 
> orthography and alphabet has been amended recently. The change completely 
> dumps the previous approach of digraphs with an apostrophe in second position 
> in favor of an acute diacritic mark above the base letters for vowels Á/á, 
> Í/í, Ó/ó, Ú/ú, Ý/ý and two consonants Ǵ/ǵ and Ń/ń, while the other two become 
> commonly encountered H digraphs, Ch/ch and Sh/sh.
> 
> Rejoice.
> 
> https://tengrinews.kz/kazakhstan_news/novyiy-variant-kazahskogo-alfavita-latinitse-utverdil-338010
> 
> http://www.akorda.kz/kz/legal_acts/decrees/kazak-tili-alipbiin-kirillicadan-latyn-grafikasyna-koshiru-turaly-kazakstan-respublikasy-prezidentinin-2017-zhylgy-26-kazandagy-569-zharlygy
> 
> http://www.akorda.kz/upload/media/files/785986f23c47a407facbfa52b935fc85.doc
> -- 
> Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet.




Re: 0027, 02BC, 2019, or a new character?

2018-01-19 Thread Michael Everson via Unicode
I won’t. 

> On 19 Jan 2018, at 13:51, Andrew West via Unicode <unicode@unicode.org> wrote:
> 
> On 19 January 2018 at 13:19, Michael Everson via Unicode
> <unicode@unicode.org> wrote:
>> 
>> I’d go talk with him :-) I published Alice in Kazakh. He might like that.
> 
> Damn, you'll have to reprint it with apostrophes now.
> 
> Andrew
> 




Re: 0027, 02BC, 2019, or a new character?

2018-01-19 Thread Michael Everson via Unicode
There’s no redeeming this orthography. 

> On 19 Jan 2018, at 13:42, Philippe Verdy via Unicode  
> wrote:
> 
> Hmmm that character exists already at 0+0315 (a combining comma above 
> right). It would work for the new Kazah orthographic system, including for 
> collation purpose.  I don't think IDN rejects this combining version.
> 
> 
> 2018-01-19 14:37 GMT+01:00 Philippe Verdy :
> May be the IDN could accept a new combining diacritic (sort of right-side 
> acute accent). After all the Kazakh intent is not to define a new separate 
> character but a modification of base letter to create a single letter in 
> their alphabet.
> So a proposal for COMBINING APOSTROPHE (whose spacing non-combining version 
> is 02BC), so that SPACE+COMBINING APOSTROPHE will render exactly like 02BC
> 
> 2018-01-18 19:51 GMT+01:00 Asmus Freytag via Unicode :
> Top level IDN domain names can not contain 02BC, nor 0027 or 2019. 
> 
> (RFC 6912 gives the rationale and RZ-LGR the implementation, see MSR-3)
> 
> A./
> 
> 
> On 1/18/2018 3:00 AM, Andre Schappo via Unicode wrote:
>> 
>> 
>>> On 18 Jan 2018, at 08:21, Andre Schappo via Unicode  
>>> wrote:
>>> 
>>> 
>>> 
 On 16 Jan 2018, at 08:00, Richard Wordingham via Unicode 
  wrote:
 
 On Mon, 15 Jan 2018 20:16:21 -0800
 James Kass via Unicode  wrote:
 
> It will probably be the ASCII apostrophe.  The stated intent favors
> the apostrophe over diacritics or special characters to ensure that
> the language can be input to computers with standard keyboards.
 
 Typing U+0027 into a word processor takes planning.  Of the three, it
 should obviously be the modifier letter U+02BC, but I think what gets
 stored will be U+0027 or the single quotation mark U+2019.
 
 However, we shouldn't overlook the diacritic mark U+0315 COMBINING COMMA
 ABOVE RIGHT.
 
 Richard.
>>> 
>>> I have just tested twitter hashtags and as one would expect, U+02BC does 
>>> not break hashtags. See twitter.com/andreschappo/status/953903964722024448
>>> 
>> 
>> ...and, just in case twitter.com/andreschappo/status/953944089896083456
>> 
>> André Schappo
>> 
> 
> 
> 




Re: 0027, 02BC, 2019, or a new character?

2018-01-19 Thread Michael Everson via Unicode
I’d go talk with him :-) I published Alice in Kazakh. He might like that. 

Michael

> On 19 Jan 2018, at 09:39, Andrew West via Unicode  wrote:
> 
> On 19 January 2018 at 09:16, Shriramana Sharma via Unicode
>  wrote:
>> Wow. Somebody really needs to convey this to the Kazhaks. Else a
>> short-sighted decision would ruin their chances at native IDNs. Any Kazhaks
>> on this list?
> 
> There's only one Kazakh who counts, and I'm pretty sure he's not on this list.
> 
> Andrew




Re: Linearized tilde?

2017-12-30 Thread Michael Everson via Unicode
On 30 Dec 2017, at 18:59, Doug Ewell via Unicode <unicode@unicode.org> wrote:
> A defining characteristic of the 1982 African Reference Alphabet was that it 
> was lowercase-only. An uppercase form would be an invention with no basis in 
> history or usage.

Which is why it failed. Everybody who used anything like it or derived from it 
ended up devising capital letters. 

Doke’s click letters are better candidates for encoding.

Michael Everson


Re: Word_Break for Hieroglyphs

2017-12-14 Thread Michael Everson via Unicode
On 14 Dec 2017, at 14:14, Mark Davis ☕️ via Unicode <unicode@unicode.org> wrote:

> The Word_Break property doesn't have a value Complex_Context, but I think 
> that was just a typo in your message.
> 
> The word break and line break properties for 1,057 [:Script=Egyp:] characters 
> are currently
> 
> Word_Break=ALetter
> Line_Break=Alphabetic
> 
> Off the top of my head, I think the best course would be to make them both 
> the same as for most of [:Script=Hani:]
> 
> Word_Break=Other
> Line_Break=Ideographic

Egyptian is not ideographic and is certainly not fixed-width. CJK does not 
cluster. Why should you want to make them the same? Moreover, these properties 
were defined at the beginning, were they not? Bob Richmond and others will 
certainly have a view on this. 

> We would only need to use Complex_Context [:lb=SA:] for scripts that keep 
> some letters together and break others apart (typically needing dictionary 
> lookup). I would suspect for modern use of Egyp, that is not the case;

Please do not “suspect”. It is not hard to ask experts.

> most people would expect the characters to would just flow like ideographs, 
> breaking between any pair:

NO. Clusters cannot be broken up just anywhere. 

> you wouldn't need to disallow breaks between a  axe> and a , for example.
> 
> Also, I noticed that the 14 Egyp characters with Line_Break≠Alphabetic have a 
> linebreak and general category properties that seem odd and inconsistent to 
> me.
> 
> Line_Break=Close_Punctuation
> General_Category=Other_Letteritems: 8
> Egyptian Hieroglyphs — O. Buildings, parts of buildings, etc.items: 6
> 
>  ㉛U+1325B EGYPTIAN HIEROGLYPH O006D
>  ㉜U+1325C EGYPTIAN HIEROGLYPH O006E
>  ㉝U+1325D EGYPTIAN HIEROGLYPH O006F
>  ㊂U+13282 EGYPTIAN HIEROGLYPH O033A
>  ㊇U+13287 EGYPTIAN HIEROGLYPH O036B
>  ㊉U+13289 EGYPTIAN HIEROGLYPH O036D
> Egyptian Hieroglyphs — V. Rope, fiber, baskets, bags, etc.items: 2
> 
>  ㍺U+1337A EGYPTIAN HIEROGLYPH V011B
>  ㍻U+1337B EGYPTIAN HIEROGLYPH V011C
> Line_Break=Open_Punctuation
> General_Category=Other_Letteritems: 6
> Egyptian Hieroglyphs — O. Buildings, parts of buildings, etc.items: 5
> 
>  ㉘U+13258 EGYPTIAN HIEROGLYPH O006A
>  ㉙U+13259 EGYPTIAN HIEROGLYPH O006B
>  ㉚U+1325A EGYPTIAN HIEROGLYPH O006C
>  ㊆U+13286 EGYPTIAN HIEROGLYPH O036A
>  ㊈U+13288 EGYPTIAN HIEROGLYPH O036C
> Egyptian Hieroglyphs — V. Rope, fiber, baskets, bags, etc.items: 1
> 
>  ㍹U+13379 EGYPTIAN HIEROGLYPH V011A

These properties were chosen explicitly when Egyptian was first defined. Those 
are enclosing punctuation characters. 

Michael Everson.


Re: Word_Break for Hieroglyphs

2017-12-14 Thread Michael Everson via Unicode
On 14 Dec 2017, at 08:09, Richard Wordingham via Unicode <unicode@unicode.org> 
wrote:
> 
> Is there any valid reason for Egyptian hieroglyphs to have
> Word_Break=ALetter rather than Complex_Context?  So far as I am aware,
> hieroglyphs lack visible word breaks in both inscriptions and in modern
> transcriptions.

Why should visibility matter here?

Michael Everson


Re: Question about Karabakh Characters

2017-10-05 Thread Michael Everson via Unicode
It is legitimate to add characters for Armenian dialectology, and if you can 
provide additional evidence of usage in lexicography and (if possible) in other 
literature, we can see if a proposal can be made. 

We may do this offline so as to save the list from to many files. I look 
forward to hearing from you. Nothing will happen, though, without further 
information. 

Michael

> On 5 Oct 2017, at 06:09, via Unicode <unicode@unicode.org> wrote:
> 
> Thank you for your reply.
> I am currently handling technical support to publish in multi-language.
> 
> This was found when we were handling a project on the Karabakh language.
> I was informed that Karabakh has a dictionary containing over 40,000 words 
> that was produced in 2013 which employs the three characters.
> I personally have not seen this dictionary, but it seems that are ones that 
> need these characters.
> So I decided to make a post.
> 
> Kazunari Tsuboi
> 
> -Original Message-
> From: Michael Everson [mailto:ever...@evertype.com] 
> Sent: Wednesday, October 4, 2017 11:31 PM
> To: Tsuboi, Kazunari
> Cc: unicode Unicode Discussion
> Subject: Re: Question about Karabakh Characters
> 
> They are not encoded, but that example is not sufficient. If you’d like to 
> contact me offline we can discuss this further.
> 
> Michael Everson
> 
>> On 4 Oct 2017, at 08:39, via Unicode <unicode@unicode.org> wrote:
>> 
>> Hi there,
>> 
>> The Karabakh language uses Armenian characters, but the following 
>> characters do not have a Unicode assigned. (image1.JPG attached) They 
>> are pronounced “Yi”, “Ini” and “Eh” and used with several 
>> combinations. (Image2.JPG attached)
>> 
>> Is there any reason these characters are not supported by Unicode?
>> I would appreciate any related information.
>> 
>> Thank you!
>> 
>> Kazunari Tsuboi
>> 
> 
> 




Re: Question about Karabakh Characters

2017-10-04 Thread Michael Everson via Unicode
They are not encoded, but that example is not sufficient. If you’d like to 
contact me offline we can discuss this further.

Michael Everson

> On 4 Oct 2017, at 08:39, via Unicode <unicode@unicode.org> wrote:
> 
> Hi there,
>  
> The Karabakh language uses Armenian characters, but the following characters 
> do not have a Unicode assigned. (image1.JPG attached)
> They are pronounced “Yi”, “Ini” and “Eh” and used with several combinations. 
> (Image2.JPG attached)
>  
> Is there any reason these characters are not supported by Unicode?
> I would appreciate any related information.
>  
> Thank you!
>  
> Kazunari Tsuboi
> 




Re: LATIN CAPITAL LETTER SHARP S officially recognized

2017-07-02 Thread Michael Everson via Unicode
Now that it has been added, however, the situation is different. 

> On 2 Jul 2017, at 16:59, Jörg Knappen via Unicode  wrote:
> 
> > Is it possible to design fonts that will render ẞ as SS?
>  
> In fact, that has happened long before the capital letter sharp s was added 
> to Unicode: The T1 encoding (aka Cork encoding) of LaTeX
> does this since 1990. The reason for this was correct hyphenation for German 
> words rendered in all caps.
>  
> --Jörg Knappen




Re: LATIN CAPITAL LETTER SHARP S officially recognized

2017-07-01 Thread Michael Everson via Unicode
On 1 Jul 2017, at 10:34, Werner LEMBERG via Unicode <unicode@unicode.org> wrote:

> It's even more complicated.  Take for example the word `Straße'
> (street), which gets capitalized as `STRASSE’.

Or as STRAẞE. 

> In Germany and Austria this word gets hyphenated as `STRA-SSE' (since 
> hyphenation is not
> influenced by the ß→SS substitution).  However, in Switzerland it gets
> hyphenated as `STRAS-SE', since Swiss German doesn't use ß; instead,
> `ss' gets treated as a normal double consonant.

It would be hyphenated STRA-ẞE in any case.

Michael Everson


Re: LATIN CAPITAL LETTER SHARP S officially recognized

2017-06-30 Thread Michael Everson via Unicode
It would be sensible to case-map ß to ẞ however.

> On 30 Jun 2017, at 16:29, Otto Stolz via Unicode  wrote:
> 
> Hello,
> 
> der Rat für deutsche Rechtschreibung which is responsible for the further 
> development of the official German orthography has finally recognized LATIN 
> CAPITAL LETTER SHARP S as a possible upper-case equvalent for the LATIN SMALL 
> LETTER SHARP S.
> 
> The report announcing the change is dated 2016-12-08, but the official rules 
> have been updated only yesterday, so the change is currently in the news (not 
> very prominently, though).
> 
> The pertinent section from the official 2107 rules reads thusly:
>> § 25 E3
>> Bei Schreibung mit Großbuchstaben schreibt man SS. Daneben ist auch die 
>> Verwendung des Großbuchstabens ẞ möglich.  Beispiel: Straße – STRASSE 
>> –STRAẞE.
> 
> Which translates to:
>> When writing all caps, you spell SS. Alternatively, it is possible to use 
>> the upper-case ẞ. Example: Straße – STRASSE –STRAẞE.
> 
> So, SS remains the primary upper-case equivalent of ß. Yesterday’s note to 
> the press says that the capital ẞ is meant mainly for passports and similar 
> official documents, wich have to reproduce personal names faithfully in their 
> respective spelling variants. E. g., Passports used to distinguish proper 
> names such as GROẞMANN and GROSSMAN; up to now, they usually have spelled 
> GROßMANN, with a small ß between the capitals, which renders ugly, in most 
> fonts.
> 
> Best wishes,
>   Otto




Re: Announcing The Unicode® Standard, Version 10.0

2017-06-21 Thread Michael Everson via Unicode
High 101F6, high 101F6, High FE4F… ♫ 

> On 21 Jun 2017, at 18:01, Ken Whistler via Unicode  
> wrote:
> 
> I wonder IF 9 times suffice,
> But IF more are required,
> I'll tweet ILY, tweet it twice --
> Since spelling's been retired.
> 
> 
> On 6/21/2017 8:37 AM, William_J_G Overington via Unicode wrote:
>> Here is a mnemonic poem, that I wrote on Monday 20 February 2017, now 
>> published as U+1F91F is now officially in The Unicode Standard.
>> 
>> One eff nine one eff
>> Is the code number to say
>> In one symbol
>> A very special message
>> To a loved one far away
>> 
>> In an email
>> Or a message of text
>> 
>> 
> 




Re: Proposal to add standardized variation sequences for chess notation

2017-04-12 Thread Michael Everson via Unicode
On 12 Apr 2017, at 10:16, Kent Karlsson via Unicode <unicode@unicode.org> wrote:

> Unicode has (only) these for Shogi pieces:
> 
> 2616;WHITE SHOGI PIECE;So;0;ON;N;
> 2617;BLACK SHOGI PIECE;So;0;ON;N;
> 26C9;TURNED WHITE SHOGI PIECE;So;0;ON;N;
> 26CA;TURNED BLACK SHOGI PIECE;So;0;ON;N;
> 
> Which seems insufficient…

Yes, we know. One thing at a time, please.

Michael Everson


Re: Xiangqi Game Symbols (was Re: Proposal to add standardized variation sequences for chess notation)

2017-04-12 Thread Michael Everson via Unicode
On 12 Apr 2017, at 10:13, Andrew West via Unicode <unicode@unicode.org> wrote:

> My Xiangqi proposal (http://www.unicode.org/L2/L2016/16255-n4748-xiangqi.pdf) 
> proposed a minimal set of logical game pieces for Xiangqi/Janggi, regardless 
> of shape (circular or octagonal) or design (traditional characters, 
> simplified characters, cursive characters, or pictures) which I consider a 
> font design issue, and explicitly did not seek to encode circled ideographs. 
> My proposal was rejected, and a different proposal by Michael Everson 
> (http://www.unicode.org/L2/L2016/16270-n4766-xiangqi.pdf) to encode all 
> circled ideographs and negative circled ideographs attested in Xiangqi game 
> diagrams was accepted instead.

Not quite. At the WG2 meeting it was proposed, I believe by experts from the 
US, to use circled ideographs to represent xiangqi characters. “In for a penny, 
in for a pound,” I thought, and so said that if we were to do that we’d have to 
encoded all the attested circled ideographs, because you can’t have a circled 士 
(58EB) and say that a circled 仕 (4ED5) is a valid glyph variant of it. Then I 
wrote that proposal so that we could have an actionable document with which to 
get characters on the ballot. 

> The accepted proposal for circled ideographs is a glyph encoding model not a 
> character encoding model as for other game symbols (Chess,
> Dominos, Mahjong, Playing Cards, etc.),

This is true. 

> and in my opinion it is a very bad model for several reasons. It makes the 
> interchange of Xiangqi game data and game diagrams problematic; it hinders 
> normal text processing operations on Xiangqi game pieces (for example, to 
> search for a red horse piece you have to search for three different 
> characters);

Yes, it does. It is important to remember that this use of symbols is a text 
usage.

> and in modern computer usage Xiangqi game pieces may not be represented as 
> simple circled ideographs, but may be coloured designs showing characters or 
> images.

Or black and white designs showing for instance an actual elephant rather than 
象 8C61.

> It is also very likely that vendors will want to produce emoji versions of 
> Xiangqi pieces,



> and these could not reasonably be considered to be glyph variants of circled 
> ideographs.

True.

> There has been some negative feedback on the circled ideographs model on the 
> internet, and I believe that Michael has now been convinced that this model 
> is wrong, and should be replaced by a model using logical game pieces.

I was convinced, and my proposal to rectify this were provided as Irish ballot 
comments to PDAM 1.2.

Michael Everson


Re: Coloured Punctuation and Annotation

2017-04-10 Thread Michael Everson via Unicode
Michael isn’t trying to make any coloured fonts.

Michael

> On 10 Apr 2017, at 23:08, Peter Constable via Unicode <unicode@unicode.org> 
> wrote:
> 
> William:
> 
> Michael's scenario doesn't require a special palette index value such as you 
> propose since (i) he could implement a font with alternate palettes to 
> provide different colouring options of his choosing, and (ii) an app can 
> always expose customization options to allow the user to customize any of the 
> palette entries that are being used, even on a character-by-character basis 
> if the app really wanted to.
> 
> Moreover, defining palette index 0xFFFE with a special meaning would be a 
> breaking change that could negatively impact existing implementations. Also, 
> it would create a potential ambiguity about what colour to use: whereas text 
> drawing operations _always_ have a foreground colour specified, there is no 
> convention for specifying a "first decoration colour". 
> 
> For these reasons, this is not going to happen.
> 
> 
> Peter
> 
> 
> -Original Message-
> From: Unicode [mailto:unicode-boun...@unicode.org] On Behalf Of William_J_G 
> Overington
> Sent: Thursday, April 6, 2017 11:40 AM
> To: ever...@evertype.com; richard.wording...@ntlworld.com; unicode@unicode.org
> Subject: Re: Coloured Punctuation and Annotation
> 
> Michael Everson wrote:
> 
>> No. Here is an example of a font available in two variants. In one variant, 
>> all those grey swirls are fused to the letters, and it can all be printed in 
>> black or one colour ink. 
> 
>> https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fcdn.myfonts.net%2Fs%2Faw%2Foriginal%2F255%2F0%2F131020.png=02%7C01%7Cpetercon%40microsoft.com%7C99523bf7480842d3096708d47d1ecae7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636271018606863669=7r1pdkH%2BGDjMDxhw44fxfwXjQ6IU%2FUXZntejzC5npm4%3D=0
>>  
> 
>> There is also a second set of fonts included which separates the swirls from 
>> the letters, and those can be used in typesetting to get the two-colour 
>> effect you see here. That can’t really be done using standard encoding. 
>> You’d probably see IIVVOORRYY in the backing store for that word, with every 
>> other letter being set in the letter font and the swirl font. 
> 
> Richard Wordingham mentioned the following.
> 
>> The third glyph would use 'index' 0x to specify that it be displayed in 
>> the foreground colour.
> 
> If the OpenType specification were augmented so that 'index' 0xFFFE were to 
> specify that the appropriate part of the glyph be displayed in the "first 
> decoration colour", a colour specified in the application program and not in 
> the font; and an application program were augmented so that an end user were 
> able to choose first decoration colour as well as choosing foreground colour, 
> then would that produce the result for which Michael is looking?
> 
> William Overington
> 
> Thursday 6 April 2017
> 
> 
> 
> 




Re: Coloured Punctuation and Annotation

2017-04-10 Thread Michael Everson via Unicode
On 10 Apr 2017, at 17:30, Peter Constable  wrote:

Sorry, Peter. I didn’t realize you weren’t talking about chess fonts. 

Michael


Re: Coloured Punctuation and Annotation

2017-04-10 Thread Michael Everson via Unicode
On 10 Apr 2017, at 19:32, Peter Constable via Unicode  
wrote:
> 
> Michael, your two-tone effect can easily be added into your first font using 
> COLR and CPAL tables, so that the one font can support a monochrome rendering 
> that uses glyphs in which the swirls are fused with the letters, and can also 
> support a poly-chrome rendering in which those glyphs are decomposed into 
> separate glyphs that get layered on top of one another in an order you 
> specify with different RGBA colours.

Thank you, but I don’t have any need to represent chess diagram glyphs in 
colour. I also don’t have any font editing tools which edit COLR and CPAL 
tables in colour. 

Michael


Re: Proposal to add standardized variation sequences for chess notation

2017-04-10 Thread Michael Everson
On 10 Apr 2017, at 11:40, Christoph Päper <christoph.pae...@crissov.de> wrote:

> Even if I were [wrong], nobody has proven that. Everybody is just shouting 
> out their presumptions and prejudices, full of falsehoods.

I have stated that emoji is a different world. It brings with it specific 
implications for burdening vendors in a particular way. I am not having this 
simple, feasible, sensible, and effective proposal derailed by mixing it in 
with colourful emoji fonts. I have stated nothing false. 

>> Emoji as a special relationship with vendors and a particular implementation 
>> environment. 
> 
> That is true. It does not mean that
> 
> a) this environment would not be used to interchange chess diagrams nor 

Emoji is for sending stuff to your friends via various messaging services. 

Chess diagrams have been set in plain type for going on two hundred years. 
That’s what the proposal supports. That’s all it supports. It solves the 
problem of using the UCS to set such diagrams. That’s it. 

> b) parties interested in rendering textual chess diagrams couldn't take 
> advantage of it and bend it to their requirements.

I’ve worked with vendors providing colour emoji glyphs and black-and-white 
emoji lists. Implementation is time-consuming and expensive. I just want 
standardized variation sequences for chess notation so that chess fonts can be 
sorted out. 

>> Vendors via the UTC look at symbol and pictograph and other characters and 
>> decide if they want to give these symbols and pictographs and other 
>> characters the special characteristic which implies generally colour 
>> rendering and implies an obligation to supply input methods for those 
>> characters.
> 
> Yes, Unicode's emojification process is still seriously broken. It's not an 
> argument against reusing the underlying techniques, though.

I said it once already. Now I’m saying it again. Only the UTC assigns the emoji 
category to symbols. I’m not asking, and am not going to ask, the UTC to assign 
the emoji category to chess symbols. 

>> Please stop trying to conflate emoji and chess characters. It is NOT, I 
>> think, a solution which the UTC would agree to. I would oppose it in SC2. 
> 
> That's why I'm trying to convince you (but not just you) in this early stage.

I’m not convinced and I’m not going to be convinced. The emoji VS would not 
solve the problem I have in any case. I need two VS characters, one for light 
squares and one for dark squares and the emoji VS only say “you can make it 
colourful”. Emojification of chess characters is not the correct solution to 
the problem. 

>>> In all existing implementations they are.
>> 
>> That’s not true. 
> 
> Could you please provide a counter example?

I’ve seen chess fonts that have free-standing chesspiece characters as well as 
chess characters on light squares and dark ones. 

> *Emojis* are always square. I didn't say anything about fonts used for chess 
> diagrams here.

Square also does not mean “em-square sized” which is pretty much what you need 
for chess diagrams. 

That’s all. 

Michael Everson


Re: Proposal to add standardized variation sequences for chess notation

2017-04-10 Thread Michael Everson
On 10 Apr 2017, at 09:49, Christoph Päper <christoph.pae...@crissov.de> wrote:

> If Unicode chess diagrams used VS-16 instead of VS-1 and VS-2, users could 
> one day choose a font that fakes marble, wood, glass, steel or just some 
> random color or even animation for pieces and board squares. Since this kind 
> of customization is a common feature in chess applications, I'd expect it to 
> be a welcome feature for textual diagrams as well, even if it's not used 
> (much) in print books. Within the web ecosystem that relies upon CSS for 
> styling, authors and readers could very well and easily use different 
> designs. With non-emoji chess characters, the differences would mostly be 
> limited to glyph outlines.

Well, no. 

> Is that "no reason”?

Yes. If the UTC wants to make chess characters into emoji then they can do 
that. Garth and I are not asking for it. We're asking for interchangeability 
and stability in representing chess diagram data. This is not the same thing as 
what you are talking about, and so it is not relevant to the proposal. 

>> And risking that some consistent monochrome glyphs would be replaced with 
>> colorful pictures  by overly aggressive systems is also something that 
>> should be avoided with the chess symbols.
> 
> VS-15 should be better at that than any other variation selector. 

That just tells the glyph to be the text glyph from the code charts. That 
ignores completely the piece-on-square glyphs the proposal requires. You’re 
talking about something irrelevant to the proposal. Christoph. It’s not 
helpful. 

As Garth says:

>> Look, this proposal is not about "Wouldn't it be a neat idea if we could 
>> make chess diagrams in text?" People had that neat idea before they had the 
>> neat idea for Unicode, or for computers for that matter. This is about 
>> removing a barrier to people using Unicode instead of various 
>> mutually-incompatible dingbat fonts for something they already regularly do.
> 
> I understand that perfectly well. Currently, Unicode chess pieces are only 
> well-suited for figurine notation, not for 2D diagrams. I even agree with the 
> approach to use variation selectors.

Thank you.

> I just think that there would be significant positive synergy from reusing 
> the infrastructure already established for emojis.

I think this is a huge distraction from the simple and robust proposal made. 
Emoji is a different kind of thing. 

>> One nice thing about the existing VS proposal is that it does not require 
>> any heuristics at all. Each square is explicitly marked as light or dark, 
>> with no guessing needed.
> 
> The UI/UX drawback is that authors have to explicitly mark every field — 
> unless you put the heuristics there.

Yes. Authors should have to explicitly mark every field. That gives a 
consistent number of encoded characters for each square, which helps to 
facilitate fallback reading when the ligation is not available. 

OK, nothing new has been offered on this topic for a long time. Thank you for 
your support of the VS proposal, Christoph. Your supplementary proposals didn’t 
make it better to achieve the goal: to remove the barrier to people using 
Unicode instead of various mutually-incompatible dingbat fonts for something 
they already regularly do.

Michael Everson


Re: Proposal to add standardized variation sequences for chess notation

2017-04-10 Thread Michael Everson
Kent,

I believe the box drawing characters are for drawing boxes and grids on 
computer terminals, which is not the same thing as scoring a line around a set 
of 64 graphic images. I don’t want to get mixed up in using the box-drawing 
characters. The characters which I have chosen work fine and to my mind suit 
the application better. 

I also don’t want to complicate chess fonts by having to have multiple choices 
within a font for bordering. For one thing, single-rule and double-rule 
bordering is by no means the gamut of possibility. Chess fonts do not have to 
be swiss-army knives. 

Thank you for your consideration, but I will stick with using the ⅛-block and 
quadrant characters. 

Michael Everson

> On 9 Apr 2017, at 18:02, Kent Karlsson <kent.karlsso...@telia.com> wrote:
> 
> 
> Den 2017-04-06 01:25, skrev "Michael Everson" <ever...@evertype.com>:
> 
> > Oh, here. This is what I would add. 
> > 
> > 2581 FE00; Chessboard box drawing; # LOWER ONE EIGHTH BLOCK
> > 258F FE00; Chessboard box drawing; # LEFT ONE EIGHTH BLOCK
> > 2594 FE00; Chessboard box drawing; # UPPER ONE EIGHTH BLOCK
> > 2595 FE00; Chessboard box drawing; # RIGHT ONE EIGHTH BLOCK
> > 2596 FE00; Chessboard box drawing; # QUADRANT LOWER LEFT
> > 2597 FE00; Chessboard box drawing; # QUADRANT LOWER RIGHT
> > 2598 FE00; Chessboard box drawing; # QUADRANT UPPER LEFT
> > 259D FE00; Chessboard box drawing; # QUADRANT UPPER RIGHT
> 
> Instead of that, I'd suggest:
> 2500 FE00; Chessboard box drawing (top); # BOX DRAWINGS LIGHT HORIZONTAL 
> (U+2500)
> 2500 FE01; Chessboard box drawing (bottom); # BOX DRAWINGS LIGHT HORIZONTAL 
> (U+2500)
> 2502 FE00; Chessboard box drawing (left); # BOX DRAWINGS LIGHT VERTICAL 
> (U+2502)
> 2502 FE01; Chessboard box drawing (right); # BOX DRAWINGS LIGHT VERTICAL 
> (U+2502)
> 250C FE00; Chessboard box drawing; # BOX DRAWINGS LIGHT DOWN AND RIGHT 
> (U+250C)
> 2510 FE00; Chessboard box drawing; # BOX DRAWINGS LIGHT DOWN AND LEFT (U+2510)
> 2514 FE00; Chessboard box drawing; # BOX DRAWINGS LIGHT UP AND RIGHT (U+2514)
> 2518 FE00; Chessboard box drawing; # BOX DRAWINGS LIGHT UP AND LEFT (U+2518)
> 
> These are more likely to be supported (by (fixed-width) fonts) in fallback 
> than the ones you suggest.
> They are also intended for box drawing (unlike the ones you suggest).
> 
> Perhaps also, since you exemplify also with double borders in your document:
> 2550 FE00; Chessboard box drawing (top); # BOX DRAWINGS DOUBLE HORIZONTAL 
> (U+2550)
> 2550 FE01; Chessboard box drawing (bottom); # BOX DRAWINGS DOUBLE HORIZONTAL 
> (U+2550)
> 2551 FE00; Chessboard box drawing (left); # BOX DRAWINGS DOUBLE VERTICAL 
> (U+2551)
> 2551 FE01; Chessboard box drawing (right); # BOX DRAWINGS DOUBLE VERTICAL 
> (U+2551)
> 2554 FE00; Chessboard box drawing; # BOX DRAWINGS DOUBLE DOWN AND RIGHT 
> (U+2554)
> 2557 FE00; Chessboard box drawing; # BOX DRAWINGS DOUBLE DOWN AND LEFT 
> (U+2557)
> 255A FE00; Chessboard box drawing; # BOX DRAWINGS DOUBLE UP AND RIGHT (U+255A)
> 255D FE00; Chessboard box drawing; # BOX DRAWINGS DOUBLE UP AND LEFT (U+255D)
> 
> /Kent K




Re: Proposal to add standardized variation sequences for chess notation

2017-04-08 Thread Michael Everson
On 8 Apr 2017, at 22:23, Asmus Freytag <asm...@ix.netcom.com> wrote:

> Time for Sarasvati to pull the plug on this thread?

Useful input has been gratefully received. I thank those gave it.

Michael Everson


Re: Proposal to add standardized variation sequences for chess notation

2017-04-08 Thread Michael Everson
> On 8 Apr 2017, at 15:14, Philippe Verdy <verd...@wanadoo.fr> wrote:
> 
> 2017-04-08 15:59 GMT+02:00 Michael Everson <ever...@evertype.com>:
> >> We’re not proposing to “implement a game”.
> >
> > You were yourself speaking about applications, me too, not just a "game".
> 
> No, I wasn’t.
> 
> I can quote your own message just posted 3 hours ago? YOU REALLY USED the 
> term "game" and wanted developers to use fonts for them.

Please learn to read. 

> This is definitely not what most chess game developers do and have done since 
> long, becaues fonts are definitely not easily integrable and give 
> unpredictable results. They would not accept the kind of fallbacks you 
> document for encoding in plain text.
> 
> QUOTING YOUR OWN MESSAGE BELOW.
> 
> 2017-04-08 13:10 GMT+02:00 Michael Everson <ever...@evertype.com>:
> Developers can already use the encoded chess characters in game apps if they 
> want.
> 
> If we have a set of standardized variation sequences for chess notation, then 
> if game developers want to use them, who is to complain?

This means, I would not complain, because that isn’t the point of the proposal, 
and if they use text and fonts or if they use graphics is of no consequence. 
They can do whatever they need for their purposess. 

> But that is not the point of this proposal,

See? Gaming apps is not the point of the proposal. 

> which is to enable people working with chess notation to be able to use the 
> UCS (which they aren’t doing). An app interface has not the same plain-text 
> requirement that people working with chess data do.
> 
> (They ARE using fonts, which shows they want to do this in text. They are NOT 
> using UCS characters, and they do NOT have a coherent model amongst any of 
> their hacks.) 
> 




Re: Proposal to add standardized variation sequences for chess notation

2017-04-08 Thread Michael Everson
On 8 Apr 2017, at 14:50, Philippe Verdy <verd...@wanadoo.fr> wrote:

>>> May be they use fonts,
>> 
>> There is no maybe about it.
> 
> There REALLY IS a "maybe", because this is not required at all, and most 
> chess applications do not use any "font" (most of them display bitmap icons, 
> or custom 2D/3D graphics) 

The proposal is not based on the practice of chess game applications. The 
proposal deals with the problem of typesetting chess diagrams. This is a 
publishing function. 

>>> And SVG glyphs are easier to integrate in derived documents.
>> 
>> Nonsense.
> 
> Non sense reply !!! Custom fonts

What? Fonts. You know. Fonts. Truetype with OpenType tables for glyph 
substition. This is nothing special. This is bog-standard. 

> are hard to integrate as they depend on renderers (which most applications 
> don't want to support directly, they are part of a browser or OS). And 
> OpenType fonts are much less flexible for what applications want to do. SVG 
> allows much easier variations and effects. There are tons of tools or 
> stylesheets for that, which will not work on glyphs in OpenType fonts.

This has nothing to do with the proposal. 

>> We’re not proposing to “implement a game”.
> 
> You were yourself speaking about applications, me too, not just a "game".

No, I wasn’t. 

Michael Everson


Re: Proposal to add standardized variation sequences for chess notation

2017-04-08 Thread Michael Everson
On 8 Apr 2017, at 13:01, Philippe Verdy <verd...@wanadoo.fr> wrote:

> (They ARE using fonts, which shows they want to do this in text. They are NOT 
> using UCS characters, and they do NOT have a coherent model amongst any of 
> their hacks.)
> 
> May be they use fonts,

There is no maybe about it. 

> but is OpenType the best tool for applications to create indexed collections 
> of glyphs?

Standardized variation sequences for specific glyph presentation is a part of 
our standard. I have implemented this for the purposes described and it works. 
I implemented it with Williams font and it works. William implemented it in his 
font on his own and it works. 

What does this have to do with “indexed collections of glyphs”?

> SVG fonts are much easier to develop and change as they want.

Red herring.

> And SVG glyphs are easier to integrate in derived documents.

Nonsense. 

> For implementing a simple game, they don't need large collections. They can 
> more easily integrate photographic features, or 3D features. OpenType 
> implementations suffer from a huge resistance for newer features many 
> features don't work if at the same time the Opentype renderer is not updated 
> on the supporting platform (OS or web browser)

We’re not proposing to “implement a game”. 

> OK there are some new SVG features as well, but they are much more tested 
> than those in OpenType and much better documented, and don't suffer from 
> various propritary extensions (such as font hinting which is definitely not 
> "Open" and extremely poorly documented with many internal tricks made to 
> restrict their use on specific OSes, plus stupid limitations/bugs in the way 
> they were encoded, with no vision at all for their evolution or interaction 
> with other features)...

This has nothing to do with our proposal, or with the current practice of the 
chess commmunity.

Michael Everson


Re: Proposal to add standardized variation sequences for chess notation

2017-04-08 Thread Michael Everson
On 8 Apr 2017, at 02:02, Asmus Freytag <asm...@ix.netcom.com> wrote:

>> This isn’t about game play.
> 
> Why rule this out? Once you have a plain text solution, you'll enable any 
> plain text platform.
> 
> Seems almost churlish to want to limit what you can do...in what would be 
> after the fact.

Developers can already use the encoded chess characters in game apps if they 
want. 

If we have a set of standardized variation sequences for chess notation, then 
if game developers want to use them, who is to complain? But that is not the 
point of this proposal, which is to enable people working with chess notation 
to be able to use the UCS (which they aren’t doing). An app interface has not 
the same plain-text requirement that people working with chess data do. 

(They ARE using fonts, which shows they want to do this in text. They are NOT 
using UCS characters, and they do NOT have a coherent model amongst any of 
their hacks.)

Michael Everson


Re: Proposal to add standardized variation sequences for chess notation

2017-04-07 Thread Michael Everson
On 8 Apr 2017, at 00:28, Rebecca T <637...@gmail.com> wrote:

> > while evidently there are users who need to send BROCCOLI to one another,
> > nobody but nobody needs to send an 8 x 8 chessboard matrix in a tweet. Get
> > it?
> 
> I simply must disagree; sending a textual chessboard sounds awesome! A 
> twitter bot that plays chess with you and shows you a graphical 
> representation of the board would be great!

This isn’t about game play.

Even if you get the UTC to bless chess pieces as emoji (why?) that would not 
affect this proposal, as other VS characters are used for emoji. 

Michael Everson


Re: Proposal to add standardized variation sequences for chess notation

2017-04-07 Thread Michael Everson
tors in order to ensure that rows in rectangular grid patterns 
made with ▨︁ and □︀ look correct. The examples in the proposal were made with 
fonds using the base characters and VS sequences proposed.) 

> (and there will still be problems with △▽ because they will actually need to 
> cover more than their rectangular cells with twho corners extending outside 
> of it with additional kerning, not suitable for mathematics).

△ and ▽ have nothing to do with this proposal, because those shapes are not 
used in Chess. Actually, I don’t think they have anything to do with shogi, 
either. None of the boards on the Wikipedia pages about shogi (in English, 
French, or Japanese) have any triangle-shaped board cells on them. 

What’s the French for “red herring”?

> And the poroblem with such grid patterns is more generic than just chess 
> diagrams.

All symbol systems have potential similarities to one another. 

> We should be able to represent directly at least several well known patterns 
> of cells/tiles (optionally colored when this matters), and then be able to 
> combine them with any chacter/cluster inside them (for example for classic 
> crosswords, Scrabble, triominos and similar games).

I don’t need to do that. I need a simple way to use the UCS to do what people 
have been doing with chess data for 

> We need a way to represent grids made with square/rectangular cells, or 
> triangular/hexagonal cells (for triangular and hexagonal cells we need 
> additional half-cells to properly align rows at least at start or rows, and 
> hexagonal cells will partly extend over the previous and next row

I don’t even know if all of that’s feasible in fonts. 

> So I would prefer a proposal to:
> * add specific symbol characters for these common patterns of cells 
> (rectangular/square, triangular, hexagonal), plus half-cells for use at start 
> and end of rows (if rows are not aligned vertically but in create triangular 
> layouts),

You can write proposals for anything you want to. 

> * optionally followed by some variant selectors for mapping some semantic 
> colors on them (semantic color means "light" and "dark" may be "white" and 
> "black, or "ivory" and "wood", or "yellow" and "red", or "empty/transparent" 
> vs. "hatched" with monochromatic rendering where colors are replaced by fill 
> patterns such as ///, or dots with some density; we should have about 8 
> semantic colors, representable with actual colors or grey or fill patterns). 
> The common "black square" and "white square" (the white version would be the 
> default semantic color and would not need any additional variant).

"The more you overtick the plumbing, the easier it is to stop up the drain.” — 
Cmdr Montgomery Scott. 

> * and then use ZWJ to combine them with letters/symbols to be centered within 
> them (possibly some extended clusters such as letters+combining subscript 
> digits in Scrabble)

Scrabble. My word. 

No. The present proposal meets a particular need: To enable the UCS to be able 
to set chess diagrams. 

Michael Everson


Re: Proposal to add standardized variation sequences for chess notation

2017-04-07 Thread Michael Everson
On 7 Apr 2017, at 23:17, Christoph Päper <christoph.pae...@crissov.de> wrote:

>> The only connection this has with emoji is that it uses the variation 
>> selector system.
> 
> As I've shown, that's not the *only* connection.

Christoph, YOU ARE WRONG.

Emoji as a special relationship with vendors and a particular implementation 
environment. 

Vendors via the UTC look at symbol and pictograph and other characters and 
decide if they want to give these symbols and pictographs and other characters 
the special characteristic which implies generally colour rendering and implies 
an obligation to supply input methods for those characters. That is expensive, 
and while evidently there are users who need to send BROCCOLI to one another, 
nobody but nobody needs to send an 8 x 8 chessboard matrix in a tweet. Get it?

Emoji has nothing to do with the proposal to support standardized variation 
sequences for use with chess characters to provide support for their usage in 
chessboard diagrams. 

Please stop trying to conflate emoji and chess characters. It is NOT, I think, 
a solution which the UTC would agree to. I would oppose it in SC2. 

>> None of that is necessary, or relevant, to chess diagrams.
> 
> Chess diagrams (unlike chess notation) are often rendered as graphics, not 
> text.

Because there is no robust text representation of chess diagrams. This proposal 
shows how very easy it is to support that behaviour, in parseable and 
interchangeable text, so that unparseable graphics don’t have to be used. 

> Board and glyphs may have fancy designs and colors, e.g. wooden fields.

Two centuries of standard chess diagramming practice is all that’s needed to 
support. That’s text. That’s data. That’s what’s important. You want a pretty 
chess program, you can go download one. That’s not the same as this. 

>> I don't believe emoji are even necessarily fixed-width.
> 
> In all existing implementations they are.

That’s not true. 

> They are even always square. I'm not sure whether their em square always 
> matches the sinographic ("ideographic”) square, but it seems as if it usually 
> does.

Not always, and that’s enough chaos. There is no standardization currently in 
chess fonts. One of them splits queens and rooks into two separate characters. 
This proposal solves that.

> Without the need for ZWJ sequences, Opentype fonts can employ their 
> Contextual Alternates `calt` feature to select the correct background color 
> in diagram notation: In a sequence of up to eight chess pieces without an 
> empty square with explicit color, an initial U+2656-FE0F White Rook, 
> U+2654-FE0F White King, U+265B-FE0F Black Queen or U+265F-FE0F Black Pawn 
> would default to a black background, U+2659-FE0F White Pawn, U+2655 White 
> Queen, U+265A-FE0F Black King or U+265C-FE0F Black Rook to a white 
> background. Other than that, each character uses the alternate glyph with 
> opposing background color from its preceding (left-side) glyph. The empty 
> squares work as explicit anchors.

Well that’s a lot of effort to go to. And there’s no legible fallback if the 
“calt” features can’t be invoked. This is a bad solution. Thank you for 
suggesting it. 

> If you want, I could write and post the code in Adobe OT feature file 
> notation required for `calt` to demonstrate that this would yield results as 
> expected for all full-size 8*8 diagrams and even for many detail diagrams of 
> a section of the board.

And when “calt” substitutions can’t be displayed? What kind of fallback do you 
have?

Michael Everson


Re: Coloured Punctuation and Annotation

2017-04-07 Thread Michael Everson
On 7 Apr 2017, at 11:01, Richard Wordingham <richard.wording...@ntlworld.com> 
wrote:
> 
> Of course, if U+25A1 WHITE SQUARE is the outline of a square, it then seems 
> odd that a valid presentation form should be just a spacing glyph, as seems 
> to be preferred for chess boards!  I suppose this could be considered an edge 
> case :-)

Using SP or NBSP would not be a good idea. Spaces separate things and have 
complex properties. The light and dark squares on a chessboard are squares, not 
one square and one nirvāṇic emptiness. Yes, the VS applied to WHITE SQUARE 
makes it em-square sized and removes the outline, but that’s a specific glyph 
for a specific purpose. 

Michael Everson


Re: PETSCII mapping?

2017-04-06 Thread Michael Everson
On 6 Apr 2017, at 17:36, Rebecca Bettencourt <beckie...@gmail.com> wrote:
> 
> At some point this should be taken off the main list since discussion will 
> get very detailed very quickly.
> 
> I agree. How should we get all the interested parties together?

Everybody interested, raise your hand…

Michael Everson


Re: Proposal to add standardized variation sequences for chess notation

2017-04-06 Thread Michael Everson
On 6 Apr 2017, at 17:24, Kent Karlsson <kent.karlsso...@telia.com> wrote:

> One in one single font (according to your current proposal), one can only 
> have EITHER terminal emulator version, OR chess border version. Not both. 
> Using variant selectors for the chess border variants allow for both glyph 
> variants. Maybe it does not make much difference in a proportional font. But 
> for a "mono-width" font the terminal emulator versions for these border 
> characters would be "narrow", but the chess border versions should be 
> "fullwidh"/"square" (compare CJK in terminals; double the width of, e.g., 
> Latin characters).

Hm. Time for me to put VS support into Everson Mono, than, and see what 
happens. But I think you’re probably right, though. 

Tak for hjælpet.

Michael Everson


Re: Standaridized variation sequences for the Desert alphabet?

2017-04-06 Thread Michael Everson
On 6 Apr 2017, at 16:05, Mark Davis ☕️ <m...@macchiato.com> wrote:

>> I just get frustrated when everyone including the veterans seems to forget 
>> every bit of precedent that we have for the useful encoding of characters.
> 
> ​Nobody's forgetting anything. ​Simply because people disagree with you 
> doesn't mean they are forgetful or stupid. One could just as well respond 
> that you are forgetting that Unicode is not a glyph standard. Merely because 
> a character have multiple shapes is not grounds for disunifying it.

The ignoring of reasonable precedent does not make the UTC seem reasonable. In 
terms of Deseret, the suggestion that characters Ѕ/Ћ/Ѓ/Љ with a stroke derived 
from І are glyph variants of one another simply makes no sense at all. We have 
honed over many years our understanding of writing systems, and saying “Oh, 
Љ-with-stroke and Ѓ-with stroke are variant shapes of the same thing”… Anyone 
can see that this is not true. 

The vexing thing is that one can never rely on consistency in the UTC’s 
approaches to any proposal. I have discussed this with other successful and 
prolific proposal writers. It’s always a coin-toss as to how a proposal will be 
viewed. 

The recent instance of adding attested capital letters for ʂ and ʐ is a perfect 
example. We have seen before some desire to see evidence for casing pairs 
(though often it has not been sought.) We have never before seen evidence for 
casing pairs to be thrown out. Case, of course, is a function of the Latin 
script, just as it is of Greek and Cyrillic and Armenian and Cherokee and both 
Georgian scripts and others. The UTC’s refusal to encode attested capitals for 
ʂ and ʐ simply makes no sense. 

Your statement "Merely because a character have multiple shapes is not grounds 
for disunifying it” suggests an underlying view that "everything is already 
encoded and additions are disunifications”. I do not subscribe to this view. 

Michael Everson


Re: Standaridized variation sequences for the Desert alphabet?

2017-04-06 Thread Michael Everson
r oi in earlier texts
>>>> 1 DESERET CAPITAL LETTER LONG AH WITH STROKE
>>>>* used for oi in later texts
>>>> 1 DESERET CAPITAL LETTER SHORT OO WITH STROKE
>>>>* used for ew in later texts
>>> 
>>> Currently, it has this:
>>> 
>>> 10426 Ц DESERET CAPITAL LETTER OI
>>> 
>>> 10427 Ч DESERET CAPITAL LETTER EW
>> 
>> You are being deliberately obtuse. Note that I stated clearly “officially 
>> named ‘ew/oi’ in the code chart”.
> 
> Well, if you think I'm deliberately obtuse, then I'd have to say that I think 
> you're (deliberately?) obscure.

I was making a point; sorry if you didn’t catch it. The names as given in that 
list above are the kinds of descriptions of the letters that we often give. We 
have LATIN LETTER THORN WITH STROKE. We might have named it LATIN LETTER THAT. 

> You repeat hypothetical, non-existing names

They’re descriptive of the letter, not of the diphthong.

> such as "DESERET CAPITAL LETTER LONG OO WITH STROKE" over and over, using 
> capitals to make then look like the actual names, and bury the actual names 
> (such as "DESERET CAPITAL LETTER OI") by shortening and lowercasing them.

Well, I lowercased them because lowercase is used in informative notes. Anyway, 
sorry if my rhetoric failed to hit the mark. :-) 

> But even if that weren't the case, we would still want to treat it as one and 
> the same character, with a single code point. It would still be hopelessly 
> impractical for Germans to use two different characters, when they only can 
> decide which character to type once they have seen the actual character in 
> the font they type, and have to potentially change the character if they 
> change the font.

But even if we did encode an ſʒ letter (similar to the T-Z ligature-letter Ꜩ ꜩ 
we did encode) it would be encoded for a special purpose, and wouldn’t be 
intended to affect standard German. Look, we can write schön and we can write 
ſchoͤn and nobody’s affected by the latter. 

> And while we currently have no evidence that Deseret had developed a 
> typographic tradition where some type styles would use one set of ligatures, 
> and other styles would use another set, it wouldn't be possible to reject 
> this possibility without actually trying to find evidence one way or another.

There was type during the heyday of Deseret use, and evidence for several sorts 
but no typographic “tradition” really. That’s happened latterly. 

>> Your argument seemed to be based solely on the use of the letters for the 
>> sounds, ignoring the historical derivation and the facts of the spelling 
>> reform in Deseret.
> 
> The spelling reform is fine. What is important is what happened after the 
> spelling reform. Were the 1855 variants replaced by the 1859 variants? Was it 
> two different traditions, separated in some way or other? Or was it in effect 
> more like a mixture of both?
> (or maybe we don't know, or it's a little of everything?)

Where they were replaced, it helps to identify the provenance of a text. There 
are also some texts where there’s a bit of a mix. In fact adding some letters 
to the standard for Deseret will improve users’ ability to represent the 
historical texts. For those relatively few people who are creating new texts 
now, they will be able to choose what letters they need. Some, like John, don’t 
use the diphthong letters at all. In fact most modern readers read John’s 
texts, so few would probably worry about the other letters. 

> Examining these questions and bringing the available data to light and 
> clarifying the limits of our data and our understanding is very important. 
> Only in this way can we make decisions that will hopefully be valid for the 
> rest of the existence of Unicode (which might be quite a few decades at 
> least), or decisions that at a minimum might be evaluated as "well, they 
> didn't know better then", rather than as "they definitely should have known 
> better, even then”.

Really, my practice when approaching this is the same as it has been for 
additions to Latin or Greek or Cyrillic. I’m quite consistent. :-) 

>> A proposal will be forthcoming. I want to thank several people who have 
>> written to me privately supporting my position with regard to this topic on 
>> this list. I can only say that supporting me in public is more useful than 
>> supporting me in private.
> 
> I'm looking forward to your proposal. I hope it clearly indicates why (you 
> think) there's no danger of inconveniencing modern practitioners.

To be honest, we didn’t have to say “r rotunda will not affect modern users of 
the Latin script”, now, did we? :-)

Today I received Ken’s book on the Deseret-script English-Hopi vocabulary. This 
will help us move forward with a proposal.

Best,
Michael Everson


Re: Proposal to add standardized variation sequences for chess notation

2017-04-06 Thread Michael Everson
On 6 Apr 2017, at 13:19, Christoph Päper <christoph.pae...@crissov.de> wrote:
> 
> Although Michael Everson readily dismisses any connection to emojis, e.g. 
> L2/16-021 or L2/16-087+088, and hence the Emoji and Emoji_Presentation 
> character properties as well as sequences with variation selectors 15 and 16 
> (U+FE0E/F), normal emoji design actually matches "diagram" notation quite 
> nicely in that all emoji glyphs are rendered within an (ideographic / em) 
> square. 

No, no. Emojis are something else very specific and very expensive with 
implications for vendors and having to do with colour. Look at zero:

U+0030 - 0 - DIGIT ZERO
U+0030 FE00 - 0︀ - short diagonal stroke form
U+0030 FE0E - 0︎ - text style
U+0030 FE0F - 0️ - emoji style

Emoji is something else. Emoji is a fine thing, but it’s not chessboard 
typesetting. 

Michael Everson


Re: Eszett variation sequence

2017-04-06 Thread Michael Everson
Can you give an example of any font which has two glyphs in it for ß?

I mean, I was in Berlin and I took this picture:

http://evertype.com/standards/unicode-list/seydlitzstr.jpg

Do you think we should encode a Latin straight y (like the Cyrillic one) so we 
can write Seүdlitzstraſʒe?

> On 6 Apr 2017, at 13:37, Christoph Päper  wrote:
> 
> U+00DF Latin Letter Sharp S ⟨ß⟩ has at least two rather different visual 
> styles resulting from a ligature of either long and round lowercase S, ⟨ſs⟩, 
> or of long S and normal or tailed lowercase Z, ⟨ſz⟩ or ⟨ſʒ⟩. Most modern 
> typeface designs follow the first style and sometimes the right-hand side is 
> quite distinct from the shape of the round S in the same font. In some cases 
> it makes sense to distinguish the glyphic origins, because, by orthographic 
> or graphotactic means, for instance, an _sz_ digraph is appropriate in 
> different places than an _ss_ repeated letter.
> 
> Would it make sense to propose standardized variation sequences for these 
> styles or should this be left to font features like `cv##` or `calt` in 
> Opentype?





Re: Proposal to add standardized variation sequences for chess notation

2017-04-06 Thread Michael Everson
On 6 Apr 2017, at 11:00, Christoph Päper <christoph.pae...@crissov.de> wrote:
> 
> Michael Everson <ever...@evertype.com>:
>> 
>> Standardized variation sequences are the best way to achieve this simply and 
>> without needless duplication. :-)
> 
> I still agree with this assertion.

So do I.. ;-)

>> Yes but you still want it to be reasonably legible when the OpenType 
>> ligatures fail.
> 
> This is were I don't follow.

Why wouldn’t you want it to be reasonably legible when the OpenType ligatures 
can’t be displayed?

▗▖
▕□︀▨︁□︀▨︁□︀▨︁♞︀▨︁▏
▕▨︁□︀▨︁□︀▨︁□︀▨︁□︀▏
▕□︀▨︁♔︀▨︁□︀▨︁□︀▨︁▏
▕▨︁□︀▨︁□︀▨︁♘︀▨︁□︀▏
▕□︀▨︁□︀▨︁♚︀▨︁□︀▨︁▏
▕▨︁□︀▨︁□︀▨︁□︀▨︁□︀▏
▕□︀▨︁□︀♙︁♛︀▨︁□︀▨︁▏
▕▨︁□︀♕︁□︀▨︁♖︀▨︁□︀▏
▝▘
is far better than this:
▗▖
▕□︀□︀□︀□︀□︀□︀♞︀□︀▏
▕□︀□︀□︀□︀□︀□︀□︀□︀▏
▕□︀□︀♔︀□︀□︀□︀□︀□︀▏
▕□︀□︀□︀□︀□︀♘︀□︀□︀▏
▕□︀□︀□︀□︀♚︀□︀□︀□︀▏
▕□︀□︀□︀□︀□︀□︀□︀□︀▏
▕□︀□︀□︀♙︁♛︀□︀□︀□︀▏<< Is it the pawn or the queen that’s on the black square?
▕□︀□︀♕︁□︀□︀♖︀□︀□︀▏
▝▘

> It *looks* far better in a multi-line plain text environment, but that's a 
> glyphic/typographic/stylistic argument.

It’s an argument for legibility. 

> The semantics conveyed are redundantly encoded this way, so I wouldn't say it 
> was far better. This alternating pattern is far more redundant than, say, 
> pairs of opening and closing characters (brackets, quotation marks).

It’s not redundant to the reader. The reader of the second one has to remember 
that the dark square is the lower left, and then count in order to know the 
colour of any given square. The reader of the first one doesn’t have to do 
this, because we have both ▨︁ and □︀, two encoded characters, and we use them 
for convenience. 

> Aside, good fallback isn't something the UTC seems to be concerned with 
> lately, 

Inconsistency on the part of the UTC is not my concern. I have to 

> see emoji subregion flags that are all represented by Waving Black Flag in 
> legacy implementations (possibly followed by TOFU).

Yes, well, that’s an example of a decision that didn’t have good oversight or 
feedback, perhaps. I do know that falling back to a black flag rather than to 
the Union flag for Wales, England, and Scotland doesn’t seem very sensible. 
Leaving out the de-facto flag of Northern Ireland wasn’t very wise either, 
though nobody asked the UK or Irish representatives of SC2 their opinion about 
it. 

>> See? To parse this one you have to remember which of the white squares are 
>> the alternating black ones.
> 
> No, you only have to remember that A1, i.e. the lower left square initially 
> occupied by a white rook, is black.

You have to remember that, and then you have to count every other square in 
whatever direction to know what colour a given square is. That’s not very 
user-friendly. And it’s easy to be user friendly. Just use both ▨︁ and □︀. 

> For legal moves, the color pattern hardly matters, unless - regarding pawns - 
> it was common practice to render the board turned, i.e. with the white player 
> not at the bottom, but at the top (or left or right) side, and without 
> alphabetic column and numeric row labels. 

For legal moves, no. But this is text. The table is meant to be read. Since it 
is, good fallback is better than bad fallback. 

>> The colour of the matrix is NOT redundant for a human reader.
> 
> That's what this proposal is all about. It's a good and sound proposal, 
> except for the empty square.

Do you mean “except for the light and dark squares without a piece on them” or 
“except for the light square without a piece on it”? The convention is to have 
two alternating shades on the squares and there’s no advantage to the human 
reader to quash this distinction. 

What is your specific counter-proposal?

Michael Everson


Re: PETSCII mapping?

2017-04-06 Thread Michael Everson
On 6 Apr 2017, at 04:32, Rebecca Bettencourt <beckie...@gmail.com> wrote:

> We do have to provide Unicode with fonts, I believe. We can use an existing 
> C64 font, such as Pet Me. Or, we can create a new font with vectorized 
> versions of the characters.

I’ll help with that; we should harmonize with other characters in the standard.

At some point this should be taken off the main list since discussion will get 
very detailed very quickly.

Michael Everson


Re: Coloured Punctuation and Annotation

2017-04-06 Thread Michael Everson

> On 6 Apr 2017, at 05:41, Richard Wordingham <richard.wording...@ntlworld.com> 
> wrote:
> 
> On Thu, 6 Apr 2017 01:11:09 +0100
> Michael Everson <ever...@evertype.com> wrote:
> 
>> On 5 Apr 2017, at 22:48, Richard Wordingham
>> <richard.wording...@ntlworld.com> wrote:
>> 
>>> I tried to read it from UTS#51 ‘Unicode Emoji', which is not part of TUS, 
>>> but I couldn't deduce that a font that enables U+10B99 PSALTER PAHLAVI 
>>> SECTION MARK to have exactly two (as opposed to none or four) red dots is 
>>> in breach of the guidelines therein. 
>> 
>> Kindly explain how ANY font could do this.
> 
> Is this a trick question?

No. Here is an example of a font available in two variants. In one variant, all 
those grey swirls are fused to the letters, and it can all be printed in black 
or one colour ink. http://cdn.myfonts.net/s/aw/original/255/0/131020.png 

There is also a second set of fonts included which separates the swirls from 
the letters, and those can be used in typesetting to get the two-colour effect 
you see here. That can’t really be done using standard encoding. You’d probably 
see IIVVOORRYY in the backing store for that word, with every other letter 
being set in the letter font and the swirl font. 

Emoji-style colour fonts use other mechanisms for colour.

Michael Everson


Re: Proposal to add standardized variation sequences for chess notation

2017-04-06 Thread Michael Everson
On 6 Apr 2017, at 04:24, Martin J. Dürst <due...@it.aoyama.ac.jp> wrote:

>> http://evertype.com/standards/unicode-list/looking-glass-yellow-blue.png
> 
> [OT]
> It looks neat. But I noticed three very small gaps in each of the top and 
> bottom borders.

I have not done anything to optimize display in these fonts. They were 
proof-of-concept fonts for the sequences. It’s easy to fix those… just drag the 
glyph and make it a bit longer. One does the same thing in Arabic fonts. 

> Also, it's probably not the best choice of colors, because my eyes tend to 
> associate the yellow figures with white, and the blue ones with black, but 
> thinking it through makes it clear that it's the other way round.

I just picked the process colours cyan and yellow, but it was Richard who had 
specified the colours: “Now, what happens to the two scheme if rendered with 
yellow text ('foreground') on a blue background?" 

Michael Everson


Re: Proposal to add standardized variation sequences for chess notation

2017-04-05 Thread Michael Everson
On 6 Apr 2017, at 02:05, Kent Karlsson <kent.karlsso...@telia.com> wrote:

>> Do generic font makers intend to support both graphic terminal emulation and 
>> chess?
> 
> I don't know. But it should not be impossible to do so.

And you think the proposal as it does leads to that?

>> Should chess font makers be burdened with graphic terminal emulation glyphs 
>> they know nothing about?
> 
> If it is really a chess font, they can just use the glyphs for the chess 
> variety also as the "plain" (terminal emulator variety), and it would not 
> matter (as long as no-one insist on using it for terminal emulation).

Ha, so you’re saying it’s mostly for things like Everson Mono that it matters… 
;-)

> All that is needed for that is a manoeuvre to copy a few glyphs within the 
> font (when creating the font). I guess that is not very hard…

It is not.

Michael Everson


Re: Proposal to add standardized variation sequences for chess notation

2017-04-05 Thread Michael Everson
On 6 Apr 2017, at 01:54, Kent Karlsson  wrote:

>>> - some bidi fix [preferably making the box/border drawing characters bidi 
>>> "L", if possible; otherwise a caveat that if there is an expectation to 
>>> paste in such a board into an RTL document, bidi controls need be used to 
>>> LTR the board]).
>> 
>> I donąt know if there is a problem here and am not able to offer a solution 
>> if there is. I donąt object to a solution, if there is a problem.
> 
> I would think

Come on. This is a serious proposal. I’m glad you support it, but if you are 
going to raise an issue like this, “I would think and guess about a problem” 
isn’t the same as “I have tried and here’s an actual problem”. 

Roozbeh, there’s an issue that might benefit from your expertise. Can you look 
into it? Discussion needn’t occur here, but offline with Kent and me, if you 
prefer. 

> that anyone pasting a chess board (ŕ la your proposal) to an RTL context will 
> see that something went amiss,

Will they? Why?

> and also know enough about bidi to set the bidi context to LTR for the chess 
> board(s),

RTL users understand the problems of cutting and pasting LTR text and symbols, 
certainly. LTR users don’t. 

> either by some setting, or by inserting bidi control characters.

Well, if there’s a problem it should be well-defined so it can be tackled. 

> So a small caveat is all that is necessary. Like: "The chess boards are 
> assumed to be set in a left-to-right bidi context.”

THAT I can put into the document, but since chess is as important in both the 
RTL and LTR worlds, it would be good to know what’s what. 

Thank you again for your thoughtfulness,

Michael


Re: Proposal to add standardized variation sequences for chess notation

2017-04-05 Thread Michael Everson
On 6 Apr 2017, at 01:53, Kent Karlsson  wrote:
> 
>> Oh, you misunderstood me. I knew it was raw HTML. I didn¹t expect it to 
>> render. But it was meaningless code.
> 
> It was a response to Marcus, in that HTML might be used (with existing 
> characters and no VSs) to format chess boards. And he is right, as proven by 
> the HTML code I (basically) copied from stackoverflow.

Yes, I know this of course. (Well, whatever stackoverflow is.)

> And it does typeset better plain text chess boards à la your proposal…

Not with ordinary fonts and Unicode characters. And typographic care. 


Re: Proposal to add standardized variation sequences for chess notation

2017-04-05 Thread Michael Everson
Well, see my follow-up to James Kass and evaluate the merits of the two 
choices. Do generic font makers intend to support both graphic terminal 
emulation and chess? Should chess font makers be burdened with graphic terminal 
emulation glyphs they know nothing about?

> On 6 Apr 2017, at 01:31, Kent Karlsson <kent.karlsso...@telia.com> wrote:
> 
> 
> Exactly.
> 
> /K
> 
> Den 2017-04-06 01:25, skrev "Michael Everson" <ever...@evertype.com>:
> 
>> 2581 FE00; Chessboard box drawing; # LOWER ONE EIGHTH BLOCK
>> 258F FE00; Chessboard box drawing; # LEFT ONE EIGHTH BLOCK
>> 2594 FE00; Chessboard box drawing; # UPPER ONE EIGHTH BLOCK
>> 2595 FE00; Chessboard box drawing; # RIGHT ONE EIGHTH BLOCK
>> 2596 FE00; Chessboard box drawing; # QUADRANT LOWER LEFT
>> 2597 FE00; Chessboard box drawing; # QUADRANT LOWER RIGHT
>> 2598 FE00; Chessboard box drawing; # QUADRANT UPPER LEFT
>> 259D FE00; Chessboard box drawing; # QUADRANT UPPER RIGHT
>> 
>> I guess I see your point. It does no harm, especially if the font might
>> possibly be used for graphics terminal emulation. ;-)
> 
> 




Re: Proposal to add standardized variation sequences for chess notation

2017-04-05 Thread Michael Everson
On 6 Apr 2017, at 00:12, James Kass <jameskass...@gmail.com> wrote:
> 
> Kent Karlsson wrote,
> 
>> - with the extra requirement to have VSs also for the boarder line drawing 
>> characters (to make them fit for drawing chess board boarders, in a general 
>> purpose font), and
> 
> This doesn't seem necessary.  A general purpose font modified to display the 
> chess board in plain text in accordance with Michael Everson's proposal would 
> be expected to use the same metrics as the box drawing glyphs for all of the 
> VS-produced glyphs.  A general purpose font *not* so modified would not be 
> expected to display the chessboard in a perfect square, anyway.  (Yet the 
> display would still be legible.)

Well.

1) A general purpose font that wanted to support chessboards as well as legacy 
graphic terminals would make use of VS for the border characters in order to be 
able to do both.

2) If we decided to standardizing on that would have to burden chess-font 
designers with either

a) learning how to draw graphic terminal characters correctly in their chess 
fonts along with the characters + VS for actual use

b) ignoring graphic terminal character shapes and just pasting in the chess 
shapes to those code positions

Michael Everson


Re: Coloured Punctuation and Annotation

2017-04-05 Thread Michael Everson

> On 5 Apr 2017, at 23:16, Asmus Freytag <asm...@ix.netcom.com> wrote:
> 
> Do you have any examples of plain text that is rendered with parts of 
> characters having white (opaque) background?
> 
> I'm not aware of any

There are certainly MSS (in many languages) where some punctuation made of dots 
have some of the dots red and some black. 

Michael Everson


Re: PETSCII mapping?

2017-04-05 Thread Michael Everson
I agree with Rebecca. It’s going to be a handful of characters, used by the 
handful of people who use legacy character sets. Those people exist (I run Mac 
OS 9 regularly because it’s necessary for some of my work) and since some of 
these legacy characters are encoded, it makes sense to make sure all of them 
are. It’s no harm to the standard to support them. 

Asmus is right. It needs a proposal. 

> On 5 Apr 2017, at 23:14, Asmus Freytag (c)  wrote:
> 
> On 4/5/2017 2:25 PM, Rebecca T wrote:
>> > If there's a credible need to convert files between Unicode-based systems 
>> > and
>> > those using PETSCII
>> 
>> There is! It’s called “sharing textual information” and it’s how our society
>> functions. Can we afford to blithely abandon data from the best selling
>> computer in history [1] because nobody cared to standardize its?
> 
> There's no need for inflammatory rhetoric.
> 
> If you believe there is a credible need, then it should be easy to document 
> that as part of a proposal.
> 
> Nothing gets decided by the UTC unless there's a proposal on the table.





Re: Coloured Punctuation and Annotation

2017-04-05 Thread Michael Everson
On 5 Apr 2017, at 22:48, Richard Wordingham <richard.wording...@ntlworld.com> 
wrote:

> I tried to read it from UTS#51 ‘Unicode Emoji', which is not part of TUS, but 
> I couldn't deduce that a font that enables U+10B99 PSALTER PAHLAVI SECTION 
> MARK to have exactly two (as opposed to none or four) red dots is in breach 
> of the guidelines therein.

Kindly explain how ANY font could do this.

> Are we really going to have to set up Psalter Pahlavi emoji? There's also 
> some encoded Ethiopic punctuation that certainly used to have red dots.

If you want 10B99 to have different coloured dots (the rings? the dots?) the 
only precedent we have in the UCS is (1) to name a whole glyph with a colour 
like RED APPLE and then to hatch the glyph in black and white or (2) use the 
emoji property.

> I think the emoji database has overlooked an entire script of emoji - the 
> Egyptian hieroglyphs!

Put it out of your mind. 

Michael Everson


Re: Proposal to add standardized variation sequences for chess notation

2017-04-05 Thread Michael Everson
On 5 Apr 2017, at 22:13, Kent Karlsson  wrote:
> 
> Kent, I can’t read this in a plain-text e-mail.
> 
> Well, it was SUPPOSED to be explicit HTML code in the email. It was NOT the 
> intent that the given example was to be
> rendered directly in the email (even if you have HTML emails enabled).

Oh, you misunderstood me. I knew it was raw HTML. I didn’t expect it to render. 
But it was meaningless code. 

The proposal for standardized variation sequences for chess treats it as text. 
Whether that text is analogous to ASCII art or not is irrelevant. The proposal 
solves a problem. giving good visual fallback, and excellent rendering if 
properly employed. It’s incredibly simple and uses 

> I agree that the HTML code is a bit of a mouthful (and I would also do it a 
> bit differently), and also has the problem
> mentioned in the previous paragraph). Which is why I support your proposal, 
> but with these modifications:
> 
>  - with the extra requirement to have VSs also for the boarder line drawing 
> characters (to make them fit for 
>drawing chess board boarders, in a general purpose font), and

Look, if that’s the price I’d have to pay to move forward with this I would. I 
don’t think it’s necessary. I *do* think that the definition of the resulting 
glyph “suitable for the chess glyphs in this font that supports … 

Oh, here. This is what I would add. 

2581 FE00; Chessboard box drawing; # LOWER ONE EIGHTH BLOCK
258F FE00; Chessboard box drawing; # LEFT ONE EIGHTH BLOCK
2594 FE00; Chessboard box drawing; # UPPER ONE EIGHTH BLOCK
2595 FE00; Chessboard box drawing; # RIGHT ONE EIGHTH BLOCK
2596 FE00; Chessboard box drawing; # QUADRANT LOWER LEFT
2597 FE00; Chessboard box drawing; # QUADRANT LOWER RIGHT
2598 FE00; Chessboard box drawing; # QUADRANT UPPER LEFT
259D FE00; Chessboard box drawing; # QUADRANT UPPER RIGHT

I guess I see your point. It does no harm, especially if the font might 
possibly be used for graphics terminal emulation. ;-)

>  - some bidi fix [preferably making the box/border drawing characters bidi 
> "L", if possible; otherwise a caveat that
>if there is an expectation to paste in such a board into an RTL document, 
> bidi controls need be used to LTR the board]).

I don’t know if there is a problem here and am not able to offer a solution if 
there is. I don’t object to a solution, if there is a problem. 

> Nit: You sometimes seem to have made the line spacing slightly larger (like 2 
> points) larger than the character width.

Different fonts have different metrics. The Ludus font supports many games, not 
just chess. 

> Should they not be exactly the same, to get the best (square) display of the 
> chess boards? (Not that it is very visible,
> but a bit.)

I didn’t overcompensate in the proposal document to make absolutely perfect 
charts; it’s reasonable to know that from font to font control over leading may 
be necessary.

> I think the "ligatures" approach is a dead end.

I hope others will think so too. 

[1]
>  - As you mention, the fallback will have very different line lengths for the 
> lines of a board display,
>and thus basically unreadable.

Since the proposal takes as read that chess data should be parseable and 
plain-text, an approach with better legibility should be considered superior to 
an approach with poorer legibility. 

[2]
>  - If ZWJ is not needed, one will need two *new* characters that (in some 
> fonts) ligate with chess pieces.
>No existing character should ever ligate with chess pieces.

I’d agree, for even if there were “ligate with light/dark chess square”, 
fallback would be illegible per [1] above. 

[3]
>  - If ZWJ is needed, then one can use some existing characters as board 
> squares.

Not sure what you mean, but it’s probably not important since ZWJ is a bad idea 
and because of [1] above.

[4]
>  - In either case, it is not clear (or obvious) which should come first, a 
> chess piece or a board square.
>There will surely be mistakes, giving them in the wrong order (not a 
> problem in your proposal).

The one thing about my proposal is that a parser could tell someone if there 
were a missing VS or the wrong VS, though when you are typesetting with a 
conformant font, the visual feedback is enough. 

[5]
>  - My personal guesstimate is that there will be much fewer fonts that would 
> implement the ligation
>(if that approach was to be chosen), than would implement the VS approach 
> you are suggesting.

And THAT's the reason it has been proposed as a standardized sequence. Chess is 
an important activity and chess literature is vast and should be properly 
supported by the UCS. 

> Thus I support your proposal, since that gives:
>   - Good fallback (readable, though ugly).
>   - Fairly good display when the VS sequences are interpreted (and the font 
> is otherwise reasonable),
> and "good" context (line height setting, not too short lines so that auto 
> line breaking is 

Re: Proposal to add standardized variation sequences for chess notation

2017-04-05 Thread Michael Everson
It’s wonderful that Mr Verdy opposes my proposal. I must be doing something 
right. 

On 5 Apr 2017, at 20:13, Philippe Verdy <verd...@wanadoo.fr> wrote:

> 2017-04-05 18:28 GMT+02:00 William_J_G Overington <wjgo_10...@btinternet.com>:
> For example, where WOMAN ZWJ ROCKET produces a glyph for a LADY ASTRONAUT, 
> thus a change of meaning and I think that it went to UTC as there was a 
> change of meaning but I am not congruently sure of that..
> 
> SQUARE ZWJ CHESSPIECE or CHESSPIECE ZWJ SQUARE produces a CHESSPIECE ON A 
> SQUARE, thus a change of meaning.
> 
> You're right here. The absence of ZWJ clearly means separate symbols side by 
> side

Wrong. ZWJ has no particular directional semantics. 

> (wether they will align vertically or match their metrics is not relevant 
> here but we already see that this is a problem for displaying actual boards 
> with the "method" proposed by Micheal Everson for use in plain text,

I have no trouble whatsoever making use of the three prototype fonts which make 
use of variation selectors to set chessboards of various sizes and with pieces 
anywhere I need them to be. The proposal document clearly shows examples of the 
boards, set with the fonts using the substitutions I specify. 

What, then, is the problem for display? 

> which just looks for me as only a hack (not a serious encoding proposal),

It is quite serious. It solves a long-standing problem which everyone has 
ignored. 

> just as if we were replacing all German sharp s letters by Greek beta 
> letters, only because they more or less "look the same”.

Lovely! A completely random analogy that has nothing whatsoever to do with this 
proposal.

> You can perfectly have a board displayed beside normal text which may contain 
> some chess pieces, not intended to combine with the surrounding board, even 
> if both symbols may also appear side by side (with independant metrics) in 
> text paragraphs.

Yes, Mr Verdy. That’s just exactly what my proposal says. You can use one font, 
with some extra glyphs attained by use of VS, to set chesspieces in text and to 
set chessboards alongside them. All using Unicode characters, not competing 
ASCII encodings which prevent harmonization of chessboard data now. 

There’s even an example of this in my proposal. Perhaps you didn’t read it. Can 
you find the Figure I refer to? 

> Given what has been encoded for other Emojis, ZWJ should be usd between 
> symbols that are supposed to combine visually (such as MAN+WOMAN).

Chess characters aren’t emojis. 

> The encoding should still respect the logic,

The logic of the use of VS in this proposal is no different from the logic used 
with them in maths, or in Myanmar, or even in some emoji. 

> just like we do in normal scripts (independantly of the fact they may have 
> different visual ordering/layout, or could have similar glyphs properly 
> disunified because of their needed distinct semantic properties).

A pawn is a pawn is a pawn. Sometimes I need the glyph for a pawn to appear in 
a certain way in order to do something nice like set a chessboard. 

> Note als othat these "chess pieces" are not just intended to be used only 
> with chesses,

If there are other uses which can be made of chess pieces, then those uses can 
be investigated in due course by someone interested in that. 

> and various board types may be used (not only with square cells,  for example 
> there are rectangular ones or triangular for Shogi pieces in Japan,

Shogi is not chess. Shogi notation is not like chess notation, either. Try to 
focus on the actual proposal. 

> the cell colors also have their own meanings, and special boards may have 
> their own cells changing colors to add other rules).

Red herring. This has nothing to do with the PRIMARY USE of chess characters, 
which is inline in text to describe chess problems in various notations, and 
also to set chessboard diagrams. 

> Note that Shogi has other pieces with distinct semantics.

Shogi isn’t chess. 

> The pieces are generally flat and can be tuned to the other side to show 
> their promotion. Traditional pieces use cursive Kanjis, but there are 
> modernised **variants** using linear glyph shapes, or westernized shapes with 
> Latin letters or geometric symbols, or even reusing the chess pieces 
> (including the Queen for the Gold General; or the King for the Jewel/Jade 
> General/Master and for its "White" Challenger), but making distinctions 
> between horses (horses-dragoons) and cavalry. When promoting using chess 
> pieces, the promotion may be shown by placing the chess piece.on top of a 
> draught piece or coin/token. Coins/tokens are used to promote pawns (just 
> stack two pieces like in draught game).

Shogi isn’t chess. 

I thank Mr Verdy for his defence of my proposal. 

Michael Everson


Re: Proposal to add standardized variation sequences for chess notation

2017-04-05 Thread Michael Everson
On 5 Apr 2017, at 11:05, Asmus Freytag <asm...@ix.netcom.com> wrote:

> Actually, I'm now leaning towards a preference for any scheme that does not 
> use VS, but relies on ligatures.

This would make editing the text more difficult and would yield less legible 
results in environments where the ligatures aren’t supported. 

> Such a scheme would need
> a) no matching spacing for the bare pieces (the ligature with the empty 
> square would result in the correct spacing)

Well, that’s no different at all than my scheme except you ligate pawn and 
empty square as I ligate pawn and VS. But your scheme has the disadvantage of 
being similar to the emoji sequences, which would appear to require ZWJ between 
the pawn and the empty square. That means you have more characters to deal with 
and in fact you end up with variable length chessboard lines, which yields the 
worst possible results in fallback. 

> b) no pieces with built-in dark background (pieces simply ligate with the 
> empty "black" square).

Or as I have it, pawn and VS.

>> Now, what happens to the two scheme if rendered with yellow text 
>> ('foreground') on a blue background?
> 
> According to Michael, the effect should be that of lead typography.

Well that’s not really what I was talking about with lead typography. (That’s 
more the ASCII-art argument.)

> This would mean that the entire ligature has the same ink color, and all 
> parts that are not "ink" are the background color (paper color).

Yes, paper and ink. As in 
http://evertype.com/standards/unicode-list/looking-glass-yellow-blue.png

> Unlike lead typography, the ink can be perfectly opaque, allowing a lighter 
> color to show on a dark background. Or the opacity of the foreground can be 
> selected to an intermediate level, allowing the ink to look greenish in your 
> example.

In any case this is a red herring. 

> (The results with a VS based system are not really different, because I 
> imagine, the actual glyph repertoire is identical in all alternatives 
> discussed so far - relying solely on ligatures has the benefit of not 
> involving the UTC at all, therefore it could be implemented today without 
> delay).

Except that ligatures is problematic for actually making chessboards. The risk 
that fallback becomes illegible is hugely magnified. Here:

http://evertype.com/standards/unicode-list/ligation-vs-VS.png

On the left we have your scheme, shown in a mono-width font; on the right, 
mine. Ligation, in fallback will lead to variable-width text on each of the 
eight lines, which will differ depending on how many chess pieces or none 
appear. With the VS solution, *all* chess data will have the same number of 
characters in each line. In fact, parsers could identify misplaced VS 
characters (VS1 where VS2 would have to be there) or missing ones. Moreover, 
reverse-parsers (or whatever the term could be) could take narrative text data 
as in:

http://evertype.com/standards/unicode-list/34-variantim.png

and generate tables from it (if the narrative data were well-formed). 

All the UTC has to do is approve the set of VS sequences as a *standardized* 
way of doing this. Ad-hoc ligation is just going to lead to continued chaos, as 
well as continued dependence on differently-encoded ASCII fonts. 

Michael Everson


Re: Proposal to add standardized variation sequences for chess notation

2017-04-05 Thread Michael Everson

> On 5 Apr 2017, at 16:25, Asmus Freytag <asm...@ix.netcom.com> wrote:
> 
>> http://evertype.com/standards/unicode-list/looking-glass-yellow-blue.png
>> 
> This matches the reply I gave Richard. Very nice.

15 seconds’ work, too.

> I think you could achieve the same with using just ligatures (no VS) and get 
> the same result when using a proper font.

No, because yours isn’t as well thought-out in terms of the structure of 
plaintext chessboard data. (Probably only because I’ve been working on this 
with real fonts for a good while now.) See my next e-mail.

Michael Everson


Re: Proposal to add standardized variation sequences for chess notation

2017-04-05 Thread Michael Everson
On 5 Apr 2017, at 15:52, Garth Wallace <gwa...@gmail.com> wrote:

> […] I'm just saying that if having symbols without VS not match either of the 
> VSes is a sticking point, it's not hard to work around.

Oh, I see.  Well, yes, I agree with you in part. But here’s the thing. 

It is *permissible* for proportional-inline-chesspieces to be identical to 
emsquare-chessboard-chesspiece if a designer *wants* to do it that way. But it 
is *just* as permissible for proportional-inline-chesspieces to be truly 
proportional and unsuitable for chessboard typesetting (and that’s how it has 
been since Unicode 1.1).

Look, here is a choice:

U+2654 - WHITE CHESS KING whose width might or might not be 
U+2654 FE00 - WHITE CHESS KING whose glyph is a white/light em-square for 
chessboards
U+2654 FE01 - WHITE CHESS KING whose glyph is a black/dark em-square for 
chessboards

I think this is enough. Or it could be:

U+2654 - WHITE CHESS KING whose width might or might not be 
U+2654 FE00 - WHITE CHESS KING whose glyph is the same as the unmodified 
U+2654, whatever it is
U+2654 FE01 - WHITE CHESS KING whose glyph is a white/light em-square for 
chessboards
U+2654 FE02 - WHITE CHESS KING whose glyph is a black/dark em-square for 
chessboards

There’s some precedent for this, where some symbols have one VS for “text 
glyph” and a different VS for “emoji glyph” and of course the unmodified symbol 
can be used and will display as the font has it.

I don’t think the second is necessary. It’s not necessary for this, for example:

U+0030 - DIGIT ZERO
U+0030 FE00 - short diagonal stroke form
U+0030 FE0E - text style
U+0030 FE0F - emoji style

OK, “text style” is identical to unmodified U+0030, but the only reason that 
attribute exists is in distinction to “emoji style”. Compare also:

U+1000 - MYANMAR LETTER KA
U+1000 FE00 - dotted form

>>> Currently, chess fonts can be (roughly) divided into "diagram fonts" and 
>>> "notation fonts”.
>> 
>> That’s not true. There are some which do all three.
> 
> There are, sure. I said roughly: many don't do both & rely on font-switching.

But even more of them can’t rely on font-switching because the encoding of the 
piece on light and dark chessboard varies from supplier to supplier. All 
current chess fonts are ASCII hacks. 

>>> None of the features required for a diagram font are unacceptable in 
>>> figurine notation:
>> 
>> The white ones may be too wide for use in text.
> 
> Not visually ideal, but legible.

Yes but if we were to unify unmodified chesspieces with the pieces on white 
squares it could invalidate the metrics of text like 
http://evertype.com/standards/unicode-list/34-variantim.png

As I say, it’s *permissible* to have the unmodified chesspiece glyph be the 
same as the white-square chesspiece glyph, but it’s not obligatory, and we must 
preserve font designer choice here. 

Michael Everson


Re: Proposal to add standardized variation sequences for chess notation

2017-04-05 Thread Michael Everson
On 5 Apr 2017, at 09:10, Richard Wordingham <richard.wording...@ntlworld.com 
<mailto:richard.wording...@ntlworld.com>> wrote:

> Now, what happens to the two scheme if rendered with yellow text 
> ('foreground') on a blue background?

The same thing that happens to ANY graphic character if you choose to render 
the background as blue and the text as yellow. 

> I believe the 'empty black square' will have yellow hatching on a blue back 
> ground.

Well, it is good that you believe this. 

> Will the empty white square be white or blue?

It will be blue, obviously. 

> Will the 'piece with matching spacing' have a white background around the 
> depiction of the piece, or a blue background?  What of a 'white
> square with a specific piece on it’?

This isn’t a problem and has nothing to do with my proposal. 

> A piece with a *white* background is different to a piece that is merely an 
> outline, whether filled or not.

I don’t think I can consider your comments to be relevant to the proposal any 
longer. You don’t even address the proposal. 

Oh, here is the answer to your question. It took me 15 seconds to change the 
background and text colour in Quark XPress. It has nothing to do with the 
proposal for variation sequences. 

http://evertype.com/standards/unicode-list/looking-glass-yellow-blue.png

Michael Everson



Re: Proposal to add standardized variation sequences for chess notation

2017-04-05 Thread Michael Everson
Kent, I can’t read this in a plain-text e-mail. I can’t paste it into an 
ordinary word-processor like Word as in my previous response to Markus, or in 
Pages (left) or LibreOffice (right) as shown here. (I simply pasted in the text 
from Word to each of those. It’s odd to see that there is some variation in 
display the text without selecting it and applying the correctly-configured 
font to it, but when that’s done, the correct display is given (modulo some 
leading issues which I didn’t focus on in either). 

The workaround you give is just that. It works. It’s not usefully portable or 
user-friendly, and as higher-letter protocols go, it hasn’t swept away all 
competition for presenting chessboards. People use ASCII or MS Symbol-based 
fonts not even with any Unicode characters in them.

http://evertype.com/standards/unicode-list/libreoffice-lg.png

http://evertype.com/standards/unicode-list/pages-lg.png

> On 3 Apr 2017, at 19:46, Kent Karlsson  > wrote:
> 
> 
> Den 2017-04-03 19:51, skrev "markus@gmail.com 
> " :
> 
> > It seems to me that higher-level layout (e.g, HTML+CSS) is appropriate for 
> > the 
> > board layout (e.g., via a table), board frame style, and cell/field shading.
> > In each field, the existing characters should suffice.
> > 
> > markus
> 
> True, and one can easily find an example online.
> 
> Slightly modified from 
> http://stackoverflow.com/questions/18505921/chess-using-tables
> 
> 
> 
> 
> a {
> color:#000;
> display:block;
> font-size:12px;
> height:16px;
> position:relative;
> text-decoration:none;
> text-shadow:0 1px #fff;
> width:16px;
> }
> #chess_board { border:2px solid #333; }
> #chess_board td {
> background:#fff;
> background:-moz-linear-gradient(top, #fff, #eee);
> background:-webkit-gradient(linear,0 0, 0 100%, from(#fff), to(#eee));
> box-shadow:inset 0 0 0 1px #fff;
> -moz-box-shadow:inset 0 0 0 1px #fff;
> -webkit-box-shadow:inset 0 0 0 1px #fff;
> height:16px;
> text-align:center;
> vertical-align:middle;
> width:16px;
> }
> #chess_board tr:nth-child(odd) td:nth-child(even),
> #chess_board tr:nth-child(even) td:nth-child(odd) {
> background:#ccc;
> background:-moz-linear-gradient(top, #ccc, #eee);
> background:-webkit-gradient(linear,0 0, 0 100%, from(#ccc), to(#eee));
> box-shadow:inset 0 0 10px rgba(0,0,0,.4);
> -moz-box-shadow:inset 0 0 10px rgba(0,0,0,.4);
> -webkit-box-shadow:inset 0 0 10px rgba(0,0,0,.4);
> }
> 
> 
> 
> True, and one can easily find an example online.
> Slightly modified from 
> http://stackoverflow.com/questions/18505921/chess-using-tables
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 



Re: Proposal to add standardized variation sequences for chess notation

2017-04-05 Thread Michael Everson
NOTE: A number of messages I sent in the last two days were scrubbed by the 
Unicode list software because they contained images. I will re-send these with 
links now. 

From: William_J_G Overington <wjgo_10...@btinternet.com 
<mailto:wjgo_10...@btinternet.com>>
Date: 2 April 2017 at 12:05:03 IST

> I included the regular Unicode chess pieces themselves, and for each chess 
> piece also versions on a white square and on a black square in the Private 
> Use Area of my Quest text font.

OK, I’m looking at this. William’s description uses rather different terms than 
mine does, so I’ll try to translate. 

First, he’s describing a font he made in 2004 or 2005, not an implementation of 
my proposal. 

> Free download of the Quest text font from 
> http://www.users.globalnet.co.uk/~ngo/fonts.htm 
> <http://www.users.globalnet.co.uk/~ngo/fonts.htm>
> 
> Thus, for, say, White King, there are three glyphs.

Correct, just as my proposal would have it. And the metrics for e.g. 
white-king-for-use-in-text and white-king-for-use-in-chessboard are different. 

> The Quest text font has descenders, so that while the glyph for White King 
> itself is sat on the baseline, the glyph for the White King on a white square 
> has the chess piece positioned lower vertically. The background shading for 
> White King on a black background goes down to the WinDescent level.

By this he means more or less that the glyphs intended to produce a board have 
em-square metrics, while the base glyphs do not have square metrics and would 
be more suitable for in-text usage.

[Picture of William’s three characters showing metrics]

http://evertype.com/standards/unicode-list/overington-glyphs.png

> Line spacing could be an issue, but it need not be as long as the 
> OpenType-supporting application where the font is used has the facility to 
> set type with no additional spacing. I use Serif PagePlus X7 and the facility 
> is there, so diagrams look fine.
> 
> I hope that Michael's proposal goes forward and is accepted.

So does Michael. 

> Regarding the borders. I note that use of a variation selector is not 
> suggested.

Nor should it be. 

> As it happens, Quest text also has eight glyphs for producing a border, all 
> eight being in the Private Use Area. They are rather ornate. They are at 
> U+E5B0 through to U+E5B7.

They are there. I had to figure out how the should be used. They are put 
together in a very different way than the borders of any other font I have seen 
are. I am not sure, but I think he’s intended to use them thus:

[Pic of the Looking-Glass board in William’s font]

http://evertype.com/standards/unicode-list/overington-board.png

William’s design is decidedly non-traditional, and not (to my eye) particularly 
easy to read, but it doesn’t matter. The picture here shows his glyphs 
configured in exactly the same way as specified in my proposal. IT WORKS. 
(There are some hairline gaps in the border and the top left corner piece is a 
little less well aligned than one would if one were preparing to ship the 
font.) 

The underlying text just the same text that I used to set the Looking-Glass 
board in my proposal, variation selectors and all. There are no variation 
selectors used (or needed) for the border, though its glyphs are certainly 
unconventional horizontal lines. ;-) 

> The empty squares in the chess diagram each use a variation selector.

I dont’ see how. There were no OpenType instructions in the font or variation 
selector characters in the font.

> I opine that it would be helpful if a variation selector were to be used for 
> each of the eight border items. 

No, because this would lead to potentially infinite variety of borders within 
any font, and it would be better to restrict this. I wouldn’t even want a VS to 
distinguish a single-line border from a double-line border, and there’s an 
enormous variety of ornamental borders one could put on the glyphs for 

> Using a variation selector would mean that a diagram could be produced 
> without relying on the basic designs of the eight character sorts used to 
> produce the border and also would allow a stylish border design to be 
> included in a font.

A chess font is best when optimized to the design the designer wants, but 
honestly, the model proposed is simple and robust and does not need more 
tinkering or more complexity. It is able to support William’s design as well as 
the more traditional ones in the proposal while remaining parseable plain text. 
That should be enough. That is what takes the mess that current non-Unicode 
chess fonts are in and normalizes them for use. 

> Best regards,
> 
> William Overington

Thank you for sharing your font, William. I’ll send you the ttf of this one so 
you can tinker with glyph placement as you wish, if the proposal is accepted 
and the standardized variation sequences accepted.  

Michael Everson

Re: Proposal to add standardized variation sequences for chess notation

2017-04-05 Thread Michael Everson
ph) for the chess pieces. 

> If one has no control over the fallback sequence for glyphs, arguably the 
> situation for truly 'plain text', then the escape root for plain
> text is to have the font with good chess glyphs for use in running text 
> declare that it has the glyph for such use.

Richard, I’ve shown some examples of the Looking-Glass problem where the VS 
sequences are ignored. Did you see these? Why don’t you refer to them. You’re 
talking in the abstract as though you haven’t read the proposal or looked at 
the examples it gives. In Figure 3, you can see the base glyphs in the font 
which might be used for any purpose. For the special purpose of setting a 
chessboard, you need to use the VS sequences. If your font or your app can’t 
display that, that’s a problem, but that is no different for ANY app or font 
that can’t display fi/fl ligatures, or maths characters with VS, or Myanmar 
characters with VS, or emoji characters either in colour or black-and-white 
with or without ligatures. Why is this such a dreadful problem for chess when 
it’s not for any of the other character types which use VS sequences?

You haven’t explained this inchoate worry of yours. My proposal admits freely 
that VS-derived glyphs for chessboards might fail in some environments (but 
also shows that a board set using this scheme is still legible by humans and 
parseable by software even if the good display may fail. 

But the point is that chess fonts are specialized fonts, and if people want to 
set chess problems they need special chess fonts. Such fonts exist right now, 
but only with conflicting ASCII encodings. THAT is worse that maybe some fonts 
having ugly glyphs or maybe some environments can’t display OpenType sequences 
properly. THAT is the problem that is solved by this proposal. 

> That requires the definition of a variation sequence to force the choice of 
> suitable glyph.

These sequences are what the proposal gives. Why are you saying this?

> Now, having to use variation sequences for chess pieces in plain text is 
> unfortunate,

No, it isn’t. It’s a better idea to use those for 96 chess characters than to 
have to get the committees to accept 288 chess characters (which I very much 
doubt they will) which by the way also puts off a solution for chess for 
another two years. 

> but should also work with existing fonts supporting chess pieces.

It can, if and only if the makers of those fonts add the new glyphs and new VS 
sequences to their fonts. This is also true for maths fonts, for Myanmar fonts, 
and for emoji fonts. 

> There would be transitional effects as existing fonts were modified to 
> declare that they supported this variation sequence - the effects of font 
> fallback would vary as the new fonts were added to the system.

Yes, that is what will happen if this proposal is selected. No fonts are 
magically altered. 

Are you supporting my proposal or objecting to it? I can’t even tell any more. 

>> But nobody making a chess font with actual support for chess would dothat. 
> 
> Note that the font I have in mind is just supporting chessboards. The idea 
> would be that other fonts would be used for high quality rendering.

WHAT? What does this mean? Who would make a font supporting just chessboards 
and not supporting display of chess characters otherwise? Anyway you can’t. You 
can't even MAKE a substitution table if the base characters aren’t in the font. 
FontLab complains and then politely asks if you want to add the glyphs to the 
font, and when you say Yes (which you must if you want your OT sequences to 
work) then it puts those characters in the font so you can put glyphs into 
them. Is this explained clearly enough for you? If you’re worrying about 
whether the font designer will bother to make NICE glyphs for those characters, 
well, that is up to the wit of the font designer. And that has nothing to do 
with the proposal. 



> 
>> Nothing prevents someone from drawing the 16 Myanmar base characters
>> with rings at the ends of their glyphs even though now VS are being
>> recommended for that presentation. Is it legitimate to do that? Of
>> course it is.
> 
> You seem to be declaring that it would not be wrong for chess piece 
> characters in running text to be automatically depicted with dark chess 
> square backgrounds.

If a font designer is perverse enough to do that, his font won’t be nice and 
nobody will use it. It is possible for someone to depict chess piece characters 
in running text with dark chess square backgrounds, but since that’s not the 
convention for doing so, nobody would like the font and nobody would use it. 

>> Can you identify an actual problem? 
> 
> See above.

You failed to identify an actual problem with the proposal. 

The actual problem is (1) chess fonts aren’t using unicode characters (2) VS 
selectors can help provide a standardized way that enables chess fonts to do so 
and (3) the proposal gives a mechanism for doing that which will work in 
environments where VS substitution glyphs are supported. 

Michael Everson


Re: Proposal to add standardized variation sequences for chess notation

2017-04-04 Thread Michael Everson
On 4 Apr 2017, at 18:54, Richard Wordingham <richard.wording...@ntlworld.com> 
wrote:
> 
> On Tue, 4 Apr 2017 01:30:05 +0100
> Michael Everson <ever...@evertype.com> wrote:
> 
>>> I'm trying to work out whether we need a variation sequence for "chesspiece 
>>> in a sentence”.  
>> 
>> Of course! Haven’t you ever seen chess problem texts? Check out the Fairy 
>> Chess proposal for encoding additional characters. Plenty of examples there. 
> 
> Your examples did not have to contend with the possibility of fonts that only 
> support the variants for drawing chessboards.

Um, what? 

Why would anyone make a font that supports the variants for drawing chessboards 
(which require the encoded characters 2654..265F) not put in glyphs for those? 

FontLab is the program I use to add OpenType features to my fonts, and if I try 
to add a sequence like 2654 + FE00 and the font doesn’t have a 2654, if flags 
it as an error and insists that the character appear in the font. OK, someone 
could be perverse and not add glyphs to those code positions, but…  

But nobody making a chess font with actual support for chess would do that. 

So this is another red herring. As far as I can see, your worries are 
groundless, and nothing has suggested that there’s something wrong with the 
proposal. 

Also, having implemented it in three or four different fonts now, I find that 
it works. It does the job, and it’s easy to use to edit.

>> Sorry, I meant “Of course **not**!” that is, chesspiece in a sentence is 
>> extremely common, and should be the default (not stylized) form. We can’t 
>> repurpose that to be “chesspiece on a white square” because it hasn’t been 
>> previously and changing that would affect the layout of existing data.
> 
> But would not your proposal make it legitimate for a font to supply only 
> chess pieces on dark backgrounds for the chess piece characters?

What does “legitimate” mean? 

Nothing prevents someone from drawing the 16 Myanmar base characters with rings 
at the ends of their glyphs even though now VS are being recommended for that 
presentation. Is it legitimate to do that? Of course it is. It’s legitimate to 
make Myanmar fonts with square glyphs rather than circular ones. 

This proposal provides a stable encoding model for drawing chessboards simply, 
with fonts. Currently there are other fonts which do this, but they do not 
share encodings, and so sharing chessboard data is dependent on whether you 
have set up your board in the same font encoding that somebody else is using. 
Otherwise it doesn’t work, and your text is corrupt and you have to re-key 
various elements in order to use the glyphs of the other font. This problem is 
described in detail at the beginning of the proposal. It is the same problem we 
had with ISO/IECE 8859-1, -2, -3. -4 etc before we had the UCS. So: we have 
unstable non-Unicode encodings for chessboards now, this proposal provides 
stable Unicode encodings. 

This can only benefit the community of users of chess fonts. Anybody who isn’t 
setting chessboards is unaffected, just as I am unaffected by variation 
selectors used for glyph variation in mathematical fonts. (I might add the 
slashed zero glyph to Everson Mono, though.)

This proposal does this while leaving the base characters alone so they can be 
used as chesspieces in text (as they have been since Unicode 1.1) and by adding 
a mechanism to construct the glyphs necessary for presenting chessboard data.

This proposal uses a mechanism which has already been used for dozens of 
regular characters and 310 times for some popular pictographs. No new 
characters need to be added. Just a list of items in a text file. 

Can you identify an actual problem? 

Michael Everson


Re: Proposal to add standardized variation sequences for chess notation

2017-04-04 Thread Michael Everson
On 4 Apr 2017, at 17:58, Mark Davis ☕️ <m...@macchiato.com> wrote:

> Amusing at this is, hard to believe that people are spending this much time 
> on an April Fool's posting. 

I wondered how long it would take for someone to be taken in. The joke, of 
course, was hidden not inside the proposal, but inside the date. 

> I'm looking forward to similar postings on checkers

You haven’t bothered to read the proposal, have you? 

> and go pieces.

Gō notation is rather different and this kind of solution might not be 
appropriate for it. That, however, is a different problem unrelated to this 
proposal. 

> As a matter of fact, one that proposes adding new characters for every 
> possible configuration of a go board would be imaginative.

You really haven’t bothered to read the proposal, have you? 

> And I'm looking also forward to the ♖+ZWJ+⬛️  (etc) proposal.

I recommend that you read the proposal before attempting to dismiss it. 

Michael Everson

PS. Interested readers may wish to review some other  proposals by myself and 
others.

N4014 2011-04-01 was successful
N4012 2011-04-01 was successful
N4011 2011-04-01 was not successful*
N3412 2008-04-01 was not successful
N3066 2006-04-01 was successful
N2935 2005-04-01 was successful
N258A 2003-04-01 was not successful
N2338 2001-04-01 was successful
N2326 2001-04-01 was not successful

*Though given recent symbol work by some it might be prudent to revive some 
part of this one.

PSS: While games like chess, draughts, gō, and xiàngqí are pastimes, they are 
also complex intellectual pursuits which have amassed a sizeable literature 
over many centuries. Chess notation and chess diagrams is a good example. Kifu 
notation for gō is another. 

The UCS encodes characters which represent the pieces of many games. It is 
reasonable to expect that people may wish to use these characters to represent 
game data. 

Asmus’ idea that the 12 chess characters be duplicated or triplicated in order 
to set chess diagrams is wasteful of encoding space and not extensible either. 
We have seen that some 84 additional chess characters have been proposed; it 
would be a very bad idea to expand that to 168 or 252 characters. The 
appropriate way to respond to the great many differences in the ASCII-encoded 
existing chess fonts is to simply make use of existing characters in the 
standard to alter, in a systematic and standardized way, the glyph 
representation of the 12 already-encoded characters with 2 other 
already-encoded characters, as described in the proposal. 

Years ago a proposal similar to Asmus’ was made, in discussion if not in a 
formal document. The answer was “a higher level protocol would be best for 
chessboard notation”. Well, the simplest higher-level protocol for this is to 
use variation selectors to alter the font display, just as we use them for 
DIGIT ZERO, 16 Myanmar letters, INTERSECTION, UNION, SUBSET OF WITH NOT EQUAL 
TO, a bunch of other mathematical characters and more than 300 pictographs.

Michael Everson


Re: Proposal to add standardized variation sequences for chess notation

2017-04-03 Thread Michael Everson

> On 4 Apr 2017, at 02:01, Kent Karlsson <kent.karlsso...@telia.com> wrote:
> 
>>> Book formatting? Old style book formatting still cannot use as 
>>> sophisticated layouts as HTML can... (AFAIK).
>> 
>> Yeah, but come on, the chief use of chess characters is to cite them inline 
>> in text like any other symbol @ § % & and the other equally chief use of 
>> chess characters is to set 8 × 8 chessboards which float in space in the 
>> layout as figures. The layout requirement isn’t all that demanding that HTML 
>> offers a major advantage. 
> 
> In case you missed it, the statement I made above was in *SUPPORT* of your 
> proposal (in general, but not necessarily all details)…

It’s not easy to tell because couterapproaches suggested are not well specified 
and really don’t seem to be practical. 

It *is* important that there be an even number of characters in every row of 8 
squares for fallback display to be better rather than worse, I think. I don’t 
think it’s possible to ensure that the rendering engine every app displays the 
fallback identically (Seems that Word and LibreOffice and Pages and Quark 
display a little differently; this seems to be that they load glyphs from some 
fonts before glyphs from others. 

I found while setting the tables that it was convenient to have to remember 
that every one of the 64 characters had to have VS1 or VS2 along with it. 
Constructing a table from scratch and modifying and existing one both felt 
easier with uniform encoding.

Michael Everson


Re: Proposal to add standardized variation sequences for chess notation

2017-04-03 Thread Michael Everson
> I'm trying to work out whether we need a variation sequence for
> "chesspiece in a sentence”.

Of course! Haven’t you ever seen chess problem texts? Check out the Fairy Chess 
proposal for encoding additional characters. Plenty of examples there. 

Sorry, I meant “Of course **not**!” that is, chesspiece in a sentence is 
extremely common, and should be the default (not stylized) form. We can’t 
repurpose that to be “chesspiece on a white square” because it hasn’t been 
previously and changing that would affect the layout of existing data.

Michael Everson

Re: Proposal to add standardized variation sequences for chess notation

2017-04-03 Thread Michael Everson
On 4 Apr 2017, at 00:59, Richard Wordingham <richard.wording...@ntlworld.com> 
wrote:

> No, he wants two characters WHITE CHESS KNIGHT and WHITE CHESS KNIGHT ON DARK 
> BACKGROUND, and a variation selector, say VS2, that when applied to them 
> yields a glyph that works with block elements.
> 
> It might be simpler if WHITE CHESS KNIGHT ON DARK BACKGROUND was defined as a 
> character that worked with block elements. 

I can’t fathom how you would configure a font to do whatever it is you think 
you’re describing here. I don’t follow it. “worked with which block elements, 
to do what?

If it’s draw a box around the board, I already said, the answer is to change 
the graphics terminal block elements because in a chess-font environment their 
positional function is used, not their graphics terminal glyph. 

>> Then you’re still stuck for a solution for non-em-square characters for 
>> inline text. 
> 
> No, WHITE CHESS KNIGHT should continue to fulfil that role.  My only worry is 
> that one might need a variation selector, say VS1, to force the choice of a 
> suitable glyph.

I don’t get what you’re on about. I’ve already solved this problem, and 
whatever it is you’re describing sure doesn’t sound intuitive. 

I’ve shown my implementations which do what I need them to do. I don’t know if 
you can do the same, but go ahead and make your font to prove it, and write it 
up clearly in a counter-proposal if you think it’s the right way to . 

Michael Everson


  1   2   3   4   5   6   7   8   9   10   >