RE: Engmagate?

2013-12-13 Thread Whistler, Ken
Well, inconceivable? No. Inadvisable? yes.

First of all, such “comments” are not actually “comments”—they are the result 
of a fairly cumbersome and drawn-out process of adding *normative* standardized 
variation sequences to the standard.

Second – although this is a nit – FE0E and FE0F would not be used for 
standardized variation sequences of this type, because of their special 
pre-emption for the emoji variants. But FE00, FE01, etc., are, of course 
available.

More importantly, I consider it a *very* bad precedent to attempt to start down 
the road of mixing standardized variants with case pairing. That is almost 
guaranteed to create permanent implementation problems that would surely be 
worse than the perceived problem it would be attempting to solve. Case mapping 
could fail under circumstances mysterious to end users. Likewise, string 
matches would fail under mysterious circumstances, where implementations doing 
multi-level weighting and/or simply stripping variation selectors would match, 
but the same strings doing binary compares would fail, where they might not 
have before.

Do such problems afflict the existing standardized variation sequences? Well, 
yes, to a certain extent. But to date they have not been simultaneously pulled 
into the maelstrom of case mapping and case folding. Mixing the two is just a 
recipe for a big mess, IMO.

--Ken


Would it be inconceivable to add comments such as:

014A LATIN CAPITAL LETTER ENG
* glyph may also have appearance of large form of the small letter
~ 014A FE0E N form
~ 014A FE0F n form
?


Re: Engmagate?

2013-12-13 Thread Karl Pentzlin
Am Freitag, 13. Dezember 2013 um 08:13 schrieb Jean-François Colson:

JFC>  Is that as wrong as if “ændern” was used instead of  “ändern” in German

Yes, this is a good example. You somehow recognize what the correct
character would be, and if you see this repeatedly in a longer text,
you get an impression between silliness and incompetence independent
of the actual content of that text. I know this feeling from texts
which continuously use a Greek beta (β) instead of an German sharp s (ß).

- Karl





Re: Engmagate?

2013-12-13 Thread Jean-François Colson

Le 12/12/13 23:52, Asmus Freytag a écrit :

On 12/12/2013 2:25 PM, Leo Broukhis wrote:
Hmmm... As a person with Russian as the first language I can assure 
you that from any literate Russian-speaking person's perspective 
italic ū is an unacceptable and *WRONG* representation of п (because 
in Russian, unlike Serbian, there is й). Should we bother disunifying?


This example adds the issue of font style - because for styles other 
than italic, the issue doesn't exist. I would take that as a stronger 
indication that this is an issue that belongs in glyph space.


The fact that the lowercase letter is the same in both cases proves 
that the difference between N-Eng and n-Eng is purely stylistic 
rather than semantic. Unicode shouldn't bother with those minutia.


What about the reverse case, where the uppercase is the same and the 
lower case isn't?


There are precedents in Unicode where these have been disunified.

U+00D0 LATIN CAPITAL LETTER ETH
U+0189 LATIN CAPITAL LETTER AFRICAN D

look exactly identical.


In this case, that’s the capitals which look the same while the 
lowercase letters are different.


There’s also Ə ə vs Ǝ ǝ.




Precedents like this make the issue considerably less than clear cut,


+1





> I suppose nothing will happen until the governments of eng-using 
countries come together with a proposal.


Let's hope so. I wish they never do.


Lets hope they come together and endorse a solution that takes into 
account not only rendering, but identifier security issues as well. 
And while they are at it, I wouldn't refuse if they squared the circle.


A./


Leo



On Thu, Dec 12, 2013 at 2:06 PM, Michael Everson 
mailto:ever...@evertype.com>> wrote:


On 12 Dec 2013, at 15:29, Leo Broukhis mailto:l...@mailcom.com>> wrote:

> Hasn't http://www.unicode.org/standard/where/#Variant_Shapes
explained it once and for all?

No, because users of N-shaped capital Eng consider n-shaped
capital Eng to be *WRONG*, not an acceptable variant. And because
n-shaped capital Eng consider N-shaped capital Eng to be *WRONG*,
not an acceptable variant.

Disunification is the best solution.

I suppose nothing will happen until the governments of eng-using
countries come together with a proposal.

Michael Everson * http://www.evertype.com/








Re: Engmagate?

2013-12-13 Thread Jean-François Colson

Le 13/12/13 08:58, Jean-François Colson a écrit :

Le 13/12/13 08:33, Denis Jacquerye a écrit :
On Thu, Dec 12, 2013 at 10:06 PM, Michael Everson 
 wrote:

On 12 Dec 2013, at 15:29, Leo Broukhis  wrote:

Hasn't http://www.unicode.org/standard/where/#Variant_Shapes 
explained it once and for all?
No, because users of N-shaped capital Eng consider n-shaped capital 
Eng to be *WRONG*, not an acceptable variant. And because n-shaped 
capital Eng consider N-shaped capital Eng to be *WRONG*, not an 
acceptable variant.


Disunification is the best solution.

I suppose nothing will happen until the governments of eng-using 
countries come together with a proposal.
What if not every user of one form considers it wrong to use the 
other form?


What if there’s evidence of use of both forms in those languages?

What if the users who consider the other shape wrong are unaware of
the history or variation of their own orthographies?



All those problems could be solved with variation selectors.

In the case of a disunification, you are compelled to choose one of 
the two forms.


If variation sequences are preferred:
— those who consider a form is wrong simply use the VS associated with 
the other form,
— and those who’re not bothered by that matter and wish more variation 
might use the letter Ŋ without any VS.


Of course, the VS should be included in the keyboard driver.



About the history question, should I print my curriculum vitæ in 
blackletter only because that writing style has been used in my country?








RE: Engmagate?

2013-12-13 Thread Jonathan Rosenne
All this just endorses  "Go to, let us go down, and there confound their 
language"

Best regards,
Jonathan (Jony) Rosenne

-Original Message-
From: unicode-bou...@unicode.org [mailto:unicode-bou...@unicode.org] On Behalf 
Of Denis Jacquerye
Sent: Friday, December 13, 2013 9:33 AM
To: Michael Everson
Cc: Leo Broukhis; Don Osborn; unicode Unicode Discussion; Eng in the UCS
Subject: Re: Engmagate?

On Thu, Dec 12, 2013 at 10:06 PM, Michael Everson  wrote:
> On 12 Dec 2013, at 15:29, Leo Broukhis  wrote:
>
>> Hasn't http://www.unicode.org/standard/where/#Variant_Shapes explained it 
>> once and for all?
>
> No, because users of N-shaped capital Eng consider n-shaped capital Eng to be 
> *WRONG*, not an acceptable variant. And because n-shaped capital Eng consider 
> N-shaped capital Eng to be *WRONG*, not an acceptable variant.
>
> Disunification is the best solution.
>
> I suppose nothing will happen until the governments of eng-using countries 
> come together with a proposal.

What if not every user of one form considers it wrong to use the other form?

What if there’s evidence of use of both forms in those languages?

What if the users who consider the other shape wrong are unaware of the history 
or variation of their own orthographies?

--
Denis Moyogo Jacquerye
African Network for Localisation http://www.africanlocalisation.net/
Nkótá ya Kongó míbalé --- http://info-langues-congo.1sd.org/
DejaVu fonts --- http://www.dejavu-fonts.org/






Re: Engmagate?

2013-12-13 Thread Jean-François Colson

Le 13/12/13 08:33, Denis Jacquerye a écrit :

On Thu, Dec 12, 2013 at 10:06 PM, Michael Everson  wrote:

On 12 Dec 2013, at 15:29, Leo Broukhis  wrote:


Hasn't http://www.unicode.org/standard/where/#Variant_Shapes explained it once 
and for all?

No, because users of N-shaped capital Eng consider n-shaped capital Eng to be 
*WRONG*, not an acceptable variant. And because n-shaped capital Eng consider 
N-shaped capital Eng to be *WRONG*, not an acceptable variant.

Disunification is the best solution.

I suppose nothing will happen until the governments of eng-using countries come 
together with a proposal.

What if not every user of one form considers it wrong to use the other form?

What if there’s evidence of use of both forms in those languages?

What if the users who consider the other shape wrong are unaware of
the history or variation of their own orthographies?



All those problems could be solved with variation selectors.

In the case of a disunification, you are compelled to choose one of the 
two forms.


If variation sequences are preferred:
— those who consider a form is wrong simply use the VS associated with 
the other form,
— and those who’re not bothered by that matter and wish more variation 
might use the letter Ŋ without any VS.


About the history question, should I print my curriculum vitæ in 
blackletter only because that writing style has been used in my country?





Re: Engmagate?

2013-12-12 Thread Denis Jacquerye
On Thu, Dec 12, 2013 at 10:06 PM, Michael Everson  wrote:
> On 12 Dec 2013, at 15:29, Leo Broukhis  wrote:
>
>> Hasn't http://www.unicode.org/standard/where/#Variant_Shapes explained it 
>> once and for all?
>
> No, because users of N-shaped capital Eng consider n-shaped capital Eng to be 
> *WRONG*, not an acceptable variant. And because n-shaped capital Eng consider 
> N-shaped capital Eng to be *WRONG*, not an acceptable variant.
>
> Disunification is the best solution.
>
> I suppose nothing will happen until the governments of eng-using countries 
> come together with a proposal.

What if not every user of one form considers it wrong to use the other form?

What if there’s evidence of use of both forms in those languages?

What if the users who consider the other shape wrong are unaware of
the history or variation of their own orthographies?

-- 
Denis Moyogo Jacquerye
African Network for Localisation http://www.africanlocalisation.net/
Nkótá ya Kongó míbalé --- http://info-langues-congo.1sd.org/
DejaVu fonts --- http://www.dejavu-fonts.org/




Re: Engmagate?

2013-12-12 Thread Jean-François Colson

Le 13/12/13 00:10, Leo Broukhis a écrit :
In the case of ɖ vs ð vs đ, there are three different letters, as 
follows from their names, that happen to have identical capital glyphs 
(those you've mentioned plus U+0110 LATIN CAPITAL LETTER D WITH STROKE).


Speaking of đ, "an alternate glyph with the stroke through the bowl is 
used in Americanist orthographies" without any [loud] cries about 
disunification.


Have you ever made an inquiry about the fonts which might be rejected by 
Americanists because đ has the stroke through the ascender and about the 
fonts which might be rejected by Croatian/Sami/Vietnamese speakers 
because đ has the stroke through the bowl?




If N-Eng and n-Eng are disunified but small engs aren't (should 
they?), who keeps the "default" "toupper" conversion?


> And while they are at it, I wouldn't refuse if they squared the circle.

That's exactly right.

Leo






Re: Engmagate?

2013-12-12 Thread Jean-François Colson

Le 12/12/13 23:06, Michael Everson a écrit :

On 12 Dec 2013, at 15:29, Leo Broukhis  wrote:

Hasn't http://www.unicode.org/standard/where/#Variant_Shapes 
explained it once and for all?
No, because users of N-shaped capital Eng consider n-shaped capital 
Eng to be *WRONG*, not an acceptable variant. And because n-shaped 
capital Eng consider N-shaped capital Eng to be *WRONG*, not an 
acceptable variant.


Is that as wrong as if "ændern" was used instead of "ändern" in German 
or "Lätitia" instead of the surname "Lætitia" in French, based on the 
fact that "Händel" is also spelled "Hændel"?


BTW, why are there two separate characters for "ð" and "?"?




Disunification is the best solution.

I suppose nothing will happen until the governments of eng-using 
countries come together with a proposal.


Michael Everson * http://www.evertype.com/





I wonder whether disunification is THE solution.

For some characters there's already a standardized use of VSs:

0023 NUMBER SIGN
= pound sign, hash, crosshatch, octothorpe
x (l b bar symbol - 2114)
x (music sharp sign - 266F)
~ 0023 FE0E text style
~ 0023 FE0F emoji style

Would it be inconceivable to add comments such as:

014A LATIN CAPITAL LETTER ENG
* glyph may also have appearance of large form of the small letter
~ 014A FE0E N form
~ 014A FE0F n form
?



Re: Engmagate?

2013-12-12 Thread Jean-François Colson

Le 13/12/13 00:10, Leo Broukhis a écrit :
In the case of ɖ vs ð vs đ, there are three different letters, as 
follows from their names, that happen to have identical capital glyphs 
(those you've mentioned plus U+0110 LATIN CAPITAL LETTER D WITH STROKE).


Speaking of đ, "an alternate glyph with the stroke through the bowl is 
used in Americanist orthographies" without any [loud] cries about 
disunification.


If N-Eng and n-Eng are disunified but small engs aren't (should 
they?), who keeps the "default" "toupper" conversion?


Ifever the small engs were disunified, the capital ones should be 
disunified too, or that would lead to a problem à la Turkish where, in 
international databases, i’s loose their dot when Turkish names are 
capitalized while they shouldn’t.




> And while they are at it, I wouldn't refuse if they squared the circle.

That's exactly right.

Leo




On Thu, Dec 12, 2013 at 2:52 PM, Asmus Freytag > wrote:


On 12/12/2013 2:25 PM, Leo Broukhis wrote:

Hmmm... As a person with Russian as the first language I can
assure you that from any literate Russian-speaking person's
perspective italic ū is an unacceptable and *WRONG*
representation of п (because in Russian, unlike Serbian, there is
й). Should we bother disunifying?


This example adds the issue of font style - because for styles
other than italic, the issue doesn't exist. I would take that as a
stronger indication that this is an issue that belongs in glyph
space.



The fact that the lowercase letter is the same in both cases
proves that the difference between N-Eng and n-Eng is purely
stylistic rather than semantic. Unicode shouldn't bother with
those minutia.


What about the reverse case, where the uppercase is the same and
the lower case isn't?

There are precedents in Unicode where these have been disunified.

U+00D0 LATIN CAPITAL LETTER ETH
U+0189 LATIN CAPITAL LETTER AFRICAN D

look exactly identical.

Precedents like this make the issue considerably less than clear cut,




> I suppose nothing will happen until the governments of
eng-using countries come together with a proposal.

Let's hope so. I wish they never do.


Lets hope they come together and endorse a solution that takes
into account not only rendering, but identifier security issues as
well. And while they are at it, I wouldn't refuse if they squared
the circle.

A./



Leo



On Thu, Dec 12, 2013 at 2:06 PM, Michael Everson
mailto:ever...@evertype.com>> wrote:

On 12 Dec 2013, at 15:29, Leo Broukhis mailto:l...@mailcom.com>> wrote:

> Hasn't
http://www.unicode.org/standard/where/#Variant_Shapes
explained it once and for all?

No, because users of N-shaped capital Eng consider n-shaped
capital Eng to be *WRONG*, not an acceptable variant. And
because n-shaped capital Eng consider N-shaped capital Eng to
be *WRONG*, not an acceptable variant.

Disunification is the best solution.

I suppose nothing will happen until the governments of
eng-using countries come together with a proposal.

Michael Everson * http://www.evertype.com/









Re: Engmagate?

2013-12-12 Thread Jean-François Colson

Le 13/12/13 03:24, Michael Everson a écrit :

On 12 Dec 2013, at 22:25, Leo Broukhis  wrote:

Hmmm... As a person with Russian as the first language I can assure 
you that from any literate Russian-speaking person's perspective 
italic ū is an unacceptable and *WRONG* representation of п (because 
in Russian, unlike Serbian, there is й). Should we bother disunifying?

Italic is not plain text.


Really???
Why?

If in Gedit I choose the font “Liberation Sans Italic” ( 
http://colson.eu/ItalicPlainText.png ) to display a plain text, isn’t 
that a plain text displayed with an italic font?
Gedit is a text editor (not a word processor) where you can choose a 
font for the whole document but not for a small part of the text.





I suppose nothing will happen until the governments of eng-using 
countries come together with a proposal.

Let's hope so. I wish they never do.
Yeah. To heck with the end user and their pathetic 
preferences.


Michael Everson * http://www.evertype.com/







Re: Engmagate?

2013-12-12 Thread Asmus Freytag

On 12/12/2013 6:38 PM, Leo Broukhis wrote:

> Italic is not plain text.

Is this the only thing that would have stopped you from advocating 
disunification?


> Yeah. To heck with the end user and their pathetic preferences.

Is a preference to have traditional and simplified CJK characters 
disunified more or less pathetic (and why) than the preference at hand?


The process has been plagued with the lack of knowledgeable and engaged 
members from the affected communities where it comes to certain regions.


Ultimately any unification decision makes tradeoffs. Without the 
communities involved, what you have is experts acting as self-appointed 
representatives. Because tradeoffs by their nature have not only 
benefits but also costs, I can fully understand a reluctance to change 
the tradeoff based on any form of arms-length interaction.


In the CJK case, the communities were very actively involved - not all 
their constituents were not unanimous, but they took part in the 
development and ultimately approved the tradeoffs involved after a 
lengthy and formal process.


As to whether the governments are the best representatives of their 
communities depends very much on circumstances, so I'd like to support a 
restatement of the comment made earlier:


This issue will not be really settled until the eng-using communities 
get together and resolve which tradeoff works best for them. Let's hope 
they do, and soon.


A./


Leo


On Thu, Dec 12, 2013 at 6:24 PM, Michael Everson > wrote:


On 12 Dec 2013, at 22:25, Leo Broukhis mailto:l...@mailcom.com>> wrote:

> Hmmm... As a person with Russian as the first language I can
assure you that from any literate Russian-speaking person's
perspective italic ū is an unacceptable and *WRONG* representation
of п (because in Russian, unlike Serbian, there is й). Should we
bother disunifying?

Italic is not plain text.

> > I suppose nothing will happen until the governments of
eng-using countries come together with a proposal.
>
> Let's hope so. I wish they never do.

Yeah. To heck with the end user and their pathetic
preferences.

Michael Everson * http://www.evertype.com/






Re: Engmagate?

2013-12-12 Thread Leo Broukhis
> Italic is not plain text.

Is this the only thing that would have stopped you from advocating
disunification?

> Yeah. To heck with the end user and their pathetic preferences.

Is a preference to have traditional and simplified CJK characters
disunified more or less pathetic (and why) than the preference at hand?

Leo


On Thu, Dec 12, 2013 at 6:24 PM, Michael Everson wrote:

> On 12 Dec 2013, at 22:25, Leo Broukhis  wrote:
>
> > Hmmm... As a person with Russian as the first language I can assure you
> that from any literate Russian-speaking person's perspective italic ū is an
> unacceptable and *WRONG* representation of п (because in Russian, unlike
> Serbian, there is й). Should we bother disunifying?
>
> Italic is not plain text.
>
> > > I suppose nothing will happen until the governments of eng-using
> countries come together with a proposal.
> >
> > Let's hope so. I wish they never do.
>
> Yeah. To heck with the end user and their pathetic
> preferences.
>
> Michael Everson * http://www.evertype.com/
>
>


Re: Engmagate?

2013-12-12 Thread Michael Everson
On 12 Dec 2013, at 22:25, Leo Broukhis  wrote:

> Hmmm... As a person with Russian as the first language I can assure you that 
> from any literate Russian-speaking person's perspective italic ū is an 
> unacceptable and *WRONG* representation of п (because in Russian, unlike 
> Serbian, there is й). Should we bother disunifying?

Italic is not plain text. 

> > I suppose nothing will happen until the governments of eng-using countries 
> > come together with a proposal.
> 
> Let's hope so. I wish they never do.

Yeah. To heck with the end user and their pathetic preferences.

Michael Everson * http://www.evertype.com/





Re: Engmagate?

2013-12-12 Thread Asmus Freytag

On 12/12/2013 3:10 PM, Leo Broukhis wrote:
In the case of ɖ vs ð vs đ, there are three different letters, as 
follows from their names, that happen to have identical capital glyphs 
(those you've mentioned plus U+0110 LATIN CAPITAL LETTER D WITH STROKE).


Speaking of đ, "an alternate glyph with the stroke through the bowl is 
used in Americanist orthographies" without any [loud] cries about 
disunification.


If N-Eng and n-Eng are disunified but small engs aren't (should 
they?), who keeps the "default" "toupper" conversion?


> And while they are at it, I wouldn't refuse if they squared the circle.

That's exactly right.

Leo




On Thu, Dec 12, 2013 at 2:52 PM, Asmus Freytag > wrote:


On 12/12/2013 2:25 PM, Leo Broukhis wrote:

Hmmm... As a person with Russian as the first language I can
assure you that from any literate Russian-speaking person's
perspective italic ū is an unacceptable and *WRONG*
representation of п (because in Russian, unlike Serbian, there is
й). Should we bother disunifying?


This example adds the issue of font style - because for styles
other than italic, the issue doesn't exist. I would take that as a
stronger indication that this is an issue that belongs in glyph
space.



The fact that the lowercase letter is the same in both cases
proves that the difference between N-Eng and n-Eng is purely
stylistic rather than semantic. Unicode shouldn't bother with
those minutia.


What about the reverse case, where the uppercase is the same and
the lower case isn't?

There are precedents in Unicode where these have been disunified.

U+00D0 LATIN CAPITAL LETTER ETH
U+0189 LATIN CAPITAL LETTER AFRICAN D

look exactly identical.

Precedents like this make the issue considerably less than clear cut,




> I suppose nothing will happen until the governments of
eng-using countries come together with a proposal.

Let's hope so. I wish they never do.


Lets hope they come together and endorse a solution that takes
into account not only rendering, but identifier security issues as
well. And while they are at it, I wouldn't refuse if they squared
the circle.



For some identifier systems it may be possible to institute a weak 
normalization that would reduce the impact of duplicating the lower case 
letter.


But disunifying any character at this stage may lead to either abandoned 
data, or data that gets miscoded w/o anybody being the wiser.


Hence it is not true that

"Disunification is the best solution."

under every scenario.

A./





Leo



On Thu, Dec 12, 2013 at 2:06 PM, Michael Everson
mailto:ever...@evertype.com>> wrote:

On 12 Dec 2013, at 15:29, Leo Broukhis mailto:l...@mailcom.com>> wrote:

> Hasn't
http://www.unicode.org/standard/where/#Variant_Shapes
explained it once and for all?

No, because users of N-shaped capital Eng consider n-shaped
capital Eng to be *WRONG*, not an acceptable variant. And
because n-shaped capital Eng consider N-shaped capital Eng to
be *WRONG*, not an acceptable variant.

Disunification is the best solution.

I suppose nothing will happen until the governments of
eng-using countries come together with a proposal.

Michael Everson * http://www.evertype.com/









Re: Engmagate?

2013-12-12 Thread Leo Broukhis
In the case of ɖ vs ð vs đ, there are three different letters, as follows
from their names, that happen to have identical capital glyphs (those
you've mentioned plus U+0110 LATIN CAPITAL LETTER D WITH STROKE).

Speaking of đ, "an alternate glyph with the stroke through the bowl is used
in Americanist orthographies" without any [loud] cries about disunification.

If N-Eng and n-Eng are disunified but small engs aren't (should they?), who
keeps the "default" "toupper" conversion?

> And while they are at it, I wouldn't refuse if they squared the circle.

That's exactly right.

Leo




On Thu, Dec 12, 2013 at 2:52 PM, Asmus Freytag  wrote:

>  On 12/12/2013 2:25 PM, Leo Broukhis wrote:
>
>   Hmmm... As a person with Russian as the first language I can assure you
> that from any literate Russian-speaking person's perspective italic ū is an
> unacceptable and *WRONG* representation of п (because in Russian, unlike
> Serbian, there is й). Should we bother disunifying?
>
>
> This example adds the issue of font style - because for styles other than
> italic, the issue doesn't exist. I would take that as a stronger indication
> that this is an issue that belongs in glyph space.
>
>
>  The fact that the lowercase letter is the same in both cases proves that
> the difference between N-Eng and n-Eng is purely stylistic rather than
> semantic. Unicode shouldn't bother with those minutia.
>
>
> What about the reverse case, where the uppercase is the same and the lower
> case isn't?
>
> There are precedents in Unicode where these have been disunified.
>
> U+00D0 LATIN CAPITAL LETTER ETH
> U+0189 LATIN CAPITAL LETTER AFRICAN D
>
> look exactly identical.
>
> Precedents like this make the issue considerably less than clear cut,
>
>
>
> > I suppose nothing will happen until the governments of eng-using
> countries come together with a proposal.
>
>  Let's hope so. I wish they never do.
>
>
> Lets hope they come together and endorse a solution that takes into
> account not only rendering, but identifier security issues as well. And
> while they are at it, I wouldn't refuse if they squared the circle.
>
> A./
>
>
>  Leo
>
>
>
> On Thu, Dec 12, 2013 at 2:06 PM, Michael Everson wrote:
>
>> On 12 Dec 2013, at 15:29, Leo Broukhis  wrote:
>>
>> > Hasn't http://www.unicode.org/standard/where/#Variant_Shapes explained
>> it once and for all?
>>
>>  No, because users of N-shaped capital Eng consider n-shaped capital Eng
>> to be *WRONG*, not an acceptable variant. And because n-shaped capital Eng
>> consider N-shaped capital Eng to be *WRONG*, not an acceptable variant.
>>
>> Disunification is the best solution.
>>
>> I suppose nothing will happen until the governments of eng-using
>> countries come together with a proposal.
>>
>> Michael Everson * http://www.evertype.com/
>>
>>
>
>


Re: Engmagate?

2013-12-12 Thread Asmus Freytag

On 12/12/2013 2:25 PM, Leo Broukhis wrote:
Hmmm... As a person with Russian as the first language I can assure 
you that from any literate Russian-speaking person's perspective 
italic ū is an unacceptable and *WRONG* representation of п (because 
in Russian, unlike Serbian, there is й). Should we bother disunifying?


This example adds the issue of font style - because for styles other 
than italic, the issue doesn't exist. I would take that as a stronger 
indication that this is an issue that belongs in glyph space.


The fact that the lowercase letter is the same in both cases proves 
that the difference between N-Eng and n-Eng is purely stylistic rather 
than semantic. Unicode shouldn't bother with those minutia.


What about the reverse case, where the uppercase is the same and the 
lower case isn't?


There are precedents in Unicode where these have been disunified.

U+00D0 LATIN CAPITAL LETTER ETH
U+0189 LATIN CAPITAL LETTER AFRICAN D

look exactly identical.

Precedents like this make the issue considerably less than clear cut,



> I suppose nothing will happen until the governments of eng-using 
countries come together with a proposal.


Let's hope so. I wish they never do.


Lets hope they come together and endorse a solution that takes into 
account not only rendering, but identifier security issues as well. And 
while they are at it, I wouldn't refuse if they squared the circle.


A./


Leo



On Thu, Dec 12, 2013 at 2:06 PM, Michael Everson > wrote:


On 12 Dec 2013, at 15:29, Leo Broukhis mailto:l...@mailcom.com>> wrote:

> Hasn't http://www.unicode.org/standard/where/#Variant_Shapes
explained it once and for all?

No, because users of N-shaped capital Eng consider n-shaped
capital Eng to be *WRONG*, not an acceptable variant. And because
n-shaped capital Eng consider N-shaped capital Eng to be *WRONG*,
not an acceptable variant.

Disunification is the best solution.

I suppose nothing will happen until the governments of eng-using
countries come together with a proposal.

Michael Everson * http://www.evertype.com/






Re: Engmagate?

2013-12-12 Thread Leo Broukhis
Hmmm... As a person with Russian as the first language I can assure you
that from any literate Russian-speaking person's perspective italic ū is an
unacceptable and *WRONG* representation of п (because in Russian, unlike
Serbian, there is й). Should we bother disunifying?

The fact that the lowercase letter is the same in both cases proves that
the difference between N-Eng and n-Eng is purely stylistic rather than
semantic. Unicode shouldn't bother with those minutia.

> I suppose nothing will happen until the governments of eng-using
countries come together with a proposal.

Let's hope so. I wish they never do.

Leo



On Thu, Dec 12, 2013 at 2:06 PM, Michael Everson wrote:

> On 12 Dec 2013, at 15:29, Leo Broukhis  wrote:
>
> > Hasn't http://www.unicode.org/standard/where/#Variant_Shapes explained
> it once and for all?
>
> No, because users of N-shaped capital Eng consider n-shaped capital Eng to
> be *WRONG*, not an acceptable variant. And because n-shaped capital Eng
> consider N-shaped capital Eng to be *WRONG*, not an acceptable variant.
>
> Disunification is the best solution.
>
> I suppose nothing will happen until the governments of eng-using countries
> come together with a proposal.
>
> Michael Everson * http://www.evertype.com/
>
>


Re: Engmagate?

2013-12-12 Thread Michael Everson
On 12 Dec 2013, at 15:29, Leo Broukhis  wrote:

> Hasn't http://www.unicode.org/standard/where/#Variant_Shapes explained it 
> once and for all?

No, because users of N-shaped capital Eng consider n-shaped capital Eng to be 
*WRONG*, not an acceptable variant. And because n-shaped capital Eng consider 
N-shaped capital Eng to be *WRONG*, not an acceptable variant. 

Disunification is the best solution.

I suppose nothing will happen until the governments of eng-using countries come 
together with a proposal.

Michael Everson * http://www.evertype.com/





Re: Engmagate?

2013-12-12 Thread Philippe Verdy
However you should have noted that this link just explains why the charts
cannot represent all possible shapes of a character. It exposes some cases
(here we are in a situation exactly similar to the variant shapes of italic
Cyrillic letter pe, with prefered form very different between russian and
Serbian).

This small section in the standard is not enough. Notably it mixes several
very distinct cases:
- contextual shapes in Arabic are part of a separate normative
specification in TUS for joining types.
- contextual shapes adopted within specific *sequences* (e.g. in Indic
scripts, or alternative shapes that *should* be adopted when a base letter
is followed by some combining diacritics.)

The second set of variants would merit further normalization, notably for
Indic scripts. For now we can only find the relevant data outside TUS, for
example in OpenType specifications. This is not enough in my opinion,
because OpenType is not the only possible implementation used of the
standard, and the way OpenType defines this is also not normative and
technically too far from the need: we need a clear and normalized
specification to know which *sequences* are expected (in various languages
or more generally in some scripts independantly of the language), notably
for sequences involving joiner controls. These issues are too superficially
covered in TUS chapters describing some scripts.

TUS has only standardized a few sequences by assigning normative names, but
without assigning them at least informative representative glyphs for these
sequences. and nothing has been done to exhibit representative glyphs
expected in some languages (e.g. the Serbian Cyrillic italic small letter
pe).

We should think about extending the standard by starting by one or several
technical reports, which will later become part of the standard or
integrated in the relevant chapters for each script, and with an add-on
after the charts showing only the representative glyphs for isolated
characters (which is just the "most common" shape expected in all
languages. Such specifications should expose the best practices that are
expected, with many of them that should be come normative (even of there
will still be free space for variation, in the limits where differences
between the standardized sets of representative glyphs will be preserved).

However it requires that these add-on charts contain more identification
than just a single code point, as it shoud include other selectors :
language, style options like italic, sequences of code points, possibly
even a reference to some regular expression or similar (collating
element?) to infer the correct set of acceptable shapes.



2013/12/12 Leo Broukhis 

> Hasn't http://www.unicode.org/standard/where/#Variant_Shapes explained it
> once and for all?
>
> Leo
>
>
> On Thu, Dec 12, 2013 at 4:42 AM,  wrote:
>
>> FWIW, a blog post prompted by discussions in the wake of a DejaVu font
>> use of N-form over n-form capital ŋ ("eng" or "engma"):
>>
>> "The 'eng' times for unified capital ŋ?"
>> http://niamey.blogspot.com/2013/12/the-eng-times-for-unified-capital.html
>>
>> It's not a new issue, but was leaving the two main forms of capital eng
>> as variants of one character the best course of action? In any event, it's
>> probably more complex to disunify now (if that were to be decided) than it
>> would have been, say, 10-12 years ago.
>>
>> Don Osborn
>>
>> Sent via BlackBerry by AT&T
>>
>>
>>
>


Re: Engmagate?

2013-12-12 Thread Philippe Verdy
No, this links only explains that variants are possible, expected, even
desirable, as long as they do not disrupt the languages with which these
variants are used os they cause confusion.
But before thinking about disunifying the ENG/eng pair for these variants,
we need to find convincing opposed pairs where such variants cause
misinterpretation of the encoded texts.

If these variants are separated only by their *preferred* letter form, this
is not enough to justify disunification. The article does not demonstrate
any case where words/sentences would be misinterpreted when reading them
with one form or the other. May be these forms are not those prefered, but
this is not really a problem (for now, until it is demonstrated).

So the issue can be "easily" solved by managing fonts with
language-sensitive letterforms. OpenType already has the necessary support
for it (this is the same issue between Simplified and Traditional Cjinese
letter forms). This means that for best rendering of text, we need language
tagging (outside of the encoded texts, in metadata, or in rich text formats
by embedding explicit language attributes, that renderers will consider to
select which letter form to render).

2013/12/12 Leo Broukhis 

> Hasn't http://www.unicode.org/standard/where/#Variant_Shapes explained it
> once and for all?
>
> Leo
>
>
> On Thu, Dec 12, 2013 at 4:42 AM,  wrote:
>
>> FWIW, a blog post prompted by discussions in the wake of a DejaVu font
>> use of N-form over n-form capital ŋ ("eng" or "engma"):
>>
>> "The 'eng' times for unified capital ŋ?"
>> http://niamey.blogspot.com/2013/12/the-eng-times-for-unified-capital.html
>>
>> It's not a new issue, but was leaving the two main forms of capital eng
>> as variants of one character the best course of action? In any event, it's
>> probably more complex to disunify now (if that were to be decided) than it
>> would have been, say, 10-12 years ago.
>>
>> Don Osborn
>>
>> Sent via BlackBerry by AT&T
>>
>>
>>
>


Re: Engmagate?

2013-12-12 Thread Leo Broukhis
Hasn't http://www.unicode.org/standard/where/#Variant_Shapes explained it
once and for all?

Leo


On Thu, Dec 12, 2013 at 4:42 AM,  wrote:

> FWIW, a blog post prompted by discussions in the wake of a DejaVu font use
> of N-form over n-form capital ŋ ("eng" or "engma"):
>
> "The 'eng' times for unified capital ŋ?"
> http://niamey.blogspot.com/2013/12/the-eng-times-for-unified-capital.html
>
> It's not a new issue, but was leaving the two main forms of capital eng as
> variants of one character the best course of action? In any event, it's
> probably more complex to disunify now (if that were to be decided) than it
> would have been, say, 10-12 years ago.
>
> Don Osborn
>
> Sent via BlackBerry by AT&T
>
>
>