Re: A sketch with the best-known Swiss tongue twister

2018-03-09 Thread Mark Davis ☕️ via Unicode
> In summary you do not object the fact that unqualified "gsw" language code

​Whether I object or not makes no​ difference.

Whether for good or for bad, the gsw code (clearly originally for
German-Swiss from the code letters) has been expanded beyond the borders of
Switzerland. There are also separate codes for Schwäbisch and
Waliserdütsch, so outside of Switzerland 'gsw' mainly extends to Elsassisch
(Alsace, ~0.5M speakers). So gsw-CH works to limit the scope to Switzerland
(~4.5M speakers).

> My opinion is that even the Swiss variants should be preferably named
"Swiss Alemannic" collectively...

That's clearly also not going to happen for the English term. Good luck
with the French equivalent...

Mark

On Fri, Mar 9, 2018 at 3:52 PM, Philippe Verdy  wrote:

> In summary you do not object the fact that unqualified "gsw" language code
> is not (and should not be) named "Swiss German" (as it is only for
> "gsw-CH", not for any other non-Swiss variants of Alemannic).
>
> The addition of "High" is optional, unneeded in fact, as it does not
> remove any ambiguity, in Germany for "de-DE", or in Switzerland for
> "de-CH", or in Italian South Tyrol for "de-IT", or in Austria for "de-AT",
> or even for "Standard German" (de)
>
> Note also that Alsatian itself ("gsw-FR") is considered part of the "High
> German" branch of Germanic languages !
>
> "High German" refers to the group that includes Standard German and its
> national variants ("de", "de-DE", "de-CH", "de-AT", "de-CH", "de-IT") as
> well as the Alemannic group ( "gsw" , "gsw-FR", "gsw-CH"), possibly extended
> (this is discutable) to Schwäbish in Germany and Hungary.
>
> My opinion is that even the Swiss variants should be preferably named
> "Swiss Alemannic" collectively, and not "Swiss German" which causes
> constant confusion between "de-CH" and "gsw-CH".
>
>
> 2018-03-09 15:11 GMT+01:00 Mark Davis ☕️ via Unicode 
> :
>
>> Yes, the right English names are "Swiss High German" for de-CH, and
>> "Swiss German" for gsw-CH.
>>
>> Mark
>>
>> On Fri, Mar 9, 2018 at 2:40 PM, Tom Gewecke via Unicode <
>> unicode@unicode.org> wrote:
>>
>>>
>>> > On Mar 9, 2018, at 5:52 AM, Philippe Verdy via Unicode <
>>> unicode@unicode.org> wrote:
>>> >
>>> > So the "best-known Swiss tongue" is still not so much known, and still
>>> incorrectly referenced (frequently confused with "Swiss German", which is
>>> much like standard High German
>>>
>>> I think Swiss German is in fact the correct English name for the Swiss
>>> dialects, taken from the German Schweizerdeutsch.
>>>
>>> https://en.wikipedia.org/wiki/Swiss_German
>>>
>>
>>
>


Re: Unicode Emoji 11.0 characters now ready for adoption!

2018-03-09 Thread Ken Whistler via Unicode



On 3/9/2018 9:29 AM, via Unicode wrote:
Documented increase such as scientific terms for new elements, flora 
and fauna, would seem to be not more one or two dozen a year. 


Indeed. Of the "urgently needed characters" added to the unified CJK 
ideographs for Unicode 11.0, two were obscure place name characters 
needed to complete mapping for the Japanese IT mandatory use of the Moji 
Joho collection.


The other three were newly standardized Chinese characters for 
superheavy elements that now have official designations by the IUPAC (as 
of December 2015): Nihonium (113), Tennessine (117) and Oganesson (118). 
The Chinese characters coined for those 3 were encoded at U+9FED, 
U+9FEC, and U+9FEB, respectively.


Oganesson, in particular, is of interest, as the heaviest known element 
produced to date. It is the subject of 1000's of hours of intense 
experimentation and of hundreds of scientific papers, but:


   ... since 2005, only five (possibly six) atoms of the nuclide ^294
   Og have been detected.


But we already have a Chinese character (pronounced ào) for Og, and a 
standardized Unicode code point for it: U+9FEB.


Next up: unobtanium and hardtofindium

--Ken



Re: A sketch with the best-known Swiss tongue twister

2018-03-09 Thread Philippe Verdy via Unicode
Is that just for Switzerland in one of the local dialectal variants ? Or
more generally Alemannic (also in Northeastern France, South Germany,
Western Austria, Liechtenstein, Northern Italy).

2018-03-09 12:09 GMT+01:00 Mark Davis ☕️ via Unicode :

> https://www.youtube.com/watch?v=QOwITNazUKg
>
> De Papscht hät z’Schpiäz s’Schpäkchbschtekch z’schpaat bschtellt.
> literally: The Pope has [in Spiez] [the bacon cutlery] [too late] ordered.
>
> Mark
>


Re: Unicode Emoji 11.0 characters now ready for adoption!

2018-03-09 Thread Martin J. Dürst via Unicode

On 2018/03/09 10:17, Philippe Verdy via Unicode wrote:

This still leaves the question about how to write personal names !
IDS alone cannot represent them without enabling some "reasonable"
ligaturing (they don't have to match the exact strokes variants for optimal
placement, or with all possible simplifications).
I'm curious to know how China, Taiwan, Singapore or Japan handle this (for
official records or in banks): like our personal signatures (as digital
images), and then using a simplified official record (including the
registration of romanized names)?


This question seems to assume more of a difference between alphabetic 
and ideographic traditions. A name in ideographs, in the same way as a 
name in alphabetic characters, is defined by the characters that are 
used, not by stuff like stroke variants, etc. And virtually all names, 
even before the introduction of computers, and even more after that, use 
reasonably frequent characters.


The difference, at least in Japan, is that some people keep the 
ideograph before simplification in their official records, but they may 
or may not insist on its use in everyday practice. In most cases, both a 
traditional and a simplified variant are available. Examples are 広/廣, 
高/髙, 崎/﨑, and so on. I regularly hit such cases when grading, because 
our university database uses the formal (old) one, where students may 
not care about it and enter the new one on some system where they have 
to enter their name by themselves.


Apart from that, at least in Japan, signatures are used extremely 
rarely; it's mostly stamped seals, which are also kept as images by 
banks,...


Regards,   Martin.



Re: Unicode Emoji 11.0 characters now ready for adoption!

2018-03-09 Thread Martin J. Dürst via Unicode

On 2018/03/09 10:22, Philippe Verdy via Unicode wrote:

As well how Chinese/Japanese post offices handle addresses written with
sinograms for personal names ? Is the expanded IDS form acceptable for
them, or do they require using Romanized addresses, or phonetic
approximations (Bopomofo in China, Kanas in Japan, Hangul in Korea) ?


They just see the printed form, not an encoding, and therefore no IDS. 
Many addresses use handwriting, which has its own variability. 
Variations such as those covered by IDSes are easily recognizable by 
people as being the same as the 'base' character, and OCR systems, if 
they are good enough to decipher handwriting, can handle such cases, 
too. Romanized addresses will be delivered because otherwise it would be 
difficult for foreigners to send anything. Pure Kana should work in 
Japan, although the postal employee will have a second look because it's 
extremely unusual. For Korea, these days, it will be mostly Hangul; I'm 
not sure whether addresses with Hanja would incur a delay. My guess 
would be that Bopomofo wouldn't work in mainland China (might work in 
Taiwan, not sure).


Regards,   Martin.


A sketch with the best-known Swiss tongue twister

2018-03-09 Thread Mark Davis ☕️ via Unicode
https://www.youtube.com/watch?v=QOwITNazUKg

De Papscht hät z’Schpiäz s’Schpäkchbschtekch z’schpaat bschtellt.
literally: The Pope has [in Spiez] [the bacon cutlery] [too late] ordered.

Mark


Re: Translating the standard

2018-03-09 Thread Ken Whistler via Unicode



On 3/9/2018 6:58 AM, Marcel Schneider via Unicode wrote:

As of translating the Core spec as a whole, why did two recent attempts crash 
even
before the maintenance stage, while the 3.1 project succeeded?


Essentially because both the Japanese and the Chinese attempts were 
conceived of as commercial projects, which ultimately did not cost out 
for the publishers, I think. Both projects attempted limiting the scope 
of their translation to a subset of the core spec that would focus on 
East Asian topics, but the core spec is complex enough that it does not 
abridge well. And I think both projects ran into difficulties in trying 
to figure out how to deal with fonts and figures.


The Unicode 3.0 translation (and the 3.1 update) by Patrick Andries was 
a labor of love. In this arena, a labor of love is far more likely to 
succeed than a commercial translation project, because it doesn't have 
to make financial sense.


By the way, as a kind of annotation to an annotated translation, people 
should know that the 3.1 translation on Patrick's site is not a straight 
translation of 3.1, but a kind of interpreted adaptation. In particular, 
it incorporated a translation of UAX #15, Unicode Normalization Forms, 
Version 3.1.0, as a Chapter 6 of the translation, which is not the 
actual structure of Unicode 3.1. And there are other abridgements and 
alterations, where they make sense -- compare the resources section of 
the Preface, for example. This is not a knock on Patrick's excellent 
translation work, but it does illustrate the inherent difficulties of 
trying to approach a complete translation project for *any* version of 
the Unicode Standard.


--Ken



Re: A sketch with the best-known Swiss tongue twister

2018-03-09 Thread Otto Stolz via Unicode

2018-03-09 12:09 GMT+01:00 Mark Davis ☕️ via Unicode
:

De Papscht hät z’Schpiäz s’Schpäkchbschtekch z’schpaat bschtellt.
literally: The Pope has [in Spiez] [the bacon cutlery] [too late]
ordered.


Am 2018-03-09 um 12:52 schrieb Philippe Verdy via Unicode:

Is that just for Switzerland in one of the local dialectal variants ?


Basically the same in Central Swabian (I am from Stuttgart):
  I måen, mir häbet s Spätzles-Bsteck z spät bstellt.
  literally: I guess, we have ordered the noodle cutlery too late.

And when my niece married a guy with the Polish surname Brzeczek
and had asked for cutlery for their wedding present, guess what we
have told them. ☺

Otto

Solution:
  Zerst hemmer denkt, mir häbet für die Brzeczeks s Bsteck
  z spät bstellt, aber nå håts doch no glangt.


Re: A sketch with the best-known Swiss tongue twister

2018-03-09 Thread Tom Gewecke via Unicode

> On Mar 9, 2018, at 5:52 AM, Philippe Verdy via Unicode  
> wrote:
> 
> So the "best-known Swiss tongue" is still not so much known, and still 
> incorrectly referenced (frequently confused with "Swiss German", which is 
> much like standard High German

I think Swiss German is in fact the correct English name for the Swiss 
dialects, taken from the German Schweizerdeutsch.

https://en.wikipedia.org/wiki/Swiss_German


Re: A sketch with the best-known Swiss tongue twister

2018-03-09 Thread Philippe Verdy via Unicode
English Wikipedia is not a good reference for the name; the GSW wiki states
clearly another name and "Alemannic" is attested and correct for the family
of dialects.
"Schweizerdeutsch" is also wrong like "Swiss German" when it refers to
Alsatian (neither Swiss nor German for those speaking it): these
expressions only refer to "de-CH", not "gsw".

2018-03-09 14:40 GMT+01:00 Tom Gewecke via Unicode :

>
> > On Mar 9, 2018, at 5:52 AM, Philippe Verdy via Unicode <
> unicode@unicode.org> wrote:
> >
> > So the "best-known Swiss tongue" is still not so much known, and still
> incorrectly referenced (frequently confused with "Swiss German", which is
> much like standard High German
>
> I think Swiss German is in fact the correct English name for the Swiss
> dialects, taken from the German Schweizerdeutsch.
>
> https://en.wikipedia.org/wiki/Swiss_German
>


Re: A sketch with the best-known Swiss tongue twister

2018-03-09 Thread Philippe Verdy via Unicode
So the "best-known Swiss tongue" is still not so much known, and still
incorrectly referenced (frequently confused with "Swiss German", which is
much like standard High German, unifying with it on most aspects, with only
minor orthographic preferences such as capitalization rules or very few
Swiss-specific terms, but no alteration of the grammar and no specific
characters like in Alemanic dialects; the term "Swiss tongue" in the
context given by the video is obviously false).
Note tht Schwäbisch is way far from it. What looks more like the Swiss
dialects of Alemanic if French Alsatian, it is not "Swiss", and don't tell
Alsatians that this is "German" when there are clear differences with the
language on the other side of the Rhine River, and lot of differences with
Schwäbish (which is much more a distinct language than a dialect of
Alemannic or German). Same remark about Tyrol and Bavarian (they are
probably nearer from Schwäbish than Swiss or French Alemannic, or than
Standard High German; their difference with Schwäbish is almost like the
difference between Standard Dutch and Limburgish or West Flämisch; Standard
Dutch, Standard German, French/Swiss Alemanic, and Schwäbisch are enough
differentiated to be distinct languages). The term "Alemannic" is way too
large, but calling it "Swiss German" is also wrong (even if its ISO 639-3
code is "gsw", probably taken from this incorrect name).

2018-03-09 13:23 GMT+01:00 Otto Stolz via Unicode :

> 2018-03-09 12:09 GMT+01:00 Mark Davis ☕️ via Unicode
> :
>
>> De Papscht hät z’Schpiäz s’Schpäkchbschtekch z’schpaat bschtellt.
>> literally: The Pope has [in Spiez] [the bacon cutlery] [too late]
>> ordered.
>>
>
> Am 2018-03-09 um 12:52 schrieb Philippe Verdy via Unicode:
>
>> Is that just for Switzerland in one of the local dialectal variants ?
>>
>
> Basically the same in Central Swabian (I am from Stuttgart):
>   I måen, mir häbet s Spätzles-Bsteck z spät bstellt.
>   literally: I guess, we have ordered the noodle cutlery too late.
>
> And when my niece married a guy with the Polish surname Brzeczek
> and had asked for cutlery for their wedding present, guess what we
> have told them. ☺
>
> Otto
>
> Solution:
>   Zerst hemmer denkt, mir häbet für die Brzeczeks s Bsteck
>   z spät bstellt, aber nå håts doch no glangt.
>


Re: A sketch with the best-known Swiss tongue twister

2018-03-09 Thread Mark Davis ☕️ via Unicode
There are definitely many dialects across Switzerland. I think that for
*this* phrase it would be roughly the same for most of the population, with
minor differences (eg 'het' vs 'hät'). But a native speaker like Martin
would be able to say for sure.

Mark

On Fri, Mar 9, 2018 at 12:52 PM, Philippe Verdy  wrote:

> Is that just for Switzerland in one of the local dialectal variants ? Or
> more generally Alemannic (also in Northeastern France, South Germany,
> Western Austria, Liechtenstein, Northern Italy).
>
> 2018-03-09 12:09 GMT+01:00 Mark Davis ☕️ via Unicode 
> :
>
>> https://www.youtube.com/watch?v=QOwITNazUKg
>>
>> De Papscht hät z’Schpiäz s’Schpäkchbschtekch z’schpaat bschtellt.
>> literally: The Pope has [in Spiez] [the bacon cutlery] [too late] ordered.
>>
>> Mark
>>
>
>


Re: A sketch with the best-known Swiss tongue twister

2018-03-09 Thread Mark Davis ☕️ via Unicode
Yes, the right English names are "Swiss High German" for de-CH, and "Swiss
German" for gsw-CH.

Mark

On Fri, Mar 9, 2018 at 2:40 PM, Tom Gewecke via Unicode  wrote:

>
> > On Mar 9, 2018, at 5:52 AM, Philippe Verdy via Unicode <
> unicode@unicode.org> wrote:
> >
> > So the "best-known Swiss tongue" is still not so much known, and still
> incorrectly referenced (frequently confused with "Swiss German", which is
> much like standard High German
>
> I think Swiss German is in fact the correct English name for the Swiss
> dialects, taken from the German Schweizerdeutsch.
>
> https://en.wikipedia.org/wiki/Swiss_German
>


Re: A sketch with the best-known Swiss tongue twister

2018-03-09 Thread Philippe Verdy via Unicode
In summary you do not object the fact that unqualified "gsw" language code
is not (and should not be) named "Swiss German" (as it is only for
"gsw-CH", not for any other non-Swiss variants of Alemannic).

The addition of "High" is optional, unneeded in fact, as it does not remove
any ambiguity, in Germany for "de-DE", or in Switzerland for "de-CH", or in
Italian South Tyrol for "de-IT", or in Austria for "de-AT", or even for
"Standard German" (de)

Note also that Alsatian itself ("gsw-FR") is considered part of the "High
German" branch of Germanic languages !

"High German" refers to the group that includes Standard German and its
national variants ("de", "de-DE", "de-CH", "de-AT", "de-CH", "de-IT") as
well as the Alemannic group ( "gsw" , "gsw-FR", "gsw-CH"), possibly extended
(this is discutable) to Schwäbish in Germany and Hungary.

My opinion is that even the Swiss variants should be preferably named
"Swiss Alemannic" collectively, and not "Swiss German" which causes
constant confusion between "de-CH" and "gsw-CH".


2018-03-09 15:11 GMT+01:00 Mark Davis ☕️ via Unicode :

> Yes, the right English names are "Swiss High German" for de-CH, and "Swiss
> German" for gsw-CH.
>
> Mark
>
> On Fri, Mar 9, 2018 at 2:40 PM, Tom Gewecke via Unicode <
> unicode@unicode.org> wrote:
>
>>
>> > On Mar 9, 2018, at 5:52 AM, Philippe Verdy via Unicode <
>> unicode@unicode.org> wrote:
>> >
>> > So the "best-known Swiss tongue" is still not so much known, and still
>> incorrectly referenced (frequently confused with "Swiss German", which is
>> much like standard High German
>>
>> I think Swiss German is in fact the correct English name for the Swiss
>> dialects, taken from the German Schweizerdeutsch.
>>
>> https://en.wikipedia.org/wiki/Swiss_German
>>
>
>


Re: Translating the standard (was: Re: Fonts and font sizes used in the Unicode)

2018-03-09 Thread Marcel Schneider via Unicode
On 08/03/18 19:33, Arthur Reutenauer  wrote:
> 
> On Thu, Mar 08, 2018 at 07:05:06PM +0100, Marcel Schneider via Unicode wrote:
> > https://www.amazon.fr/Unicode-5-0-pratique-Patrick-Andries/dp/2100511408/ref=pd_bbs_sr_1?ie=UTF8=books=1206989878=8-1
> 
> You’re linking to the wrong one of Patrick’s books :-) The
> translation he made of version 3.1 (not 5.0) of the core specification
> is available in full at http://hapax.qc.ca/ (“Unicode et ISO 10646 en
> français”, middle of page), as well as a few free sample chapters from
> his other book.
> 
> Best,
> 
> Arthur
> 

Indeed, thank you very much for correction, and thanks for the link.

I can tell so much that the free online chapters of Patrick Andriesʼ 
translation 
of the Unicode standard were to me the first introduction, more precisely ch. 7 
(Punctuation) which I even printed out to get in touch with the various dashes 
and spaces and learn more about quotation marks. [I didnʼt have internet and
took the copy home from a library.] Based on this experience, I think there 
isnʼt 
too much extrapolation in supposing that millions of newcomers in all countries 
could use such a translation. Although the latest version of TUS is obviously 
more 
up‐to‐date, version 3.1 isnʼt plain wrong at all. Hence I warmly recommend to
translate at least v3.1 — or those chapters of v10.0 that are already in v3.1 — 
while prompting the reader to seek further information on the Unicode website.

We note too that Patrickʼs translation is annotated (footnotes in gray print) 
with
additional information of interest for the target locale. (Here one could 
mention 
that Latin script requires preformatted superscript letters for an 
interoperable 
representation of current text in some languages.)

Some Unicode terminology like “bidi‐mirroring” may be hard to adapt but that 
isnʼt more of a challenge than any tech/science writer is facing when handling 
content that was originally produced in the United States and/or, more 
generally,
in English. E.g. in French we may choose from a panel of more conservative 
through less usual grammatical forms among which: “réflexion bidi”, “réflexion
bidirectonnelle”, “bidi‐reflexion” (hyphenated or not), “réflexible” or, 
simply, 
“miroir”. Anyway, every locale is expected to localize the full range of 
Unicode 
terminology — unless people agree to switch to English whenever the topic is 
Unicode, even while discussing any other topic currently in Chinese or in 
Japanese, 
although doing so is not a problem, itʼs just ethically weird.

So we look forward to the concept of a “Unicode in Practice” textbook 
implemented
in Chinese and in Japanese and in any other non‐English and non‐French locale 
if it
isnʼt already.

As of translating the Core spec as a whole, why did two recent attempts crash 
even 
before the maintenance stage, while the 3.1 project succeeded?

Some pieces of the puzzle seem to be still missing.

Best regards,

Marcel



Re: Unicode Emoji 11.0 characters now ready for adoption!

2018-03-09 Thread via Unicode

Dear Richard,

On 09.03.2018 07:06, Richard Wordingham via Unicode wrote:

On Thu, 08 Mar 2018 09:42:38 +0800
via Unicode  wrote:

to the best of my knowledge virtually no new characters used just 
for

names are under consideration, all the ones that are under
consideration are from before this century.


What I was interested in was the rate of generation of new
CJK characters in general, not just those for names.  I appreciate 
that

encoding is dominated by the backlog of older characters.



Impossible to give an accurate answer or even a reasonable guess.

As to those that would be condidates for Unicode, my guess would be not 
more than a few dozen a year. New  characters are not permitted in legal 
names. Fanasty Chinese characters used for a alien language or a mystery 
novel would not usually be suitable for encoding. Most new words in 
Chinese have more than one syllable and do not require any new 
characters. Documented increase such as scientific terms for new 
elements, flora and fauna, would seem to be not more one or two dozen a 
year.


Regards
John Knightley



Richard.




Re: Unicode Emoji 11.0 characters now ready for adoption!

2018-03-09 Thread via Unicode

On 09.03.2018 09:17, Philippe Verdy via Unicode wrote:

This still leaves the question about how to write personal names !
IDS alone cannot represent them without enabling some "reasonable"
ligaturing (they dont have to match the exact strokes variants for
optimal placement, or with all possible simplifications).
Im curious to know how China, Taiwan, Singapore or Japan handle this
(for official records or in banks): like our personal signatures (as
digital images), and then using a simplified official record
(including the registration of romanized names)?

2018-03-09 0:06 GMT+01:00 Richard Wordingham via Unicode
:

In mainliand China the full back is to use pinyin capitals without tone 
marks, so ASCII. Passport have names printed in both Chinese characters 
and capitalised pinyin, both are legally valid. ID cards which people 
get when they turn 16 have the names in printed Chinese characters only. 
So these I assume must be printed using a system that has some 
characters not in UCS. Banks certainly don't have all these extra 
characters so they use capitalised pinyin for any characters they can 
not type.


Japan in CJK Ext F had 1,645 characters which included all characters 
required for names of poeple and places. So there should be no need for 
a fallback system, Unicode is enough, now


John Knightley