Re: TIRONIAN SIGN ET

2018-01-27 Thread Janusz S. Bień via Unicode
On Sat, Jan 27 2018 at 21:59 CET, davidj_fau...@yahoo.ca writes:

[...]

> As far as I can tell, it was originally proposed in the document n1747 
> 'Contraction mark characters for the UCS’ by Everson. However, I
> cannot find that document anywhere.

Thank you very much for the reference.

On the page

http://www.evertype.com/formal.html

there is the link

http://unicode.org/wg2/docs/n1747.pdf

but it does not work. However the page

http://www.unicode.org/wg2/WG2-registry.html

states

The archival document directory for WG2 is accessible here:
http://std.dkuug.dk/jtc1/sc2/wg2/ The archives contain all
available documents through 2014

and the document is at

ftp://std.dkuug.dk/ftp.anonymous/JTC1/SC2/WG2/docs/n1747.pdf

Actually the character is "inherited" from

 ISO 5426-2:1996 Information and documentation -- Extension of
 the Latin alphabet coded character set for bibliographic
 information interchange -- Part 2: Latin characters used in
 minor European languages and obsolete typography

Hence my curiosity is fully satisfied :-)

Thanks again!

Janusz

-- 
   ,   
Prof. dr hab. Janusz S. Bien -  Uniwersytet Warszawski (Katedra Lingwistyki 
Formalnej)
Prof. Janusz S. Bien - University of Warsaw (Formal Linguistics Department)
jsb...@uw.edu.pl, jsb...@mimuw.edu.pl, http://fleksem.klf.uw.edu.pl/~jsbien/



Re: 0027, 02BC, 2019, or a new character?

2018-01-27 Thread Marcel Schneider via Unicode
On Sun, 28 Jan 2018 05:02:47 +, Richard Wordingham via Unicode wrote:
> 
> On Sat, 27 Jan 2018 22:54:57 +0100 (CET)
> Marcel Schneider via Unicode  wrote:
> 
> > The US-Intl is so weird “you canʼt just leave it on all the time” as
> > reported in:
> > 
> > http://www.unicode.org/mail-arch/unicode-ml/Archives-Old/UML017/0558.html
> 
> I did (except when I was using a totally different writing system).
> One just has to remember that those punctuation marks need two key
> strokes, the first being the space key. Mark Davis's problem seems to
> be that he was using an Apple half the time.

Indeed, Appleʼs US-extended has lots of dead keys on Option level, so that 
Base level ASCII symbols are left alone. Some of these are hijacked on 
Windowsʼ US-international for five deadkeys only (likewise, French hijacks 
two), 
to disrupt UX wrt macOS, impacting those using both.

And developers donʼt like to remember hitting space before a vowel to get 
the (single/double/reverse) quote, or tilde or caret. On any layout, such a 
complication is inacceptable to most coders.

But US-Intl isnʼt the only case. The Canadian Standard layout too is cheered 
on Apple and disliked on Windows, obviously because beyond the first two 
levels, there are many many differences. That cannot really be a matter of 
conformance to the CAN specs, as the Windows implementation leaves out 
the '⅛' character, beside of messing up the group modifier.

We can only hope that now, CLDR is thoroughly re-engineering the way 
international or otherwise extended keyboards are mapped.

Regards,

Marcel



Re: 0027, 02BC, 2019, or a new character?

2018-01-27 Thread Richard Wordingham via Unicode
On Sat, 27 Jan 2018 22:54:57 +0100 (CET)
Marcel Schneider via Unicode  wrote:

> The US-Intl is so weird “you canʼt just leave it on all the time” as
> reported in:
> 
> http://www.unicode.org/mail-arch/unicode-ml/Archives-Old/UML017/0558.html

I did (except when I was using a totally different writing system).
One just has to remember that those punctuation marks need two key
strokes, the first being the space key. Mark Davis's problem seems to
be that he was using an Apple half the time.

Richard.



Re: Internationalised Computer Science Exercises

2018-01-27 Thread Richard Wordingham via Unicode
On Sat, 27 Jan 2018 14:13:40 -0800
Shervin Afshar  wrote:

> On Mon, Jan 22, 2018 at 2:08 PM, Richard Wordingham via Unicode <
> unicode@unicode.org> wrote:  

> > On Mon, 22 Jan 2018 at 16:39:57, Andre Schappo via Unicode <  
> > unicode@unicode.org> wrote:  

> > > By way of example, one programming challenge I set to students a
> > > couple of weeks ago involves diacritics. Please see
> > > jsfiddle.net/coas/wda45gLp  

> > Did any of them come up with the idea of using traces instead of
> > strings?

> Care to elaborate? Are you referring to sequence alignment methods?

No, I'm thinking of the trace monoid (see e.g.
https://en.wikipedia.org/wiki/Trace_monoid).  One way of thinking of
strings is as concatenations of the NFD decompositions of their
constituent characters. Then the canonical equivalence classes of these
strings form the trace monoid of indecomposable characters.  The theory
of regular expressions (though you may not think that mathematical
regular expressions matter) extends to trace monoids, with the
disturbing exception that the Kleene star of a regular language is not
necessarily regular.  (The prototypical example is sequences (xy)^n
where x and y are distinct and commute, i.e. xy and yx are canonically
equivalent in Unicode terms.  A Unicode example is the set of strings
composed only of U+0F73 TIBETAN VOWEL SIGN II - there is no FSM that
will recognise canonically equivalent strings).

One consequence of this view is that one has to think of U+1EAD LATIN
SMALL LETTER A WITH CIRCUMFLEX AND DOT BELOW (ậ) beinɡ both composed of
the Vietnamese vowel letter U+00E2 LATIN SMALL LETTER A WITH CIRCUMFLEX
(â) and tone mark  U+0323 COMBINING DOT BELOW and also composed of, in
the spirit of Thai ISO 11940 transliteration, of the transliterated Thai
vowel U+1EA1 LATIN SMALL LETTER A WITH DOT BELOW (ạ), corresponding to
U+0E31 THAI CHARACTER MAI HAN-AKAT, and the tone mark U+0302 COMBINING
CIRCUMFLEX ACCENT, corresponding to U+0E49 THAI CHARACTER MAI THO.  (In
ISO 11940 as specified, the tone mark is actually written on the
immediately preceding consonant, not on the vowel.)

Richard.



Re: Internationalised Computer Science Exercises

2018-01-27 Thread Shervin Afshar via Unicode
On Mon, Jan 22, 2018 at 2:08 PM, Richard Wordingham via Unicode <
unicode@unicode.org> wrote:

> On Mon, 22 Jan 2018 at 16:39:57, Andre Schappo via Unicode <
> unicode@unicode.org> wrote:


> > By way of example, one programming challenge I set to students a
> > couple of weeks ago involves diacritics. Please see
> > jsfiddle.net/coas/wda45gLp
>
> Did any of them come up with the idea of using traces instead of
> strings?
>

Care to elaborate? Are you referring to sequence alignment methods?


Re: 0027, 02BC, 2019, or a new character?

2018-01-27 Thread Marcel Schneider via Unicode
On Tue, 23 Jan 2018 21:52:46 +, Richard Wordingham wrote:
> 
> On Wed, 24 Jan 2018 03:22:37 +0800
> Phake Nick via Unicode  wrote:
> 
> > >I found the Windows 'US International' keyboard layout highly
> > >intuitive for accented Latin-1 characters. 
> > How common is the US International keyboard in real life..?
> 
> I thought it was two copies per new Windows PC - one for 32- and the
> other for 64-bit code. I was talking about the *layout*. […]

The US-Intl is so weird “you canʼt just leave it on all the time” as reported 
in:

http://www.unicode.org/mail-arch/unicode-ml/Archives-Old/UML017/0558.html

Now that CLDR is sorting out how to improve keyboard layouts, hopefully 
something falls off to replace the *legacy* US-Intl. 
As of how common the new one will become, I guess it depends on whether 
it gets less weird than the old one, and to what extent.

Regards,

Marcel



Re: TIRONIAN SIGN ET

2018-01-27 Thread Janusz S. Bień via Unicode
On Sat, Jan 27 2018 at 20:53 CET, r...@unicode.org writes:
> Hello Janusz --
>
> Try this: http://www.unicode.org/L2/L2017/17300-n4841-tironian-et.pdf
>
> Regards,
>
> On 1/27/2018 11:40 AM, Janusz S. Bień via Unicode wrote:
>> Hi!
>>
>> I try to find in UTC Document Register the proposals for characters
>> which interest me for some reasons. I'm usually rather successful, but
>> I'm unable to find the proposal for TIRONIAN SIGN ET.

I've seen this document, but I'm looking for an earlier one. The
character was introduced in Unicode 3.0 in 1999, cf. e.g.

http://unicode.org/mail-arch/unicode-ml/Archives-Old/UML015/0250.html

Regards

Janusz

-- 
   ,   
Prof. dr hab. Janusz S. Bien -  Uniwersytet Warszawski (Katedra Lingwistyki 
Formalnej)
Prof. Janusz S. Bien - University of Warsaw (Formal Linguistics Department)
jsb...@uw.edu.pl, jsb...@mimuw.edu.pl, http://fleksem.klf.uw.edu.pl/~jsbien/



Re: TIRONIAN SIGN ET

2018-01-27 Thread Rick McGowan via Unicode

Hello Janusz --

Try this: http://www.unicode.org/L2/L2017/17300-n4841-tironian-et.pdf

Regards,

On 1/27/2018 11:40 AM, Janusz S. Bień via Unicode wrote:

Hi!

I try to find in UTC Document Register the proposals for characters
which interest me for some reasons. I'm usually rather successful, but
I'm unable to find the proposal for TIRONIAN SIGN ET.

Any hints?

Best regards

Janusz





TIRONIAN SIGN ET

2018-01-27 Thread Janusz S. Bień via Unicode

Hi!

I try to find in UTC Document Register the proposals for characters
which interest me for some reasons. I'm usually rather successful, but
I'm unable to find the proposal for TIRONIAN SIGN ET.

Any hints?

Best regards

Janusz

-- 
   ,   
Prof. dr hab. Janusz S. Bien -  Uniwersytet Warszawski (Katedra Lingwistyki 
Formalnej)
Prof. Janusz S. Bien - University of Warsaw (Formal Linguistics Department)
jsb...@uw.edu.pl, jsb...@mimuw.edu.pl, http://fleksem.klf.uw.edu.pl/~jsbien/


Re: [HUMOR] Proof that emojis are useful

2018-01-27 Thread Mark Davis ☕️ via Unicode
Nice, thanks!

Mark

On Sat, Jan 27, 2018 at 7:31 AM, Stephane Bortzmeyer via Unicode <
unicode@unicode.org> wrote:

> Nice scientific info, and with emojis :
>
> https://twitter.com/biolojical/status/956953421130514432
>


[HUMOR] Proof that emojis are useful

2018-01-27 Thread Stephane Bortzmeyer via Unicode
Nice scientific info, and with emojis :

https://twitter.com/biolojical/status/956953421130514432


In the mean time, in France (was Re: 0027, 02BC, 2019, or a new character?)

2018-01-27 Thread Denis Jacquerye via Unicode
In the mean time, in France, a municipality is refusing to let a baby be
registered with an apostrophe in his Breton name while several babies have
had apostrophes in their names in recent years : 2017 N'néné (F), 2017
Tu'iuvea (M), 2016 D'jessy (M), 2015 N'Guessan (F), 2015 Chem's (M), 2014
N'Khany (M) 2012 Manec'h (M).

https://www.connexionfrance.com/French-news/Rennes-mayor-to-challenge-ban-on-Breton-first-names

If only someone had told them it’s not necessarily an apostrophe but
can be U+02BC or U+02BB in some of these.


Re: 0027, 02BC, 2019, or a new character?

2018-01-27 Thread Julian Bradfield via Unicode
On 2018-01-26, Richard Wordingham via Unicode  wrote:
> Some systems (or admins) have been totally defeated by even the ASCII
> version of ʹO’Sullivanʹ.  That bodes ill for Kazakhs.

The head (about to be ex-head) of my university is Sir Timothy O'Shea.
On the student record system, it is impossible to search for students
called O'Shea (I have one). I suppose it doesn't sanitize correctly -
I haven't tried looking for little Bobby Tables yet. It hadn't
occurred to me to check, but of course searching for O’Shea doesn't
work either, as they usually enter their own names into the initial
record, and use 0027.


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.