Re: Internationalised Computer Science Exercises

2018-01-28 Thread Philippe Verdy via Unicode
You may also wonder why I describe a regexp that would never match anything but would be handled itself as a successful match: it is a useful extension that allows stopping early the analysis and genenalizes the concept of negation (defined in character classes with the minus operator). For exampl

Re: Internationalised Computer Science Exercises

2018-01-28 Thread Philippe Verdy via Unicode
I made an error for the character class notation: "{?optionalquantifier[class]}" should be just "{optionalquantifier[class]}"... So "{?[abc]}" contains 1 item "[abc]" to choose from in any order, it is not quantified explicitly so it matches by default 1 or more, but as there's only one item, it w

Re: Internationalised Computer Science Exercises

2018-01-28 Thread Philippe Verdy via Unicode
2018-01-28 23:44 GMT+01:00 Richard Wordingham via Unicode < unicode@unicode.org>: > On Sun, 28 Jan 2018 20:29:28 +0100 > Philippe Verdy via Unicode wrote: > > > 2018-01-28 5:12 GMT+01:00 Richard Wordingham via Unicode < > > unicode@unicode.org>: > > > > > On Sat, 27 Jan 2018 14:13:40 -0800The the

Re: Keyboard layouts and CLDR (was: Re: 0027, 02BC, 2019, or a new character?)

2018-01-28 Thread Mark Davis ☕️ via Unicode
On Sun, Jan 28, 2018 at 3:20 PM, Doug Ewell wrote: > Mark Davis wrote: > > One addition: with the expansion of keyboards in >> http://blog.unicode.org/2018/01/unicode-ldml-keyboard-enhancements.html >> we are looking to expand the repository to not merely represent those, >> but to also serve as

Re: Keyboard layouts and CLDR (was: Re: 0027, 02BC, 2019, or a new character?)

2018-01-28 Thread Marcel Schneider via Unicode
On Sun, 28 Jan 2018 14:11:06 -0700, Doug Ewell wrote: > > Marcel Schneider wrote: > > > We can only hope that now, CLDR is thoroughly re-engineering the way > > international or otherwise extended keyboards are mapped. > > I suspect you already know this and just misspoke, but CLDR doesn't > pre

Re: Keyboard layouts and CLDR

2018-01-28 Thread Marcel Schneider via Unicode
On Sun, 28 Jan 2018 16:20:16 -0700, Doug Ewell wrote: > > Mark Davis wrote: > > > One addition: with the expansion of keyboards in > > http://blog.unicode.org/2018/01/unicode-ldml-keyboard-enhancements.html > > we are looking to expand the repository to not merely represent those, > > but to also

Re: Keyboard layouts and CLDR (was: Re: 0027, 02BC, 2019, or a new character?)

2018-01-28 Thread Doug Ewell via Unicode
Mark Davis wrote: One addition: with the expansion of keyboards in http://blog.unicode.org/2018/01/unicode-ldml-keyboard-enhancements.html we are looking to expand the repository to not merely represent those, but to also serve as a resource that vendors can draw on. Would you say, then, that

Re: Keyboard layouts and CLDR (was: Re: 0027, 02BC, 2019, or a new character?)

2018-01-28 Thread Mark Davis ☕️ via Unicode
One addition: with the expansion of keyboards in http://blog.unicode.org/2018/01/unicode-ldml-keyboard-enhancements.html we are looking to expand the repository to not merely represent those, but to also serve as a resource that vendors can draw on. Mark On Sun, Jan 28, 2018 at 1:11 PM, Doug Ewel

Re: Internationalised Computer Science Exercises

2018-01-28 Thread Richard Wordingham via Unicode
On Sun, 28 Jan 2018 20:29:28 +0100 Philippe Verdy via Unicode wrote: > 2018-01-28 5:12 GMT+01:00 Richard Wordingham via Unicode < > unicode@unicode.org>: > > > On Sat, 27 Jan 2018 14:13:40 -0800The theory > > of regular expressions (though you may not think that mathematical > > regular expres

Keyboard layouts and CLDR (was: Re: 0027, 02BC, 2019, or a new character?)

2018-01-28 Thread Doug Ewell via Unicode
Marcel Schneider wrote: We can only hope that now, CLDR is thoroughly re-engineering the way international or otherwise extended keyboards are mapped. I suspect you already know this and just misspoke, but CLDR doesn't prescribe any vendor's keyboard layouts. CLDR mappings reflect what vendo

Re: Internationalised Computer Science Exercises

2018-01-28 Thread Philippe Verdy via Unicode
Note that for finding occurence of simpler combining sequences such as finding the regexp is simpler: [[ [^[[:cc=0:]]] - [[:cc=above:]] ]] * The central character class allows 53 distinct combining classes, and the maximum match length is 2+53=55 characters. If Unicode assigns new combining c

Re: Internationalised Computer Science Exercises

2018-01-28 Thread Philippe Verdy via Unicode
Typo, the full regexp has undesired asterisks: [[ [^[[:cc=0:]]] - [[:cc=above:][:cc=below:]] ]] * ( [[ [^[[:cc=0:]]] - [[:cc=above:][:cc=below:]] ]] * | [[ [^[[:cc=0:]]] - [[:cc=above:][:cc=below:]] ]] * < COMBINING CIRCUMFLEX> 2018-01-28 20:29 GMT+01:00 Philippe Verdy : > > > 2018

Re: Internationalised Computer Science Exercises

2018-01-28 Thread Philippe Verdy via Unicode
2018-01-28 5:12 GMT+01:00 Richard Wordingham via Unicode < unicode@unicode.org>: > On Sat, 27 Jan 2018 14:13:40 -0800The theory > of regular expressions (though you may not think that mathematical > regular expressions matter) extends to trace monoids, with the > disturbing exception that the Klee