Re: Internationalised Computer Science Exercises

2018-02-05 Thread Richard Wordingham via Unicode
On Thu, 1 Feb 2018 19:20:04 + Richard Wordingham via Unicode wrote: > A regular trace expression of the form > > [:ccc=1:][:ccc=2:]…[:ccc=n:] > > seems to require 2^n states in your scheme. As I effectively only > apply the regex to NFD input strings, I use fewer

Re: Internationalised Computer Science Exercises - Correction

2018-02-01 Thread Richard Wordingham via Unicode
On Thu, 1 Feb 2018 01:38:58 + Richard Wordingham via Unicode wrote: > I believe the concurrent star of a language A is (|A|)*, where > > |A| = {x ∊ A : {x}* is a regular language} > > (The definition works for the trace of fully decomposed Unicode > character strings

Re: Internationalised Computer Science Exercises

2018-02-01 Thread Richard Wordingham via Unicode
On Thu, 1 Feb 2018 08:03:31 +0100 Philippe Verdy via Unicode wrote: > 2018-02-01 2:38 GMT+01:00 Richard Wordingham via Unicode < > unicode@unicode.org>: >> On Wed, 31 Jan 2018 19:45:56 +0100 >> Philippe Verdy via Unicode wrote: >>> 2018-01-29

Re: Internationalised Computer Science Exercises

2018-02-01 Thread Philippe Verdy via Unicode
2018-02-01 8:03 GMT+01:00 Philippe Verdy : > > > 2018-02-01 2:38 GMT+01:00 Richard Wordingham via Unicode < > unicode@unicode.org>: > >> >> For example, in /a{2}/ has 4 nodes (still marked by leading apostrophes > here), and so k=2: > > '< > \ /

Re: Internationalised Computer Science Exercises

2018-01-31 Thread Philippe Verdy via Unicode
2018-02-01 2:38 GMT+01:00 Richard Wordingham via Unicode < unicode@unicode.org>: > On Wed, 31 Jan 2018 19:45:56 +0100 > Philippe Verdy via Unicode wrote: > > > 2018-01-29 21:53 GMT+01:00 Richard Wordingham via Unicode < > > unicode@unicode.org>: > > > > On Mon, 29 Jan 2018

Re: Internationalised Computer Science Exercises

2018-01-31 Thread Richard Wordingham via Unicode
On Wed, 31 Jan 2018 19:45:56 +0100 Philippe Verdy via Unicode wrote: > 2018-01-29 21:53 GMT+01:00 Richard Wordingham via Unicode < > unicode@unicode.org>: > > On Mon, 29 Jan 2018 14:15:04 +0100 > > was meant to be an example of a > > searched string. For example, >

Re: Internationalised Computer Science Exercises

2018-01-31 Thread Philippe Verdy via Unicode
2018-01-29 21:53 GMT+01:00 Richard Wordingham via Unicode < unicode@unicode.org>: > On Mon, 29 Jan 2018 14:15:04 +0100 > > The case of u with diaeresis and macron is simpler: it has two > > combining characters of the same combining class and they don't > > commute, still the regexp to match it

Re: Internationalised Computer Science Exercises

2018-01-29 Thread Richard Wordingham via Unicode
On Mon, 29 Jan 2018 14:15:04 +0100 Philippe Verdy via Unicode wrote: > No since the begining we were talking about matching strings that are > canonically equivalent within regexps. So that searching for a regexp > containing precombined characters or decomposed characters

Re: Internationalised Computer Science Exercises

2018-01-29 Thread Philippe Verdy via Unicode
No since the begining we were talking about matching strings that are canonically equivalent within regexps. So that searching for a regexp containing precombined characters or decomposed characters would find them independantly of the encoded form (normalized or not) of the input and

Re: Internationalised Computer Science Exercises

2018-01-29 Thread Andre Schappo via Unicode
icode let alone heard of trace monoid ...and I confess, I knew nothing of trace monoid until I read the below wikipedia article but then again my ignorance is profound BTW. these internationalised computer science exercises I have written and am writing are not part of any course or module a

Re: Internationalised Computer Science Exercises

2018-01-29 Thread Richard Wordingham via Unicode
On Mon, 29 Jan 2018 07:16:04 +0100 Philippe Verdy via Unicode wrote: > 2018-01-28 23:44 GMT+01:00 Richard Wordingham via Unicode < > unicode@unicode.org>: > > In the search you have in mind, the converted regex for use with NFD > > strings is actually intelligible and

Re: Internationalised Computer Science Exercises

2018-01-28 Thread Philippe Verdy via Unicode
You may also wonder why I describe a regexp that would never match anything but would be handled itself as a successful match: it is a useful extension that allows stopping early the analysis and genenalizes the concept of negation (defined in character classes with the minus operator). For

Re: Internationalised Computer Science Exercises

2018-01-28 Thread Philippe Verdy via Unicode
I made an error for the character class notation: "{?optionalquantifier[class]}" should be just "{optionalquantifier[class]}"... So "{?[abc]}" contains 1 item "[abc]" to choose from in any order, it is not quantified explicitly so it matches by default 1 or more, but as there's only one item, it

Re: Internationalised Computer Science Exercises

2018-01-28 Thread Philippe Verdy via Unicode
2018-01-28 23:44 GMT+01:00 Richard Wordingham via Unicode < unicode@unicode.org>: > On Sun, 28 Jan 2018 20:29:28 +0100 > Philippe Verdy via Unicode wrote: > > > 2018-01-28 5:12 GMT+01:00 Richard Wordingham via Unicode < > > unicode@unicode.org>: > > > > > On Sat, 27 Jan 2018

Re: Internationalised Computer Science Exercises

2018-01-28 Thread Richard Wordingham via Unicode
On Sun, 28 Jan 2018 20:29:28 +0100 Philippe Verdy via Unicode wrote: > 2018-01-28 5:12 GMT+01:00 Richard Wordingham via Unicode < > unicode@unicode.org>: > > > On Sat, 27 Jan 2018 14:13:40 -0800The theory > > of regular expressions (though you may not think that

Re: Internationalised Computer Science Exercises

2018-01-28 Thread Philippe Verdy via Unicode
Note that for finding occurence of simpler combining sequences such as finding the regexp is simpler: [[ [^[[:cc=0:]]] - [[:cc=above:]] ]] * The central character class allows 53 distinct combining classes, and the maximum match length is 2+53=55 characters. If Unicode assigns new combining

Re: Internationalised Computer Science Exercises

2018-01-28 Thread Philippe Verdy via Unicode
Typo, the full regexp has undesired asterisks: [[ [^[[:cc=0:]]] - [[:cc=above:][:cc=below:]] ]] * ( [[ [^[[:cc=0:]]] - [[:cc=above:][:cc=below:]] ]] * | [[ [^[[:cc=0:]]] - [[:cc=above:][:cc=below:]] ]] * < COMBINING CIRCUMFLEX> 2018-01-28 20:29 GMT+01:00 Philippe Verdy

Re: Internationalised Computer Science Exercises

2018-01-28 Thread Philippe Verdy via Unicode
2018-01-28 5:12 GMT+01:00 Richard Wordingham via Unicode < unicode@unicode.org>: > On Sat, 27 Jan 2018 14:13:40 -0800The theory > of regular expressions (though you may not think that mathematical > regular expressions matter) extends to trace monoids, with the > disturbing exception that the

Re: Internationalised Computer Science Exercises

2018-01-27 Thread Richard Wordingham via Unicode
On Sat, 27 Jan 2018 14:13:40 -0800 Shervin Afshar wrote: > On Mon, Jan 22, 2018 at 2:08 PM, Richard Wordingham via Unicode < > unicode@unicode.org> wrote: > > On Mon, 22 Jan 2018 at 16:39:57, Andre Schappo via Unicode < > > unicode@unicode.org> wrote: > > > By

Re: Internationalised Computer Science Exercises

2018-01-27 Thread Shervin Afshar via Unicode
On Mon, Jan 22, 2018 at 2:08 PM, Richard Wordingham via Unicode < unicode@unicode.org> wrote: > On Mon, 22 Jan 2018 at 16:39:57, Andre Schappo via Unicode < > unicode@unicode.org> wrote: > > By way of example, one programming challenge I set to students a > > couple of weeks ago involves

Re: Internationalised Computer Science Exercises

2018-01-22 Thread Richard Wordingham via Unicode
On Mon, 22 Jan 2018 18:55:16 +0100 Frédéric Grosshans via Unicode wrote: > A simple challenge is to write a function which localize numbers in a > script having decimal digits or parse them (i.e. which have > characters with property Numeric_Type=Decimal, as explained in

Re: Internationalised Computer Science Exercises

2018-01-22 Thread Richard Wordingham via Unicode
On Mon, 22 Jan 2018 16:39:57 + Andre Schappo via Unicode wrote: > By way of example, one programming challenge I set to students a > couple of weeks ago involves diacritics. Please see > jsfiddle.net/coas/wda45gLp Did any of them

Re: Internationalised Computer Science Exercises

2018-01-22 Thread Frédéric Grosshans via Unicode
Le 22/01/2018 à 17:39, Andre Schappo via Unicode a écrit : By way of example, one programming challenge I set to students a couple of weeks ago involves diacritics. Please see jsfiddle.net/coas/wda45gLp There is huge potential for some really

Internationalised Computer Science Exercises

2018-01-22 Thread Andre Schappo via Unicode
will start in October but students will be choosing their project some time around June. The project involves producing a set of internationalised Computer Science exercises for both educators and students. Details at schappo.blogspot.co.uk/2018/01/computer-science-internationalization_21.html<h