Re: Accented Characters and Counting Syllables

2014-12-08 Thread Nordlöw
On Sunday, 7 December 2014 at 15:47:45 UTC, H. S. Teoh via Digitalmars-d-learn wrote: Ok, thanks. I just noticed that byGrapheme() lacks bidirectional access. Further it also lacks graphemeStrideBack() in complement to graphemeStride()? Similar to stride() and strideBack(). Is this difficult

Re: Accented Characters and Counting Syllables

2014-12-08 Thread Nordlöw
On Monday, 8 December 2014 at 14:57:06 UTC, Nordlöw wrote: What's the best source of information for these algorithms? Is it certain that graphemes iteration is backwards iteratable by definition? I guess https://en.wikipedia.org/wiki/Combining_character could be a good start.

Re: Accented Characters and Counting Syllables

2014-12-07 Thread anonymous via Digitalmars-d-learn
On Saturday, 6 December 2014 at 22:37:19 UTC, Nordlöw wrote: Given the fact that static assert(é.length == 2); I was surprised that static assert(é.byCodeUnit.length == 2); static assert(é.byCodePoint.length == 2); string already iterates over code points. So byCodePoint doesn't

Re: Accented Characters and Counting Syllables

2014-12-07 Thread via Digitalmars-d-learn
On Saturday, 6 December 2014 at 22:37:19 UTC, Nordlöw wrote: static assert(é.byCodePoint.length == 2); Huh? Why is byCodePoint.length even defined?

Re: Accented Characters and Counting Syllables

2014-12-07 Thread John Colvin via Digitalmars-d-learn
On Sunday, 7 December 2014 at 13:24:28 UTC, Marc Schütz wrote: On Saturday, 6 December 2014 at 22:37:19 UTC, Nordlöw wrote: static assert(é.byCodePoint.length == 2); Huh? Why is byCodePoint.length even defined? because string has ElementType dchar (i.e. it already iterates by

Re: Accented Characters and Counting Syllables

2014-12-07 Thread via Digitalmars-d-learn
On Sunday, 7 December 2014 at 13:24:28 UTC, Marc Schütz wrote: On Saturday, 6 December 2014 at 22:37:19 UTC, Nordlöw wrote: static assert(é.byCodePoint.length == 2); Huh? Why is byCodePoint.length even defined? import std.uni; pragma(msg, typeof(é.byCodePoint)); = string Something's

Re: Accented Characters and Counting Syllables

2014-12-07 Thread Nordlöw
On Saturday, 6 December 2014 at 23:11:49 UTC, H. S. Teoh via Digitalmars-d-learn wrote: This is a Unicode issue. What you want is neither byCodeUnit nor byCodePoint, but byGrapheme. A grapheme is the Unicode equivalent of what lay people would call a character. A Unicode character (or more

Re: Accented Characters and Counting Syllables

2014-12-07 Thread H. S. Teoh via Digitalmars-d-learn
On Sun, Dec 07, 2014 at 02:30:13PM +, Nordlöw via Digitalmars-d-learn wrote: On Saturday, 6 December 2014 at 23:11:49 UTC, H. S. Teoh via Digitalmars-d-learn wrote: This is a Unicode issue. What you want is neither byCodeUnit nor byCodePoint, but byGrapheme. A grapheme is the Unicode

Re: Accented Characters and Counting Syllables

2014-12-06 Thread H. S. Teoh via Digitalmars-d-learn
On Sat, Dec 06, 2014 at 10:37:17PM +, Nordlöw via Digitalmars-d-learn wrote: Given the fact that static assert(é.length == 2); I was surprised that static assert(é.byCodeUnit.length == 2); static assert(é.byCodePoint.length == 2); Isn't there a way to iterate over