Re: Grapheme clusters and east asian width

2015-09-17 Thread Richard Wordingham
On Thu, 17 Sep 2015 19:30:41 +0300 Eli Zaretskii wrote: > > Date: Thu, 17 Sep 2015 17:25:34 +0100 > > From: Daniel Bünzli > > Cc: richard.wording...@ntlworld.com, unicode@unicode.org > > > > Le jeudi, 17 septembre 2015 à 17:24, Eli Zaretskii a écrit : > > > > Is there a formal definition of the

Re: Grapheme clusters and east asian width

2015-09-17 Thread Eli Zaretskii
> Date: Thu, 17 Sep 2015 17:25:34 +0100 > From: Daniel Bünzli > Cc: richard.wording...@ntlworld.com, unicode@unicode.org > > Le jeudi, 17 septembre 2015 à 17:24, Eli Zaretskii a écrit : > > > Is there a formal definition of the algorithm used ? This [1] is not very > > > helpful. > > > > They

Re: Grapheme clusters and east asian width

2015-09-17 Thread Daniel Bünzli
Le jeudi, 17 septembre 2015 à 17:24, Eli Zaretskii a écrit : > > Is there a formal definition of the algorithm used ? This [1] is not very > > helpful. > > They just use a table of values, AFAIK. But is it standardized or everyone has its own table ? Daniel

Re: Grapheme clusters and east asian width

2015-09-17 Thread Eli Zaretskii
> Date: Thu, 17 Sep 2015 16:51:03 +0100 > From: Daniel Bünzli > Cc: Richard Wordingham , unicode@unicode.org > > > > Date: Thu, 17 Sep 2015 13:27:31 +0100 > > > From: Richard Wordingham > > (mailto:richard.wording...@ntlworld.com)> > > > > > > The best estimator is probably the POSIX function

Re: Grapheme clusters and east asian width

2015-09-17 Thread Daniel Bünzli
Le jeudi, 17 septembre 2015 à 15:47, Eli Zaretskii a écrit : > > Date: Thu, 17 Sep 2015 13:27:31 +0100 > > From: Richard Wordingham > (mailto:richard.wording...@ntlworld.com)> > > > > The best estimator is probably the POSIX function wcswidth(). > Only on glibc-based systems, I'm quite sure.

Re: Grapheme clusters and east asian width

2015-09-17 Thread Eli Zaretskii
> Date: Thu, 17 Sep 2015 13:27:31 +0100 > From: Richard Wordingham > > The best estimator is probably the POSIX function wcswidth(). Only on glibc-based systems, I'm quite sure. > The > terminal emulator might actually use that function to do its layout. > Some do. If you need accuracy, you may

Re: Grapheme clusters and east asian width

2015-09-17 Thread Richard Wordingham
On Thu, 17 Sep 2015 10:00:29 +0100 Daniel Bünzli wrote: > Le jeudi, 17 septembre 2015 à 02:25, Richard Wordingham a écrit : > > If you're trying to work out what a particular emulator will do, the > > starting point is its documentation. > Unfortunately *many* emulators. The best estimator

Re: Grapheme clusters and east asian width

2015-09-17 Thread Daniel Bünzli
Le jeudi, 17 septembre 2015 à 02:25, Richard Wordingham a écrit : > Are you actually trying to work out how a terminal emulator someone else > wrote will position > characters? Yes. Basically given a, let's say single line, UTF-8 string to output to a, let's say an ANSI tty, I'd like to compute

Re: Grapheme clusters and east asian width

2015-09-16 Thread Richard Wordingham
On Wed, 16 Sep 2015 22:34:17 +0100 Daniel Bünzli wrote: > Le mercredi, 16 septembre 2015 à 20:33, Richard Wordingham a écrit : > > Have you addressed the issue of Indic scripts? There are > > discontiguous grapheme clusters composed of indecomposable code > > points (e.g. U+17C4 KHMER VOWEL SIGN

Re: Grapheme clusters and east asian width

2015-09-16 Thread Richard Wordingham
On Wed, 16 Sep 2015 22:56:42 +0100 Daniel Bünzli wrote: > Le mercredi, 16 septembre 2015 à 22:14, Asmus Freytag (t) a écrit : > > "N" doesn't mean "narrow" but "neutral" - that is, the width is > > given by other consideration. > > Ah right ! Thanks. Narrow is Na. > > So a refined algorithm w

Re: Grapheme clusters and east asian width

2015-09-16 Thread Daniel Bünzli
Le mercredi, 16 septembre 2015 à 22:14, Asmus Freytag (t) a écrit : > "N" doesn't mean "narrow" but "neutral" - that is, the width is given by > other consideration. Ah right ! Thanks. Narrow is Na. So a refined algorithm would be to actually do the summation in each grapheme cluster as I ini

Re: Grapheme clusters and east asian width

2015-09-16 Thread Daniel Bünzli
Le mercredi, 16 septembre 2015 à 21:27, Dominikus Dittes Scherkl a écrit : > Why adding them up? > I think every grapheme cluster of hangul syllables would have simply > width 2 - that is the concept of CJK charakters. I don't personally know how CJK characters behave in general w.r.t. to width,

Re: Grapheme clusters and east asian width

2015-09-16 Thread Asmus Freytag (t)
On 9/15/2015 6:45 PM, Daniel Bünzli wrote: Hello, Is there any guidance on how to combine the information given by grapheme clusters and the east asian width property to do fixed-width layouts in terminal emulators ? For example if we have: U+AC01 ( 각 ) HA

Re: Grapheme clusters and east asian width

2015-09-16 Thread Dominikus Dittes Scherkl
Am 16.09.2015 um 03:45 schrieb Daniel Bünzli: > Hello, > > Is there any guidance on how to combine the information given by > grapheme clusters and the east asian width property to do fixed-width > layouts in terminal emulators ? > > For example if we have: > > U+AC01 ( 각 ) HANGUL SYLLABLE GAG >

Re: Grapheme clusters and east asian width

2015-09-16 Thread Richard Wordingham
On Wed, 16 Sep 2015 02:45:27 +0100 Daniel Bünzli wrote: > This will delimit a single grapheme cluster, but if I try to add up > their east asian widths (W, N, N), this would result in 4 columns. > Does something naïve like looking up only the east asian width of the > first scalar value in the g

Re: Grapheme clusters and east asian width

2015-09-16 Thread Daniel Bünzli
Le mercredi, 16 septembre 2015 à 18:10, Edwin Hoogerbeets a écrit : > Have you looked into the Unicode Normalization Algorithm? Since in general a precomposed character cannot always be found, I'll still need to apply unicode segmentation algorithm for finding grapheme clusters and I'd rather

Grapheme clusters and east asian width

2015-09-15 Thread Daniel Bünzli
Hello, Is there any guidance on how to combine the information given by grapheme clusters and the east asian width property to do fixed-width layouts in terminal emulators ? For example if we have: U+AC01 ( 각 ) HANGUL SYLLABLE GAG This will delimit a single grapheme cluster with east a