Re: Character folding in text editors

2016-02-21 Thread Eli Zaretskii
> From: "Doug Ewell" > Cc: > Date: Sun, 21 Feb 2016 10:53:23 -0700 > > > Given that the feature can be turned off easily, do you think that it > > will nonetheless be useful, even though language-dependent parts are > > not available? > > It's probably a lot better than no folding. Just be prep

Re: Character folding in text editors

2016-02-21 Thread Asmus Freytag (t)
On 2/21/2016 8:22 AM, Eli Zaretskii wrote: From: "Asmus Freytag (t)" Date: Sat, 20 Feb 2016 14:10:04 -0800 What about language-independent character-folding: where in the Unicode database is the data for that? Unico

Re: Character folding in text editors

2016-02-21 Thread Doug Ewell
Eli Zaretskii wrote: About the closest approximation you can get using Unicode data alone (not CLDR) is to normalize to NFD, then ignore the combining diacritics. This is what Emacs currently does, IIUC what you say. The NFD normalization uses the decomposition data included with UnicodeData.

Additional decompositions in decomps.txt

2016-02-21 Thread Eli Zaretskii
This question is separate from, though related to, the "Character folding in text editors" thread. The UCA database includes the file decomps.txt, which is said to be based on the normative properties: # The decompositions used in the generation of DUCET are loosely based # on the normative d

Re: Character folding in text editors

2016-02-21 Thread Eli Zaretskii
Btw, are there any editors out there which support similar features? If so, can someone please point to them, and perhaps provide a short summary of the features they provide and how are they implemented? Thanks.

Re: Character folding in text editors

2016-02-21 Thread Eli Zaretskii
> From: Mark Davis ☕️ > Date: Sun, 21 Feb 2016 11:47:28 +0100 > Cc: Unicode Public > > If you don't use ICU, you can also use the CLDR data directly, but you'll > have to parse it yourself. You'd start with the root locale, then add in > the mappings from the children (eg de.xml). The parsing is

Re: Character folding in text editors

2016-02-21 Thread Eli Zaretskii
> From: Philippe Verdy > Date: Sun, 21 Feb 2016 00:19:19 +0100 > Cc: unicode Unicode Discussion > > Unless we have case folding tailored by language, you cannot do that based > on the Unicode database alone. > > However CLDR provides tailored data about collation. > > From my point of view, i

Re: Character folding in text editors

2016-02-21 Thread Eli Zaretskii
> From: Philippe Verdy > Date: Sun, 21 Feb 2016 00:19:19 +0100 > Cc: unicode Unicode Discussion > > It should also be noted that some kind of "folding" described/desired by > Elias will likely fail his expectations, even when using collation data in > CLDR tailored per language. I don't think t

Re: Character folding in text editors

2016-02-21 Thread Eli Zaretskii
> From: "Asmus Freytag (t)" > Date: Sat, 20 Feb 2016 14:10:04 -0800 > > > What about language-independent character-folding: where in the > > Unicode database is the data for that? > > > > > Unicode, even CLDR, doesn't nearly have enough data for the purpose. This seems to contradict what others

Re: Character folding in text editors

2016-02-21 Thread Eli Zaretskii
> From: "Doug Ewell" > Date: Sat, 20 Feb 2016 14:43:15 -0700 > > > What about language-independent character-folding: where in the > > Unicode database is the data for that? > > The OP kind of alluded to that: there is no such thing really as > language-independent character folding. Emacs is c

Re: Character folding in text editors

2016-02-21 Thread Mark Davis ☕️
On Sat, Feb 20, 2016 at 11:10 PM, Asmus Freytag (t) wrote: > Unicode, even CLDR, doesn't nearly have enough data for the purpose. > (and as a corollary of what Elias points out, it's likely to annoy users > of every language, in that it would fold essential and non-essential > distinctions indisc