Ken Whistler <kenwhistler at att dot net> wrote: >> A more >> robust approach for their purposes might be to teach ρ to exclude >> combining characters (gc=Mn) when counting the "size" of a string. > > And it seems to me that that is *very* unlikely to happen, precisely > because ρ is so deeply embedded in the array and vector logic of APL. > > That is counting the data size of arrays of "characters" (i.e., code > units). If somebody tried to somehow teach ρ to do something different > about characters, changing the concept of array of code units into > something more akin to what we think of as Unicode strings, that > would end up being a *different* language -- not APL!
Then we're back to the central point that Alex Weiner originally expressed, in arguing for the encoding of precomposed letters with underbar: > The string length functionality would view an 'A' code point combined > with an '_' code point as an item that has two elements, while > something that looks like 'A' Should be atomic, and return a length > of one. -- Doug Ewell | http://ewellic.org | Thornton, CO 🇺🇸

