On Tue, 2014-07-08 at 12:38 +0800, Elias Mårtenson wrote: > Right, but just having a "plain" collating order for Unicode would > require me to pass a million-element array (⎕UCS¨⍳1114111) as left > argument to grade. >
I guess you could do that if you needed to impose a complete collating order upon every code point. Most applications would be content, I think, with sorting alphanumerics (in all the languages of interest) plus common punctuation. > > That said, I can't even get dyadic grade to work at all, but that's a > separate issue. > Here's a working example. ∇z←suffix CF⍙ls path;dir ⍝ Return a character matrix of directory entries. Left argument, ⍝ if present, filters entries by suffix. z←0 0⍴'' dir←CF¯FILEIO[28] path dir←(⍳↑⍴dir) ⎕io⌷dir ⍎(0≠⎕nc 'suffix')/'dir←((⊂,suffix)≡¨(-⍴,suffix)↑¨dir)/dir' →(0=⍴dir)/0 dir←⊃dir z←dir[(⎕ucs ⎕io-⍨⍳256)⍋dir;] ∇ CF¯FILEIO is the bound name of the lib_file_io native function. On the last line, dir is a character matrix. > > Regards, > Elias > > > On 8 July 2014 12:27, David B. Lamkins <dlamk...@gmail.com> wrote: > The problem with generating a permutation vector for an > "arbitrary" > Unicode string is still a problems of collating order. There > is no > inherent order in Unicode; someone has to decide on what makes > sense as > a collating order for the subset of code points used by the > application. > > You should use ⎕ucs with a vector of code points to define > your own > collating order for Unicode; any code points not explicitly > specified in > the collating order will sort to the end. > > For example (and this is an easy case) you can use this to > specify a > default collating order (based upon ordinal value of the code > points > themselves) for the 8-bit ASCII subset: > > ⎕ucs ⎕io-⍨⍳256 > > > > On Tue, 2014-07-08 at 12:09 +0800, Elias Mårtenson wrote: > > Dyadic grade doesn't make much sense in the context of > Unicode though. > > How do you grade an arbitrary Unicode string? > > > > > > That issue is there even if we completely disregard all the > > other Unicode-related collating issues. > > > > > > Regards, > > Elias > > > > > > On 8 July 2014 12:00, David B. Lamkins <dlamk...@gmail.com> > wrote: > > Check my follow-up post. > > > > I'm fairly certain that the issue is whether monadic > grade > > applied to a > > list of strings should do anything but signal a > domain error. > > The ISO > > spec says that monadic grade is defined only on > numeric > > arguments. > > > > My test case appears to have monadic grade treating > strings as > > if they > > encode numbers in a sufficiently large base. > > > > If you want to sort strings, use dyadic grade. The > left > > argument > > specifies a collating sequence. > > > > On Tue, 2014-07-08 at 11:43 +0800, Elias Mårtenson > wrote: > > > Ordering by size first makes very little sense to > me. It > > makes it very > > > hard to sort any list of strings. > > > > > > > > > I was hoping that the following would have done > so, but it > > also > > > suffers from the "length first" issue: > > > > > > > > > z[⍋ ⎕UCS¨ z←'aa' 'xx' 'aaa' 'xxx'] > > > aa xx aaa xxx > > > > > > > > > What is the proper way to sort strings given the > existing > > semantics of > > > grade? > > > > > > > > > Regards, > > > Elias > > > > > > > > > On 8 July 2014 02:34, David Lamkins > <da...@lamkins.net> > > wrote: > > > Looking at the spec, it seems that monadic > grade is > > defined > > > only for numeric data. > > > > > > > > > That leaves open the question of whether > my example > > should > > > have signaled a domain error. > > > > > > > > > > > > On Mon, Jul 7, 2014 at 11:25 AM, David > Lamkins > > > <da...@lamkins.net> wrote: > > > Given a list of character vectors > (and > > scalars), grade > > > appears to generate the > permutation vector > > first by > > > length then by content. > > > > > > ⍋'aaa' 'xx' 'y' 'bbb' 'cc' > > > 3 5 2 1 4 > > > > > > > > > This seems counterintuitive. It > seems as if > > ⍋ treats > > > character strings like numbers. Is > this a > > bug? > > > > > > -- > > > "The secret to creativity is > knowing how to > > hide your > > > sources." > > > Albert Einstein > > > > > > > > > http://soundcloud.com/davidlamkins > > > http://reverbnation.com/lamkins > > > http://reverbnation.com/lcw > > > http://lamkins-guitar.com/ > > > http://lamkins.net/ > > > http://successful-lisp.com/ > > > > > > > > > > > > -- > > > "The secret to creativity is knowing how > to hide > > your > > > sources." > > > Albert Einstein > > > > > > > > > http://soundcloud.com/davidlamkins > > > http://reverbnation.com/lamkins > > > http://reverbnation.com/lcw > > > http://lamkins-guitar.com/ > > > http://lamkins.net/ > > > http://successful-lisp.com/ > > > > > > > > > > > > > > > > > > > > >