To clarify, I tried the following:

      *(⎕UCS¨⍳1114111) ⍋ 'foo' 'bar' 'test'*
DOMAIN ERROR
      (⎕UCS¨⍳1114111)⍋'foo' 'bar' 'test'
      ^              ^

Note of course that this is pretty insane, and there should be an easier
way to do this.

Regards,
Elias


On 8 July 2014 12:38, Elias Mårtenson <loke...@gmail.com> wrote:

> Right, but just having a "plain" collating order for Unicode would require
> me to pass a million-element array (⎕UCS¨⍳1114111) as left argument to
> grade.
>
> That said, I can't even get dyadic grade to work at all, but that's a
> separate issue.
>
> Regards,
> Elias
>
>
> On 8 July 2014 12:27, David B. Lamkins <dlamk...@gmail.com> wrote:
>
>> The problem with generating a permutation vector for an "arbitrary"
>> Unicode string is still a problems of collating order. There is no
>> inherent order in Unicode; someone has to decide on what makes sense as
>> a collating order for the subset of code points used by the application.
>>
>> You should use ⎕ucs with a vector of code points to define your own
>> collating order for Unicode; any code points not explicitly specified in
>> the collating order will sort to the end.
>>
>> For example (and this is an easy case) you can use this to specify a
>> default collating order (based upon ordinal value of the code points
>> themselves) for the 8-bit ASCII subset:
>>
>> ⎕ucs ⎕io-⍨⍳256
>>
>>
>>
>> On Tue, 2014-07-08 at 12:09 +0800, Elias Mårtenson wrote:
>> > Dyadic grade doesn't make much sense in the context of Unicode though.
>> > How do you grade an arbitrary Unicode string?
>> >
>> >
>> > That issue is there even if we completely disregard all the
>> > other Unicode-related collating issues.
>> >
>> >
>> > Regards,
>> > Elias
>> >
>> >
>> > On 8 July 2014 12:00, David B. Lamkins <dlamk...@gmail.com> wrote:
>> >         Check my follow-up post.
>> >
>> >         I'm fairly certain that the issue is whether monadic grade
>> >         applied to a
>> >         list of strings should do anything but signal a domain error.
>> >         The ISO
>> >         spec says that monadic grade is defined only on numeric
>> >         arguments.
>> >
>> >         My test case appears to have monadic grade treating strings as
>> >         if they
>> >         encode numbers in a sufficiently large base.
>> >
>> >         If you want to sort strings, use dyadic grade. The left
>> >         argument
>> >         specifies a collating sequence.
>> >
>> >         On Tue, 2014-07-08 at 11:43 +0800, Elias Mårtenson wrote:
>> >         > Ordering by size first makes very little sense to me. It
>> >         makes it very
>> >         > hard to sort any list of strings.
>> >         >
>> >         >
>> >         > I was hoping that the following would have done so, but it
>> >         also
>> >         > suffers from the "length first" issue:
>> >         >
>> >         >
>> >         >       z[⍋ ⎕UCS¨ z←'aa' 'xx' 'aaa' 'xxx']
>> >         >  aa xx aaa xxx
>> >         >
>> >         >
>> >         > What is the proper way to sort strings given the existing
>> >         semantics of
>> >         > grade?
>> >         >
>> >         >
>> >         > Regards,
>> >         > Elias
>> >         >
>> >         >
>> >         > On 8 July 2014 02:34, David Lamkins <da...@lamkins.net>
>> >         wrote:
>> >         >         Looking at the spec, it seems that monadic grade is
>> >         defined
>> >         >         only for numeric data.
>> >         >
>> >         >
>> >         >         That leaves open the question of whether my example
>> >         should
>> >         >         have signaled a domain error.
>> >         >
>> >         >
>> >         >
>> >         >         On Mon, Jul 7, 2014 at 11:25 AM, David Lamkins
>> >         >         <da...@lamkins.net> wrote:
>> >         >                 Given a list of character vectors (and
>> >         scalars), grade
>> >         >                 appears to generate the permutation vector
>> >         first by
>> >         >                 length then by content.
>> >         >
>> >         >                       ⍋'aaa' 'xx' 'y' 'bbb' 'cc'
>> >         >                 3 5 2 1 4
>> >         >
>> >         >
>> >         >                 This seems counterintuitive. It seems as if
>> >         ⍋ treats
>> >         >                 character strings like numbers. Is this a
>> >         bug?
>> >         >
>> >         >                 --
>> >         >                 "The secret to creativity is knowing how to
>> >         hide your
>> >         >                 sources."
>> >         >                    Albert Einstein
>> >         >
>> >         >
>> >         >                 http://soundcloud.com/davidlamkins
>> >         >                 http://reverbnation.com/lamkins
>> >         >                 http://reverbnation.com/lcw
>> >         >                 http://lamkins-guitar.com/
>> >         >                 http://lamkins.net/
>> >         >                 http://successful-lisp.com/
>> >         >
>> >         >
>> >         >
>> >         >         --
>> >         >         "The secret to creativity is knowing how to hide
>> >         your
>> >         >         sources."
>> >         >            Albert Einstein
>> >         >
>> >         >
>> >         >         http://soundcloud.com/davidlamkins
>> >         >         http://reverbnation.com/lamkins
>> >         >         http://reverbnation.com/lcw
>> >         >         http://lamkins-guitar.com/
>> >         >         http://lamkins.net/
>> >         >         http://successful-lisp.com/
>> >         >
>> >         >
>> >
>> >
>> >
>> >
>> >
>>
>>
>>
>

Reply via email to