>>>>> "DC" == Damian Conway <[EMAIL PROTECTED]> writes:

  DC>      # Stringifically ascending...
  DC>      @sorted = sort @unsorted;

  DC> or with a single two-argument block/closure (to sort by whatever the
  DC> specified comparator is):

  DC>      # Numerically ascending...
  DC>      @sorted = sort {$^a <=> $^b} @unsorted;

so because that has 2 placeholders, it is will match this signature:

     type Comparator   ::= Code(Any, Any) returns Int;

i have to remember that placeholders are really implied args to a code
block and not just in the expression

  DC>      # Namewise stringifically descending case-insensitive...
  DC>      @sorted = sort {lc $^b.name cmp lc $^a.name}
  DC>                     @unsorted;
  DC>      # or...
  DC>      @sorted = sort {$^b.name cmp $^a.name} is insensitive
  DC>                     @unsorted;
  DC>      # or...
  DC>      @sorted = sort {$^a.name cmp $^b.name} is descending is insensitive
  DC>                     @unsorted;

TIMTOWTDI lives on!

  DC>      # Modtimewise numerically ascending...
  DC>      @sorted = sort {-M $^a <=> -M $^b} @unsorted;

  DC>      # Fuzz-ifically...
  DC>      sub fuzzy_cmp($x, $y) returns Int;
  DC>      @sorted = sort &fuzzy_cmp, @unsorted;

ok, so that is recognizes as a compare sub due to the 2 arg sig. so does
the sub must be defined/declared before the sort code is compiled?

  DC> or with a single one-argument block/closure (to sort according
  DC> whatever the specified key extractor returns):

  DC>      # Numerically ascending...
  DC>      @sorted = sort {+ $^elem} @unsorted;
  DC>      @sorted = sort {+ $_} @unsorted;

is $^elem special? or just a regular place holder? i see $_ will be set
to each record as we discussed.

  DC>      # Namewise stringifically descending case-insensitive...
  DC>      @sorted = sort {~ $^elem.name} is descending is insensitive @unsorted;
  DC>      @sorted = sort {lc $^elem.name} is descending @unsorted;
  DC>      @sorted = sort {lc .name} is descending @unsorted;

just getting my p6 chops back. .name is really $_.name so that makes
sense. and $^elem is just a named placeholder for $_ as before?

  DC>      # Key-ifically...
  DC>      sub get_key($elem) {...}
  DC>      @sorted = sort &get_key, @unsorted;

and that is parsed as an extracter code call due to the single arg
sig. again, it appears that it has to be seen before the sort code for
that to work.

  DC> or with a single extractor/comparator pair (to sort according to the
  DC> extracted key, using the specified comparator):

  DC>      # Modtimewise stringifically descending...
  DC>      @sorted = sort {-M}=>{$^b cmp $^a} @unsorted;

so that is a single pair of extractor/comparator. but there is no comma
before @unsorted. is that correct? see below for why i ask that.

  DC>      # Namewise fuzz-ifically...
  DC>      @sorted = sort {.name}=>&fuzzy_cmp @unsorted;

i first parsed that as being wrong and the {} should wrap the whole
thing. so that is a pair again of extractor/comparator.

  DC> or with an array of comparators and/or key extractors and/or
  DC> extractor-comparator pairs (to sort according to a cascading list of
  DC> criteria):

  DC>      # Numerically ascending
  DC>      # or else namewise stringifically descending case-insensitive
  DC>      # or else modtimewise numerically ascending
  DC>      # or else namewise fuzz-ifically
  DC>      # or else fuzz-ifically...
  DC>      @sorted = sort [ {+ $^elem},
  DC>                       {$^b.name cmp $^a.name} is insensitive,
  DC>                       {-M},
  DC>                       {.name}=>&fuzzy_cmp,
  DC>                       &fuzzy_cmp,

i see the need for commas in here as it is a list of criteria.

  DC>                     ],

but what about that comma? no other example seems to have one before the
@unsorted stuff.

  DC>                     @unsorted;

  DC> If a key-extractor block returns number, then C<< <=> >> is used to
  DC> compare those keys. Otherwise C<cmp> is used. In either case, the keys
  DC> extracted by the block are cached within the call to C<sort>, to
  DC> optimize subsequent comparisons against the same element. That is, a
  DC> key-extractor block is only ever called once for each element being
  DC> sorted.

where does the int optimizer come in? just as you had it before in the
extractor code? that will need to be accessible to the optimizer if the
GRT is to work correctly.

i like that the key caching is defined here. we can implement it in
several different ways depending on optimization hints and such. we
could support the ST, GRT and orchish and select the best one for each
sort. or we could have one basic sort and load the others as pragmas or
modules.

  DC> The C<is descending> and C<is insensitive> traits on a key extractor
  DC> or a comparator are detected within the call to C<sort> (or possibly
  DC> by the compiler) and used to modify the case-sensitivity and
  DC> "direction" of any comparison operators used for the corresponding key
  DC> or in the corresponding comparator.

or by reversing the order of the args passed to the comparator code. i
see you want to keep the alpha order of place holders as well as
descending. i just don't see the need for both but i can live with it.

so are those traits are only allowed/meaningful on comparison blocks?
or will an extraction block take them (extracting in insensitive mode
makes as much sense as comparing in that mode, in fact it would be
faster). so that is a little side issue. the trait insensitive is on the
comparator block but is best used by the extractor (which may not be
provided). the internals will need to be able to handle that. i assume
they will be able to see the trait on the comparator block and use it
for extraction. same with descending which is needed for extraction in
the GRT but not for orchish or ST. you have examples which show the
traits on either the extractor or comparator code blocks. that implies
that the guts can get those flags from either and use them as needed.

  DC> Note that ambiguous cases like:

  DC>      @sorted = sort {-M}, {-M}, {-M};

  DC> will be dispatched according to the normal multiple dispatch semantics
  DC> (which will mean that they will mean):

  DC>      @sorted = sort {-M}          <== {-M}, {-M};

  DC> and so one would need to write:

  DC>      @sorted = sort <== {-M}, {-M}, {-M};

that clears up that one for me.

this is very good overall (notwithstanding my few nits and
questions). it will satisfy all sorts of sort users, even those who are
out of sorts.

thanx,

uri

-- 
Uri Guttman  ------  [EMAIL PROTECTED]  -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs  ----------------------------  http://jobs.perl.org

Reply via email to