Uri Guttman writes:
> >>>>> "DC" == Damian Conway <[EMAIL PROTECTED]> writes:
> DC> # Modtimewise numerically ascending...
> DC> @sorted = sort {-M $^a <=> -M $^b} @unsorted;
>
> DC> # Fuzz-ifically...
> DC> sub fuzzy_cmp($x, $y) returns Int;
> DC> @sorted = sort &fuzzy_cmp, @unsorted;
>
> ok, so that is recognizes as a compare sub due to the 2 arg sig. so does
> the sub must be defined/declared before the sort code is compiled?
Nope. C<sort> is declared as a multimethod. This works, too:
$code = sub ($a, $b) { -M $a <=> -M $b };
@sorted = sort $code, @unsorted;
> DC> or with a single one-argument block/closure (to sort according
> DC> whatever the specified key extractor returns):
>
> DC> # Numerically ascending...
> DC> @sorted = sort {+ $^elem} @unsorted;
> DC> @sorted = sort {+ $_} @unsorted;
>
> is $^elem special? or just a regular place holder? i see $_ will be set
> to each record as we discussed.
Those two statements are exactly the same in every way. Well, except
how they're writted. $^elem is indeed a regular placeholder. $_
becomes an implicit parameter when it is referred to, in the absence of
placeholders or another type of signature.
> DC> # Key-ifically...
> DC> sub get_key($elem) {...}
> DC> @sorted = sort &get_key, @unsorted;
>
> and that is parsed as an extracter code call due to the single arg
> sig. again, it appears that it has to be seen before the sort code for
> that to work.
Nope. Runtime dispatch as before.
> DC> or with a single extractor/comparator pair (to sort according to the
> DC> extracted key, using the specified comparator):
>
> DC> # Modtimewise stringifically descending...
> DC> @sorted = sort {-M}=>{$^b cmp $^a} @unsorted;
>
> so that is a single pair of extractor/comparator. but there is no comma
> before @unsorted. is that correct? see below for why i ask that.
Yes. Commas may be ommitted on either side of a block when used as an
argument. I would argue that they only be omitted on the right side, so
that this is unambiguous:
if some_function { ... }
{ ... }
Which might be parsed as either:
if (some_function { ... }) { ... }
Or:
if (some_function()) {...}
{...} # Bare block
> DC> or with an array of comparators and/or key extractors and/or
> DC> extractor-comparator pairs (to sort according to a cascading list of
> DC> criteria):
>
> DC> # Numerically ascending
> DC> # or else namewise stringifically descending case-insensitive
> DC> # or else modtimewise numerically ascending
> DC> # or else namewise fuzz-ifically
> DC> # or else fuzz-ifically...
> DC> @sorted = sort [ {+ $^elem},
> DC> {$^b.name cmp $^a.name} is insensitive,
> DC> {-M},
> DC> {.name}=>&fuzzy_cmp,
> DC> &fuzzy_cmp,
>
> i see the need for commas in here as it is a list of criteria.
>
> DC> ],
>
> but what about that comma? no other example seems to have one before the
> @unsorted stuff.
It's not a closure, so you need a comma.
> DC> @unsorted;
>
> DC> If a key-extractor block returns number, then C<< <=> >> is used to
> DC> compare those keys. Otherwise C<cmp> is used. In either case, the keys
> DC> extracted by the block are cached within the call to C<sort>, to
> DC> optimize subsequent comparisons against the same element. That is, a
> DC> key-extractor block is only ever called once for each element being
> DC> sorted.
>
> where does the int optimizer come in? just as you had it before in the
> extractor code? that will need to be accessible to the optimizer if the
> GRT is to work correctly.
If the block provably returns an int, C<sort> might be able to optimize
for ints. Several ways to provably return an int:
my $extractor = an int sub($arg) { $arg.num }
@sorted = sort $extractor, @unsorted;
Or with a smarter compiler:
@sorted = sort { int .num } @unsorted;
Or C<sort> might even check whether all the return values are ints and
then optimize that way. No guarantees: it's not a language-level issue.
> i like that the key caching is defined here.
Yeah. This is a language-level issue, as the blocks might have
side-effects.
> DC> Note that ambiguous cases like:
>
> DC> @sorted = sort {-M}, {-M}, {-M};
>
> DC> will be dispatched according to the normal multiple dispatch semantics
> DC> (which will mean that they will mean):
>
> DC> @sorted = sort {-M} <== {-M}, {-M};
>
> DC> and so one would need to write:
>
> DC> @sorted = sort <== {-M}, {-M}, {-M};
>
> that clears up that one for me.
>
> this is very good overall (notwithstanding my few nits and
> questions). it will satisfy all sorts of sort users, even those who are
> out of sorts.
Agreed. I'm very fond of it..
Luke