Peter Alexander wrote:
Personally, I would just expose all accelerated algorithms for sorted
ranges, and make the user call them explicitly e.g. if they know their
range is sorted, make them call binarySearch (or whatever) instead of find.
I have two primary reasons for this:
1. Having the function choose the right algorithm puts an undue burden
on the function writer, which translates into increased uncertainty for
the user of the function, "If I call find(assumeSorted(r)), is it going
to use binary search? What if I call SomeThirdParty.find? Will it know
about assumeSorted?"
2. While we're only discussing assumeSorted now, there are a myriad of
other assumeX wrappers that would be useful:
assumeUnimodal(range) - Allows ternary search for maxElement.
assumeDistinct(range) - uniq doesn't need to do anything.
assumePrime(int) - e.g. getFactors(assumePrime(x)) returns [1, x]
assumeAssociative(binOp) - power can work in O(lg(n)).
* power!(binOp)(a, b) := reduce!(binOp)(replicate(a, b))
Do we provide wrappers for these as well, and make the algorithms check
for all relevant ones?
---
I say no. Provide ternarySearch, powerAssociative etc., and let the user
choose them when he/she knows they are the right algorithm. That way,
function writers can focus on writing specialized functions, and
function users can rest assured that the function they are calling is
performing the algorithm that they expect (with the complexity bounds
that they expect).
I am quoting this in full because I agree with it in full! I think this
and opIn_r settle the matter in favor of making the additional goodies
members of Sorted!Range.
A few additional thoughts:
- Experience with the STL has shown that "specialized functions for
improved complexity with specialized syntax" (a la find as a member vs
nonmember) fare better than "best-effort functions" (a la STL's distance).
- I'm considering allowing assignment to .front, .back, [k] etc. with
runtime enforcement that the assignment doesn't break the ordering.
There is a lure to have them actually shuffle the assigned element in
the right place, but that would break many assumptions. Many algorithms
assume that after r.front = x they can assert(r.front == x).
- All binary search operators (lowerBound, upperBound, equalRange)
should be made members of Sorted!Range. So instead of writing:
// We know r is sorted, baby
auto r1 = upperBound(r, x);
you'd write:
auto r1 = assumeSorted(r).upperBound(x);
thus essentially moving the comment into the code. (I also have a cool
idea for statistically checking sortedness without deteriorating
complexity. Essentially assumeSorted(r) could check that r is sorted by
tossing a rigged coin with P(head)=1.0/r.length and deciding to check if
head comes up. That way, the amortized complexity of the check is O(1)).
This all raises the question: where should this Sorted(R) in the making
go? It's a range so it should go in std.range, but it's mostly a
motivator for algorithms, so it should go in std.algorithm. Tiebreakers?
I'm very excited about this development. Maybe I should include some of
this stuff in my Google talk on Friday.
Andrei