Sean Kelly wrote:
Georg Wrede wrote:
The distinction is not whether you or others write stuff. It's about whether it is for debugging *only*, as opposed to general input validation.

So I guess the real question is whether a function is expected to validate its parameters. I'd argue that it isn't, but then I'm from a C/C++ background. For me, validation is a debugging tool, or at least an optional feature for applications that want the added insurance.

Interesting. My policy is to favor validation whenever it doesn't impact performance. Imagine for example that strlen() validated its input for non-null. Would that show on the profiling chart of any C application? No, unless the application's core loop only called strlen() on a 1-character string or so.

One simple case that clarifies the necessary tradeoff is binary search. That assumes the range to be searched is sorted. If you actually checked for that, it would render binary search useless as a linear search would be in fact faster. So you need to assume. One way to do so is in the documentation. You write in the docs that findSorted expects a sorted range. Another way is to encode this information in the type of the sorted range. But that's onerous as most of the time you have an array you just sorted, not a SortedArray value.

The approach I took with the new phobos is:

int[] haystack;
int[] needle;
...
auto pos1 = find(haystack, needle); // linear
sort(haystack);
auto pos2 = find(assumeSorted(haystack), needle);

The assumeSorted function wraps the haystack in an AssumeSorted!(int[]) type without adding members or running extra code. It's there to clarify to everyone what's going on. And it's usable with other arguments or functions too, e.g.

auto pos3 = find(haystack, assumeSorted(needle));
setIntersection(assumeSorted(haystack), assumeSorted(needle));

Interestingly, assumeSorted can actually do checking without impacting the complexity of the search. In debug mode, it can arrange to run random isSorted tests every 1/N calls, where N is the average length of the incoming arrays, then its complexity impact is amortized constant.


Andrei

Reply via email to