On Monday, 2 March 2020 at 15:47:26 UTC, Steven Schveighoffer wrote:
On 3/2/20 6:52 AM, Andrea Fontana wrote:
On Saturday, 29 February 2020 at 20:11:24 UTC, Steven Schveighoffer wrote:
1. in is supposed to be O(lg(n)) or better. Generic code may depend on this property. Searching an array is O(n).

Probably it should work if we're using a "SortedRange".


int[] a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9];
auto p = assumeSorted(a);

assert(3 in p);



That could work. Currently, you need to use p.contains(3). opIn could be added as a shortcut.

It only makes sense if you have it as a literal though, as p.contains(3) isn't that bad to use:

assert(3 in [0, 1, 2, 3, 4, 5, 6, 7, 8, 9].assumeSorted);

-Steve

There's no guarantee that checking if a value is in a sorted list is any faster than checking if it's in a non sorted list. It's why sort usually switches from a binary-esque algorithm to a linear one at a certain size. The list could potentially need to be _very_ large for p.contains to make a significant impact over canFind(p) AFAIK.

Here's a small test program, try playing with the numbers and see what happens:

import std.random;
import std.range;
import std.algorithm;
import std.datetime.stopwatch;
import std.stdio;

void main()
{
    auto count = 1_000;
    auto max = int.max;

    alias randoms = generate!(() => uniform(0, max));

    auto r1 = randoms.take(count).array;
    auto r2 = r1.dup.sort;
    auto elem = r1[uniform(0, count)];

    benchmark!(
        () => r1.canFind(elem),
        () => r2.contains(elem),
    )(1_000).writeln;
}

Use LDC and -O3 of course. I was hard pressed to get the sorted contains to be any faster than canFind.

This begs the question then: do these requirements on in make any sense? An algorithm can be log n (ala the sorted search) but still be a magnitude slower than a linear search... what has the world come to 🤦‍♂️

PS: Why is it named contains if it's on a SortedRange and canFind otherwise?

Reply via email to