On 09/30/2016 11:58 PM, Walter Bright wrote:
On 9/30/2016 7:31 PM, Andrei Alexandrescu wrote:
https://github.com/dlang/phobos/pull/4827 still allows that but
specifies that
phobos algorithms are not required to. -- Andrei

A couple instances where a length may be longer than the address space:

1. large files
2. sequences of mathematically generated values

I don't see a particular advantage to restricting hasLength() to be
size_t or ulong. Whether an algorithm requires the length to be within
the addressable capabilities of the CPU or not should be up to the
algorithm.

There is also cause for r.length to return an int even on a 64 bit
system - it can be more efficient.

Indeed the "I have a dream" cases are quite the lure. The reality in the field, however, is that Phobos thoroughly assumes length has type size_t. The problem is defining hasLength in ways that are at the same time general and that also work is extremely laborious. Just grep through Consider:

1. std.algorithm.comparison:1007 (abridged)

        const rc = mulu(r, c, overflow);
        if (_matrix.length < rc)

So here we assume length is comparable for inequality with a const size_t. That should probably rule out signed integrals.

2. ibid, 1694 (abridged)

    static if (hasLength!(Range1) && hasLength!(Range2))
    {
        return r1.length == r2.length;
    }

Here we assume any two ranges satisfying hasLength have lengths comparable for equality. That projects a rather tenuous definition of hasLength: "the range must define a length that is comparable for equality with any other range that also defines a length". Not impossible, but definitely not what we have right now, de facto or de jure.

3. ibid, 634 (abridged)

            immutable len = min(r1.length, r2.length);

So here two lengths must be comparable for ordering, have a common type, and have an immutable constructor.

4. ibid, 656 (abridged)

        return threeWay(r1.length, r2.length);

Here threeWay takes two size_t. So both lengths must be at least convertible implicitly to size_t.

I got to the end of the first file I scanned with grep, and I don't know what a robust definition of hasLength should be - outside of size_t itself. The more locations one examines, the more the capabilities required get closer to those of size_t - or the more latent bugs exist in phobos if we want to support something else.

We have two options here.

1. Support a range of types (e.g.: any integral and any user-defined type that supports the following list of primitives: ...) for length with hasLength, and then have each range-based algorithm define its own additional restrictions if they have such. That means right now a bunch of phobos code is incorrect and needs to be fixed.

2. Require lengths to be size_t.

I'm all for generality, but of the useful kind. It seems to me supporting a complicated definition of length is a protracted proposition with little upside. That's why I'm going with the engineering solution in the PR.


Andrei

Reply via email to