Re: "IndexType" for ranges
On Tue, 02 Oct 2012 19:23:48 +0200 "Jonathan M Davis" wrote: > On Tuesday, October 02, 2012 19:10:53 monarch_dodra wrote: > > Ideally, only size_t would be allowed. Reality makes it so that we > need ulong in some cases (e.g. iota). Given that fact, you'd ideally > restrict it to size_t or ulong specfically (or at least > IndexType.sizeof >= size_t.sizeof). The problem is that I'm quite > sure that there are plenty of programmers out there who have been > using int for length and indices even though it's a horribly bad > idea. It's a classic mistake. > Yea, typing "int" tends to be automatic enough, and then the awkwardness of "size_t" on top of that tends to ensure it doesn't get used as much as it should.
Re: "IndexType" for ranges
On Tuesday, October 02, 2012 20:45:36 David Nadlinger wrote: > On Tuesday, 2 October 2012 at 17:24:32 UTC, Andrei Alexandrescu > > wrote: > > Yes. Unfortunately there are few, few cases in which size_t is > > insufficient (e.g. an input range from a file or a large iota, > > both on 32-bit builds). I personally think these are too few to > > need formal support. > > I'd throw bit arrays into the mix, where 32 bit can also be quite > small. There might also be some other clever hacks using custom > index types for representing non-linear data structures as ranges. > > The question is whether such ranges are likely to be used as > random access ranges. I can't come up with a compelling use case > right now, but I'd rather think twice before throwing support for > them out of the window and later regretting it. Also, one of the > simplest ranges (iota) not fitting the range concept has somewhat > of an odd aftertaste. If it were restricted to size_t, it would just mean that those types would be restricted to 32-bits for their length and index on 32-bit machines if you would want them to function as ranges (as iota would - if size_t is required, then it's going to use size_t, not become incompatible as a range). But it's types like this which muddy things a bit. Ideally, we'd insist that all ranges use size_t. It simplifies things and certainly using smaller than that doesn't really make sense. But if we really need to support ulong, then unfortunately, we really need to support ulong - in which case presumably length and indices would have to be size_t or ulong (either that or IndexType.sizeof >= size_t.sizeof, but allowing signed types also complicates things in nasty ways). It _would_ be great to be able to just insist on size_t though. The question is whether we can reasonably get away with that. - Jonathan M Davis
Re: "IndexType" for ranges
On Tuesday, 2 October 2012 at 18:45:24 UTC, David Nadlinger wrote: On Tuesday, 2 October 2012 at 17:24:32 UTC, Andrei Alexandrescu wrote: Yes. Unfortunately there are few, few cases in which size_t is insufficient (e.g. an input range from a file or a large iota, both on 32-bit builds). I personally think these are too few to need formal support. I'd throw bit arrays into the mix, where 32 bit can also be quite small. There might also be some other clever hacks using custom index types for representing non-linear data structures as ranges. The question is whether such ranges are likely to be used as random access ranges. I can't come up with a compelling use case right now, but I'd rather think twice before throwing support for them out of the window and later regretting it. Also, one of the simplest ranges (iota) not fitting the range concept has somewhat of an odd aftertaste. It's easy to think of random access ranges that could easily need more than size_t: - The cartesian product of several smaller ranges - a permutationsOf(r) range - a subsetsOf(r) range Any combinatoric range would easily use up 32-bit and 64-bit indexing. If 32-bit sometimes isn't enough, then neither is 64-bit. So, the question is, do we want to allow the use of BigInt indexing?
Re: "IndexType" for ranges
On Tuesday, 2 October 2012 at 17:24:32 UTC, Andrei Alexandrescu wrote: Yes. Unfortunately there are few, few cases in which size_t is insufficient (e.g. an input range from a file or a large iota, both on 32-bit builds). I personally think these are too few to need formal support. I'd throw bit arrays into the mix, where 32 bit can also be quite small. There might also be some other clever hacks using custom index types for representing non-linear data structures as ranges. The question is whether such ranges are likely to be used as random access ranges. I can't come up with a compelling use case right now, but I'd rather think twice before throwing support for them out of the window and later regretting it. Also, one of the simplest ranges (iota) not fitting the range concept has somewhat of an odd aftertaste. David
Re: "IndexType" for ranges
On Tuesday, October 02, 2012 19:37:18 monarch_dodra wrote: > On Tuesday, 2 October 2012 at 17:07:19 UTC, monarch_dodra wrote: > > [SNIP] > > You know what, I think I have a better. Idea. All of this came up > because I've had iota break my compiles WAY more often then I'd > have liked. But I think I know of another solution. > > I think it would be nice if we enforced that all ranges used > size_t. Everywhere. And it was enforced. > > I'm sorry, I like extremes. Personally, I'd love that. The problem is that iota was specifically changed to use ulong to support handling long and ulong properly on 32-bit systems. Without it, you can't actually use long or ulong with a step of 1 beyond uint.max (at least, I _think_ that that was the issue). Requiring that the length and indices be size_t undermines that. Now, I have no idea how much of a problem that realistically is. After all, you can't have an array of length > uint.max or 32-bit systems, so restricting iota to a length of uint.max isn't necessarily all than unreasonable IMHO. And per that argument, we _could_ change iota to use size_t again and just outright require that length and indices be size_t. - Jonathan M Davis
Re: "IndexType" for ranges
On Tuesday, 2 October 2012 at 17:07:19 UTC, monarch_dodra wrote: [SNIP] You know what, I think I have a better. Idea. All of this came up because I've had iota break my compiles WAY more often then I'd have liked. But I think I know of another solution. I think it would be nice if we enforced that all ranges used size_t. Everywhere. And it was enforced. I'm sorry, I like extremes.
Re: "IndexType" for ranges
On Tuesday, October 02, 2012 19:08:59 Piotr Szturmaj wrote: > Jonathan M Davis wrote: > > if length can be specifically ulong and the type is random access, then > > its > > indices will need to be ulong), so unfortunately, the situation is not so > > simple that you can always assume size_t (even you should arguably be able > > to). > > It seems that isRandomAccessRange doesn't check that opIndex parameter > type and length() return type are the same. Do you think it should? Definitely. It makes no sense to be able to have a length greater than you can index (beyond the fact that the last index is length - 1), and it makes no sense to be able to index anything greater than length as far as the size of types go. - Jonathan M Davis
Re: "IndexType" for ranges
On 10/2/12 1:07 PM, monarch_dodra wrote: I don't know, forcing an implementer on size_t is pretty gratuitous. Why can't he be free to choose his own index type? Too much freedom can be detrimental (as is in this case). Andrei
Re: "IndexType" for ranges
On Tuesday, 2 October 2012 at 17:13:48 UTC, Jonathan M Davis wrote: On Tuesday, October 02, 2012 15:17:58 monarch_dodra wrote: You might think "just use typeof(length)" BUT: *you aren't even guaranteed that "typeof(length)" will be correct! Certain ranges, such as iota, will return a length usually of type uint, but be indexed with ulong... :/ *Infinite ranges don't have length... I'd argue that that's a bug in iota. iota's length even specifically returns _IndexType_. It makes no sense for length, opIndex, or opSlice to vary in type at all. They should all use the same type (ideally size_t). The fact that it's not outright required to be size_t is bad enough (though IIRC iota had some good reasons for using ulong). To be honest, I think I may have put too much stress on the "details". I agree we may want to enforce they have matching types (or at least, a smart hierarchy). That wasn't the root if the reason for IndexType. The "big picture issue" here is writing wrapper ranges, such as "AssumeSorted". Or "take", or every other sweet-ass range adaptors we have in std.range. If "take" doesn't know how to index the sub-range, how can it properly work with ranges that always use ulong, AND at the same time, support that ranges that always use size_t (uint on x86)? Answer: It CAN'T. CAN'T CAN'T CAN'T. Keep in mind, infinite ranges don't have length, so that's out of the equation... These are not big changes I'm proposing, but they *may* break some existing ranges. Those ranges are arguably retarded, and these changes would enforce correctness, but they'd break none the less. I'd like some feedback if you think this trait is worth pushing? Requiring that length, opIndex, and opSlice all use the same index type would be very much the right way to go IMHO. If that's done however, I don't know if we'll really need IndexType (though it may still be a good idea to add it). In addition, I'd argue that they should require that they all be at least as large as size_t (ideally, they'd even have to be either size_t or ulong and that's it - no signed types allowed), but that may be too strict at this point given that it could break existing code that did stupid stuff like use int (which _way_ too many people seem inclined to do). - Jonathan M Davis You'd still need IndexType for the reasons mentioned above, unless you wanted to write "auto opIndex(ParameterTypeTuple(R.opIndex)[1] n)" in all your ranges. AND, you'd require the array specialization (which would default to size_t). The actual support of things smaller than size_t, at that point, would become a non-issue. Just: // static if (isRandomAccessRange!R) auto opIndex(IndexType!R n) { return r[n]; } // Clean, concise. Supports both size_t and ulong (and others).
Re: "IndexType" for ranges
On 10/2/12 12:45 PM, Peter Alexander wrote: On Tuesday, 2 October 2012 at 16:29:28 UTC, Simen Kjaeraas wrote: On 2012-10-02, 18:09, Peter Alexander wrote: On Tuesday, 2 October 2012 at 13:17:45 UTC, monarch_dodra wrote: If you've ever worked on a template that needs to index a range, you may have run into this problem: What is the type you should use to index an RA range? Forgive my ignorance. What's wrong with size_t? That not all ranges use it? If the range uses int, short, byte (I wonder why they'd do it, though), using size_t will not even compile. That's kind of my point. Unless there's a compelling reason not to, I'd suggest we standardise on size_t indexing (and length) and avoid this issue altogether. Yes. Unfortunately there are few, few cases in which size_t is insufficient (e.g. an input range from a file or a large iota, both on 32-bit builds). I personally think these are too few to need formal support. C++ containers have a size_type typedef. No one uses it. Agreed. Let's keep things simple instead of complicating things for the sake of unwanted "flexibility". Yes. We should curb some corner cases of current range design in the direction of simplifying things. Andrei
Re: "IndexType" for ranges
On Tuesday, October 02, 2012 19:10:53 monarch_dodra wrote: > Given your stance of "I see _zero_ reason to support lengths or > indices smaller than size_t" and "Types that do that are badly > designed IMHO": > > Are you agreeing with my proposed type tightening? If anything, > it weeds out the "bad designs" for which you had no wish to > support, while allowing better support for those that do. Ideally, only size_t would be allowed. Reality makes it so that we need ulong in some cases (e.g. iota). Given that fact, you'd ideally restrict it to size_t or ulong specfically (or at least IndexType.sizeof >= size_t.sizeof). The problem is that I'm quite sure that there are plenty of programmers out there who have been using int for length and indices even though it's a horribly bad idea. It's a classic mistake. So, while requiring size_t or ulong would be great, I'd be very surprised if it didn't break a fair bit of code out there. Given that fact that and Andrei's increased resistance to potential code breakage, I don't know that we can make that change. Still, I'd try to push for it though. It's bad enough that length and indices are allowed to be something other than size_t at all, but anything smaller than size_t (including using int specifically) _will_ cause problems for those who do that, if nothing else because size_t is ulong on 64-bit systems and using int will therefore mean that code using int for length will likely break when compiled on 64-bit systems (particularly when interacting with arrays). That's probably even a good argument for why we could restrict length and indices to size_t or greater even if it might break code (since it'll generally break when compiled on 64-bit systems anyway). This sort of change is going to have to get passed Andrei though, so we'll need his buy-in no matter what we do. - Jonathan M Davis
Re: "IndexType" for ranges
On Tuesday, 2 October 2012 at 16:59:38 UTC, Jonathan M Davis wrote: On Tuesday, October 02, 2012 18:45:50 Peter Alexander wrote: On Tuesday, 2 October 2012 at 16:29:28 UTC, Simen Kjaeraas wrote: > On 2012-10-02, 18:09, Peter Alexander wrote: >> On Tuesday, 2 October 2012 at 13:17:45 UTC, monarch_dodra >> >> wrote: >>> If you've ever worked on a template that needs to index a >>> range, you may have run into this problem: What is the type >>> you should use to index an RA range? >> >> Forgive my ignorance. What's wrong with size_t? > > That not all ranges use it? If the range uses int, short, > byte > (I wonder why they'd do it, though), using size_t will not > even > compile. That's kind of my point. Unless there's a compelling reason not to, I'd suggest we standardise on size_t indexing (and length) and avoid this issue altogether. C++ containers have a size_type typedef. No one uses it. Let's keep things simple instead of complicating things for the sake of unwanted "flexibility". In general, all ranges _should_ use size_t for both length and indexing, but for a few range types in Phobos specifically use ulong (e.g. IIRC iota does in order to work properly with ranges or long and ulong on 32-bit systems). I see _zero_ reason to support lengths or indices smaller than size_t. Types that do that are badly designed IMHO. But we already have a precedent that you can't always assume size_t (at least for length - I'm not sure about indices - but if length can be specifically ulong and the type is random access, then its indices will need to be ulong), so unfortunately, the situation is not so simple that you can always assume size_t (even you should arguably be able to). - Jonathan M Davis Given your stance of "I see _zero_ reason to support lengths or indices smaller than size_t" and "Types that do that are badly designed IMHO": Are you agreeing with my proposed type tightening? If anything, it weeds out the "bad designs" for which you had no wish to support, while allowing better support for those that do.
Re: "IndexType" for ranges
On Tuesday, October 02, 2012 15:17:58 monarch_dodra wrote: > You might think "just use typeof(length)" BUT: > *you aren't even guaranteed that "typeof(length)" will be > correct! Certain ranges, such as iota, will return a length > usually of type uint, but be indexed with ulong... :/ > *Infinite ranges don't have length... I'd argue that that's a bug in iota. iota's length even specifically returns _IndexType_. It makes no sense for length, opIndex, or opSlice to vary in type at all. They should all use the same type (ideally size_t). The fact that it's not outright required to be size_t is bad enough (though IIRC iota had some good reasons for using ulong). > These are not big changes I'm proposing, but they *may* break > some existing ranges. Those ranges are arguably retarded, and > these changes would enforce correctness, but they'd break none > the less. I'd like some feedback if you think this trait is worth > pushing? Requiring that length, opIndex, and opSlice all use the same index type would be very much the right way to go IMHO. If that's done however, I don't know if we'll really need IndexType (though it may still be a good idea to add it). In addition, I'd argue that they should require that they all be at least as large as size_t (ideally, they'd even have to be either size_t or ulong and that's it - no signed types allowed), but that may be too strict at this point given that it could break existing code that did stupid stuff like use int (which _way_ too many people seem inclined to do). - Jonathan M Davis
Re: "IndexType" for ranges
Jonathan M Davis wrote: if length can be specifically ulong and the type is random access, then its indices will need to be ulong), so unfortunately, the situation is not so simple that you can always assume size_t (even you should arguably be able to). It seems that isRandomAccessRange doesn't check that opIndex parameter type and length() return type are the same. Do you think it should?
Re: "IndexType" for ranges
On Tuesday, 2 October 2012 at 16:48:34 UTC, Peter Alexander wrote: Then don't create ranges that use ushort for indexing and length. There's no need to. To be clear, I'm suggesting that all random access ranges should use size_t, and they will not be random access ranges if they use anything else. Unless someone can give a compelling reason not to do this, I cannot see anything but benefits. I don't know, forcing an implementer on size_t is pretty gratuitous. Why can't he be free to choose his own index type? Besides, you'll still run into the problem for ranges that use ulong, such as iota. Then what about ranges that use ulong? As those wrong too? What about iota? Wrong? // import std.range; import std.algorithm; void main() { auto r = assumeSorted(iota(0L, 1L)); //Still DERP! } // src\phobos\std\range.d(6925): Error: cannot implicitly convert expression (this._input.length()) of type ulong to uint src\phobos\std\range.d(7346): Error: template instance std.range.SortedRange!(Result,"a < b") error instantiating // And this time, you can't blame me for doing fishy code, it's all in phobos. The end problem is this. // struct S(R) { //... auto opIndex(some_type n) { return r[r]; } } // Regardless of what you do, you will encounter problems at the "boundaries" or S.opIndex. Either for calling it, because some_type is too small, either for implementing it, because some_type is too big. The fact that both uint, ulong and size_t are valid indexers for range means ANYTHING in Phobos can break. The trait I'm proposing should enable support for uint, ulong and size_t, and every other type as an added bonus.
Re: "IndexType" for ranges
On Tuesday, October 02, 2012 18:45:50 Peter Alexander wrote: > On Tuesday, 2 October 2012 at 16:29:28 UTC, Simen Kjaeraas wrote: > > On 2012-10-02, 18:09, Peter Alexander wrote: > >> On Tuesday, 2 October 2012 at 13:17:45 UTC, monarch_dodra > >> > >> wrote: > >>> If you've ever worked on a template that needs to index a > >>> range, you may have run into this problem: What is the type > >>> you should use to index an RA range? > >> > >> Forgive my ignorance. What's wrong with size_t? > > > > That not all ranges use it? If the range uses int, short, byte > > (I wonder why they'd do it, though), using size_t will not even > > compile. > > That's kind of my point. Unless there's a compelling reason not > to, I'd suggest we standardise on size_t indexing (and length) > and avoid this issue altogether. > > C++ containers have a size_type typedef. No one uses it. > > Let's keep things simple instead of complicating things for the > sake of unwanted "flexibility". In general, all ranges _should_ use size_t for both length and indexing, but for a few range types in Phobos specifically use ulong (e.g. IIRC iota does in order to work properly with ranges or long and ulong on 32-bit systems). I see _zero_ reason to support lengths or indices smaller than size_t. Types that do that are badly designed IMHO. But we already have a precedent that you can't always assume size_t (at least for length - I'm not sure about indices - but if length can be specifically ulong and the type is random access, then its indices will need to be ulong), so unfortunately, the situation is not so simple that you can always assume size_t (even you should arguably be able to). - Jonathan M Davis
Re: "IndexType" for ranges
monarch_dodra wrote: On Tuesday, 2 October 2012 at 16:09:16 UTC, Peter Alexander wrote: On Tuesday, 2 October 2012 at 13:17:45 UTC, monarch_dodra wrote: If you've ever worked on a template that needs to index a range, you may have run into this problem: What is the type you should use to index an RA range? Forgive my ignorance. What's wrong with size_t? This is what happens when you use size_t: // import std.range; import std.algorithm; struct ZeroToTen { ushort first = 0; ushort last = 10; @property bool empty(){return first == last;} @property ushort front(){return first;} void popFront(){++first;} @property ushort back(){return last;} void popBack(){--last;} @property ZeroToTen save(){return this;} @property ushort length(){return cast(ushort)(last - first);} ushort opIndex(ushort n){return cast(ushort)(first + n);} } Why not use size_t or ulong as parameter? This way all smaller types will be implicitly converted. ushort opIndex(size_t n){return cast(ushort)(first + n);}
Re: "IndexType" for ranges
On Tuesday, 2 October 2012 at 16:44:48 UTC, monarch_dodra wrote: On Tuesday, 2 October 2012 at 16:09:16 UTC, Peter Alexander wrote: On Tuesday, 2 October 2012 at 13:17:45 UTC, monarch_dodra wrote: If you've ever worked on a template that needs to index a range, you may have run into this problem: What is the type you should use to index an RA range? Forgive my ignorance. What's wrong with size_t? This is what happens when you use size_t: http://dpaste.dzfl.pl/d95ccb14 On a side note, SortedRange is pretty bold at assuming the range is slice-able. Gonna fix that.
Re: "IndexType" for ranges
On Tuesday, 2 October 2012 at 16:44:48 UTC, monarch_dodra wrote: On Tuesday, 2 October 2012 at 16:09:16 UTC, Peter Alexander wrote: On Tuesday, 2 October 2012 at 13:17:45 UTC, monarch_dodra wrote: If you've ever worked on a template that needs to index a range, you may have run into this problem: What is the type you should use to index an RA range? Forgive my ignorance. What's wrong with size_t? This is what happens when you use size_t: [snip] Then don't create ranges that use ushort for indexing and length. There's no need to. To be clear, I'm suggesting that all random access ranges should use size_t, and they will not be random access ranges if they use anything else. Unless someone can give a compelling reason not to do this, I cannot see anything but benefits.
Re: "IndexType" for ranges
On Tuesday, 2 October 2012 at 16:29:28 UTC, Simen Kjaeraas wrote: On 2012-10-02, 18:09, Peter Alexander wrote: On Tuesday, 2 October 2012 at 13:17:45 UTC, monarch_dodra wrote: If you've ever worked on a template that needs to index a range, you may have run into this problem: What is the type you should use to index an RA range? Forgive my ignorance. What's wrong with size_t? That not all ranges use it? If the range uses int, short, byte (I wonder why they'd do it, though), using size_t will not even compile. That's kind of my point. Unless there's a compelling reason not to, I'd suggest we standardise on size_t indexing (and length) and avoid this issue altogether. C++ containers have a size_type typedef. No one uses it. Let's keep things simple instead of complicating things for the sake of unwanted "flexibility".
Re: "IndexType" for ranges
On Tuesday, 2 October 2012 at 16:09:16 UTC, Peter Alexander wrote: On Tuesday, 2 October 2012 at 13:17:45 UTC, monarch_dodra wrote: If you've ever worked on a template that needs to index a range, you may have run into this problem: What is the type you should use to index an RA range? Forgive my ignorance. What's wrong with size_t? This is what happens when you use size_t: // import std.range; import std.algorithm; struct ZeroToTen { ushort first = 0; ushort last = 10; @property bool empty(){return first == last;} @property ushort front(){return first;} void popFront(){++first;} @property ushort back(){return last;} void popBack(){--last;} @property ZeroToTen save(){return this;} @property ushort length(){return cast(ushort)(last - first);} ushort opIndex(ushort n){return cast(ushort)(first + n);} } void main() { ZeroToTen ztt; static assert(hasLength!ZeroToTen); //OK: normal static assert(isRandomAccess!ZeroToTen); //Ok... But I don't like where this is going... auto r = assumeSorted(ztt); //DERP! } // \src\phobos\std\range.d(6909): Error: function main.ZeroToTen.opIndex (ushort n) is not callable using argument types (uint) \src\phobos\std\range.d(6909): Error: cannot implicitly convert expression (i) of type uint to ushort \src\phobos\std\range.d(7346): Error: template instance std.range.SortedRange!(ZeroToTen,"a < b") error instantiating //
Re: "IndexType" for ranges
On 2012-10-02, 18:09, Peter Alexander wrote: On Tuesday, 2 October 2012 at 13:17:45 UTC, monarch_dodra wrote: If you've ever worked on a template that needs to index a range, you may have run into this problem: What is the type you should use to index an RA range? Forgive my ignorance. What's wrong with size_t? That not all ranges use it? If the range uses int, short, byte (I wonder why they'd do it, though), using size_t will not even compile. -- Simen
Re: "IndexType" for ranges
On Tuesday, 2 October 2012 at 13:17:45 UTC, monarch_dodra wrote: If you've ever worked on a template that needs to index a range, you may have run into this problem: What is the type you should use to index an RA range? Forgive my ignorance. What's wrong with size_t?