Re: range.save

2015-11-27 Thread Jonathan M Davis via Digitalmars-d
On Friday, 27 November 2015 at 12:06:02 UTC, Joseph Rushton 
Wakeling wrote:
On Friday, 27 November 2015 at 11:57:37 UTC, Jonathan M Davis 
wrote:
Well, you can have a pure input range which is lazy, but what 
you can't do is wrap it in another lazy range. A prime example 
would be something like


auto first5 = range.take(5);
range.popFront();
range.popFront();
// first5 now refers to elements 2 through 6 rather than 0 
through 4


Hmm well, I think it depends on how you approach the question 
of what is "correct" there.  If range is a RNG then that 
behaviour could arguably be OK; the 5 numbers extracted from 
the RNG are evaluated as you consume them, and that's all right.


Well, if you're dealing with a pseudo-random generator with a 
specific seed, I'm not sure that it's okay, though obviously for 
a fully random number generator, it wouldn't matter. The real 
problem is that this affects _any_ input range that's a reference 
type, not just random number generators, and the order very much 
matters for most range types. The problem can be fixed for 
forward ranges via save, but not for pure input ranges. And it's 
even worse with pure input ranges if they're not required to be 
full-on reference types, since then who knows what the state of 
the range is after it's copied to be passed into take. Right now, 
generic code can't use any range that's passed to take (or any 
function like it) after it's been passed in, because it's 
undefined behavior given the varying, possible semantics when 
copying a range, though calling save first obviously gets around 
that problem for forward ranges. But it's pretty certain that 
there's lots of code out there that actually depends on that 
undefined behavior acting like it does with dynamic arrays, 
because that's what most ranges do.


This is where I'm wishing I knew Haskell better, because I'm 
increasingly suspecting that InputRanges ought to be thought of 
in much the same way as Haskell considers IO.


Possibly, but because almost everything in Haskell is both 
functional and lazy, you don't really get the problem of popFront 
being called after the call to take.


- Jonathan M Davis


Re: range.save

2015-11-27 Thread Jonathan M Davis via Digitalmars-d
On Friday, 27 November 2015 at 09:20:23 UTC, Jonathan M Davis 
wrote:
I'm starting to think that it would be better to have pure 
input ranges have to be reference types and forward ranges have 
to be value types and then be very careful about which 
functions work with both rather than simply treating all 
forward ranges like pure input ranges that can also be copied 
via save.


Another piece of this puzzle to consider is that unless a range 
is a value type (or at least acts like a value type as long as 
you don't mutate its elements) or has save called on it, then it 
fundamentally doesn't work with lazy ranges. So, at minimum, we 
need to consider making it so that lazy ranges require forward 
ranges (and then, assuming that we continue to have save, the 
lazy ranges need to always call save on the range that they're 
given).


- Jonathan M Davis


Re: range.save

2015-11-27 Thread Joseph Rushton Wakeling via Digitalmars-d
On Friday, 27 November 2015 at 11:31:14 UTC, Jonathan M Davis 
wrote:
Another piece of this puzzle to consider is that unless a range 
is a value type (or at least acts like a value type as long as 
you don't mutate its elements) or has save called on it, then 
it fundamentally doesn't work with lazy ranges. So, at minimum, 
we need to consider making it so that lazy ranges require 
forward ranges (and then, assuming that we continue to have 
save, the lazy ranges need to always call save on the range 
that they're given).


Ah, interesting you should bring that up, as it's exactly the 
challenge of doing random number generation via a range interface 
;-)


I'm looking at this problem from a slightly different angle, 
which is that for a non-deterministic range (which is a subset of 
possible InputRanges) to be lazy, it matters that the value of 
.front is not evaluated until the first call to .front; and this 
"not yet determined" property needs to be restored after 
.popFront() is called.


Basically, you require _true_ laziness rather than the kind of 
pseudo-laziness that most Phobos ranges display, where the 
initial value of .front is determined in the constructor, and 
.popFront() determines the next value of .front "eagerly".


(N.B. "Non-deterministic" here includes pseudo-non-deterministic 
ranges, such as pseudo-RNGs.)


Re: range.save

2015-11-27 Thread Joseph Rushton Wakeling via Digitalmars-d
On Friday, 27 November 2015 at 11:57:37 UTC, Jonathan M Davis 
wrote:
Well, you can have a pure input range which is lazy, but what 
you can't do is wrap it in another lazy range. A prime example 
would be something like


auto first5 = range.take(5);
range.popFront();
range.popFront();
// first5 now refers to elements 2 through 6 rather than 0 
through 4


Hmm well, I think it depends on how you approach the question of 
what is "correct" there.  If range is a RNG then that behaviour 
could arguably be OK; the 5 numbers extracted from the RNG are 
evaluated as you consume them, and that's all right.


This is where I'm wishing I knew Haskell better, because I'm 
increasingly suspecting that InputRanges ought to be thought of 
in much the same way as Haskell considers IO.


Re: range.save

2015-11-27 Thread Jonathan M Davis via Digitalmars-d
On Friday, 27 November 2015 at 11:45:38 UTC, Joseph Rushton 
Wakeling wrote:
On Friday, 27 November 2015 at 11:31:14 UTC, Jonathan M Davis 
wrote:
Another piece of this puzzle to consider is that unless a 
range is a value type (or at least acts like a value type as 
long as you don't mutate its elements) or has save called on 
it, then it fundamentally doesn't work with lazy ranges. So, 
at minimum, we need to consider making it so that lazy ranges 
require forward ranges (and then, assuming that we continue to 
have save, the lazy ranges need to always call save on the 
range that they're given).


Ah, interesting you should bring that up, as it's exactly the 
challenge of doing random number generation via a range 
interface ;-)


I'm looking at this problem from a slightly different angle, 
which is that for a non-deterministic range (which is a subset 
of possible InputRanges) to be lazy, it matters that the value 
of .front is not evaluated until the first call to .front; and 
this "not yet determined" property needs to be restored after 
.popFront() is called.


Basically, you require _true_ laziness rather than the kind of 
pseudo-laziness that most Phobos ranges display, where the 
initial value of .front is determined in the constructor, and 
.popFront() determines the next value of .front "eagerly".


(N.B. "Non-deterministic" here includes 
pseudo-non-deterministic ranges, such as pseudo-RNGs.)


Well, you can have a pure input range which is lazy, but what you 
can't do is wrap it in another lazy range. A prime example would 
be something like


auto first5 = range.take(5);
range.popFront();
range.popFront();
// first5 now refers to elements 2 through 6 rather than 0 
through 4


Either take needs to actually get a separate copy of the range 
(i.e. use save), or the range can't be used after take has been 
called. So, wrapping the range in a lazy range does still work on 
some level - but only as long as you don't use the range for 
anything else other than through that lazy range, and I don't 
know of any way to restrict that except by either disallowing 
pure input ranges with lazy range wrappers (which is arguably 
over restrictive) or by simply telling people not to use a pure 
input range after passing it to a lazy range (which is obviously 
error-prone, because it's not enforced in any way).


Whether the original range was lazy or not doesn't really matter. 
It's the fact that it's not a value type that screws things.


- Jonathan M Davis


Re: range.save

2015-11-27 Thread Joseph Rushton Wakeling via Digitalmars-d
On Friday, 27 November 2015 at 12:16:34 UTC, Jonathan M Davis 
wrote:
Well, if you're dealing with a pseudo-random generator with a 
specific seed, I'm not sure that it's okay, though obviously 
for a fully random number generator, it wouldn't matter.


I think the distinction here is artificial.  If `range` is a 
random number generator, then `take` has no right to expect its 
output to be deterministic or consistent in any way.


When we come to input ranges in general, that's surely a factor 
we need to take into account.  Can the InputRange actually be 
relied on as a deterministic iteration, but one where we can't 
save our state?  Or should it be treated as simply a source of 
data, about which one can make no assumptions regarding its 
consistency?


I think answering that question -- and maybe distinguishing the 
two cases if possible -- feeds into how to address the subsequent 
problems you describe.


Possibly, but because almost everything in Haskell is both 
functional and lazy, you don't really get the problem of 
popFront being called after the call to take.


I'm not sure that's really true.  I'm reasonably sure that I can 
(given some source of IO) do something like (pseudo-code because 
my Haskell is rusty),


do
lazyData = 
something_that_lazily_reads_5_entries_from_the_IO_entity

read_some_data_from_IO
read_some_data_from_IO
do_something_with_lazyData

If the source of IO is the standard random number generator of 
Haskell, for example, it'll behave like the input range/popFront 
example.


Re: range.save

2015-11-27 Thread Joseph Rushton Wakeling via Digitalmars-d
On Friday, 27 November 2015 at 13:46:37 UTC, Joseph Rushton 
Wakeling wrote:
I think the distinction here is artificial.  If `range` is a 
random number generator


... should read, "If `range` is a random number generator (even a 
pseudo-random one)".


Re: range.save

2015-11-27 Thread Jonathan M Davis via Digitalmars-d
On Friday, 27 November 2015 at 08:53:26 UTC, Joseph Rushton 
Wakeling wrote:
On Thursday, 26 November 2015 at 17:23:07 UTC, Andrei 
Alexandrescu wrote:

So initially I thought a simple solution would be this:

struct MyForwardRange
{
enum bool isForward = true;
... empty, front, popFront ...
}

enum isForwardRange(R) =
  is(typeof(R.isForward)) && R.isForward;

Then there's no need for .save(), and isForward!R can be used 
for constraints etc. Reference forward ranges are not 
supported but then that's liberating (fewer complications), 
rare, and easy to work around by wrapping.


By "reference forward range" do you mean a reference type (per 
se) or any range that contains an internal reference type (e.g. 
a range whose private data includes a dynamic array)?


Obviously, Andrei will have to answer to know what he meant, but 
with regards to ranges, I would consider a reference type to be 
one where in


auto copy = range;

doing anything to copy or range does the exact same thing to the 
other, because they refer to the exact same state. Something like 
save is required to get a separate range where popping elements 
from one will not affect the other.


Contrast that with a value type where copy and range are 
completely separate. And then there are ranges where the copy 
shares _some_ state with the original but not all, which as far 
as save goes is pretty much the same as a reference type but 
means that you can't rely on mutating one having the same effect 
on the other one either (the easiest example would be a range 
that would otherwise be a reference type but caches front).


Based on that view of things, dynamic arrays definitely are value 
types. Obviously, if you start doing stuff that would mutate 
their elements, then they aren't really value types, but if all 
you're doing is iterating over them, then they're essentially 
value types.


What we lose without save (or something else to replace it) is 
the ability to have any range types that aren't value types (or 
which essentially behave like them). So, a range which is a 
dynamic array or a simple wrapper around a dynamic array is fine, 
because copying the range is enough, but anything where a copy is 
not the same as save would have been will no longer work. Using 
postblit constructors like Sonke suggested solves _some_ of that 
problem, but not all of it, since that doesn't work with classes, 
and const doesn't work with postblit constructors, whereas we 
could make it work with save. The loss of classes poses the 
problem that non-templated functions can't really be made to work 
with ranges. And the loss of reference type ranges in general 
definitely is a problem for some uses cases.


But the more I look at the semantics involved, the more inclined 
I am to argue that trying to have the same code work for both 
value types and reference types is questionable at best. On top 
of that, pure input ranges almost need to be reference types 
(though they can currently work as pseudo-reference types at 
least some of the time), and forward ranges really need to be 
value types (at least as much as dynamic arrays are anyway) if we 
don't want to have to call save everywhere.


I'm starting to think that it would be better to have pure input 
ranges have to be reference types and forward ranges have to be 
value types and then be very careful about which functions work 
with both rather than simply treating all forward ranges like 
pure input ranges that can also be copied via save.


- Jonathan M Davis


Re: range.save

2015-11-27 Thread Joseph Rushton Wakeling via Digitalmars-d
On Thursday, 26 November 2015 at 17:23:07 UTC, Andrei 
Alexandrescu wrote:

So initially I thought a simple solution would be this:

struct MyForwardRange
{
enum bool isForward = true;
... empty, front, popFront ...
}

enum isForwardRange(R) =
  is(typeof(R.isForward)) && R.isForward;

Then there's no need for .save(), and isForward!R can be used 
for constraints etc. Reference forward ranges are not supported 
but then that's liberating (fewer complications), rare, and 
easy to work around by wrapping.


By "reference forward range" do you mean a reference type (per 
se) or any range that contains an internal reference type (e.g. a 
range whose private data includes a dynamic array)?


Re: range.save

2015-11-27 Thread Joseph Rushton Wakeling via Digitalmars-d
On Friday, 27 November 2015 at 09:20:23 UTC, Jonathan M Davis 
wrote:
Obviously, Andrei will have to answer to know what he meant, 
but with regards to ranges, I would consider a reference type 
to be one where in


auto copy = range;

doing anything to copy or range does the exact same thing to 
the other, because they refer to the exact same state. 
Something like save is required to get a separate range where 
popping elements from one will not affect the other.


Unfortunately it's a bit more complicated than that, because it's 
readily possible to have ranges where


auto copy = range;

... will copy _some_ of the internals by value, and some by 
reference -- e.g. a range whose private data includes some 
integer values and a dynamic array.


That's not necessarily a problem if the reference-type data does 
not influence the range's behaviour (e.g. you're doing forward 
iteration over a container accessed by ref), but it's readily 
possible to imagine a range design where


auto copy = range;
copy.popFront();

... will affect range's state without updating it to the _same_ 
state as copy.


Re: range.save

2015-11-27 Thread Jonathan M Davis via Digitalmars-d
On Friday, 27 November 2015 at 10:17:46 UTC, Joseph Rushton 
Wakeling wrote:
On Friday, 27 November 2015 at 09:20:23 UTC, Jonathan M Davis 
wrote:
Obviously, Andrei will have to answer to know what he meant, 
but with regards to ranges, I would consider a reference type 
to be one where in


auto copy = range;

doing anything to copy or range does the exact same thing to 
the other, because they refer to the exact same state. 
Something like save is required to get a separate range where 
popping elements from one will not affect the other.


Unfortunately it's a bit more complicated than that, because 
it's readily possible to have ranges where


auto copy = range;

... will copy _some_ of the internals by value, and some by 
reference -- e.g. a range whose private data includes some 
integer values and a dynamic array.


That's not necessarily a problem if the reference-type data 
does not influence the range's behaviour (e.g. you're doing 
forward iteration over a container accessed by ref), but it's 
readily possible to imagine a range design where


auto copy = range;
copy.popFront();

... will affect range's state without updating it to the _same_ 
state as copy.


Yes. I discussed that in my post, though maybe I wasn't clear 
enough. But such a range requires save just as much as a full-on 
reference type does, so ultimately it's pretty much in the same 
boat as far as save goes. It's also a serious problem for pure 
input ranges to such ranges exist, since then even if you can 
assume that any range which can implement save does implement 
save, you still can't guarantee that an input range is a 
reference type, which becomes a serious problem when an input 
range is copied. e.g.


foreach(e; range)
{
if(cond)
break;
}
range.popFront();

could be fine if you can guarantee that the range is a reference 
type, but if you can't guarantee that (as is currently the case), 
then as soon as you use a pure input range with a foreach loop, 
you can't use it anymore, because it's copied as part of the 
lowering. To get around that, you can use a for loop directly, 
but the fact that pure input ranges aren't currently guaranteed 
to be reference types definitely makes it harder to write 
consistent, correct code with pure input ranges.


As it stands, forward ranges have the same problem, but by 
calling save, you can ensure that the copy is separate from the 
original and ensure that iterating the original after the loop is 
fine (though if you want reference semantics, you would still 
have to use a for loop), but pure input ranges don't have a way 
to ensure consistency like that other than maybe wrapping the 
range in another type which you can guarantee is a reference type.


Regardless, even if we require that pure input ranges be 
reference types and that forward ranges be value types (at least 
insomuch as copying them results in a separate range with the 
same state like with save), we then still have the problem that 
pure input ranges and forward ranges have different semantics. 
But as long as dynamic arrays are ranges, it's not like we could 
force all ranges to be reference types, and that wouldn't be good 
for efficiency anyway, since it would almost certainly mean more 
heap allocations just so that they can be reference types.


- Jonathan M Davis


Re: range.save

2015-11-26 Thread Jonathan M Davis via Digitalmars-d
On Thursday, 26 November 2015 at 17:23:07 UTC, Andrei 
Alexandrescu wrote:
As of right now the situation with ranges is suboptimal - you 
need to use .save() but if you don't the sheer copy works most 
of the time, just not always. It may be nice to improve on that 
a bit, for example to require input ranges that are not forward 
ranges to indeed have reference semantics. Or require forward 
ranges to define .save() but only with the body { return this; 
}. Either of these would break code.


As long as a range has semantics very similar to a dynamic array, 
it works great, but as soon as it doesn't, it tends to fall apart 
in subtle ways. The current situation is simply too error-prone 
and unwieldy IMHO. It works great in the common case but 
definitely falls apart at the edges. And even if you're diligent 
about getting it right, the number of calls to save that are 
required is ridiculous. So, while I completely agree that we want 
to avoid breaking code if we can, I'm also increasingly convinced 
that we should look at what would need to be done to make ranges 
clean to use without being error-prone and see if we can get 
there with minimal breaking changes. Regardless, if we _do_ make 
breaking changes with regards to ranges, we need to be very sure 
that we want to make them.


- Jonathan M Davis


Re: range.save

2015-11-26 Thread Andrei Alexandrescu via Digitalmars-d

On 11/20/2015 11:34 AM, Sönke Ludwig wrote:

Am 20.11.2015 um 16:37 schrieb Andrei Alexandrescu:

On 11/20/2015 04:42 AM, Sönke Ludwig wrote:

I think that .save is a serious design flaw in the range API


How would you redo it? -- Andrei


That's why I wrote that the alternatives had their own issues - I
unfortunately don't have a solution to this either. Making usage errors
fail loudly at runtime is the only one improvement I had in mind that
wouldn't result in unacceptable code breakage.

Now if we could exclude reference type ranges from the picture* and
ignore the fact that this would break tons of code, I think making input
ranges non-copyable and using the postblit constructor to do the job of
save() would be the right approach.


(had this in my other's laptop outbox for a while)

The baseline was STL's input iterators, which were quite bad - there's 
no syntactic distinction between input and forward iterators, which 
makes the entire matter a convention. That's not the case for any other 
iterators - code using their capabilities won't compile with weaker 
iterators (nice).


So initially I thought a simple solution would be this:

struct MyForwardRange
{
enum bool isForward = true;
... empty, front, popFront ...
}

enum isForwardRange(R) =
  is(typeof(R.isForward)) && R.isForward;

Then there's no need for .save(), and isForward!R can be used for 
constraints etc. Reference forward ranges are not supported but then 
that's liberating (fewer complications), rare, and easy to work around 
by wrapping.


Looking back, the only reason for which I didn't go that way was fear; I 
simply hadn't seen such an idiom and I was afraid it'd be too 
nonconformist. Introspection was new and back then quite arcane, and the 
idioms were unclear. So I went with a more conventional solution of 
adding one method.


Today that's the way I'd do it, and in fact it is the way the new 
collections will work exactly that way - if a type exposes 
isStdCollection then it commits to implementing the collection 
primitives as specified.


As of right now the situation with ranges is suboptimal - you need to 
use .save() but if you don't the sheer copy works most of the time, just 
not always. It may be nice to improve on that a bit, for example to 
require input ranges that are not forward ranges to indeed have 
reference semantics. Or require forward ranges to define .save() but 
only with the body { return this; }. Either of these would break code.



Andrei



Re: range.save

2015-11-24 Thread Freddy via Digitalmars-d
On Tuesday, 24 November 2015 at 01:53:39 UTC, Steven 
Schveighoffer wrote:

...
I can't quite think of an example right now but there was a 
thread about this a few years ago. 
http://forum.dlang.org/thread/twnymbxfdmqupsfjf...@forum.dlang.org


Re: range.save

2015-11-23 Thread Freddy via Digitalmars-d

On Thursday, 19 November 2015 at 21:30:23 UTC, Freddy wrote:

...
Another problem I noticed with ranges is that all functionality 
is unionized. Ranges are expected to be able to 
popFront,Index,popBack, randomly possibly forcing ranges to carry 
unneeded buffers if indexing or popBack in never used.


On possible solution is to have .retroRange and .indexRange On 
forward/input ranges.


Re: range.save

2015-11-23 Thread Jonathan M Davis via Digitalmars-d
On Tuesday, 24 November 2015 at 01:03:36 UTC, Steven 
Schveighoffer wrote:
But yes, a fundamental requirement is to be able to get the 
front element repeatedly. This necessitates a buffer or "saving 
of state".


Either that or the same operation/calculation is done every time 
you call front, and it results in the same value (e.g. the result 
of map calls its predicate every time that you cal front on it). 
In any case, regardless of how it's done, front needs to always 
return the same value until popFront is called, and how that's 
done can vary.


- Jonathan M Davis


Re: range.save

2015-11-23 Thread Steven Schveighoffer via Digitalmars-d

On 11/23/15 7:10 PM, Freddy wrote:

On Thursday, 19 November 2015 at 21:30:23 UTC, Freddy wrote:

...

Another problem I noticed with ranges is that all functionality is
unionized. Ranges are expected to be able to popFront,Index,popBack,
randomly possibly forcing ranges to carry unneeded buffers if indexing
or popBack in never used.


surely you mean opIndex? Note that ranges are required to implement 
front, popFront, and empty. That's it, then it is a range. Even save 
isn't required unless you want it to be a forward range.


But yes, a fundamental requirement is to be able to get the front 
element repeatedly. This necessitates a buffer or "saving of state".


-Steve


Re: range.save

2015-11-23 Thread Freddy via Digitalmars-d
On Tuesday, 24 November 2015 at 01:03:36 UTC, Steven 
Schveighoffer wrote:
surely you mean opIndex? Note that ranges are required to 
implement front, popFront, and empty. That's it, then it is a 
range. Even save isn't required unless you want it to be a 
forward range.


I meant .indexableRange, random access may require extra buffers 
or stack space that take space even when random access isn't 
used, and I'm asking for optional members(.retroRange, 
.indexableRange)


But yes, a fundamental requirement is to be able to get the 
front element repeatedly. This necessitates a buffer or "saving 
of state".


Not quite what I was thinking. I was saying that ranges that 
implement back,popBack may need to implement a backwards buffer 
along a forward buffer even if the backwards buffer is never used.


Re: range.save

2015-11-23 Thread Steven Schveighoffer via Digitalmars-d

On 11/23/15 8:38 PM, Freddy wrote:


Not quite what I was thinking. I was saying that ranges that implement
back,popBack may need to implement a backwards buffer along a forward
buffer even if the backwards buffer is never used.


Maybe you are saying that if you want to implement indexing, you must 
also implement back and popBack? Note that if you don't implement 
something, it just doesn't get qualified as that type of range, so it's 
definitely possible to have indexing, but not back and popBack (although 
if range[$-1] is possible, then wouldn't that easily qualify as back?). 
So I don't really understand still.


I think your issue may be better described with an example.

-Steve


Re: range.save

2015-11-20 Thread Sönke Ludwig via Digitalmars-d

Am 19.11.2015 um 22:30 schrieb Freddy:

Does anyone else think range.save is a hack? I often find myself
forgetting to call range.save in my generic code with my unittests
working fine. Also, working with a range of ranges may forget to call
range.save.(Ex: [1,2,4].repeat)


I think that .save is a serious design flaw in the range API and there 
must me countless theoretical or future bugs lurking in todays code 
bases, only working by chance due to undefined algorithm behavior.


The problem is that all of the alternatives that were discussed all had 
different trade-offs and solving the .save issue would lead to other 
issues/limitations (or at least AFAIR there hasn't been a suggestion 
where that wouldn't be the case). And since we now already have an API, 
that wouldn't help anyway.


Maybe we could inject code into existing ranges that, in debug mode, at 
least causes assertions to trigger whenever an operation is done on an 
outdated copy of an input range (or of a non '.save'd forward range). 
That could then be advertised as a standard pattern for user defined 
ranges ("mixin RangeSemantics;").


Re: range.save

2015-11-20 Thread Andrei Alexandrescu via Digitalmars-d

On 11/20/2015 04:42 AM, Sönke Ludwig wrote:

I think that .save is a serious design flaw in the range API


How would you redo it? -- Andrei


Re: range.save

2015-11-20 Thread Sönke Ludwig via Digitalmars-d

Am 20.11.2015 um 16:37 schrieb Andrei Alexandrescu:

On 11/20/2015 04:42 AM, Sönke Ludwig wrote:

I think that .save is a serious design flaw in the range API


How would you redo it? -- Andrei


That's why I wrote that the alternatives had their own issues - I 
unfortunately don't have a solution to this either. Making usage errors 
fail loudly at runtime is the only one improvement I had in mind that 
wouldn't result in unacceptable code breakage.


Now if we could exclude reference type ranges from the picture* and 
ignore the fact that this would break tons of code, I think making input 
ranges non-copyable and using the postblit constructor to do the job of 
save() would be the right approach. Recently this was mentioned 
somewhere and the counter argument was that this wouldn't work for 
wrapper ranges:


struct FilterRange(R) {
private R _input;

this(R input) {
_input = input; // error: R not copyable
}
// ...
}

But it does work!

struct FilterRange(R) {
private R _input;

this(R input) {
swap(_input, input); // fine!
}
   // ...
}

Passing a range into such a wrapper range can still be done directly in 
case of rvalues: FilterRange!MyRange(MyRange(...))


L-values require an explicit "move" call (which is a good thing):

MyRange r;
auto fr = FilterRange(r.move);

The lowering of "foreach" to "for" would also have to be adjusted 
accordingly.


The main drawback of using postblit for forward ranges is that it might 
happen that it gets invoked accidentally, resulting in hidden 
performance issues (IMO better than correctness issues, but anyway). 
That could either be mitigated by performing substantial work lazily 
instead of directly in this(this), or by keeping support for save() in 
addition to this(this), so that a valid forward range could still 
disable this(this) if it's costly and expose the copy functionality 
through save() instead.



* To still support reference type ranges, we could turn this around and 
define a copyref() method that creates a new range that references the 
same internal range object. The advantage is that this way around 
failure to call copyref() will result in an immediate error.


Re: range.save

2015-11-20 Thread Jonathan M Davis via Digitalmars-d

On Friday, 20 November 2015 at 16:35:01 UTC, Sönke Ludwig wrote:
The lowering of "foreach" to "for" would also have to be 
adjusted accordingly.


I'm actually wondering if we should change it even to support the 
way that we currently do things. The problem is that the 
semantics of copying a range are undefined. They depend entirely 
on how the range is implemented (value types, reference types, 
and pseudo-reference types all act differently when copied). The 
result is that if you want generic range-based code to work with 
all range types, you can never use a range after it's been copied 
unless it's assigned a new value - and foreach does a copy!


foreach(e; range)
{
// do stuff
}

->

for(auto __c = range; !__c.empty; __c.popFront())
{
auto e = __c.front;
// do stuff
}

So, in generic code, once you use foreach on a range, you can't 
use it again. Something like


foreach(e; range)
{
   if(cond)
   break;
}
// do something with range again

is inherently broken. You're pretty much forced to use a naked 
for loop, e.g.


for(; !range.empty; range.popFront())
{
auto e = range.front;
if(cond)
break;
}
// do something with range

So, what we really want is probably something more like

foreach(e; range)
{
// do stuff
}

->

for(; !range.empty; range.popFront())
{
auto e = range.front;
// do stuff
}

Unfortunately, that _does_ risk breaking some code with forward 
ranges - but only because they're being implicitly saved, which 
is arguably a bug. So, I'd be inclined to argue that the change 
should be made - if nothing else, because any attempts to break 
out of the loop are screwed with the currently lowering, and pure 
input ranges can't be saved to work around the problem, meaning 
that they're doubly screwed.


Now, that does have the downside of making foreach work 
differently for arrays, since they don't use the normal range 
lowering (rather, they're effectively saved), which means that 
we're still pretty much screwed...


So, maybe what we need to do is have the foreach lowering check 
for save and just call it if it exists so that pure input ranges 
get


for(; !range.empty; range.popFront())
{
auto e = range.front;
// do stuff
}

whereas forward ranges get

for(auto __c = range.save; !__c.empty; __c.popFront())
{
auto e = __c.front;
// do stuff
}

That's more complicated, but it avoids breaking code while still 
making it so that input ranges and foreach aren't quite so broken 
- though it means that input ranges and forward ranges act 
differently with foreach. But the alternative is basically tell 
folks that they need to call save explicitly when using foreach 
on forward ranges in generic code and to tell them that they 
should never use foreach on pure input ranges unless they intend 
to iterate over the whole range.


save tries to solve the copying problem with ranges, but the 
overall situation is _far_ worse than just trying to make it so 
that reference type ranges have a way to be copied properly.


- Jonathan M Davis


Re: range.save

2015-11-19 Thread Ilya via Digitalmars-d
On Friday, 20 November 2015 at 02:47:17 UTC, Rikki Cattermole 
wrote:

On 20/11/15 10:30 AM, Freddy wrote:
Does anyone else think range.save is a hack? I often find 
myself
forgetting to call range.save in my generic code with my 
unittests
working fine. Also, working with a range of ranges may forget 
to call

range.save.(Ex: [1,2,4].repeat)


Not really.
If you forget it, then things won't work since it'll be empty ;)


If range is class. For most of structs like Iota range.save 
returns just itself.


Re: range.save

2015-11-19 Thread Rikki Cattermole via Digitalmars-d

On 20/11/15 10:30 AM, Freddy wrote:

Does anyone else think range.save is a hack? I often find myself
forgetting to call range.save in my generic code with my unittests
working fine. Also, working with a range of ranges may forget to call
range.save.(Ex: [1,2,4].repeat)


Not really.
If you forget it, then things won't work since it'll be empty ;)