Re: Should range foreach be iterating over an implicit copy?

2012-06-07 Thread Ken

On Friday, 18 May 2012 at 06:17:13 UTC, Lars T. Kyllingstad wrote:
On Thursday, 17 May 2012 at 14:18:55 UTC, Steven Schveighoffer 
wrote:
[...]  I believe someone has created a byRef struct that wraps 
a range and iterates it byRef (maybe dsimcha?)


Nope, me. :)

https://github.com/kyllingstad/ltk/blob/master/ltk/range.d#L12

It only supports the input range primitives, because that was 
all I needed when I wrote it.  I offered to contribute it to 
Phobos (with support for the remaining range types, of course), 
but the interest for it seemed rather low.


-Lars


I've got a use for it right now.  I've implemented groupBy for my 
project, which is like std.algorithm.group except that instead of 
giving you the number of adjacent equivalent items, it gives you 
a lazy subrange of them.  So you end up with a range of subranges.


Trouble is, if you feed the inner range to foreach, it works on a 
copy of the inner range.  Which means that the outer range skips 
through the entire inner range you just iterated through in order 
to get to the next inner range.


Now if the range you feed to groupBy is a true non-ForwardRange, 
where copying it just creates an alias of some kind, this problem 
goes away.  Your wrapper looks like it would do the trick there.


At any rate, the whole thing makes an excellent use case for 
foreach(ref e; ref r).


Re: Should range foreach be iterating over an implicit copy?

2012-06-07 Thread Ken
On Wednesday, 16 May 2012 at 21:40:39 UTC, Andrei Alexandrescu 
wrote:

On 5/16/12 4:37 PM, Nick Sabalausky wrote:
One counter-argument that was raised is that TDPL has an 
example on page 381
that indicates foreach iterates over an implicit copy. I don't 
have a copy
handy ATM, so I can't look at it myself, but I'd love to see 
Andrei weigh in
on this: I'm curious if this example in TDPL made that copy 
deliberately, or

if the full implications of that copy were just an oversight.


It is deliberate and the intent is that millions of programmers 
used to foreach from other languages don't scream "where is my 
range???"


Andrei


According to the docs, however, an InputRange that is not a 
ForwardRange won't necessarily behave the way millions of 
programers used to foreach from other languages expect them to.  
Such an operation will most likely consume the underlying data, 
whether foreach makes an implicit copy or not.


Put another way, whether a non-ForwardRange is consumed by a 
foreach the way it currently works is unspecified.


The docs also say that the right way to checkpoint a ForwardRange 
is by calling save().


This tells me that the most intuitive way for foreach to work is:

a. If it's a ForwardRange, make a copy with save(), then consume 
the copy.

b. If it's not a ForwardRange, consume the original.


Re: Should range foreach be iterating over an implicit copy?

2012-05-17 Thread Lars T. Kyllingstad
On Thursday, 17 May 2012 at 14:18:55 UTC, Steven Schveighoffer 
wrote:
[...]  I believe someone has created a byRef struct that wraps 
a range and iterates it byRef (maybe dsimcha?)


Nope, me. :)

https://github.com/kyllingstad/ltk/blob/master/ltk/range.d#L12

It only supports the input range primitives, because that was all 
I needed when I wrote it.  I offered to contribute it to Phobos 
(with support for the remaining range types, of course), but the 
interest for it seemed rather low.


-Lars


Re: Should range foreach be iterating over an implicit copy?

2012-05-17 Thread Jonathan M Davis
On Friday, May 18, 2012 01:04:35 Erik Jensen wrote:
> I would be very much in support of having ranges and containers
> be distinct, with a standard way to get a range from a container.
> Something very similar is done in C#, where containers have a
> getEnumerator() method. The enumerator itself has a Current
> property and moveNext method (similar to front and popFront of a
> range), and thus is consumed as you use it. In my experience,
> this system works very well.

opSlice is the standard way to return a range from a container.

- Jonathan M Davis


Re: Should range foreach be iterating over an implicit copy?

2012-05-17 Thread Erik Jensen

On Thursday, 17 May 2012 at 05:48:44 UTC, Nick Sabalausky wrote:
- A range is not a collection, it's a *view* of a collection 
(or of
something else). This is a necessary distinction because ranges 
and
collections work in fundamentally different ways: A range is, 
*by necessity*

consumed as it's iterated - that's what popFront *does*, that's
fundamentally how ranges work. For many collections, OTOH, it 
makes *no*
sense to consume the collection while iterating it. Thus, range 
!=

collection, it's a view of a collection.

- Ranges are D's answer to iterators. I don't think people used 
to iterators
from other languages would expect their iterator to magically 
reset itself
after being used. So I see no reason why they would expect a 
range (ie, an

iterator-pair-with-benefits) to behave differently than that.

- D's arrays conflate the ideas of "collection" and "range", 
hence the odd
edge case Era pointed out, and hence the "need" for foreach to 
automatically
make a copy. But in my (not super-extensive) experience 
creating ranges,
I've found that to be a problematic pattern (due to the 
fundamental
distinction between a range and a collection), and learned to 
prefer making
my iterable things *return* a range, rather than actually *be* 
ranges.


- Admittedly, it would be annoying if foreach had to be used 
like this on
all collections: "foreach(e; myArray.rangeOf)", so I guess it 
would make
sense for a range to automatically be obtained when foreach-ing 
over a
collection. However, I'm still not 100% sold on the current 
method of doing
that (making foreach automatically copy struct-based ranges), 
partly because
of the questionable implications it has for input ranges, and 
partly because
(for struct-ranges) it leaves no way to access the range that's 
actually

being iterated.

- At the very least, perhaps input ranges just shouldn't be 
allowed to be
structs? After all, structs are intended to be copied around, 
but input

ranges, by definition, can't have their current state copied.


I would be very much in support of having ranges and containers 
be distinct, with a standard way to get a range from a container. 
Something very similar is done in C#, where containers have a 
getEnumerator() method. The enumerator itself has a Current 
property and moveNext method (similar to front and popFront of a 
range), and thus is consumed as you use it. In my experience, 
this system works very well.


Another advantage of giving arrays a method to obtain a range 
instead of them being a range themselves is that using foreach on 
a const/immutable array would work as expected without having to 
perform a slice to make it work. I would even go so far as to say 
that having an array BE a range really doesn't make any sense, 
conceptually.


Re: Should range foreach be iterating over an implicit copy?

2012-05-17 Thread Jonathan M Davis
On Thursday, May 17, 2012 13:23:22 Steven Schveighoffer wrote:
> On Thu, 17 May 2012 11:42:04 -0400, Jonathan M Davis 
> 
> wrote:
> > On Thursday, May 17, 2012 11:05:46 Steven Schveighoffer wrote:
> >> Hm... proposal:
> >> 
> >> foreach(e; ref r)
> >> {
> >> }
> >> 
> >> equates to your desired code. Would this help?
> > 
> > Or you could just do
> > 
> > for(; !r.empty; r.popFront())
> > {
> > 
> > auto e = r.front;
> > 
> > }
> > 
> > I really don't think that that's a big deal. I don't think that the
> > language
> > change would be worth having yet another thing in the language to
> > remember,
> > particularly when it's so easy to just use for to do the job.
> 
> Probably true. The only one I'd see as being impossible to duplicate is:
> 
> foreach(ref e; ref r)

But that only works if front returns a ref. If it doesn't, the ref would be to 
a local variable (so I don't know if that even compiles). And if front returns 
a ref, all you have to do is use front instead of e. So, no it wouldn't be 
_identical_, but close enough.

- Jonathan M Davis


Re: Should range foreach be iterating over an implicit copy?

2012-05-17 Thread Steven Schveighoffer
On Thu, 17 May 2012 11:42:04 -0400, Jonathan M Davis   
wrote:



On Thursday, May 17, 2012 11:05:46 Steven Schveighoffer wrote:

Hm... proposal:

foreach(e; ref r)
{
}

equates to your desired code.  Would this help?


Or you could just do

for(; !r.empty; r.popFront())
{
auto e = r.front;
}

I really don't think that that's a big deal. I don't think that the  
language
change would be worth having yet another thing in the language to  
remember,

particularly when it's so easy to just use for to do the job.


Probably true.  The only one I'd see as being impossible to duplicate is:

foreach(ref e; ref r)

-Steve


Re: Should range foreach be iterating over an implicit copy?

2012-05-17 Thread Jonathan M Davis
On Thursday, May 17, 2012 11:05:46 Steven Schveighoffer wrote:
> Hm... proposal:
> 
> foreach(e; ref r)
> {
> }
> 
> equates to your desired code.  Would this help?

Or you could just do

for(; !r.empty; r.popFront())
{
auto e = r.front;
}

I really don't think that that's a big deal. I don't think that the language 
change would be worth having yet another thing in the language to remember, 
particularly when it's so easy to just use for to do the job.

- Jonathan M Davis


Re: Should range foreach be iterating over an implicit copy?

2012-05-17 Thread Steven Schveighoffer
On Wed, 16 May 2012 17:37:14 -0400, Nick Sabalausky  
 wrote:



A small debate has broken out over on D.learn (
http://forum.dlang.org/thread/jovicg$jta$1...@digitalmars.com#post-jovicg:24jta:241:40digitalmars.com  
)

that I thought I should move here.

Basically, the issue is this: Currently, when you have a struct-based  
range,

foreach will iterate over a *copy* of it:

Range r;
auto save = r;
foreach(e; r)
assert(r == save);
assert(r == save);

One side of the argument is that this behavior is correct and expected  
since

structs are value types, and iterating shouldn't consume the range.

My argument is this:

First of all, foreach is conceptually a flow-control statement, not a
function. So I'd intuitively expect:

Range r;
foreach(e; r) {...}

To work like this:

Range r;
for(; !r.empty; r.popFront)
{
auto e = r.front;
...
}


Hm... proposal:

foreach(e; ref r)
{
}

equates to your desired code.  Would this help?

-Steve


Re: Should range foreach be iterating over an implicit copy?

2012-05-17 Thread Dmitry Olshansky

On 17.05.2012 18:18, Steven Schveighoffer wrote:

On Thu, 17 May 2012 01:48:06 -0400, Nick Sabalausky
 wrote:


I'm a little discouraged that my concern about "input ranges can't save
their state, and yet that's exactly what happens implicitly" hasn't been
addressed. I was hoping to at least get a "That's not really a problem
and
here's why..."


input ranges can save their state if they are also forward ranges.



s/ save state / copy themselves
Fixed :)
That's why I tried to come up with BacktrackRange or some such. An input 
range that you can "shake" to reset to some saved point, but can't have 
two separate states at _the same time_ (nor full copy, it's like ... file)




--
Dmitry Olshansky


Re: Should range foreach be iterating over an implicit copy?

2012-05-17 Thread Steven Schveighoffer
On Thu, 17 May 2012 01:48:06 -0400, Nick Sabalausky  
 wrote:



I'm a little discouraged that my concern about "input ranges can't save
their state, and yet that's exactly what happens implicitly" hasn't been
addressed. I was hoping to at least get a "That's not really a problem  
and

here's why..."


input ranges can save their state if they are also forward ranges.



However, that said...

"Andrei Alexandrescu" wrote...

It is deliberate and the intent is that millions of programmers used to
foreach from other languages don't scream "where is my range???"


"Era Scarecrow" wrote...

Range can refer to many things, and I'm assuming array is one of them.
So... If the array is consumed when using foreach, that seems dumb  
right?

(An array in the end is just a small struct afterall...)


I admit, those are fair points.

With those in mind, I've given it some more thought, and here are my
current...umm...thoughts:


[snip]

This is not a problem with ranges, this is an issue with value types  
versus reference types.  I believe someone has created a byRef struct that  
wraps a range and iterates it byRef (maybe dsimcha?)


Almost all ranges except true input-only ranges (i.e. input ranges that  
are only input ranges) should be value types.


If we *didn't* do this, you would need heap data for each range.

-Steve


Re: Should range foreach be iterating over an implicit copy?

2012-05-17 Thread bearophile

Nick Sabalausky:

One side of the argument is that this behavior is correct and 
expected since structs are value types, and iterating

shouldn't consume the range.


In that D.learn thread I've shown that iterating on a fixed-size
array, that is a value, doesn't perform a copy of the array.



Imagine, for example, an input range that read from stdin.
Leaving that range unconsumed would make no sense at all.
Actually, any input range can be expected to *not* leave an 
uniterated copy behind: if it *could* have an uniterated

copy left behind, it would be a forward range, not an input
range.


Seems right.

Bye,
bearophile


Re: Should range foreach be iterating over an implicit copy?

2012-05-16 Thread Nick Sabalausky
I'm a little discouraged that my concern about "input ranges can't save 
their state, and yet that's exactly what happens implicitly" hasn't been 
addressed. I was hoping to at least get a "That's not really a problem and 
here's why..."

However, that said...

"Andrei Alexandrescu" wrote...
> It is deliberate and the intent is that millions of programmers used to 
> foreach from other languages don't scream "where is my range???"

"Era Scarecrow" wrote...
> Range can refer to many things, and I'm assuming array is one of them. 
> So... If the array is consumed when using foreach, that seems dumb right? 
> (An array in the end is just a small struct afterall...)

I admit, those are fair points.

With those in mind, I've given it some more thought, and here are my 
current...umm...thoughts:

- A range is not a collection, it's a *view* of a collection (or of 
something else). This is a necessary distinction because ranges and 
collections work in fundamentally different ways: A range is, *by necessity* 
consumed as it's iterated - that's what popFront *does*, that's 
fundamentally how ranges work. For many collections, OTOH, it makes *no* 
sense to consume the collection while iterating it. Thus, range != 
collection, it's a view of a collection.

- Ranges are D's answer to iterators. I don't think people used to iterators 
from other languages would expect their iterator to magically reset itself 
after being used. So I see no reason why they would expect a range (ie, an 
iterator-pair-with-benefits) to behave differently than that.

- D's arrays conflate the ideas of "collection" and "range", hence the odd 
edge case Era pointed out, and hence the "need" for foreach to automatically 
make a copy. But in my (not super-extensive) experience creating ranges, 
I've found that to be a problematic pattern (due to the fundamental 
distinction between a range and a collection), and learned to prefer making 
my iterable things *return* a range, rather than actually *be* ranges.

- Admittedly, it would be annoying if foreach had to be used like this on 
all collections: "foreach(e; myArray.rangeOf)", so I guess it would make 
sense for a range to automatically be obtained when foreach-ing over a 
collection. However, I'm still not 100% sold on the current method of doing 
that (making foreach automatically copy struct-based ranges), partly because 
of the questionable implications it has for input ranges, and partly because 
(for struct-ranges) it leaves no way to access the range that's actually 
being iterated.

- At the very least, perhaps input ranges just shouldn't be allowed to be 
structs? After all, structs are intended to be copied around, but input 
ranges, by definition, can't have their current state copied.




Re: Should range foreach be iterating over an implicit copy?

2012-05-16 Thread Xinok
On Wednesday, 16 May 2012 at 21:40:39 UTC, Andrei Alexandrescu 
wrote:

On 5/16/12 4:37 PM, Nick Sabalausky wrote:
One counter-argument that was raised is that TDPL has an 
example on page 381
that indicates foreach iterates over an implicit copy. I don't 
have a copy
handy ATM, so I can't look at it myself, but I'd love to see 
Andrei weigh in
on this: I'm curious if this example in TDPL made that copy 
deliberately, or

if the full implications of that copy were just an oversight.


It is deliberate and the intent is that millions of programmers 
used to foreach from other languages don't scream "where is my 
range???"


Andrei


I think this is the correct behavior as well. Otherwise, we'd 
have to take extra care when writing generic code to ensure it 
doesn't consume the range but doesn't attempt to call .save on 
non-range types.


Re: Should range foreach be iterating over an implicit copy?

2012-05-16 Thread Era Scarecrow

On Wednesday, 16 May 2012 at 21:37:54 UTC, Nick Sabalausky wrote:

A small debate has broken out over on D.learn (
http://forum.dlang.org/thread/jovicg$jta$1...@digitalmars.com#post-jovicg:24jta:241:40digitalmars.com 
)

that I thought I should move here.

Basically, the issue is this: Currently, when you have a 
struct-based range, foreach will iterate over a *copy* of it:


http://forum.dlang.org/post/dskrijudairkopeoy...@forum.dlang.org

I've replied in the other one, but I'll repost it here, perhaps 
andrei can correct me.


---

On Wednesday, 16 May 2012 at 20:50:55 UTC, Nick Sabalausky wrote:

I was initially open to the idea I may have just misunderstood 
something, but I'm becoming more and more convinced that the 
current "foreach over an implicit copy" behavior is indeed a 
bad idea.


I'd love to see Andrei weigh in on this. I'm curious if the 
example you pointed out in TDPL made the copy deliberately, or 
if the effects of that copy were just an oversight.


 I remember going over hard and long over that section. I think 
it's deliberate.


 Range can refer to many things, and I'm assuming array is one of 
them. So... If the array is consumed when using foreach, that 
seems dumb right? (An array in the end is just a small struct 
afterall...)


---
int[] x = [1,2,3];
foreach(i; x) {
 //stuff
}

 assert(x.length == 0, "Should have been consumed!");
---

 Structs being value types are by value and so are copied, not 
referenced. Although referencing a copy does seem a bit silly...


int[] x = [1,2,3];
foreach(ref i; x) {
 i = 10;
}

assert(x == [10,10,10], "But i changed them to 10!!");

--

 That means that foreach(ref; ref) if it's considered (And I 
think it makes sense to) would work for structs in these few 
cases. (Course you'd still consume dynamic arrays that way)


 In std.stream are classes, so they are consumed. I'm probably 
just rambling...


Re: Should range foreach be iterating over an implicit copy?

2012-05-16 Thread Andrei Alexandrescu

On 5/16/12 4:37 PM, Nick Sabalausky wrote:

One counter-argument that was raised is that TDPL has an example on page 381
that indicates foreach iterates over an implicit copy. I don't have a copy
handy ATM, so I can't look at it myself, but I'd love to see Andrei weigh in
on this: I'm curious if this example in TDPL made that copy deliberately, or
if the full implications of that copy were just an oversight.


It is deliberate and the intent is that millions of programmers used to 
foreach from other languages don't scream "where is my range???"


Andrei