Re: called copy constructor in foreach with ref on Range

2020-06-23 Thread Steven Schveighoffer via Digitalmars-d-learn

On 6/23/20 9:47 AM, Sebastiaan Koppe wrote:

On Tuesday, 23 June 2020 at 07:30:29 UTC, Stanislav Blinov wrote:

On Tuesday, 23 June 2020 at 05:24:37 UTC, H. S. Teoh wrote:
I'm also wondering what's the motivation behind supporting 
non-copyable ranges, and whether it's worth the effort and inevitable 
complications to support it if it's a rare use-case.
 Do you have any compelling use-case in mind, that might make this 
worthwhile?


This compelling use case is staring at us from comments in this very 
thread.


I am a bit on the fence about having input ranges be non-copyable. 
Initially I liked the idea very much; I am a big fan of 'unique' semantics.


But just because most people don't expect hot/destructive reads, doesn't 
mean we should impose such a hard restriction on input ranges. There are 
use cases for popping from an input range from different places. An 
obvious one would be parsing data from the network, where you pass the 
input range into helper functions that parse sub-sections of the data.


You can pass a reference if you need to, there's nothing wrong with that.


They can be made to work for sure, but it doesn't make it cleaner.

On the other hand, most of the time, non-copyable is exactly what you 
need...


Most of the time algorithms return rvalue ranges anyway, so move is not 
necessary. I think it would actually affect calls very little, and where 
it does affect them, you may find it actually fixes issues that you 
aren't aware of.


-Steve


Re: called copy constructor in foreach with ref on Range

2020-06-23 Thread Paul Backus via Digitalmars-d-learn

On Tuesday, 23 June 2020 at 03:52:23 UTC, Jonathan M Davis wrote:


It is extremely common to wrap ranges in other ranges (and in 
fact, you basically have to in order to have lazy ranges). That 
really doesn't work very well - if at all - if you can't copy 
the range. It might be possible with a bunch of explicit calls 
to move, but that would result in range-based code in general 
being @system without a clean way to use @trusted


`move` isn't @system. Perhaps you are thinking of `moveEmplace`?




Re: called copy constructor in foreach with ref on Range

2020-06-23 Thread Sebastiaan Koppe via Digitalmars-d-learn

On Tuesday, 23 June 2020 at 07:30:29 UTC, Stanislav Blinov wrote:

On Tuesday, 23 June 2020 at 05:24:37 UTC, H. S. Teoh wrote:
I'm also wondering what's the motivation behind supporting 
non-copyable ranges, and whether it's worth the effort and 
inevitable complications to support it if it's a rare use-case.
 Do you have any compelling use-case in mind, that might make 
this worthwhile?


This compelling use case is staring at us from comments in this 
very thread.


I am a bit on the fence about having input ranges be 
non-copyable. Initially I liked the idea very much; I am a big 
fan of 'unique' semantics.


But just because most people don't expect hot/destructive reads, 
doesn't mean we should impose such a hard restriction on input 
ranges. There are use cases for popping from an input range from 
different places. An obvious one would be parsing data from the 
network, where you pass the input range into helper functions 
that parse sub-sections of the data.


They can be made to work for sure, but it doesn't make it cleaner.

On the other hand, most of the time, non-copyable is exactly what 
you need...


Re: called copy constructor in foreach with ref on Range

2020-06-23 Thread Stanislav Blinov via Digitalmars-d-learn

On Tuesday, 23 June 2020 at 05:24:37 UTC, H. S. Teoh wrote:
On Tue, Jun 23, 2020 at 03:25:55AM +, Stanislav Blinov via 
Digitalmars-d-learn wrote:
On Tuesday, 23 June 2020 at 02:41:55 UTC, Jonathan M Davis 
wrote:
> We'd need some major redesigning to make uncopyable ranges 
> work, and personally, I don't think that it's worth the 
> trouble.


Of course we would. Andrei is entertaining changing the whole 
input range API. Though he, like you, seems opposed to the 
idea of uncopyables.

[...]

I'm also wondering what's the motivation behind supporting 
non-copyable ranges, and whether it's worth the effort and 
inevitable complications to support it if it's a rare use-case.
 Do you have any compelling use-case in mind, that might make 
this worthwhile?


This compelling use case is staring at us from comments in this 
very thread. Like Jonathan said before - after you copied, you 
should only use the copy and not the original. Input range is a 
one-off iterable. You either consume it yourself, or give it away 
to someone else to consume. You can't do anything useful with it 
besides consuming it. But you can't give it away if it's copyable 
- you'd be giving someone a reference to mutable state which you 
also keep. Which brings about those very problems already talked 
about in this thread.  So if you "only should use the copy", well 
then duh, let's make it official and force you to write a move 
then (turning attempts to copy into compile errors)! Assuming, of 
course, that we do establish that .init must be an empty range. 
You'd be left with a valid state, callee would be given a valid 
state, neither of you would stomp on each other.


ByLine, just to become copyable, is made refcounted. It allocates 
its guts on the heap and molests an integer, all because it wants 
to remain copyable. But it is a call to "move" that is 
complication?


Or, perhaps, a complication would be partial consumption? E.g. 
chaining a take + map combo, but still wanting to consume 
remainder. That is solved by making wrapping ranges propagate the 
`source`, naturally, by moving it. Pretty straightforward.


Re: called copy constructor in foreach with ref on Range

2020-06-23 Thread Stanislav Blinov via Digitalmars-d-learn

On Tuesday, 23 June 2020 at 03:52:23 UTC, Jonathan M Davis wrote:
On Monday, June 22, 2020 9:25:55 PM MDT Stanislav Blinov via 
Digitalmars-d- learn wrote:
On Tuesday, 23 June 2020 at 02:41:55 UTC, Jonathan M Davis 
wrote:
> As things stand, uncopyable ranges aren't really a thing, 
> and common range idiomns rely on ranges being copyable.


Which idioms are those? I mean, genuine idioms, not design 
flaws like e.g. references.


It is extremely common to wrap ranges in other ranges (and in 
fact, you basically have to in order to have lazy ranges). That 
really doesn't work very well - if at all - if you can't copy 
the range. It might be possible with a bunch of explicit calls 
to move...


Where would "a bunch of explicit calls" come from???

See example implementations of map and filter here:
https://gist.github.com/radcapricorn/d76d29c6df6fa822d7889e799937f39d

If we disallow references as ranges, then any function that takes 
by value *owns that value*, and shouldn't needlessly copy it, 
only move. Yes, even when (in fact, especially when) that value 
is a mere wrapper over a pointer. Only when it would have to 
legitimately create a copy (i.e. fork the range) would it not 
move. At which point, if the range is non-copyable, it would be a 
compile error. Which would be a *good thing*.
That is what all generic code that works with values *should* be 
doing: moving, moving, moving. Copy ctors should only be called 
when *necessary*, not "meh, whenever".


Conceptually:

// assume OtherRange is a struct, which is what ranges should be
Range wrapRange(OtherRange)(OtherRange other)
{
// ...possibly pop some elements from `other`
return Range(other);
}

That's what Phobos is doing now. What's going on there? It makes 
a copy of `other` and then destructs `other`. There's no other 
use of `other`. So, it can simply (no, it *should*) just move 
`other`. Moving it would leave `other` in .init state, which is 
destructible, and construct Range with a valid value, without 
calling any copy ctors.
Walter's "moving, copying and forwarding" proposal, if it ever 
comes to fruition, would even let us drop the use of library 
moves in such cases.


...but that would result in range-based code in general being 
@system without a clean way to use @trusted, since


??? `move` infers.

whether it's really safe to mark such moves with @trusted would 
depend on the specific range implementation, which then is a 
big problem with generic code.


A one-argument `move` basically never should be inferred @system! 
(that is, if implementation isn't buggy, which it would seem it 
currently is in druntime...) Two-argument `move` might be 
@system, if the destructor is @system. Which means the range is 
@system anyway.


Regardless of whether it's actually possible to make it work 
though, it's a complete mess in comparison to simply copying.


And the fact that chaining range-based calls is extremely 
common makes the problem that much worse.


You mean this?

auto a = range.filter!foo.map!bar.array;

It's not a problem in any way. If `range` is non-copyable:

auto a = range.move.filter!foo.map!bar.array;

...which, if we do uphold the ".init is empty range" will *never* 
leave you with an invalid `range`, unlike copies that do 
who-knows-what.


The way that ranges are routinely passed around and wrapped 
works as well as it does, because ranges are copyable.


Now I see what you mean by "idioms". Copying things that 
shouldn't be copied :) Which is the bane of all Phobos.


The semantics of copying a variable or object vary wildly 
depending on the type regardless of whether we're talking about 
ranges. Copying a pointer or reference is still a copy even if 
it isn't a deep copy. Copying a range _always_ results in a 
copy, just like it does with any other type. It just doesn't 
necessarily result in a copy which can be independently 
iterated.


:) Which means:
- input ranges shouldn't be copyable -> no wild semantics 
anymore, statically enforced
- forward ranges should migrate save() into copy ctors -> 
guarantee a copy does what it should
- of course, .init must be an empty range, and all of this is 
moot so long as references are allowed to be ranges (all of this 
is one big "what if")


I feel like a broken record. Maybe I just should take that gist 
and make a more practical implementation out of it, as a demo 
(no, not with Andrei's `fetchNext`).




Re: called copy constructor in foreach with ref on Range

2020-06-22 Thread H. S. Teoh via Digitalmars-d-learn
On Tue, Jun 23, 2020 at 03:25:55AM +, Stanislav Blinov via 
Digitalmars-d-learn wrote:
> On Tuesday, 23 June 2020 at 02:41:55 UTC, Jonathan M Davis wrote:
> 
> > As things stand, uncopyable ranges aren't really a thing, and common
> > range idiomns rely on ranges being copyable.
> 
> Which idioms are those? I mean, genuine idioms, not design flaws like
> e.g.  references.
> 
> > We'd need some major redesigning to make uncopyable ranges work, and
> > personally, I don't think that it's worth the trouble.
> 
> Of course we would. Andrei is entertaining changing the whole input
> range API. Though he, like you, seems opposed to the idea of
> uncopyables.
[...]

I'm also wondering what's the motivation behind supporting non-copyable
ranges, and whether it's worth the effort and inevitable complications
to support it if it's a rare use-case.  Do you have any compelling
use-case in mind, that might make this worthwhile?


T

-- 
Любишь кататься - люби и саночки возить. 


Re: called copy constructor in foreach with ref on Range

2020-06-22 Thread Jonathan M Davis via Digitalmars-d-learn
On Monday, June 22, 2020 9:25:55 PM MDT Stanislav Blinov via Digitalmars-d-
learn wrote:
> On Tuesday, 23 June 2020 at 02:41:55 UTC, Jonathan M Davis wrote:
> > As things stand, uncopyable ranges aren't really a thing, and
> > common range idiomns rely on ranges being copyable.
>
> Which idioms are those? I mean, genuine idioms, not design flaws
> like e.g. references.

It is extremely common to wrap ranges in other ranges (and in fact, you
basically have to in order to have lazy ranges). That really doesn't work
very well - if at all - if you can't copy the range. It might be possible
with a bunch of explicit calls to move, but that would result in range-based
code in general being @system without a clean way to use @trusted, since
whether it's really safe to mark such moves with @trusted would depend on
the specific range implementation, which then is a big problem with generic
code. Regardless of whether it's actually possible to make it work though,
it's a complete mess in comparison to simply copying. And the fact that
chaining range-based calls is extremely common makes the problem that much
worse.

The way that ranges are routinely passed around and wrapped works as well as
it does, because ranges are copyable.

> > As things stand, it is _not_ true that it's safe to copy
> > forward ranges and then use the original. Sure, it will work
> > for some ranges, but for others it won't. The entire reason
> > that save exists is for ranges that are classes, because
> > copying them does not result in an independent range. The range
> > API does not require that copying a range result in an
> > independent copy. It's not even guaranteed that copying a range
> > that's a struct will result in an indpendent copy. Depending on
> > the range type, copying it could result in an independent copy,
> > or it could result in a reference to to the original range, or
> > it could even result in part of the state being copied and part
> > of it being a reference (e.g. if the current front is stored
> > directly in the range, but the iteration state is stored by
> > reference).
>
> If copying isn't making a copy, it's not copying.

The semantics of copying a variable or object vary wildly depending on the
type regardless of whether we're talking about ranges. Copying a pointer or
reference is still a copy even if it isn't a deep copy. Copying a range
_always_ results in a copy, just like it does with any other type. It just
doesn't necessarily result in a copy which can be independently iterated.

- Jonathan M Davis





Re: called copy constructor in foreach with ref on Range

2020-06-22 Thread Stanislav Blinov via Digitalmars-d-learn

On Tuesday, 23 June 2020 at 02:41:55 UTC, Jonathan M Davis wrote:

As things stand, uncopyable ranges aren't really a thing, and 
common range idiomns rely on ranges being copyable.


Which idioms are those? I mean, genuine idioms, not design flaws 
like e.g. references.


We'd need some major redesigning to make uncopyable ranges 
work, and personally, I don't think that it's worth the trouble.


Of course we would. Andrei is entertaining changing the whole 
input range API. Though he, like you, seems opposed to the idea 
of uncopyables.


The range API has no requirement that the init value of a range 
be empty, and any generic code which relies on such behavior is 
buggy. In the case of classes, it would outright crash, because 
the init value would be null.


Yeah, that's a shame.

I agree that ideally the range API would require that the init 
state of a range be a valid, empty range, but that's simply not 
how it works right now. In order to make it work that way, we'd 
have to redesign the range API in at least some respects (e.g. 
getting rid of save and making it illegal for classes to be 
forward ranges).


Better yet - making it illegal for classes to be ranges.

As things stand, it is _not_ true that it's safe to copy 
forward ranges and then use the original. Sure, it will work 
for some ranges, but for others it won't. The entire reason 
that save exists is for ranges that are classes, because 
copying them does not result in an independent range. The range 
API does not require that copying a range result in an 
independent copy. It's not even guaranteed that copying a range 
that's a struct will result in an indpendent copy. Depending on 
the range type, copying it could result in an independent copy, 
or it could result in a reference to to the original range, or 
it could even result in part of the state being copied and part 
of it being a reference (e.g. if the current front is stored 
directly in the range, but the iteration state is stored by 
reference).


If copying isn't making a copy, it's not copying.

If you rely on copying a range resulting in an independent 
copy, you will have buggy code - even if it's a forward range. 
The _only_ way that the range API specifies that you can get an 
independent copy of a range is to use save, and generic code 
should never rely on any other mechanism for that.


That's exactly what I said. "Copying of forward ranges is 
absolutely fine. It's what the current `save()` primitive is 
supposed to do."
...Except, of course, that that shouldn't be the case. *Copying* 
should be creating a copy. :)


Now, ideally, we'd get rid of save and require that copying a 
forward range result in an independent copy (which would then 
require that a class be wrapped in a struct to be a range)


Yes!


but that's simply not how the current range API works.


No, it's not :(


Re: called copy constructor in foreach with ref on Range

2020-06-22 Thread H. S. Teoh via Digitalmars-d-learn
On Mon, Jun 22, 2020 at 08:53:22PM -0600, Jonathan M Davis via 
Digitalmars-d-learn wrote:
[...]
> Exactly. It's because of issues like this that generic, range-based
> functions need to be tested with a variety of range types - including
> reference types. Without that, bugs are routinely missed.
[...]
> If we do actually rework the range API as has occasionally been
> discussed, it would be _really_ nice if we could more thoroughly
> restrict the copying semantics of ranges so that some of these
> problems go away, but without such a redesign, we're stuck with such
> problems.

Redesign or not, I think the new range API should be much more specific
in specifying exactly what behaviours ranges ought to have.  The current
API is under-specified, with two of the main problem points that I can
remember being:

1) Transient-ness: is the value return by .front guaranteed to remain
   valid, or does .popFront invalidate it?  This is not specified in the
   current spec, and we have ranges in Phobos of either type (e.g., iota
   and .byLine), with a lot of code just blindly assuming non-transience
   only to discover that things break down when handed a .byLine
   instance.

2) The exact semantics of copying (assigning) a range. What parts of a
   range's state is/isn't preserved when you assign a range to a local
   variable, for example?  What happens if you pass a range to a
   function and then continue to operate on the original?  All of this
   must be specified precisely in the range API so that we don't have
   one range acting one way and another range acting another way,
   thereby compromising the semantics of generic code.


T

-- 
Democracy: The triumph of popularity over principle. -- C.Bond


Re: called copy constructor in foreach with ref on Range

2020-06-22 Thread Jonathan M Davis via Digitalmars-d-learn
On Monday, June 22, 2020 3:33:08 PM MDT H. S. Teoh via Digitalmars-d-learn 
wrote:
> On Mon, Jun 22, 2020 at 09:11:07PM +, Stanislav Blinov via Digitalmars-
d-learn wrote:
> > That is not true. Copying of forward ranges is absolutely fine. It's
> > what the current `save()` primitive is supposed to do. It's the
> > copying of input ranges should just be rejected, statically.
>
> Jonathan is coming from the POV of generic code.  The problem with move
> leaving the original range in its .init state isn't so much that it will
> crash or anything (even though as you said that does indicate a flaw in
> the range's implementation), but that the semantics of generic code
> changes in subtle ways. For example:
>
>   auto myGenericFunc(R)(R r) {
>   ...
>   foreach (x; r) {
>   doSomething(x);
>   }
>   if (!r.empty)
>   doSomethingElse(r);
>   ...
>   }
>
> Suppose for argument's sake that the above foreach/if structure is an
> essential part of whatever algorithm myGenericFunc is implementing. Now
> there's a problem, because if R has array-like semantics, then the
> algorithm will do one thing, but if R has reference-like or move
> semantics, then the behaviour of the algorithm will be different, even
> if both ranges represent the same sequence of input values.
>
> Note that in real-life code, this problem can be far more subtle than a
> blatant foreach loop and if statement like the above. For example,
> consider a function that drops the first n elements of a range. Your
> generic function might want to pop the first n elements then do
> something else with the rest of the range.  Well, if you write it the
> obvious way:
>
>   auto myAlgo(R)(R r) {
>   size_t n = ...;
>   dropFirstN(r, n);
>   ... // do something else with r
>   }
>
> then you have a subtle bug, because the state of r after the call to
> dropFirstN might be completely different depending on whether r behaves
> like an array or like a by-reference or move type.

Exactly. It's because of issues like this that generic, range-based
functions need to be tested with a variety of range types - including
reference types. Without that, bugs are routinely missed. I think that a
fairly large percentage of Phobos (maybe even all of it) gets that right
now, because we've added tests for reference type ranges, but there used to
be quite a few bugs in Phobos, because there were no such tests. It's
frequently the case that people write range-based code under the assumption
that ranges will act like dynamic arrays, and their code then also works
with many ranges which have similar copying behavior, but it doesn't work
with all range types. Such problems don't generally get caught without
extensive testing. It matters a lot less for code within a program that
doesn't use a large variety of range types, but for libraries using generic
functions, it's a must.

If we do actually rework the range API as has occasionally been discussed,
it would be _really_ nice if we could more thoroughly restrict the copying
semantics of ranges so that some of these problems go away, but without such
a redesign, we're stuck with such problems. And even with a redesign, the
best fix for it is far from obvious, because basic input ranges and forward
ranges inherently have different copying semantics. We can probably require
that copying forward ranges always results in an independent copy (thereby
essentially having value semantics), but basic input ranges can't really be
value types, because if they could, they could be forward ranges, meaning
that they're generally going to either be reference types or
pseudo-reference types, which of course have different copying semantics.
So, as long as the range API is set up so that the same code can operate on
both basic input and forward ranges, we pretty much inherently have a
problem with the copying semantics in ranges being inconsistent - though
requiring value semantics for forward ranges would still be a significant
improvement.

- Jonathan M Davis





Re: called copy constructor in foreach with ref on Range

2020-06-22 Thread Jonathan M Davis via Digitalmars-d-learn
On Monday, June 22, 2020 3:11:07 PM MDT Stanislav Blinov via Digitalmars-d-
learn wrote:
> On Monday, 22 June 2020 at 20:51:37 UTC, Jonathan M Davis wrote:
> > You're unlikely to find much range-based code that does that
> > and there really isn't much point in doing that. Again, copying
> > isn't the problem. It's using the original after making the
> > copy that's the problem.
>
> Copy *is* the problem. If you can't make a copy (i.e. you get a
> compilation error) - you don't have a problem, the compiler just
> found a bug for you. Sadly, historically D's libraries were built
> around lots of redundant copying, even though there are very few
> cases where copies are actually required. The reason why we're
> "unlikely to find much range-based code that does that [move]" is
> (a) sloppy or shortsighted design and (b) poor language support.
> The latter hopefully stands to change.

As things stand, uncopyable ranges aren't really a thing, and common range
idiomns rely on ranges being copyable. We'd need some major redesigning to
make uncopyable ranges work, and personally, I don't think that it's worth
the trouble.

> > And moving doesn't fix anything, since the original variable is
> > still there
> > (just in its init state, which would still be invalid to use in
> > generic code and could outright crash in some cases if you
> > tried to use it - e.g. if it were a class reference, since it
> > would then be null).
>
> Eh? A range in 'init' state should be an empty range. If you get
> a crash from that then there's a bug in that range's
> implementation, not in user code.

The range API has no requirement that the init value of a range be empty,
and any generic code which relies on such behavior is buggy. In the case of
classes, it would outright crash, because the init value would be null. I
agree that ideally the range API would require that the init state of a
range be a valid, empty range, but that's simply not how it works right now.
In order to make it work that way, we'd have to redesign the range API in at
least some respects (e.g. getting rid of save and making it illegal for
classes to be forward ranges).

> > So, code that does a move could accidentally use the original
> > range after the move and have bugs just like code that copies
> > the range has bugs if the original is used after the copy has
> > been made. So, the rule of thumb is not that you should avoid
> > copying ranges. It's that once you've copied a range, you
> > should then use only the copy and not the original.
>
> That is not true. Copying of forward ranges is absolutely fine.
> It's what the current `save()` primitive is supposed to do. It's
> the copying of input ranges should just be rejected, statically.

As things stand, it is _not_ true that it's safe to copy forward ranges and
then use the original. Sure, it will work for some ranges, but for others it
won't. The entire reason that save exists is for ranges that are classes,
because copying them does not result in an independent range. The range API
does not require that copying a range result in an independent copy. It's
not even guaranteed that copying a range that's a struct will result in an
indpendent copy. Depending on the range type, copying it could result in an
independent copy, or it could result in a reference to to the original
range, or it could even result in part of the state being copied and part of
it being a reference (e.g. if the current front is stored directly in the
range, but the iteration state is stored by reference). If you rely on
copying a range resulting in an independent copy, you will have buggy code -
even if it's a forward range. The _only_ way that the range API specifies
that you can get an independent copy of a range is to use save, and generic
code should never rely on any other mechanism for that.

Now, ideally, we'd get rid of save and require that copying a forward range
result in an independent copy (which would then require that a class be
wrapped in a struct to be a range), but that's simply not how the current
range API works. Because the current range API does not make any guarantees
about the semantics of copying a range, generic code cannot assume that
those semantics are the same as save. As such, if you want an independent
copy of a forward range in generic code, you must use save, or the code will
be buggy. Similarly, generic code cannot use the original range after it's
been copied (unless it simply never does anything with the copy), because
mutating the copy may or may not mutate the original, and the original may
or may not even be in a valid state if the copy is mutated.

With non-generic code, you can rely on the behaviors of specific ranges and
what will happen when you copy them (e.g. not bothering to call save when
passing strings around), but with generic code, that's not true. And there's
plenty of D code out there that works correctly with a specific range type
but which would fail miserably if it were used 

Re: called copy constructor in foreach with ref on Range

2020-06-22 Thread Jonathan M Davis via Digitalmars-d-learn
On Monday, June 22, 2020 3:38:02 PM MDT Paul Backus via Digitalmars-d-learn 
wrote:
> On Monday, 22 June 2020 at 21:33:08 UTC, H. S. Teoh wrote:
> > Jonathan is coming from the POV of generic code.  The problem
> > with move leaving the original range in its .init state isn't
> > so much that it will crash or anything (even though as you said
> > that does indicate a flaw in the range's implementation), but
> > that the semantics of generic code changes in subtle ways. For
> > example:
> >
> > [...]
>
> Seems to me like adding some unit tests with non-copyable input
> ranges to Phobos could expose a number of latent bugs.

At this point, non-copyable ranges really aren't a thing. foreach does not
support them and neither does Phobos. isInputRange doesn't actually reject
them, but they don't really work in practice, and it's unlikely that you're
going to find much in Phobos that happens to work with them. isForwardRange
does outright reject them though.

- Jonathan M Davis





Re: called copy constructor in foreach with ref on Range

2020-06-22 Thread Stanislav Blinov via Digitalmars-d-learn

On Monday, 22 June 2020 at 21:33:08 UTC, H. S. Teoh wrote:

Don't be shocked when you find out how many Phobos ranges have 
.init states that are invalid (e.g., non-empty, but .front and 
.popFront will crash / return invalid values).


Which ones?

Jonathan is coming from the POV of generic code.  The problem 
with move leaving the original range in its .init state isn't 
so much that it will crash or anything (even though as you said 
that does indicate a flaw in the range's implementation), but 
that the semantics of generic code changes in subtle ways.


Well that means that the code is not generic, i.e. the bug 
originates in the design, not implementation.



auto myGenericFunc(R)(R r) {
...
foreach (x; r) {
doSomething(x);
}
if (!r.empty)
doSomethingElse(r);
...
}

Suppose for argument's sake that the above foreach/if structure 
is an essential part of whatever algorithm myGenericFunc is 
implementing. Now there's a problem, because if R has 
array-like semantics, then the algorithm will do one thing, but 
if R has reference-like or move semantics, then the behaviour 
of the algorithm will be different, even if both ranges 
represent the same sequence of input values.


Yep, definitely not generic. Which is exactly the kind of error 
that should be caught at compile time. Which we, sadly, can't do 
with the current range API.



consider a function that drops the first n elements of a range.
Your generic function might want to pop the first n elements 
then do something else with the rest of the range.  Well, if 
you write it the obvious way:


auto myAlgo(R)(R r) {
size_t n = ...;
dropFirstN(r, n);
... // do something else with r
}

then you have a subtle bug, because the state of r after the 
call to dropFirstN might be completely different depending on 
whether r behaves like an array or like a by-reference or move 
type.


Err... that one is actually fine. It would take the range by ref 
and pop off the elements. There already is such a function - the 
popFrontN. It's the functions that take ranges by value that 
present the not-so-subtle issue with reference types. For 
example, its chainable would-be-counterpart drop(). "Would-be" 
because that one takes the argument by value.
We should move toward disallowing reference types to be ranges. A 
good deal of the rest can be solved with API and design changes 
(like disallowing copying of input ranges).


It's kind of interesting how, with the ongoing discussion about 
range API in the general forum, a couple of range questions are 
brought up in learn. Something, as they say, is in the air.


Re: called copy constructor in foreach with ref on Range

2020-06-22 Thread Paul Backus via Digitalmars-d-learn

On Monday, 22 June 2020 at 21:33:08 UTC, H. S. Teoh wrote:


Jonathan is coming from the POV of generic code.  The problem 
with move leaving the original range in its .init state isn't 
so much that it will crash or anything (even though as you said 
that does indicate a flaw in the range's implementation), but 
that the semantics of generic code changes in subtle ways. For 
example:


[...]


Seems to me like adding some unit tests with non-copyable input 
ranges to Phobos could expose a number of latent bugs.


Re: called copy constructor in foreach with ref on Range

2020-06-22 Thread Jonathan M Davis via Digitalmars-d-learn
On Monday, June 22, 2020 3:10:28 PM MDT kinke via Digitalmars-d-learn wrote:
> On Monday, 22 June 2020 at 20:51:37 UTC, Jonathan M Davis wrote:
> > [...]
>
> That's why I put the struct in parantheses. Moving a class ref
> makes hardly any sense, but I've also never written a *class* to
> represent a range. Moving is the no-brainer solution for
> transferring ownership of struct ranges and invalidating the
> original instance.

Invalidating the instance doesn't actually prevent it from being misused
though. At best, the fact that you moved the instance rather than copying it
makes it more likely that accidentally using the instance will cause more
extreme bugs or crash. The core issue that the original is potentially
invalid but can still be used exists whether you copy the range or move it.

Also, since the issue here is generic code, you have to take classes into
account and cannot assume that the range is a struct. True, it's usually a
bad idea to use classes for ranges, and ideally, we'd alter the range API so
that classes weren't valid ranges, but the reality of the matter is that
they are, and generic code has to take that into account.

- Jonathan M Davis





Re: called copy constructor in foreach with ref on Range

2020-06-22 Thread H. S. Teoh via Digitalmars-d-learn
On Mon, Jun 22, 2020 at 09:11:07PM +, Stanislav Blinov via 
Digitalmars-d-learn wrote:
> On Monday, 22 June 2020 at 20:51:37 UTC, Jonathan M Davis wrote:
[...]
> > And moving doesn't fix anything, since the original variable is
> > still there (just in its init state, which would still be invalid to
> > use in generic code and could outright crash in some cases if you
> > tried to use it - e.g. if it were a class reference, since it would
> > then be null).
> 
> Eh? A range in 'init' state should be an empty range. If you get a
> crash from that then there's a bug in that range's implementation, not
> in user code.

Don't be shocked when you find out how many Phobos ranges have .init
states that are invalid (e.g., non-empty, but .front and .popFront will
crash / return invalid values).


> > So, code that does a move could accidentally use the original range
> > after the move and have bugs just like code that copies the range
> > has bugs if the original is used after the copy has been made. So,
> > the rule of thumb is not that you should avoid copying ranges. It's
> > that once you've copied a range, you should then use only the copy
> > and not the original.
> 
> That is not true. Copying of forward ranges is absolutely fine. It's
> what the current `save()` primitive is supposed to do. It's the
> copying of input ranges should just be rejected, statically.

Jonathan is coming from the POV of generic code.  The problem with move
leaving the original range in its .init state isn't so much that it will
crash or anything (even though as you said that does indicate a flaw in
the range's implementation), but that the semantics of generic code
changes in subtle ways. For example:

auto myGenericFunc(R)(R r) {
...
foreach (x; r) {
doSomething(x);
}
if (!r.empty)
doSomethingElse(r);
...
}

Suppose for argument's sake that the above foreach/if structure is an
essential part of whatever algorithm myGenericFunc is implementing. Now
there's a problem, because if R has array-like semantics, then the
algorithm will do one thing, but if R has reference-like or move
semantics, then the behaviour of the algorithm will be different, even
if both ranges represent the same sequence of input values.

Note that in real-life code, this problem can be far more subtle than a
blatant foreach loop and if statement like the above. For example,
consider a function that drops the first n elements of a range. Your
generic function might want to pop the first n elements then do
something else with the rest of the range.  Well, if you write it the
obvious way:

auto myAlgo(R)(R r) {
size_t n = ...;
dropFirstN(r, n);
... // do something else with r
}

then you have a subtle bug, because the state of r after the call to
dropFirstN might be completely different depending on whether r behaves
like an array or like a by-reference or move type.


T

-- 
Truth, Sir, is a cow which will give [skeptics] no more milk, and so they are 
gone to milk the bull. -- Sam. Johnson


Re: called copy constructor in foreach with ref on Range

2020-06-22 Thread Stanislav Blinov via Digitalmars-d-learn

On Monday, 22 June 2020 at 20:51:37 UTC, Jonathan M Davis wrote:

You're unlikely to find much range-based code that does that 
and there really isn't much point in doing that. Again, copying 
isn't the problem. It's using the original after making the 
copy that's the problem.


Copy *is* the problem. If you can't make a copy (i.e. you get a 
compilation error) - you don't have a problem, the compiler just 
found a bug for you. Sadly, historically D's libraries were built 
around lots of redundant copying, even though there are very few 
cases where copies are actually required. The reason why we're 
"unlikely to find much range-based code that does that [move]" is 
(a) sloppy or shortsighted design and (b) poor language support. 
The latter hopefully stands to change.


And moving doesn't fix anything, since the original variable is 
still there
(just in its init state, which would still be invalid to use in 
generic code and could outright crash in some cases if you 
tried to use it - e.g. if it were a class reference, since it 
would then be null).


Eh? A range in 'init' state should be an empty range. If you get 
a crash from that then there's a bug in that range's 
implementation, not in user code.


So, code that does a move could accidentally use the original 
range after the move and have bugs just like code that copies 
the range has bugs if the original is used after the copy has 
been made. So, the rule of thumb is not that you should avoid 
copying ranges. It's that once you've copied a range, you 
should then use only the copy and not the original.


That is not true. Copying of forward ranges is absolutely fine. 
It's what the current `save()` primitive is supposed to do. It's 
the copying of input ranges should just be rejected, statically.


Re: called copy constructor in foreach with ref on Range

2020-06-22 Thread kinke via Digitalmars-d-learn

On Monday, 22 June 2020 at 20:51:37 UTC, Jonathan M Davis wrote:

[...]


That's why I put the struct in parantheses. Moving a class ref 
makes hardly any sense, but I've also never written a *class* to 
represent a range. Moving is the no-brainer solution for 
transferring ownership of struct ranges and invalidating the 
original instance.


Re: called copy constructor in foreach with ref on Range

2020-06-22 Thread Jonathan M Davis via Digitalmars-d-learn
On Monday, June 22, 2020 1:41:34 PM MDT kinke via Digitalmars-d-learn wrote:
> On Monday, 22 June 2020 at 19:03:44 UTC, Jonathan M Davis wrote:
> > in practice, that means that generic code cannot use a range
> > once it's been copied
>
> Translating to a simple rule-of-thumb: never copy a (struct)
> range, always move.

You're unlikely to find much range-based code that does that and there
really isn't much point in doing that. Again, copying isn't the problem.
It's using the original after making the copy that's the problem. And moving
doesn't fix anything, since the original variable is still there (just in
its init state, which would still be invalid to use in generic code and
could outright crash in some cases if you tried to use it - e.g. if it were
a class reference, since it would then be null). So, code that does a move
could accidentally use the original range after the move and have bugs just
like code that copies the range has bugs if the original is used after the
copy has been made. So, the rule of thumb is not that you should avoid
copying ranges. It's that once you've copied a range, you should then use
only the copy and not the original.

- Jonathan M Davis





Re: called copy constructor in foreach with ref on Range

2020-06-22 Thread kinke via Digitalmars-d-learn

On Monday, 22 June 2020 at 19:03:44 UTC, Jonathan M Davis wrote:
in practice, that means that generic code cannot use a range 
once it's been copied


Translating to a simple rule-of-thumb: never copy a (struct) 
range, always move.


Re: called copy constructor in foreach with ref on Range

2020-06-22 Thread Jonathan M Davis via Digitalmars-d-learn
On Monday, June 22, 2020 10:59:45 AM MDT kinke via Digitalmars-d-learn wrote:
> If copying a range is considered to be generally unsafe and a
> common pitfall (vs. the save() abomination), maybe range-foreach
> shouldn't allow any lvalue ranges in the 1st place, thus not
> making any copies and forcing the user to specify some rvalue (as
> returned by `range.save()`, or `move(range)` if destructive
> iteration is indeed intended).

Copying ranges isn't a problem. Almost all range-based code copies ranges
quite a bit (e.g. almost all range-based functions take a range by value,
and chaining range-based function calls wouldn't work if it didn't). Rather,
it's using the original after a copy has been made that's a problem (at
least in generic code), because the semantics of copying aren't part of the
range API and can vary wildly depending on the type.

Really, the issues with copying were not properly taken into account when
the range API was created, and mistakes were made. If we were to rework the
range API at this point, we would probably get rid of save and require that
copying be equivalent to save for forward ranges (which would then make it
illegal for classes to be forward ranges without wrapping them in a struct).
That would fix the problem for forward ranges, but basic input ranges can't
have those semantics, or they could be forward ranges, so exactly what the
correct solution would be is debatable, and if generic code operates on both
basic input ranges and forward ranges, the copying semantics won't always be
the same, which is the problem we have now. So, if we were to rework the
range API, it's one of the open problems that needs to be sorted out.

Regardless, as things stand, because copying a range can have reference
semantics, value semantics, or something in between, in practice, that means
that generic code cannot use a range once it's been copied, because the
semantics will vary from type to type. The copy can be used (and most
range-based code relies on that), but the original needs to be left alone.
Non-generic code has more leeway, because it can rely on the behavior of
specific range types, but with generic code, you have to be careful. And
foreach is just one of the places where the issue of not using the original
after making a copy comes up.

- Jonathan M Davis





Re: called copy constructor in foreach with ref on Range

2020-06-22 Thread kinke via Digitalmars-d-learn

On Monday, 22 June 2020 at 16:25:11 UTC, Jonathan M Davis wrote:
The reason that foreach copies the range is simply due to how 
the code is lowered. e.g.


foreach(e; range)
{
}

essentially becomes

for(auto r = range; !r.empty; r.popFront())
{
auto e = r.front;
}

And the fact that a copy is made is likely simply a result of 
it mimicking what happens with arrays.


I was trying to explain/guess the rationale for that copy (not in 
terms of how it's implemented, but why it's implemented that 
way). This 'mimicking-an-array' doesn't make any sense to me. If 
the original idea wasn't to make sure the range is reusable 
afterwards, I guess it's done for implementation simplicity, to 
promote an rvalue range to the required lvalue.


If copying a range is considered to be generally unsafe and a 
common pitfall (vs. the save() abomination), maybe range-foreach 
shouldn't allow any lvalue ranges in the 1st place, thus not 
making any copies and forcing the user to specify some rvalue (as 
returned by `range.save()`, or `move(range)` if destructive 
iteration is indeed intended).


Re: called copy constructor in foreach with ref on Range

2020-06-22 Thread Jonathan M Davis via Digitalmars-d-learn
On Sunday, June 21, 2020 2:25:37 PM MDT kinke via Digitalmars-d-learn wrote:
> A foreach over a custom range makes an initial copy, so that the
> original range is still usable after the foreach (not empty).

No, it's not so that the range will be useable afterwards. In fact, for
generic code, you must _never_ use a range again after passing it to
foreach, because the copying semantics of ranges are not specified, and you
can get radically different behavior depending on what the copying semantics
of a range are (e.g. if it's a class, then it's just copying the reference).
In general, you should never use a range after it's been copied unless you
know exactly what type of range you're dealing with and what its copying
behavior is. If you want an independent copy, you need to use save.

The reason that foreach copies the range is simply due to how the code is
lowered. e.g.

foreach(e; range)
{
}

essentially becomes

for(auto r = range; !r.empty; r.popFront())
{
auto e = r.front;
}

And the fact that a copy is made is likely simply a result of it mimicking
what happens with arrays. Either way, you should never be doing something
like

foreach(e; range)
{
}

auto v = range.front;

in generic code. It needs to be

foreach(e; range.save)
{
}

auto v = range.front;

instead.

- Jonathan M Davis





Re: called copy constructor in foreach with ref on Range

2020-06-21 Thread kinke via Digitalmars-d-learn
A foreach over a custom range makes an initial copy, so that the 
original range is still usable after the foreach (not empty).


called copy constructor in foreach with ref on Range

2020-06-21 Thread Rogni via Digitalmars-d-learn

https://pastebin.com/BYcRN8T2

why called copy constructor for ref Range struct type in foreach ?

```
struct MyRange {
private int current_row = 0;
private int rows_count = 5;
@disable this(this); // <===
bool empty () { return current_row == rows_count; }
int front() { return current_row*current_row; }
void popFront() { ++current_row; }
}

struct Container {
@disable this(this); // <===
int opApply(int delegate(int) dg) {
foreach (int row; 0..5) dg(row*row);
return 1;
}
}

void foreach_iterable(Type)(ref Type iterable) {
import std.stdio: writeln;
foreach (int row_value; iterable) { // test_range.d(20): 
Error: struct test_range.MyRange is not copyable because it is 
annotated with @disable

row_value.writeln;
}
}

void call_iterable(Type)() {
Type iterable;
foreach_iterable(iterable);
}

void main() {
call_iterable!Container(); // work
call_iterable!MyRange(); // does`t work
}
```

ldc2 version: version   1.8.0 (DMD v2.078.3, LLVM 5.0.1)