Re: Proposal for design of 'scope' (Was: Re: Opportunities for D)

via Digitalmars-d Sat, 12 Jul 2014 11:06:06 -0700

On Friday, 11 July 2014 at 21:04:05 UTC, H. S. Teoh viaDigitalmars-d wrote:

On Thu, Jul 10, 2014 at 08:10:36PM +0000, via Digitalmars-dwrote:Hmm. Seems that you're addressing a somewhat wider scope thanwhat I hadin mind. I was thinking mainly of 'scope' as "does not escapethe bodyof this block", but you're talking about a more general case ofbeing
able to specify explicit lifetimes.

Indeed, but it includes what you're suggesting. For most usecases, just `scope` without an explicit lifetime annotation isfully sufficient.

[...]
A problem that has been discussed in a few places is safelyreturninga slice or a reference to an input parameter. This can besolved
nicely:

    scope!haystack(string) findSubstring(
        scope string haystack,
        scope string needle
    );
Inside `findSubstring`, the compiler can make sure that noreferencesto `haystack` or `needle` can be escape (an unqualified`scope` can beused here, no need to specify an "owner"), but it will allowreturninga slice from it, because the signature says: "The return valuewill
not live longer than the parameter `haystack`."
This does seem to be quite a compelling argument for explicitscopes. It
does make it more complex to implement, though.


[...]
An interesting application is the old `byLine` problem, wherethefunction keeps an internal buffer which is reused for everyline thatis read, but a slice into it is returned. When a user naivelystoresthese slices in an array, she will find that all of them havethe same
content, because they point to the same buffer. See how this is
avoided with `scope!(const ...)`:
This seems to be something else now. I'll have to think aboutthis a bitmore, but my preliminary thought is that this adds yet anotherlevel ofcomplexity to 'scope', which is not necessarily a bad thing,but we
might want to start out with something simpler first.

It's definitely an extension and not as urgently necessary,although it fits well into the general topic of borrowing:`scope` by itself provides mutable borrowing, but `scope!(const...)` provides const borrowing, in the sense that another objecttemporarily takes ownership of the value, so that the originalowner can only read the object until it is "returned" by theborrowed value going out of scope. I mentioned it here because itseemed to be an easy extension that could solve an interestinglong-standing problem for which we only have workarounds today(`byLineCopy` IIRC).

And I have to add that it's not completely thought out yet. Forexample, might it make sense to have `scope!(immutable ...)`,`scope!(shared ...)`, and if yes, what would they mean...

[...]
An open question is whether there needs to be an explicitdesignationof GC'd values (for example by `scope!static` or `scope!GC`),to saythat a given values lives as long as it's needed (or"forever").
Shouldn't unqualified values already serve this purpose?

Likely yes. It might however be useful to contemplate, especiallywith regards to allocators.

[...]
Now, for the problems:
Obviously, there is quite a bit of complexity involved. I canimaginethat inferring the scope for templates (which is essential,just as
for const and the other type modifiers) can be complicated.
I'm thinking of aiming for a design where the compiler caninfer alllifetimes automatically, and the user doesn't have to. I'm notsure ifthis is possible, but based on what Walter said, it would bebest if weinfer as much as possible, since users are lazy and areunlikely to bethrilled at the idea of having to write additional annotationson their
types.

I agree. It's already getting ugly with `const pure nothrow @safe@nogc`, adding another annotation should not be donelightheartedly. However, if the compiler could infer all thelifetimes (which I'm quite sure isn't possible, see thehaystack-needle example), I don't see why we'd need `scope` atall. It would at most be a way not to break backwardcompatibility, but that would be another case where you could saythat D has it backwards, like un-@safe by default...

My original proposal was aimed at this, that's why I didn't putinexplicit lifetimes. I was hoping to find a way to define thingssuchthat the lifetime is unambiguous from the context in which'scope' isused, so that users don't ever have to write anything more thanthat.This also makes the compiler's life easier, since we don't haveto keeptrack of who owns what, and can just compute the lifetime fromthesurrounding context. This may require sacrificing someprecision inlifetimes, but if it helps simplify things while still givingadequate
functionality, I think it's a good compromise.

I agree it looks a bit intimidating at first glance, but as faras I can tell it should be relatively straightforward toimplement. I'll explain how I think it could be done:

The obvious things: The parser needs to recognize the new syntax,and scope needs to be turned into a type modifier and stored inthe internal data structures accordingly.

It is then possible to define a hierarchy of lifetimes. At thetop are global and static variables and the GC heap(`scope!static` or just unannotated), then the come functionparameters, then local variables in function bodies, and finallylocal variables in lower scopes like `if` blocks. This is purelybased on lexical scope and order of declaration (local variablesare destroyed in inverse order of construction, for example); itcan be derived from the AST. Furthermore, it is a stricthierarchy; lifetimes higher in the hierarchy are strict supersets of lower lifetimes.

A variables effective lifetime is then its place in thishierarchy, or the lifetime of its owner if one is specified.

Once that's done, the semantic phase needs to be extended tocheck for scope correctness. This seems complicated, but actuallyneeds to touch only a few places. Any time a scope value iscopied, by assignment, returning from a function, passing to afunction, throwing, and what else I may have missed, the compilerneeds to check that the destination's effective lifetime is notwider than that of the source.

For function calls, an additional step is necessary, but it isn'treally complicated either. Let's take `findSubstring` as anexample:


    scope!haystack(string) findSubstring(
        scope string haystack,
        scope string needle
    );

    void foo() {
        string[$] h = "Hello, world!";
        auto found = findSubstring(h, ", ");
        // `typeof(found)` is now `scope!h`
    }

As owners in function signatures may refer to other parameters(or `this`), the compiler needs to match up these parameters withwhat is passed in, and substitute them accordingly for typededuction (only for `auto` return values).

And that's it, AFAICS. Notice that none of this requires flowcontrol analysis or inter-procedural things, it can all bedecided locally at the place of assignment/calling/etc.

[...]
I also have a few ideas about owned types and move semantics,but this
is mostly independent from borrowing (although, of course, it
integrates nicely with it). So, that's it, for now. Sorry forthe long
text. Thoughts?
It seems that you're the full borrowed reference/pointerproblem, whichis something necessary. But I was thinking more in terms of thebaselinefunctionality -- what is the simplest design for 'scope' thatstillgives useful semantics that covers most of the cases? I knowthere aresome tricky corner cases, but I'm wondering if we can somehowfind aneasy solution for the easy parts (presumably the more commonparts),
while still allowing for a way to deal with the hard parts.
At least for now, I'm thinking in the direction of findingsomething
with simple semantics that, at the same time, produces complex
(interesting) effects when composed, that we can use to solvethe
borrowed pointer problem.

I already wrote this in a reply to Walter. I believe in somecases we can allow automatic borrowing without any annotation atall, not even bare `scope`. The most obvious examples are purefunctions with signatures that guarantee that nothing can beescaped from them:

void foo(int[] p) pure; // obvious, function has noopportunity

                               // to keep a reference to `p`

int bar(int[] p) pure; // returns an `int` but that's avalue

                               // type, and that's ok
    int[] baz(const(int)[] p) pure;

// the return type is not `const`and thus

                               // cannot come from `p`

Maybe there are some cases with non-pure functions, too. But onthe other hand, I also think that in the end we won't get aroundintroducing explicit annotations, because the above rules cannever cover enough cases to disregard the remaining ones.

Anyway, I don't believe that explicit annotations will be neededoften enough to turn the users away. It will be mostly librarywriters who have to use them, and Phobos can set a good examplethere and work out a good style, just as it has done for othermatters.

It also helps to take a glance at Rust's standard library, to seehow frequent or infrequent lifetime annotations will be. Theykeep popping up here and there, but they are not littered allover the source code. They're frequent enough to confirm mysuspicion that they cannot be disregarded, but they're alsoinfrequent enough not to be an annoyance. (I only looked at a fewmodules, though.)

Re: Proposal for design of 'scope' (Was: Re: Opportunities for D)

Reply via email to