DMD Symbol Reference Analysis Pass
Does DMD currently do any analysis of references to a symbol in a given scope? If not where could this information be extracted (in which visitor/callback) and in what structure should it, if so, be stored? Reason: After having read about Rust's data-flow (and in turn escape) analysis I'm very curious about how difficult it would be to add more clever type inference of, for instance, mutability, based on this analysis. Two cases come to my mind: A: Non-Templated Function: must be @safe (or perhaps @trusted) pure and parameter must qualified as const (or in). B: Templated Function: Usage of parameter in body must be non-mutating; meaning no lhs of assignment op (=, +=, ...), and calls to functions that take parameter as argument must be transitively fulfill A and B. I'm guessing Scope::insert(Dsymbol*s) { if (VarDeclaration *vd = s->isVarDeclaration()) { // .. is of interest. Is there another member function called everytime a Dsymbol is referenced? I'm guessing MODFlags plays a role here aswell. I'm asking again because of the work recently done in DIP-25, that may be related to this problem.
Re: DMD Symbol Reference Analysis Pass
On Monday, 25 May 2015 at 12:43:04 UTC, Per Nordlöw wrote: Does DMD currently do any analysis of references to a symbol in a given scope? If not where could this information be extracted (in which visitor/callback) and in what structure should it, if so, be stored? Reason: After having read about Rust's data-flow (and in turn escape) analysis I'm very curious about how difficult it would be to add more clever type inference of, for instance, mutability, based on this analysis. Two cases come to my mind: A: Non-Templated Function: must be @safe (or perhaps @trusted) pure and parameter must qualified as const (or in). B: Templated Function: Usage of parameter in body must be non-mutating; meaning no lhs of assignment op (=, +=, ...), and calls to functions that take parameter as argument must be transitively fulfill A and B. I'm guessing Scope::insert(Dsymbol*s) { if (VarDeclaration *vd = s->isVarDeclaration()) { // .. is of interest. Is there another member function called everytime a Dsymbol is referenced? I'm guessing MODFlags plays a role here aswell. I'm asking again because of the work recently done in DIP-25, that may be related to this problem. Sorry, can't answer this, as I don't know enough about DMD's inner workings. But I noted down some ideas on this topic relating to my scope proposal. Algorithm for scope inference: http://wiki.dlang.org/User_talk:Schuetzm/scope2#Implementation ... to be used in templates and for enforcing these rules: http://wiki.dlang.org/User:Schuetzm/scope3#.40safe-ty_violations_with_borrowing Personally I don't think local inference of mutability (const) is of much help. However, a kind of "borrow checker" is necessary to avoid the safety problems that arise from borrowing (see the thread "RCArray is unsafe" [1]). This happens to involve very similar analysis to what you're thinking of. In contrast to Rust, it is relatively simple, because we don't support transfer of ownership (moving) as Rust does. [1] http://forum.dlang.org/thread/huspgmeupgobjubts...@forum.dlang.org
Re: DMD Symbol Reference Analysis Pass
On Tuesday, 26 May 2015 at 10:19:52 UTC, Marc Schütz wrote: ... to be used in templates and for enforcing these rules: http://wiki.dlang.org/User:Schuetzm/scope3#.40safe-ty_violations_with_borrowing There's at least a plan. Nice! One thing, though. I'm lacking a section in the document linked above on how `foreach` could be `scope`-enhanced so that an element reference of an aggregate doesn't escape its foreach scope. char[] saved_line; string saved_str; foreach (scope line; File("foo.txt").byLine) { saved_line = line; // should give error saved_line = line.dup; // should be ok saved_str = line.to!string; // should be ok } provided that `byLine` returns a reference to a volatile internal buffer.
Re: DMD Symbol Reference Analysis Pass
On Tuesday, 26 May 2015 at 14:59:38 UTC, Per Nordlöw wrote: On Tuesday, 26 May 2015 at 10:19:52 UTC, Marc Schütz wrote: ... to be used in templates and for enforcing these rules: http://wiki.dlang.org/User:Schuetzm/scope3#.40safe-ty_violations_with_borrowing There's at least a plan. Nice! One thing, though. I'm lacking a section in the document linked above on how `foreach` could be `scope`-enhanced so that an element reference of an aggregate doesn't escape its foreach scope. char[] saved_line; string saved_str; foreach (scope line; File("foo.txt").byLine) { saved_line = line; // should give error saved_line = line.dup; // should be ok saved_str = line.to!string; // should be ok } provided that `byLine` returns a reference to a volatile internal buffer. Assuming you mean by "volatile" that the buffer is released upon destruction: The compiler is supposed to do that automatically, i.e. `scope` annotations on local variables are always inferred. In your example, it would figure out that you're assigning a reference to a value with shorter lifetime (i.e. the slice to the buffer) to a value with longer lifetime (saved_line), which it would disallow. (Btw, I don't think to!string is enough, because it is probably a no-op in this case: string -> string). However, byLine has another problem, which boils down to the same cause as the problem with RCArray, namely that the content of the buffer is reused in each iteration. The reason is that the "owner" can be modified while references to it exist. For byLine, this is not a safety violation, but for RCArray it is. A solution applicable to both is to detect this and then either treat such a situation as @system, or make the owner `const` as long as the references are alive.
Re: DMD Symbol Reference Analysis Pass
On Tuesday, 26 May 2015 at 15:21:04 UTC, Marc Schütz wrote: On Tuesday, 26 May 2015 at 14:59:38 UTC, Per Nordlöw wrote: On Tuesday, 26 May 2015 at 10:19:52 UTC, Marc Schütz wrote: ... to be used in templates and for enforcing these rules: http://wiki.dlang.org/User:Schuetzm/scope3#.40safe-ty_violations_with_borrowing There's at least a plan. Nice! One thing, though. I'm lacking a section in the document linked above on how `foreach` could be `scope`-enhanced so that an element reference of an aggregate doesn't escape its foreach scope. char[] saved_line; string saved_str; foreach (scope line; File("foo.txt").byLine) { saved_line = line; // should give error saved_line = line.dup; // should be ok saved_str = line.to!string; // should be ok } provided that `byLine` returns a reference to a volatile internal buffer. Assuming you mean by "volatile" that the buffer is released upon destruction: No, with volatile I mean that the buffer contents changes with each iteration. The compiler is supposed to do that automatically, i.e. `scope` annotations on local variables are always inferred. In your No, DMD cannot currently handle scope on foreach elements. It errors as Error: basic type expected, not scope example, it would figure out that you're assigning a reference to a value with shorter lifetime (i.e. the slice to the buffer) to a value with longer lifetime (saved_line), which it would disallow. (Btw, I don't think to!string is enough, because it is probably a no-op in this case: string -> string). No to!string is not a no-op in this case. It allocates but it needs to create an immutable char array that is: char[] -> string However, byLine has another problem, which boils down to the same cause as the problem with RCArray, namely that the content of the buffer is reused in each iteration. This is what I meant with volatile. Is there a better word for this? the "owner" can be modified while references to it exist. For byLine, this is not a safety violation, but for RCArray it is. A solution applicable to both is to detect this and then either treat such a situation as @system, or make the owner `const` as long as the references are alive. AFAIK: Allowing scope in foreach would solve this problem in my case.
Re: DMD Symbol Reference Analysis Pass
On Tuesday, 26 May 2015 at 21:22:38 UTC, Per Nordlöw wrote: No, DMD cannot currently handle scope on foreach elements. It errors as Error: basic type expected, not scope Quite possible, didn't test it. Anyway, my point was that it simply isn't necessary to ever mark a local variable as `scope`. The compiler sees all local variables and can figure things out by itself. It only needs help in function signatures, in the form of explicit `scope` and `return` annotations. example, it would figure out that you're assigning a reference to a value with shorter lifetime (i.e. the slice to the buffer) to a value with longer lifetime (saved_line), which it would disallow. (Btw, I don't think to!string is enough, because it is probably a no-op in this case: string -> string). No to!string is not a no-op in this case. It allocates but it needs to create an immutable char array that is: char[] -> string I see, File.byLine returns a range of char[], I thought it returned a string range. Then you're of course right. However, byLine has another problem, which boils down to the same cause as the problem with RCArray, namely that the content of the buffer is reused in each iteration. This is what I meant with volatile. Is there a better word for this? I guess it's fine, and now I remember again that ranges with this property have been called "volatile ranges". the "owner" can be modified while references to it exist. For byLine, this is not a safety violation, but for RCArray it is. A solution applicable to both is to detect this and then either treat such a situation as @system, or make the owner `const` as long as the references are alive. AFAIK: Allowing scope in foreach would solve this problem in my case. See above. Conceptually, you can of course treat it as if it were marked with `scope`, but an actual annotation should not be necessary.
Re: DMD Symbol Reference Analysis Pass
On Wednesday, 27 May 2015 at 08:30:33 UTC, Marc Schütz wrote: See above. Conceptually, you can of course treat it as if it were marked with `scope`, but an actual annotation should not be necessary. But now you're talking about an upcoming feature in DMD, right? AFAIK, in current DMD, I can't get any help in avoiding patterns such as char[] saved_line; foreach (line; File("foo.txt").byLine) { saved_line = line; // should give error } Right? Are you saying that adding DMD support for qualifying `line` as `scope` is not the right way to solve this problem?
Re: DMD Symbol Reference Analysis Pass
On Wednesday, 27 May 2015 at 08:38:48 UTC, Per Nordlöw wrote: AFAIK, in current DMD, I can't get any help in avoiding patterns such as char[] saved_line; foreach (line; File("foo.txt").byLine) { saved_line = line; // should give error } If I understand you correctly, a new kind of qualifier for `line` may be motivated here. The semantic meaning of `scope` D is not related to volatile property. I guess the problem is somewhat related to reference counting and ownership, right.
Re: DMD Symbol Reference Analysis Pass
On Wednesday, 27 May 2015 at 08:38:48 UTC, Per Nordlöw wrote: On Wednesday, 27 May 2015 at 08:30:33 UTC, Marc Schütz wrote: See above. Conceptually, you can of course treat it as if it were marked with `scope`, but an actual annotation should not be necessary. But now you're talking about an upcoming feature in DMD, right? Well, obviously, nothing of what we're talking about works with current DMD. Even scope doesn't do anything (except for delegates). AFAIK, in current DMD, I can't get any help in avoiding patterns such as char[] saved_line; foreach (line; File("foo.txt").byLine) { saved_line = line; // should give error } Right? Yes. Are you saying that adding DMD support for qualifying `line` as `scope` is not the right way to solve this problem? Yes. First of all, `File.byLine.front` is the function that needs to get annotated, like this: char[] front() return { // ... return buffer; } The `return` keyword here means the same thing as in DIP25, namely that the returned value is owned by `this`. In your example, the owner is the temporary returned by `File("foo.txt").byLine`, which means that the returned buffer must no longer be used when that temporary gets destroyed, i.e. the read strings must not escape the foreach (without being copied), which allows the buffer to be safely released then. This is the original problem that `scope` was meant to address. Conceptually, you're right that now `line` needs to be annotated with `scope`, because otherwise you wouldn't be allowed to store the scoped slices there. But there really isn't any point in actually adding that annotation, because it can always be inferred by the compiler (it can add the annotation for you when it sees that you assign a scoped value to it). This alone is then already enough to prevent the "volatility" problem in your example, because it is longer possible for the individual lines to outlive the byLine() temporary that gets iterated over. However, it is not enough in the general case: auto lines = stdin.ByLine; auto line1 = lines.front; lines.popFront(); // line1 now changes auto line2 = lines.front; In this case, the rule that `line1` must not outlive `lines` is fulfilled, but still it gets invalidated. With byLine(), this just leads to unexpected behaviour, but with e.g. reference counting, it can cause memory corruption (use after free). Therefore, any complete scope proposal needs to address this problem, too. What I propose is the following: The compiler keeps track of outstanding "loans" to owned objects. As long as any such "loan" exists (in the above example, `line1` and `line2`), the owner (i.e. `lines`) will either become read-only (const), or alternatively, it will stay mutable, but mutating it will become @system. This effectively addresses both the safety problems as well as volatile ranges, because they are actually the same problem. I hope it is now clear what I want to say. It is unfortunately a complicated topic...
Re: DMD Symbol Reference Analysis Pass
On Wednesday, 27 May 2015 at 08:43:07 UTC, Per Nordlöw wrote: On Wednesday, 27 May 2015 at 08:38:48 UTC, Per Nordlöw wrote: AFAIK, in current DMD, I can't get any help in avoiding patterns such as char[] saved_line; foreach (line; File("foo.txt").byLine) { saved_line = line; // should give error } If I understand you correctly, a new kind of qualifier for `line` may be motivated here. The semantic meaning of `scope` D is not related to volatile property. I guess the problem is somewhat related to reference counting and ownership, right. See my other reply. Originally I thought so too, but it turns out they can't really be separated. It's basically an instance of Rust's restriction "exactly one mutable reference, or N immutable references, but not both at the same time".
Re: DMD Symbol Reference Analysis Pass
On Wednesday, 27 May 2015 at 09:54:33 UTC, Marc Schütz wrote: Yes. First of all, `File.byLine.front` is the function that needs to get annotated, like this: char[] front() return { // ... return buffer; } The `return` keyword here means the same thing as in DIP25, Is this supportd in 2.067 with -dip25 flag? If so shouldn't we qualify `File.byLine.front` with return when DIP-25 becomes stable?
Re: DMD Symbol Reference Analysis Pass
On Wednesday, 27 May 2015 at 10:53:48 UTC, Per Nordlöw wrote: On Wednesday, 27 May 2015 at 09:54:33 UTC, Marc Schütz wrote: Yes. First of all, `File.byLine.front` is the function that needs to get annotated, like this: char[] front() return { // ... return buffer; } The `return` keyword here means the same thing as in DIP25, Is this supportd in 2.067 with -dip25 flag? If so shouldn't we qualify `File.byLine.front` with return when DIP-25 becomes stable? I qualified `front` with `return` here https://github.com/nordlow/justd/blob/master/bylinefast.d#L84 but compiling this with DMD 2.067 along with flag -dip25 doesn't complain about https://github.com/nordlow/justd/blob/master/bylinefast.d#L188 Did you mean that this too is a planned feature?
Re: DMD Symbol Reference Analysis Pass
On Wednesday, 27 May 2015 at 11:02:24 UTC, Per Nordlöw wrote: On Wednesday, 27 May 2015 at 10:53:48 UTC, Per Nordlöw wrote: On Wednesday, 27 May 2015 at 09:54:33 UTC, Marc Schütz wrote: Yes. First of all, `File.byLine.front` is the function that needs to get annotated, like this: char[] front() return { // ... return buffer; } The `return` keyword here means the same thing as in DIP25, Is this supportd in 2.067 with -dip25 flag? If so shouldn't we qualify `File.byLine.front` with return when DIP-25 becomes stable? I qualified `front` with `return` here https://github.com/nordlow/justd/blob/master/bylinefast.d#L84 but compiling this with DMD 2.067 along with flag -dip25 doesn't complain about https://github.com/nordlow/justd/blob/master/bylinefast.d#L188 Did you mean that this too is a planned feature? I general, `return` is supposed to work (with -dip25), but only in combination with `ref`, not slices or pointers. DMD probably ignores it here instead of printing an error message.
Re: DMD Symbol Reference Analysis Pass
On Wednesday, 27 May 2015 at 14:13:03 UTC, Marc Schütz wrote: I general, `return` is supposed to work (with -dip25), but only in combination with `ref`, not slices or pointers. DMD probably ignores it here instead of printing an error message. So, *should* it error for slices or not?
Re: DMD Symbol Reference Analysis Pass
On Wednesday, 27 May 2015 at 11:02:24 UTC, Per Nordlöw wrote: On Wednesday, 27 May 2015 at 10:53:48 UTC, Per Nordlöw wrote: On Wednesday, 27 May 2015 at 09:54:33 UTC, Marc Schütz wrote: Yes. First of all, `File.byLine.front` is the function that needs to get annotated, like this: char[] front() return { // ... return buffer; } The `return` keyword here means the same thing as in DIP25, Is this supportd in 2.067 with -dip25 flag? If so shouldn't we qualify `File.byLine.front` with return when DIP-25 becomes stable? I qualified `front` with `return` here https://github.com/nordlow/justd/blob/master/bylinefast.d#L84 but compiling this with DMD 2.067 along with flag -dip25 doesn't complain about https://github.com/nordlow/justd/blob/master/bylinefast.d#L188 Did you mean that this too is a planned feature? I might be wrong, but I thought dip25 was only enabled in @safe annotated code?
Re: DMD Symbol Reference Analysis Pass
On Wednesday, 27 May 2015 at 15:21:38 UTC, weaselcat wrote: I might be wrong, but I thought dip25 was only enabled in @safe annotated code? Does this mean that I have to @safe-qualify `ByLineFast.front()` or the function iterating over it or both? Or does it suffice to @trusted-qualify `ByLineFast.front()` and @safe-qualify the function doing the iteration?
Re: DMD Symbol Reference Analysis Pass
On Wednesday, 27 May 2015 at 15:21:38 UTC, weaselcat wrote: I might be wrong, but I thought dip25 was only enabled in @safe annotated code? I updated bylinefast at https://github.com/nordlow/justd/blob/master/bylinefast.d#L188 to make the unittest @safe and members of @trusted. DMD (2.067 and git master) with -dip25 still doesn't complain about https://github.com/nordlow/justd/blob/master/bylinefast.d#L198 nor https://github.com/nordlow/justd/blob/master/bylinefast.d#L203 ...
Re: DMD Symbol Reference Analysis Pass
On Thursday, 28 May 2015 at 11:38:25 UTC, Per Nordlöw wrote: DMD (2.067 and git master) with -dip25 still doesn't complain about ... My guess is that this passes because the internal storage is GC-allocated. I'm sensing we need a new qualifier for this or that there is more logic to come in DMD regarding extensions to DIP-25...
Re: DMD Symbol Reference Analysis Pass
On 5/28/15 5:38 AM, "Per =?UTF-8?B?Tm9yZGzDtnci?= " wrote: On Wednesday, 27 May 2015 at 15:21:38 UTC, weaselcat wrote: I might be wrong, but I thought dip25 was only enabled in @safe annotated code? I updated bylinefast at https://github.com/nordlow/justd/blob/master/bylinefast.d#L188 How faster is bylinefast compared to byline (after the recent improvements)? -- Andrei
Re: DMD Symbol Reference Analysis Pass
How faster is bylinefast compared to byline (after the recent improvements)? -- Andrei About 3 times in my measurements.
Re: DMD Symbol Reference Analysis Pass
On 5/28/15 2:05 PM, "Per =?UTF-8?B?Tm9yZGzDtnci?= " wrote: How faster is bylinefast compared to byline (after the recent improvements)? -- Andrei About 3 times in my measurements. Cool! What are the incompatibilities keeping it from replacing byLine? -- Andrei
Re: DMD Symbol Reference Analysis Pass
On Thursday, 28 May 2015 at 20:54:59 UTC, Andrei Alexandrescu wrote: Cool! What are the incompatibilities keeping it from replacing byLine? -- Andrei Speed-up varies between 2.0 and 2.7 according to recent experiments done using new unittest at https://github.com/nordlow/justd/blob/79cc8bf0766282368f05314d00566e7d234988bd/bylinefast.d#L207 which is currently deactivated. It has worked flawlessly in my applications, so none AFAIK. Note, that I'm not the original author, though, so credits should go to someone else. I've only made some tweaks regarding indentation, symbol naming, @safe, @trusted and changing separator type from dchar to string and probably some more I've forgotten about. BTW, Andrei, there's a new lazy range PR for Phobos on GitHub awaiting review... ;)
Re: DMD Symbol Reference Analysis Pass
On Thursday, 28 May 2015 at 21:23:59 UTC, Per Nordlöw wrote: On Thursday, 28 May 2015 at 20:54:59 UTC, Andrei Alexandrescu wrote: Cool! What are the incompatibilities keeping it from replacing byLine? -- Andrei Speed-up varies between 2.0 and 2.7 according to recent experiments done using new unittest at BTW: I'm sitting on a very recently bought (fast) laptop with a fast SSD.
Re: DMD Symbol Reference Analysis Pass
On Thursday, 28 May 2015 at 21:27:06 UTC, Per Nordlöw wrote: Speed-up varies between 2.0 and 2.7 according to recent experiments done using new unittest at The test file http://downloads.dbpedia.org/3.9/en/instance_types_en.nt.bz2 contains 15.9 Mlines :) /Per
Re: DMD Symbol Reference Analysis Pass
On Thursday, 28 May 2015 at 21:23:59 UTC, Per Nordlöw wrote: https://github.com/nordlow/justd/blob/79cc8bf0766282368f05314d00566e7d234988bd/bylinefast.d#L207 which is currently deactivated. It has worked flawlessly in my applications, so none AFAIK. Could this replace the stuck https://github.com/D-Programming-Language/phobos/pull/2794?
Re: DMD Symbol Reference Analysis Pass
On Friday, 29 May 2015 at 08:48:59 UTC, Martin Nowak wrote: https://github.com/D-Programming-Language/phobos/pull/2794? That's a massive discussion. Is it possible to describe in shorter terms what the problem is and how it relates to byLine? Please.
Re: DMD Symbol Reference Analysis Pass
On Friday, 29 May 2015 at 09:17:17 UTC, Per Nordlöw wrote: That's a massive discussion. Is it possible to describe in shorter terms what the problem is and how it relates to byLine? Would the problem be solved if `byLine` was changed to not use `readln()`?
Re: DMD Symbol Reference Analysis Pass
On 5/28/15 2:23 PM, "Per =?UTF-8?B?Tm9yZGzDtnci?= " wrote: BTW, Andrei, there's a new lazy range PR for Phobos on GitHub awaiting review... ;) Destroyed. -- Andrei
Re: DMD Symbol Reference Analysis Pass
On Monday, 1 June 2015 at 03:49:39 UTC, Andrei Alexandrescu wrote: Destroyed. -- Andrei Thx