Re: cycle dependencies

2018-06-02 Thread Simen Kjærås via Digitalmars-d

On Saturday, 2 June 2018 at 17:17:02 UTC, Neia Neutuladh wrote:
On Friday, 1 June 2018 at 17:59:21 UTC, Steven Schveighoffer 
wrote:
The .di file is just an interface, it doesn't know what's 
actually compiled in the binary.


To put it another way, the compiler only generates a 
ModuleInfo (or dependency modules) for .d files. .di files are 
simply a public API for the .d files.


Yes, this is my point. (Communication is much harder than I 
thought.)


When you encounter a .di file, you can't rely on an automated 
tool to tell you what modules need to be initialized or figure 
out an order for them. You have to do it manually, and if you 
mess it up, you get undefined behavior.


I believe Steve's point was that with the suggested json file 
describing dependencies, that would be a part of the public 
interface, and so would be distributed alongside .di files. Of 
course, someone could forget to distribute those, and in such a 
case the runtime cycle test could be extended to do the 
conservative test if the compiler hasn't registered all modules 
as having json info.


--
  Simen


Re: cycle dependencies

2018-06-02 Thread Neia Neutuladh via Digitalmars-d
On Friday, 1 June 2018 at 17:59:21 UTC, Steven Schveighoffer 
wrote:
The .di file is just an interface, it doesn't know what's 
actually compiled in the binary.


To put it another way, the compiler only generates a ModuleInfo 
(or dependency modules) for .d files. .di files are simply a 
public API for the .d files.


Yes, this is my point. (Communication is much harder than I 
thought.)


When you encounter a .di file, you can't rely on an automated 
tool to tell you what modules need to be initialized or figure 
out an order for them. You have to do it manually, and if you 
mess it up, you get undefined behavior.


This is why I called it "here be dragons" -- it's fraught.

Unless your goal was to omit the depth-first search in the common 
case while preserving the rest of the current logic. I'm curious 
how much time that would save.


Re: cycle dependencies

2018-06-01 Thread Steven Schveighoffer via Digitalmars-d

On 6/1/18 11:42 AM, Neia Neutuladh wrote:

On Friday, 1 June 2018 at 12:50:31 UTC, Steven Schveighoffer wrote:
A fly in the ointment is .di files. This works today and does the 
right thing:


This actually is not a problem, because the dependency tree doesn't 
depend on whether the imported module has static ctors. It's how this 
works today -- each module only knows whether the modules has static 
ctors, and what imports it has. It doesn't go further than that.


Thanks to .di files, when I compile my executable, I don't always know 
which modules import which other modules. That was half the point of my 
example.


The .di file is just an interface, it doesn't know what's actually 
compiled in the binary.


To put it another way, the compiler only generates a ModuleInfo (or 
dependency modules) for .d files. .di files are simply a public API for 
the .d files.


Are you talking about giving the initialization order for a subset of 
modules? Because that wouldn't give much of a speed increase. Or do you 
need all of your dependencies to have their own dependency order, and 
the compiler can concatenate them?


Speed is one aspect, but the more important aspect is to do more 
fine-grained dependency tracking. Doing it on a function level vs. 
simply assuming all things in the module depend on all things in the 
dependent modules, allowing for instance module cycles where there is no 
real cycle between the data initialized. That's not going to work well 
with today's mechanisms because it would be much more expensive both in 
runtime and memory usage, and even further highlight the drawback that 
we are solving the same static problem on every run of a program.


As an alternative, you could produce the initialization order by running 
the program and seeing what order it uses.


Yes, but this still involves pushing all the dependency data into the 
binary. All we should care about at runtime is the order to execute 
ctors. We could even eliminate the needless storage of all the 
dependency modules that is there now.


-Steve


Re: cycle dependencies

2018-06-01 Thread Neia Neutuladh via Digitalmars-d
On Friday, 1 June 2018 at 12:50:31 UTC, Steven Schveighoffer 
wrote:
A fly in the ointment is .di files. This works today and does 
the right thing:


This actually is not a problem, because the dependency tree 
doesn't depend on whether the imported module has static ctors. 
It's how this works today -- each module only knows whether the 
modules has static ctors, and what imports it has. It doesn't 
go further than that.


Thanks to .di files, when I compile my executable, I don't always 
know which modules import which other modules. That was half the 
point of my example.


Are you talking about giving the initialization order for a 
subset of modules? Because that wouldn't give much of a speed 
increase. Or do you need all of your dependencies to have their 
own dependency order, and the compiler can concatenate them?


As an alternative, you could produce the initialization order by 
running the program and seeing what order it uses.


Re: cycle dependencies

2018-06-01 Thread Steven Schveighoffer via Digitalmars-d

On 5/31/18 8:49 PM, Neia Neutuladh wrote:

On Thursday, 31 May 2018 at 23:17:20 UTC, Steven Schveighoffer wrote:
Hm... I just had a crazy idea: what if D had a switch that allowed it 
to store a dependency graph of constructors into a json file, and then 
when you link, there is a wrapper which consumes all these files, runs 
the cycle detection and sort, and then compiles a perfectly sorted 
moduleinfo section to be included in the binary (obviously, written in 
D and compiled by the compiler).


This is how languages get custom object formats and custom linkers.


Yep. I know that this is something that we need to be wary of. The 
proposed file format, however, would not be something like an object or 
even a linker, just a text file that hints in how to link these together 
without having to do the sort.




A fly in the ointment is .di files. This works today and does the right 
thing:


This actually is not a problem, because the dependency tree doesn't 
depend on whether the imported module has static ctors. It's how this 
works today -- each module only knows whether the modules has static 
ctors, and what imports it has. It doesn't go further than that.


The dependency file would keep this same mechanism, we wouldn't be doing 
more interpretation.


If the dependency data were inserted into the object files instead 
(which it should be? that's what ModuleInfo is), then the compiler could 
potentially read this out before linking, and we could be sure that the 
resulting order is correct. Like, insert a weak symbol into each normal 
object file saying there's no dependency graph, and then the compiler 
can run on a set of object files to produce a new object file with the 
complete dependency graph as a strong symbol.


It's an interesting idea, the only benefit really being to avoid having 
separate files for the data. I don't know enough about object files to 
be able to speak coherently about it.


I have a much better feeling, however, about having a more 
readable/understandable file format than object files. It's definitely a 
lot easier to deal with something like a JSON file than an object file.



But that's more work.


There's nothing in this proposal that is less than a lot of work :) So 
putting a bit more on top is OK, if it makes things more useable.


This isn't something to flesh out buried in a sub-thread anyway. I'm 
going to put together a more complete description and proposal. Not a 
DIP yet, because I want to gauge interest first.


-Steve


Re: cycle dependencies

2018-05-31 Thread Neia Neutuladh via Digitalmars-d
On Thursday, 31 May 2018 at 23:17:20 UTC, Steven Schveighoffer 
wrote:
Hm... I just had a crazy idea: what if D had a switch that 
allowed it to store a dependency graph of constructors into a 
json file, and then when you link, there is a wrapper which 
consumes all these files, runs the cycle detection and sort, 
and then compiles a perfectly sorted moduleinfo section to be 
included in the binary (obviously, written in D and compiled by 
the compiler).


This is how languages get custom object formats and custom 
linkers.


A fly in the ointment is .di files. This works today and does the 
right thing:


// bar.d
import modulewithstaticctor;
shared static this() {}
string something = modulewithstaticctor.someValue;

// bar.di
// no static ctor, no imports
string something;

// foo.d
import bar, std.stdiod;
void main()
{
  writefln(something);
}

So mixing that dependency graph with .di files is "here be 
dragons" territory.


If the dependency data were inserted into the object files 
instead (which it should be? that's what ModuleInfo is), then the 
compiler could potentially read this out before linking, and we 
could be sure that the resulting order is correct. Like, insert a 
weak symbol into each normal object file saying there's no 
dependency graph, and then the compiler can run on a set of 
object files to produce a new object file with the complete 
dependency graph as a strong symbol.


But that's more work.


Re: cycle dependencies

2018-05-31 Thread Steven Schveighoffer via Digitalmars-d

On 5/31/18 5:12 PM, David Bennett wrote:

On Thursday, 31 May 2018 at 15:15:44 UTC, Steven Schveighoffer wrote:

On 5/31/18 2:14 AM, Simen Kjærås wrote:

On Wednesday, 30 May 2018 at 20:57:32 UTC, DigitalDesigns wrote:
Why is this a runtime issue? It is not as if the execution of static 
this are non-deterministic. The compiler and linker must order the 
calls in some way.


Because of separate compilation, the compiler can't do it. Because 
the generic linker doesn't do that sort of thing, the linker can't do 
it.


The first part is essentially intractable - e.g. module A's static 
this uses a global variable in module B that's set by module C. 
Module A may be compiled separately from module C, so the compiler 
can't see the dependency.


Yes, this is really the crux of it.

I want to stress that the cycle detection is not really for the sake 
of detecting cycles, it's because you need to sort the modules in the 
order their static ctors should be called. When you can't sort them in 
an order, you have to call them in an arbitrary order (probably just 
in the order they were linked).


It is really a shame we have to do this at runtime, but that's due to 
the compilation model we are stuck with, and the linker tools we deal 
with.




Thinking about this problem for a while I can up with something that 
could both reduce the work the runtime had to do and allow us to remove 
the error in the OP.


But now I see most of it was suggested in that pull request. [1]

Would the best way to implement this be to extend ModuleInfo and include 
a getter that   loads the dependencies like importedModules(), or should 
the ctor/dtor stuff be moved to a new tables?


It's really an interesting problem, probably one of the most puzzling 
algorithmic problems I've had to solve.


However, a lot of this is complicated because we don't do it at 
compile-time, but at runtime instead. We have way more time and memory 
and whatever we want to throw at it if we can move the sorting and cycle 
check to build time instead of runtime.


If we want to make the check more fine grained, we are talking metadata 
for every function call, and which static ctor-initialized items it may 
access, and which lines actually initialize them.


Also we might be able to get the compiler to insert a cluster_hash 
that's unique for each compiler run and a pre-ordered cluster_index. 
Then use this to speed up sorting in the runtime.


Yeah, I don't think the runtime speed is a huge problem. Initializing 
the hash doesn't take much time at all.


Hm... I just had a crazy idea: what if D had a switch that allowed it to 
store a dependency graph of constructors into a json file, and then when 
you link, there is a wrapper which consumes all these files, runs the 
cycle detection and sort, and then compiles a perfectly sorted 
moduleinfo section to be included in the binary (obviously, written in D 
and compiled by the compiler).


This doesn't affect the linker, and uses all the existing tools we have 
to create object files and binaries.


I'll have to bring this out into its own thread so it's not buried.

-Steve


Re: cycle dependencies

2018-05-31 Thread David Bennett via Digitalmars-d
On Thursday, 31 May 2018 at 15:15:44 UTC, Steven Schveighoffer 
wrote:

On 5/31/18 2:14 AM, Simen Kjærås wrote:
On Wednesday, 30 May 2018 at 20:57:32 UTC, DigitalDesigns 
wrote:
Why is this a runtime issue? It is not as if the execution of 
static this are non-deterministic. The compiler and linker 
must order the calls in some way.


Because of separate compilation, the compiler can't do it. 
Because the generic linker doesn't do that sort of thing, the 
linker can't do it.


The first part is essentially intractable - e.g. module A's 
static this uses a global variable in module B that's set by 
module C. Module A may be compiled separately from module C, 
so the compiler can't see the dependency.


Yes, this is really the crux of it.

I want to stress that the cycle detection is not really for the 
sake of detecting cycles, it's because you need to sort the 
modules in the order their static ctors should be called. When 
you can't sort them in an order, you have to call them in an 
arbitrary order (probably just in the order they were linked).


It is really a shame we have to do this at runtime, but that's 
due to the compilation model we are stuck with, and the linker 
tools we deal with.




Thinking about this problem for a while I can up with something 
that could both reduce the work the runtime had to do and allow 
us to remove the error in the OP.


But now I see most of it was suggested in that pull request. [1]

Would the best way to implement this be to extend ModuleInfo and 
include a getter that   loads the dependencies like 
importedModules(), or should the ctor/dtor stuff be moved to a 
new tables?


Also we might be able to get the compiler to insert a 
cluster_hash that's unique for each compiler run and a 
pre-ordered cluster_index. Then use this to speed up sorting in 
the runtime.


[1] 
https://github.com/dlang/druntime/pull/1602#issuecomment-231527759


Re: cycle dependencies

2018-05-31 Thread Steven Schveighoffer via Digitalmars-d

On 5/31/18 2:14 AM, Simen Kjærås wrote:

On Wednesday, 30 May 2018 at 20:57:32 UTC, DigitalDesigns wrote:
Why is this a runtime issue? It is not as if the execution of static 
this are non-deterministic. The compiler and linker must order the 
calls in some way. Maybe this is what you mean by own object/linker?  
But even then, they would only have to be checked once so why check 
every time the exe is ran when once it is ran it must remain 
statically oriented for all future.


Because of separate compilation, the compiler can't do it. Because the 
generic linker doesn't do that sort of thing, the linker can't do it.


The first part is essentially intractable - e.g. module A's static this 
uses a global variable in module B that's set by module C. Module A may 
be compiled separately from module C, so the compiler can't see the 
dependency.


If the linker is to do it, the compiler needs to encode the information 
in the object file, and the linker must be made specially to support 
this. Maybe this could be put in some optional section in the object 
file, and linkers that don't support it would just ignore the 
information. If a non-compliant linker is used, the runtime needs to 
have a fallback, so even this won't get us entirely out of the woods. 
Only supporting special linkers comes with its own set of problems.




Yes, this is really the crux of it.

I want to stress that the cycle detection is not really for the sake of 
detecting cycles, it's because you need to sort the modules in the order 
their static ctors should be called. When you can't sort them in an 
order, you have to call them in an arbitrary order (probably just in the 
order they were linked).


It is really a shame we have to do this at runtime, but that's due to 
the compilation model we are stuck with, and the linker tools we deal with.


However, once the binary is formed, it's possible we could *edit* the 
binary to hard-code the order -- at that point, everything is known. I 
have thrown that out there in terms of a nice project someone with the 
correct skills could do: 
https://forum.dlang.org/post/nl3du7$1as5$1...@digitalmars.com


One


Re: cycle dependencies

2018-05-31 Thread Simen Kjærås via Digitalmars-d

On Wednesday, 30 May 2018 at 20:57:32 UTC, DigitalDesigns wrote:
Why is this a runtime issue? It is not as if the execution of 
static this are non-deterministic. The compiler and linker must 
order the calls in some way. Maybe this is what you mean by own 
object/linker?  But even then, they would only have to be 
checked once so why check every time the exe is ran when once 
it is ran it must remain statically oriented for all future.


Because of separate compilation, the compiler can't do it. 
Because the generic linker doesn't do that sort of thing, the 
linker can't do it.


The first part is essentially intractable - e.g. module A's 
static this uses a global variable in module B that's set by 
module C. Module A may be compiled separately from module C, so 
the compiler can't see the dependency.


If the linker is to do it, the compiler needs to encode the 
information in the object file, and the linker must be made 
specially to support this. Maybe this could be put in some 
optional section in the object file, and linkers that don't 
support it would just ignore the information. If a non-compliant 
linker is used, the runtime needs to have a fallback, so even 
this won't get us entirely out of the woods. Only supporting 
special linkers comes with its own set of problems.


--
  Simen


Re: cycle dependencies

2018-05-30 Thread DigitalDesigns via Digitalmars-d
On Wednesday, 30 May 2018 at 18:49:40 UTC, Steven Schveighoffer 
wrote:

On 5/30/18 11:50 AM, Stefan wrote:
On Wednesday, 30 May 2018 at 13:26:53 UTC, Steven 
Schveighoffer wrote:

On 5/30/18 8:09 AM, DigitalDesigns wrote:
... it's really really hard to make it check real 
dependencies.



For serious dependency checking, you could try 
https://github.com/funkwerk/depend.

It's used in quite large commercial projects.


OK, so a couple things:

1. The cycle checking has to be really really fast, and run in 
a small amount of memory. It runs on startup of EVERY D 
program. I would LOVE to get rid of this problem, but unless we 
either invent our own object/linker system, or create some 
tools to work on the linked binary, we have to do this.


2. The cycle algorithm is fast and works correctly, given the 
dependencies that the compiler has provided. It's really not 
the cycle algorithm that is the problem, but the determination 
of dependencies. In other words, the problem is the compiler 
not tracing all the actual dependencies and outlining that in 
some structure to be parsed later during program startup.


As you increase the number of dependency points, and therefore 
the graph size, the cycle algorithm has to take longer to 
figure out if there are any cycles. So even if we can fix the 
problem outlined here, the cost may not make it worth the 
effort!


There are some ideas Martin and I have fleshed out a bit, which 
would help reduced the cyclic dependencies, but so far, nobody 
has had the time and/or skills to implement. For instance:


https://issues.dlang.org/show_bug.cgi?id=16265
https://github.com/dlang/druntime/pull/1602#issuecomment-231527759

-Steve


Why is this a runtime issue? It is not as if the execution of 
static this are non-deterministic. The compiler and linker must 
order the calls in some way. Maybe this is what you mean by own 
object/linker?  But even then, they would only have to be checked 
once so why check every time the exe is ran when once it is ran 
it must remain statically oriented for all future.


two exe's could be produced, one with only the static this code 
which is ran after linking and verified, if doesn't pass then a 
linker error is thrown.


These methods could be used for better alternatives but the real 
issue is dependencies of static constructors.


How bout instead require the user to deal with it? The user can 
specify the order to run and the compiler just blindly obeys.


pragma(cycle, 1)
static this()
{

}


...

pragma(cycle, 2)
static this()
{

}

It could require cycle to be unique or just share and let the 
compiler do what it wants(assumes no cycles).


This gives the programmer a fix but doesn't assume there are 
cycles, which most times there are not.


Later on, more intelligent cycle detecting code can be created 
and added and then can give a warning in which the user can go 
back and add the pragmas.


Alternatively, only require the pragmas on modules that 
cyclically import each other. Even if their is no cycle, the 
programmer just adds a pragma to satisfy the compiler/linker/rt.







Re: cycle dependencies

2018-05-30 Thread Steven Schveighoffer via Digitalmars-d

On 5/30/18 11:50 AM, Stefan wrote:

On Wednesday, 30 May 2018 at 13:26:53 UTC, Steven Schveighoffer wrote:

On 5/30/18 8:09 AM, DigitalDesigns wrote:
... it's really really hard to make it check real dependencies.



For serious dependency checking, you could try 
https://github.com/funkwerk/depend.

It's used in quite large commercial projects.


OK, so a couple things:

1. The cycle checking has to be really really fast, and run in a small 
amount of memory. It runs on startup of EVERY D program. I would LOVE to 
get rid of this problem, but unless we either invent our own 
object/linker system, or create some tools to work on the linked binary, 
we have to do this.


2. The cycle algorithm is fast and works correctly, given the 
dependencies that the compiler has provided. It's really not the cycle 
algorithm that is the problem, but the determination of dependencies. In 
other words, the problem is the compiler not tracing all the actual 
dependencies and outlining that in some structure to be parsed later 
during program startup.


As you increase the number of dependency points, and therefore the graph 
size, the cycle algorithm has to take longer to figure out if there are 
any cycles. So even if we can fix the problem outlined here, the cost 
may not make it worth the effort!


There are some ideas Martin and I have fleshed out a bit, which would 
help reduced the cyclic dependencies, but so far, nobody has had the 
time and/or skills to implement. For instance:


https://issues.dlang.org/show_bug.cgi?id=16265
https://github.com/dlang/druntime/pull/1602#issuecomment-231527759

-Steve


Re: cycle dependencies

2018-05-30 Thread Stefan via Digitalmars-d
On Wednesday, 30 May 2018 at 13:26:53 UTC, Steven Schveighoffer 
wrote:

On 5/30/18 8:09 AM, DigitalDesigns wrote:
... it's really really hard to make it check real dependencies.



For serious dependency checking, you could try 
https://github.com/funkwerk/depend.

It's used in quite large commercial projects.


Re: cycle dependencies

2018-05-30 Thread Steven Schveighoffer via Digitalmars-d

On 5/30/18 8:09 AM, DigitalDesigns wrote:

Seriously stupid bug here!

I had an enum an static this() { } (empty, forgot why I added it) one 
module and an struct in another module that converted values from the 
enum. I imported only that enum and guess what?!?! Cycle dependency! 
removed the static this() { } and worked!


The cycle dependency checking code is far too ignorant.


It is ignorant. It only checks for the presence of static constructors, 
not what they do. This is as designed -- there are too many ways for 
dependencies to leak between modules that it's really really hard to 
make it check real dependencies.


It just blindly 
checks without any rationale. Someone said this was because phobos had a 
bug in it so the cycles were checked, maybe it's time to address that 
issue rather than breaking code arbitrarily?




I don't know what was said, but the cycle code was very broken 
originally. A few years ago, I fixed it, and then it was re-broken by 
accident, and I re-fixed it.


It should work properly now. But that has nothing to do with adding an 
empty static ctor, it had to do with properly detecting cycles.


The cycle detection functions as designed. You can disable it via DRT 
flags, but this means that there is no guarantee that they run in the 
proper order. The compiler would have to build a much more sophisticated 
dependency tree in order to do anything more clever, but I doubt it 
would ever be able to be perfect in detecting which things truly depend 
on one another.


-Steve


Re: cycle dependencies

2018-05-30 Thread Andrea Fontana via Digitalmars-d

On Wednesday, 30 May 2018 at 12:09:21 UTC, DigitalDesigns wrote:

Seriously stupid bug here!

I had an enum an static this() { } (empty, forgot why I added 
it) one module and an struct in another module that converted 
values from the enum. I imported only that enum and guess 
what?!?! Cycle dependency! removed the static this() { } and 
worked!


The cycle dependency checking code is far too ignorant. It just 
blindly checks without any rationale. Someone said this was 
because phobos had a bug in it so the cycles were checked, 
maybe it's time to address that issue rather than breaking code 
arbitrarily?


I think an example would be useful. Anyway, if you think there's 
a bug there, write a report: https://issues.dlang.org


Andrea