Re: How templates might be improved
On Friday, 16 September 2016 at 08:51:24 UTC, Stefan Koch wrote: [...] I just found http://llvm.org/docs/doxygen/html/FoldingSet_8h_source.html, So it looks like the llvm guys are already using the intern-everything approach, It makes sense since in ssa based forms this is pretty easy to do, and a logical step. This further strengthens by believe that this is worthwhile. .. Aww it's everytime I think I came up with something really clever, I discover later that someone else has already done it ... Well I guess that just means I am not too far of the mark :)
Re: How templates might be improved
On Saturday, 17 September 2016 at 12:02:47 UTC, Stefan Koch wrote: On Friday, 16 September 2016 at 23:44:42 UTC, Chris Wright wrote: On the other hand, in a change of behavior, this will be a cache miss and the template is instantiated twice: alias myint = int; alias TypeA = Typedef!int; alias TypeB = Typedef!myint; No It would not be a miss the type is the same Correction, typedef uses __traits(identifier, ) ? I did not take that use into account that would miscompile :)
Re: How templates might be improved
On Friday, 16 September 2016 at 23:44:42 UTC, Chris Wright wrote: On the other hand, in a change of behavior, this will be a cache miss and the template is instantiated twice: alias myint = int; alias TypeA = Typedef!int; alias TypeB = Typedef!myint; No It would not be a miss the type is the same And this may or may not be a cache hit: alias TypeA = Typedef!(int, 0, "A"); alias TypeB = Typedef!(int, 0, "A"); This would be a hit. If someone tries implementing the recursive form of the Fibonacci function with your change in place, they'll have unusably long compile times. However, in the typical case, compile times will be faster (and specific types can more easily receive special treatment as needed). If someone tries to implement fibobacci as a recursive template ... well there is no way that can be fast. With interning or without.
Re: The worst Phobos template (in binderoo)
On Wednesday, 14 September 2016 at 20:24:13 UTC, Stefan Koch wrote: I am going to submit a PR soon. https://github.com/dlang/dmd/pull/6134 Here it is.
Re: Lint D code while you type in Vim
On Friday, 16 September 2016 at 22:12:58 UTC, w0rp wrote: For maintainers of DMD, I would love it if an option to read source files via stdin could be added. It could be something like -stdin. Then it would be very easy to use DMD to check source files from stdin input. It might also be an idea to add a second option for suggesting what the filename of the stdin input should be, in case that matters. I know eslint has such an option, and it could matter in some cases. Please file an ER. (Enhancement Request)
Re: Subtle bug in ddox vs. ddoc macro handling
On Friday, 16 September 2016 at 18:40:03 UTC, Andrei Alexandrescu wrote: So, I've spent the better part of this morning tracing what seems to be a subtle bug in ddox' macro treatment. I defined these macros: IGNORESECOND = [$1] DOLLARZERO = dzbegin $0 dzend TEST = before $(IGNORESECOND $(DOLLARZERO one, two)) after I inserted $(TEST) in a Phobos module. When processing the module with ddoc, the macro expands to: before [dzbegin one, two dzend] after wherease with ddox it expands to: before [dzbegin one] after i.e. the comma "escapes" the confines of the macro DOLLARZERO. The "right" behavior is ddoc's, for several reasons. Is there a distinct macro engine powering ddox generation? Who is maintaining that? Thanks, Andrei I belive ddox is maintained by soenke ludwig
Re: How templates might be improved
On Friday, 16 September 2016 at 08:51:24 UTC, Stefan Koch wrote: so big that the search for the saved instance if more expensive that dumb reinstanciation without looking for saved instance would be faster. Supposed to say "So big that search for the saved instance _can be_ as expensive as dumb re-instanciation would be."
How templates might be improved
Hi Guys, I have decided to shed some light upon the plan I have for templates. First let's focous on the problem(s) with them. The main problem is instanciation, here is how it works: When we encounter a TemplateInstance in an expression, We search for the declaration on that template, If there are multiple declarations we initiate a search trough them for the most fitting one, i.e. the most specialized one whose constraint evaluate to true and which can produce a valid body (yes D has SFINAE). After we have found a match we look for a match in the know instanciations, for that first we hash the template-parameters and look for a match in a hashtable. That process is much more expensive then one would think because in order to be sure the entry in the hashtable matches we have to do a deep comarision of the whole AST-Subsection the of the template parameters. Which if another template is a parameter of the this template includes it's whole parameter sub-tree as well. This can touch many AST nodes. So many in fact that the cache misses for the comparison become so big that the search for the saved instance if more expensive that dumb reinstanciation without looking for saved instance would be faster. So with that in mind, how can we avoid the deep comparison of AST-Subtrees. The answer is to make every template-argument unique. Such that it can be uniquely identified with a numeric id. Then the comparison would touch much less memory (AST-Nodes are really huge when compared to a a 64bit id) And we don't have to recursively decent into template-parameters which are templates. That's it :) Please ask me question if this was unclear or point out errors in my reasoning if you find them. Cheers, Stefan
Re: The worst Phobos template (in binderoo)
On Thursday, 15 September 2016 at 23:25:34 UTC, Stefan Koch wrote: On Thursday, 15 September 2016 at 23:08:54 UTC, Andrei Alexandrescu wrote: Yes, that DIP.[http://wiki.dlang.org/DIP57] It would need some more formal work before defining an implementation. Stefan, would you want to lead that effort? -- Andrei I am not good at defining semantics, yet I would participate in a joint effort with Timon. I thought some more about this. static foreach is probably not a good idea at this point. We should focus on getting the existing features to do their job properly and we won't have to add yet more meanings to the word static. Our code-generation facilities with CTFE and mixins are very good indeed. When polished CTFE can totally be used as a serious workhorse for cases that would have fallen intro the static foreach category.
Re: The worst Phobos template (in binderoo)
On Thursday, 15 September 2016 at 23:08:54 UTC, Andrei Alexandrescu wrote: Yes, that DIP. It would need some more formal work before defining an implementation. Stefan, would you want to lead that effort? -- Andrei I am not good at defining semantics, yet I would participate in a joint effort with Timon. Be aware though that I am currently working at improving templates. And there might a significant win to be had.
Re: The worst Phobos template (in binderoo)
On Thursday, 15 September 2016 at 22:58:12 UTC, Stefan Koch wrote: The Performance-Penalty will be less then on templates. Let me add a disclaimer, I _think_ the performance penalty _can_ be less then the penalty on templates. Also static if can sometimes lead to counter-intuitive situations, static foreach will likely be worse in that regard. Implementation issues aside, and the DIP aside. What would be the semantics you would want for this ?
Re: The worst Phobos template (in binderoo)
On Thursday, 15 September 2016 at 14:43:16 UTC, Stefan Koch wrote: On Thursday, 15 September 2016 at 14:38:41 UTC, Andrei Alexandrescu wrote: On 09/15/2016 10:08 AM, Stefan Koch wrote: static foreach on the other hand is a whole new can of worms. As walter will be able to tell you. It's time to open it. -- Andrei Please give me an example on how it should work. Are we talking about http://wiki.dlang.org/DIP57 ? I think I can give you a partial solution for that. The Performance-Penalty will be less then on templates. However I can smell a bunch of un-intuitive corner-cases hiding in there.
Re: struct dynamic allocation error
On Thursday, 15 September 2016 at 20:38:45 UTC, Dechcaudron wrote: I believe there is some kind of weird issue that won't allow for struct instances to be dynamically allocated in a proper way via the 'new' keyword. It does actually allocate them and return a valid pointer to operate the instances, but whenever the program is exited I get the following exception: [...] I would think the GC tries to collect the object which you destroyed before. But I cannot be sure. I avoid the gc.
Re: The worst Phobos template (in binderoo)
On Thursday, 15 September 2016 at 14:38:41 UTC, Andrei Alexandrescu wrote: On 09/15/2016 10:08 AM, Stefan Koch wrote: static foreach on the other hand is a whole new can of worms. As walter will be able to tell you. It's time to open it. -- Andrei Please give me an example on how it should work.
Re: The worst Phobos template (in binderoo)
On Thursday, 15 September 2016 at 13:49:46 UTC, Andrei Alexandrescu wrote: On 09/15/2016 09:27 AM, Stefan Koch wrote: On Thursday, 15 September 2016 at 13:20:16 UTC, Andrei Alexandrescu wrote: On 09/15/2016 08:35 AM, Stefan Koch wrote: On Thursday, 15 September 2016 at 12:26:08 UTC, Andrei Alexandrescu wrote: Apparently we need that static foreach iteration. -- Andrei What exactly do you mean ? As long as we instanciate n templates for a member nested n levels the overhead is massive! How would static foreach help ? I thought staticMap is just a (simulated) loop that applies the same template in sequence. -- Andrei staticMap is a recursive variadic template. Can recursion be replaced with iteration? Assume you have static foreach at your disposal. -- Andrei You tell me, you are the expert on templates :o) I cannot be certain but I think, It would probably work. static foreach on the other hand is a whole new can of worms. As walter will be able to tell you.
Re: The worst Phobos template (in binderoo)
On Thursday, 15 September 2016 at 13:20:16 UTC, Andrei Alexandrescu wrote: On 09/15/2016 08:35 AM, Stefan Koch wrote: On Thursday, 15 September 2016 at 12:26:08 UTC, Andrei Alexandrescu wrote: Apparently we need that static foreach iteration. -- Andrei What exactly do you mean ? As long as we instanciate n templates for a member nested n levels the overhead is massive! How would static foreach help ? I thought staticMap is just a (simulated) loop that applies the same template in sequence. -- Andrei staticMap is a recursive variadic template. CompileTime wise the worst class a template can be in. it expands to a series of templates instancing itself log(n) times. causing 2n*log(n) instances in total. It's not a pretty picture.
Re: The worst Phobos template (in binderoo)
On Thursday, 15 September 2016 at 12:26:08 UTC, Andrei Alexandrescu wrote: Apparently we need that static foreach iteration. -- Andrei What exactly do you mean ? As long as we instanciate n templates for a member nested n levels the overhead is massive! How would static foreach help ?
Re: The worst Phobos template (in binderoo)
On Thursday, 15 September 2016 at 06:08:33 UTC, Dicebot wrote: mixin("alias T2 = " ~ fqn!T); static assert(is(T2 == T)); In that sense builtin trait could be a better option. Yes a builtin trait is the better option. I made a youtube video about that. Showing you how bad FullyQualifiedName is and how much better a trait is. https://www.youtube.com/watch?v=l1Ph3Nn0en0
Re: The worst Phobos template (in binderoo)
On Wednesday, 14 September 2016 at 21:06:10 UTC, Andrei Alexandrescu wrote: On 09/14/2016 04:52 PM, Stefan Koch wrote: [...] (Disclaimer: I didn't run any speed tests.) By looking at the definition of fullyQualifiedName it seems to me we can go a long way with traditional optimization techniques. For example consider: [...] staticMap is number 3 in the top-slow-templates list. And the code inside it really does not matter so much. What matters is recursive instanciation. the evaluation of the function is fast in comparison to the time the template-subsystem takes. I believe this cannot be fixed by changing a library solution.
Re: The worst Phobos template (in binderoo)
On Wednesday, 14 September 2016 at 20:24:13 UTC, Stefan Koch wrote: It takes a whooping 500 milliseconds Ahm... I misread my performance graph it's 138ms for the first instanciation. and around 5ms for every following one. the total time spent on it was 500ms
Re: The worst Phobos template (in binderoo)
On Wednesday, 14 September 2016 at 20:24:13 UTC, Stefan Koch wrote: I would like to see users of fullyQualifiedName because apart from binderoo code which seems to work, I have none. That was supposed to say : I would like to see [test cases from] users of fullyQulifiedName.
The worst Phobos template (in binderoo)
Hi Guys, I recently had a closer look at the templates that cost the most time to instantiate in the frontend. and there is one clear winner. FullyQualified name from std.traits. It takes a whooping 500 milliseconds(on my test-case) for it's semantic phase. this is because this template is recursive and because it instantiates std.format.format which is on fourth place for slow templates. The functionality can be implemented with a __trait and that implementation would be 500 times faster. I am going to submit a PR soon. However I cannot guarantee that the newly introduces trait work the same in all cases. As templates can behave surprisingly sometimes. I would like to see users of fullyQualifiedName because apart from binderoo code which seems to work, I have none.
Re: Virtual Methods hurting performance in the DMD frontend
On Monday, 12 September 2016 at 08:03:45 UTC, Stefan Koch wrote: On Sunday, 11 September 2016 at 21:48:56 UTC, Stefan Koch wrote: Those are indirect class I meant indirect calls! @Jacob Yes that is my indented solution. Having a type-field in root-object. A Small update on this. When dmd is complied with ldc this problem seems to lessen. However much of dmd's code especially in dtemplate could be simpified if it could just switch on a value. instead of doing method calls and null checks.
Re: GC of const/immutable reference graphs
On Tuesday, 13 September 2016 at 15:27:23 UTC, John Colvin wrote: For the following, lifetimeEnd(x) is the time of freeing of x. Given a reference graph and an const/immutable node n, all nodes reachable via n (let's call them Q(n)) must also be const/immutable, as per the rules of D's type system. In order to avoid dangling pointers: For all q in Q(n), lifetimeEnd(q) >= lifetimeEnd(n) Therefore, there is never any need for the GC to mark or sweep anything in Q(n) during the lifetime of n. Does the GC take advantage of this in some way to reduce collection times? I am pretty sure it does not.
Re: iPhone vs Android
On Tuesday, 13 September 2016 at 10:58:50 UTC, Walter Bright wrote: Interestingly, Warp (the C preprocessor I developed in D) used a hybrid approach. The performance critical code was all hand-managed, while the rest was GC'd. Manual Memory management is key for performance oriented code.
Re: DIP1000
On Monday, 12 September 2016 at 14:00:53 UTC, Andrei Alexandrescu wrote: On 08/31/2016 08:05 AM, Kagamin wrote: On Tuesday, 30 August 2016 at 16:12:19 UTC, Andrei Alexandrescu wrote: http://erdani.com/d/DIP1000.html That link leads to to a grammar. Isn't DIP1000 the scope and @safe DIP ?
Re: Virtual Methods hurting performance in the DMD frontend
On Sunday, 11 September 2016 at 21:48:56 UTC, Stefan Koch wrote: Those are indirect class I meant indirect calls! @Jacob Yes that is my indented solution. Having a type-field in root-object.
Virtual Methods hurting performance in the DMD frontend
Hi, As you may know I am currently optimizing template-related code inside of DMD. Inside DMD code quality is quite high, there is little low hanging fruit. However there is one thing that suspiciously often shows up the on profilers display. Those are indirect class which have a high number of L2(!) i-cache misses. These calls don't do a lot of work. They are needed either for downcasts or to verify the dynamic type of an AST-Node. Even without the i-cache related stalls the call overhead alone is something to think about. For template-heavy code matching template parameters is one of the most frequently operations. Since Template Parameters can be types expressions or symbols the dynamic-types are heavily queried. First experiments suggest that a speedup of around 12% is possible if the types where accessible directly. Since dmd uses visitors for many things now the benefit of virtual methods is highly reduced. Please share your thoughts. Cheers, Stefan
Re: [OT] Re: Let's kill 80bit real at CTFE
On Sunday, 11 September 2016 at 14:52:18 UTC, Manu wrote: That's cool, but surely unnecessary; the compiler should just hook these and do it directly... They're intrinsics in every compiler/language I've ever used! Just not DMD. If your results are compatible, why not PR this implementation for when ctfe in std.math? Incidentally, this is how we used to do these operations in early shader code, except cutting some corners for speed+imprecision ;) Hey I just made another PR, to fix PowExp! It might as well depend on phobos :)
Re: Templates do maybe not need to be that slow (no promises)
There are more news. I wrote about manual template in-lining before, which is a fairly effective in bringing down the compile-time. Since templates are of course white-box, the compiler can do this automatically for you. Recursive templates will still incur a performance hit but the effects will be lessened. If that gets implemented. I am currently extending dmds template-code to support more efficient template caching.
Re: Templates do maybe not need to be that slow (no promises)
On Friday, 9 September 2016 at 07:56:04 UTC, Stefan Koch wrote: There is a direct linear relationship between the generated code and the template body. So If the range of change inside the template body can be linked to the changed range in the binray and we link this to the template parameters we can produce a pure function that can give us the change in the binary code when provided with the template parameters. And the need to rerun the instanciation and code-gen is reduced to just the changed sections. I am not yet sure if this is viable to implement. I think I have found a way to avoid subtree-comparisons for the most part and speed them up significantly for the rest. At the expense of limiting the number of compile-time entities (types, expressions ... anything) to a maximum 2^(28) (When using a 64bit id)
Re: Templates are slow.
On Friday, 9 September 2016 at 18:17:02 UTC, deadalnix wrote: You need to compare the string to unique them, so it doesn't change anything. It changes the frequency of comparisons.
Re: Templates do maybe not need to be that slow (no promises)
On Friday, 9 September 2016 at 15:08:26 UTC, Iakh wrote: On Friday, 9 September 2016 at 07:56:04 UTC, Stefan Koch wrote: I was thinking on adding "opaque" attribute for template arguments to force template to forget some information about type. E.g if you use class A(opaque T) {...} you can use only pointers/references to T. Probably compiler could determine it by itself is type used as opaque or not. you could use void* in this case and would not need a template at all.
Re: Templates are slow.
On Friday, 9 September 2016 at 12:09:32 UTC, Steven Schveighoffer wrote: On 9/8/16 6:57 PM, Stefan Koch wrote: Hi Guys, I have some more data. In the binderoo example the main time is spent in the backend. generating code and writing objects files. If we ever get Rainer's patch to collapse repetitive templates, we may help this problem. https://github.com/dlang/dmd/pull/5855 The front-end spends most of it's time comparing strings of unique type-names :) I thought the front end was changed to use the string pointer for symbol names as the match so string comparisons aren't done? In this case the string is freshly and not available as reference to an already lexed string . Hm... maybe to intern the string? That kind of makes sense. Yes that would be the way to go. I just had a thought. If you hash the string, and then compare the length of the string and first and last character along with the hash, what are the chances of it being a false positive? This depends entirely on the distribution of strings. It's probably quite high :)
Let's kill 80bit real at CTFE
Hi, In short 80bit real are a real pain to support cross-platform. emulating them in software is prohibitively slow, and more importantly hard to get right. 64bit floating-point numbers are supported on more architectures and are much better supported. They are also trivial to use at ctfe. I vote for killing the 80bit handling at constant folding. Destroy!
Re: Templates do maybe not need to be that slow (no promises)
On Friday, 9 September 2016 at 09:31:37 UTC, Marco Leise wrote: Don't worry about this special case too much. At least GCC can turn padLength from a runtime argument into a compile-time argument itself, so the need for templates to do a poor man's const-folding is reduced. So in this case the advise is not to use a template. You said that there is a lot of code-gen and string comparisons going on. Is code-gen already invoked on-demand? I assume with "dmd -o-" code-gen is completely disabled, which is great for ddoc, .di and dependency graph generation. This is not what this is about. This is about cases where you cannot avoid templates because you do type-based operations. The code above was just an example to illustrate the problem.
Templates do maybe not need to be that slow (no promises)
Hi Guys, I keep this short. There seems to be much more headroom then I had thought. The Idea is pretty simple. Consider : int fn(int padLength)(int a, int b, int c) { /** very long function body 1000+ lines */ return result * padLength; } This will produce roughly the same code for every instaniation expect for one imul at the end. This problem is known as template bloat. There is a direct linear relationship between the generated code and the template body. So If the range of change inside the template body can be linked to the changed range in the binray and we link this to the template parameters we can produce a pure function that can give us the change in the binary code when provided with the template parameters. And the need to rerun the instanciation and code-gen is reduced to just the changed sections. I am not yet sure if this is viable to implement.
Re: Templates are slow.
On Friday, 9 September 2016 at 01:38:40 UTC, deadalnix wrote: On Thursday, 8 September 2016 at 20:10:01 UTC, Stefan Koch wrote: generating separate object files for each template instanciation is and then only re-generating on change will only be effective if they do not change much. From one build to the next. You'd have tens of thousands of file and a big io problem. I already thought about that. The Idea is to stuff the object-code of all templates in one-file with a bit a meta-data. make a do a hash lookup at instanciation. And write the cached code in of the instanciation is found. I agree, file I/O would kill any speed win many times over!
Re: Templates are slow.
Hi Guys, I have some more data. In the binderoo example the main time is spent in the backend. generating code and writing objects files. The front-end spends most of it's time comparing strings of unique type-names :) One particular outlier in the backend code is the function ecom which eliminates common subexpression. We would potentially save some time by not emitting those in the first-place.
Re: Templates are slow.
On Thursday, 8 September 2016 at 19:49:38 UTC, Ethan Watson wrote: On Thursday, 8 September 2016 at 19:17:42 UTC, Lewis wrote: I can't help but wonder if there were some way to automatically cache templates instantiations between runs of dmd? I'm running with Visual D, which has a "COMPILE ALL THE THINGS" mentality as the default. As part of the rapid iteration part of Binderoo, I plan on doing incremental linking. Of course, if all template instantiations go in to one object file, that really ruins it. Each template instantiation going in to a separate object file will actually make life significantly easier, as each compile will have less output. The only time those template instantiations need to recompile is if the invoking module changes; the template's dependencies change; or the module the template lives in changes. My opinion is that splitting up object files will do more to reduce compile time for me than anything else, the pipeline we had for Quantum Break was to compile and link in separate steps so it's not much effort at all for me to keep that idea running in Binderoo and make it incrementally link. But I don't know the DMD code and I'm not a compiler writer, so I cannot say that authoritatively. It sounds very reasonable to me at least. generating separate object files for each template instanciation is and then only re-generating on change will only be effective if they do not change much. From one build to the next. For binderoos purpose this could be rather effective. As long as no one adds fields at the beginning of the structs :) Without incremental linking however your compile-times will shoot through the roof. And will probably damage the moon as well.
Re: CompileTime performance measurement
On Thursday, 8 September 2016 at 17:15:54 UTC, safety0ff wrote: On Thursday, 8 September 2016 at 17:03:30 UTC, Stefan Koch wrote: I thought of the same thing a while back. However I have had the time to decipher the gprof data-format yet. Is there another profile-format for decent visualization tools exist ? I was just using that as an example of what we might want to output as text. e.g. https://sourceware.org/binutils/docs/gprof/Flat-Profile.html I wasn't saying that we should mimic gmon.out file format, I don't think that buys us much. I disagree anything which allows to use existing visualization and correlation will be a major win. If I am going to write a profiler it should have pretty charts. Also the gnu guys probably put a lot thought into their format.
Re: CompileTime performance measurement
On Thursday, 8 September 2016 at 16:52:47 UTC, safety0ff wrote: On Sunday, 4 September 2016 at 00:04:16 UTC, Stefan Koch wrote: ... I have now implemented another pseudo function called __ctfeTicksMs. [Snip] This does allow meaningful compiletime performance tests to be written. spanning both CTFE and template-incitations timeings. Please tell me what you think. I think automated ctfe profiling would be much better and the byte-code interpreter seems like a great platform to build this onto. For example, using a command line switch to enable profiling which outputs something similar to gprof's flat profile. I thought of the same thing a while back. However I have had the time to decipher the gprof data-format yet. Is there another profile-format for decent visualization tools exist ?
Re: Templates are slow.
On Thursday, 8 September 2016 at 15:45:53 UTC, Jonathan M Davis wrote: It's critical that we do what we can to make templates fast. And if we can't make them fast enough, we'll definitely have to come up with techniques/guidelines for reducing their usage when they're not really needed. - Jonathan M Davis I agree. We need to make templates faster. But it will be like squeezing water out of stones. A few more oblivious optimizations I have tried did not have the desired effect at all. @Andrei Also we need to special case ranges in general. And try harder to inline calls to range functions. Maybe even early in the frontend. secondly we need to inline range-chains into each other whenever possible. If we do this the right way early on we can reduce the symbolName-length as well. All we need for this is pattern-matching on a type-resolved call-graph. Which is something I am working on as part of my ctfe work.
Re: Templates are slow.
On Thursday, 8 September 2016 at 12:23:35 UTC, Andrei Alexandrescu wrote: Are there any situations that we can special-case away? Raising the roof. -- Andrei The rangefying functions in std.array come to mind. That will give a huge boost to everyone. (everyone who uses arrays anyway :))
Re: CompileTime performance measurement
On Wednesday, 7 September 2016 at 06:49:09 UTC, Rory McGuire wrote: Seriously Stefan, you make my day! My libraries will be so much easier to write! I am glad the time was not wasted. Let's hope it gets merged :)
Re: Templates are slow.
On Thursday, 8 September 2016 at 06:34:58 UTC, Sebastien Alaiwan wrote: On Thursday, 8 September 2016 at 05:02:38 UTC, Stefan Koch wrote: (Don't do this preemptively, ONLY when you know that this template is a problem!) How would you measure such things? Is there such a thing like a "compilation time profiler" ? (Running oprofile on a dmd with debug info comes first to mind ; however, this would only give me statistics on dmd's source code, not mine.) I use a special profilng-build of dmd. oprofile on dmd can give you a good first impression of where you run into problems and then you can wirte special profiling code for this. If you do not want to write such code send me a message and I will look into it for you :)
Templates are slow.
Hi Guys, I have just hit a barrier trying to optimize the compile-time in binderoo. Roughly 90% of the compile-time is spent instantiating templates. The 10% for CTFE are small in comparison. I will write an article about why templates are slow. The gist will be however : "Templates being slow is an inherent property of templates." (We are only talking about templates as defined by (C++ and D)). That said: Templates are great! But you have to use them sanely. If you are instantiating a template inside another template think very hard about the reason, often you can "inline" the template body of the inner template and get an instant speed win right there. (Don't do this preemptively, ONLY when you know that this template is a problem!) Phobos is has many templates inside templates. In constraints for example. I have no idea how to cut down on template-instanciations in phobos while still maintaining the same user-friendliness. Of course myself and other will continue fighting on the compiler-front. To give you that fastest implementation possible! Cheers, Stefan
Re: CompileTime performance measurement
On Tuesday, 6 September 2016 at 10:42:00 UTC, Martin Nowak wrote: On Sunday, 4 September 2016 at 00:04:16 UTC, Stefan Koch wrote: I recently implemented __ctfeWriteln. Nice, is it only for your interpreter or can we move https://trello.com/c/6nU0lbl2/24-ctfewrite to done? I think __ctfeWrite would be a better primitive. And we could actually consider to specialize std.stdio.write* for CTFE. It's only for the current engine and only for Strings! See: https://github.com/dlang/druntime/pull/1643 and https://github.com/dlang/dmd/pull/6101
Re: Promotion rules ... why no float?
On Tuesday, 6 September 2016 at 07:04:24 UTC, Sai wrote: Consider this: import std.stdio; void main() { byte a = 6, b = 7; auto c = a + b; auto d = a / b; writefln("%s, %s", typeof(c).stringof, c); writefln("%s, %s", typeof(d).stringof, d); } Output : int, 13 int, 0 I really wish d gets promoted to a float. Besides C compatibility, any reason why d got promoted only to int even at the risk of serious bugs and loss of precision? I know I could have typed "auto a = 6.0" instead, but still it feels like an half-baked promotion rules. Because implicit conversion to float on every division is bad bad bad.
Re: CompileTime performance measurement
On Sunday, 4 September 2016 at 12:38:05 UTC, Andrei Alexandrescu wrote: On 9/4/16 6:14 AM, Stefan Koch wrote: writeln and __ctfeWriteln are to be regarded as completely different things. __ctfeWriteln is a debugging tool only! It should not be used in any production code. Well I'm not sure how that would be reasonably enforced. -- Andrei One could enforce it by defining it inside a version or debug block. The reason I do not want to see this in production code is as follows: In the engine I am working on, communication between it and the rest of dmd is kept to a minimum, because : "The new CTFE engine abstracts away everything into bytecode, there is no guarantee that the bytecode-evaluator is run in the same process or even on the same machine."
Re: CompileTime performance measurement
On Sunday, 4 September 2016 at 04:31:09 UTC, Jonathan M Davis wrote: He didn't say that it _couldn't_ be done. He said that it _shouldn't_ be done. - Jonathan M Davis Yes exactly.
Re: CompileTime performance measurement
On Sunday, 4 September 2016 at 04:35:15 UTC, rikki cattermole wrote: void writeln(T...)(T args) { if (__ctfe){ debug { __ctfeWriteln(args); } } else { // ... current implementation } } That will not work. The signature is void __ctfeWriteln(const string s)
Re: CompileTime performance measurement
On Sunday, 4 September 2016 at 04:10:29 UTC, rikki cattermole wrote: On 04/09/2016 2:08 PM, Stefan Koch wrote: On Sunday, 4 September 2016 at 02:06:55 UTC, Stefan Koch wrote: This works already. Anything placed in a debug {} block will be considered pure regardless. Opps your comment was about the debate. I would say that __ctfeWriteln and __ctfeTicksMs should not work outside of debug. Can we have writeln and writefln call into it if __ctfe is true? Just so that we have got some consistency between runtime and CTFE usage. No! writeln and __ctfeWriteln are to be regarded as completely different things. __ctfeWriteln is a debugging tool only! It should not be used in any production code.
Re: Usability of "allMembers and derivedMembers traits now only return visible symbols"
On Sunday, 4 September 2016 at 03:14:18 UTC, Martin Nowak wrote: It didn't slip, but I wish Walter had at least stated his opinion on the PR before merging. My thinking is that the plebes should be able to access things via the object.member syntax by obeying the usual visibility rules. But __traits(allMembers, T) should be the reflection backdoor that gives the savvy users total access, at the obvious cost of an awkward syntax. As explained several times here and in the announce thread, private members have never been accessible, other than introspecting attributes, and making them accessible comes with a performance cost and a fairly big language change. So the real question is, why do we need introspection without access, and can we handle that few cases with mixin templates. If we really need introspection of private members than we might need to go back to the drawing board and modify the visibility concept introduced with DIP22. While I do understand, that there could be a potential performance when private members could be changed around because they are not visible form outside. I fail to see how we would take advantage of that without breaking our object-model.
Re: CompileTime performance measurement
On Sunday, 4 September 2016 at 02:06:55 UTC, Stefan Koch wrote: This works already. Anything placed in a debug {} block will be considered pure regardless. Opps your comment was about the debate. I would say that __ctfeWriteln and __ctfeTicksMs should not work outside of debug.
Re: CompileTime performance measurement
On Sunday, 4 September 2016 at 02:03:49 UTC, sarn wrote: On Sunday, 4 September 2016 at 01:53:21 UTC, Stefan Koch wrote: Pragma msg can only print compiletime constants. While __ctfeWriteln can print state while doing CTFE. Thanks, that makes a lot of sense. Just to check, it prints to standard error, right? Also, the issue of non-deterministic compilation reminds me of the debate about allowing logging statements in pure functions. Maybe there's a similar answer (i.e., making it only work in some kind of debug mode). This works already. Anything placed in a debug {} block will be considered pure regardless.
Re: CompileTime performance measurement
On Sunday, 4 September 2016 at 01:44:40 UTC, sarn wrote: On Sunday, 4 September 2016 at 00:04:16 UTC, Stefan Koch wrote: I recently implemented __ctfeWriteln. Sounds like pragma msg. How does it compare? https://dlang.org/spec/pragma.html#msg Pragma msg can only print compiletime constants. While __ctfeWriteln can print state while doing CTFE. Example int fn(int n) { import std.conv; __ctfeWriteln((n-10).to!string); return n; } static assert(fn(22)); will print 12; whereas int fn(int n) { import std.conv; pragma(msg, n.to!string); return n; } will tell you that the symbol n is not avilable at compiletime
Re: CompileTime performance measurement
On Sunday, 4 September 2016 at 00:08:14 UTC, David Nadlinger wrote: Please don't. This makes CTFE indeterministic. Please elaborate on why this would have a negative impact ? if someone chooses to use a symbol called __ctfeTicksMs they shoud know what they are doing. To write performance tests, just measure compilation of a whole program (possibly with -o-). The variance due to the startup/shutdown overhead can trivially be controlled by just executing the CTFE code in question often enough. That will only allow you to tell how much overall ctfe-time you spent. It will not allow you to pinpoint and optimize the offenders.
Re: Quality of errors in DMD
On Saturday, 3 September 2016 at 22:53:25 UTC, Walter Bright wrote: On 9/3/2016 3:05 PM, Ethan Watson wrote: In the cases I've been bringing up here, it's all been user code that's been the problem *anyway*. Regardless of if the compiler author was expecting code to get to that point or not, erroring out with zero information is a bad user experience. Nobody is suggesting that asserts are a good user experience. I've asserted (!) over and over that this is why asserts have a high priority to get fixed. Adding more text to the assert message is not helpful to end users. Perhaps the best error message would be "Please post this as a bug to bugzilla."
CompileTime performance measurement
Hi Guys. I recently implemented __ctfeWriteln. Based on that experience I have now implemented another pseudo function called __ctfeTicksMs. That evaluates to a uint representing the number of milliseconds elapsed between the start of dmd and the time of semantic evaluation of this expression. This does allow meaningful compiletime performance tests to be written. spanning both CTFE and template-incitations timeings. Please tell me what you think.
Re: DIP1001: DoExpression
On Saturday, 3 September 2016 at 14:20:28 UTC, Dicebot wrote: DIP: https://github.com/dlang/DIPs/blob/master/DIPs/DIP1001.md Abstract A DoExpression do(x,y,z) is exactly equivalent to a CommaExpression (x,y,z), but doesn't emit a deprecation warning. == First community DIP has just landed the review queue. Please express your opinion and suggestions - it will all be taken into consideration when the formal review time comes. The same can be already achieved using a function literal. Introducing an expression for this seems overkill.
Re: ADL
On Saturday, 3 September 2016 at 10:11:05 UTC, Timon Gehr wrote: If ADL is done as a fallback, then it is only slower in those cases where it is either actually used, or __traits(compiles,...) is used to determine that some function overload does not exist. True. Still it does complicate the implementation. AFAICS the point of ADL is importing function definitions automatically if they are referenced, thereby practically circumventing the guarantees imports give you. In particular : "I will only import what is in that module, and I will only transitively import what is imported publicly by this module" now it becomes : "I will import what is in that module, transitively import public imports, and maybe more if a function called from that module requires it." Please do correct me if my interpretation is wrong. I haven't heard of adl before this post.
Re: ADL
On Saturday, 3 September 2016 at 01:09:18 UTC, Walter Bright wrote: Essentially, ADL has awkward problems when getting beyond the simple cases. It isn't right for D. I could not agree more strongly! If this feature were supported, it would probably break our module system. Even if we could shoehorn it into the language it would make the compiler slower.
Re: Fallback 'catch-all' template functions
On Thursday, 1 September 2016 at 09:08:31 UTC, Manu wrote: This was my fallback plan, but it seems a bit shit. Hmm I get your point. But there is not really another way within the current langauge. Also overloading lot of templates with templates constraints will eat into your compile-time!
Re: Fallback 'catch-all' template functions
On Thursday, 1 September 2016 at 08:44:38 UTC, Stefan Koch wrote: void f(T)(T t) { static if (is(fImpl(t) == void)) { f(t); } else { // default impl here } } Was meant to be void f(T)(T t) { static if (is(fImpl(t) == void)) { fImpl(t); } else { // default impl here } }
Re: Fallback 'catch-all' template functions
On Thursday, 1 September 2016 at 05:37:50 UTC, Manu wrote: So, consider a set of overloads: void f(T)(T t) if(isSomething!T) {} void f(T)(T t) if(isSomethingElse!T) {} void f(T)(T t) {} I have a recurring problem where I need a fallback function like the bottom one, which should be used in lieu of a more precise match. This is obviously an ambiguous call, but this is a pattern that comes up an awful lot. How to do it in D? I've asked this before, and people say: void f(T)(T t) if(!isSomething!T && !isSomethingElse!T) {} Consider that more overloads are being introduced by users spread out across many modules that define their own kind of T; this solution is no good. To my knowledge there is currently no clean way of doing this. The easiest workaround would be to introduce another name for the implementation. then it would look like void f(T)(T t) { static if (is(fImpl(t) == void)) { f(t); } else { // default impl here } }
Re: [OT] of [OT] I am a developer and I hate documentation
On Thursday, 18 August 2016 at 13:19:13 UTC, Chris wrote: Isn't there a way to auto-generate a minimal documentation with the help of the compiler? As in int myFunction(int a, int b) { if (a > -1) return a + b; return -1; } // auto-gen: Returns `int`, if `a` is greater than -1, else returns -1. Parameters: `int`, `int`; Returns `int`. Something like that. Would you like a patch ? that will at least autogen Returns `int`, Parameters: `int a`, `int b`; No automatic Semantics though.
Re: Cyclomatic complexity in D ?
On Monday, 15 August 2016 at 22:36:41 UTC, Basile B. wrote: Last night I've done a bit of documentation work on cyclomatic complexity. From this work it looks like the tools that do this static analysis are only using the AST (e.g the CC is computed for a each single function, and function calls inside the graph are considered as "connected componenents" with a constant weight of 1. [...] Possible - Certainly. Doable - I am not sure. But I am going to tend to no.
Building CTFE - On Youtube
Hi, I decided to (semi-regularly) record the coding session I do on ctfe. I have now created a you-tube playlist with the most recent video in a acceptable quality. https://www.youtube.com/playlist?list=PL_Hatz-fH1CcCRHREbuV8EC3jgudTvwm_ Please tell me me what you think.
Re: Taking D to GDC Europe - let's make this tight
On Tuesday, 12 July 2016 at 16:24:07 UTC, Ola Fosheim Grøstad wrote: template struct has_equality() == std::declval() )>> : std::true_type { }; It looks horrible. And in D it is much prettier.
Re: Vision for the D language - stabilizing complexity?
On Friday, 8 July 2016 at 19:43:39 UTC, jmh530 wrote: On Friday, 8 July 2016 at 18:16:03 UTC, Andrei Alexandrescu wrote: You may well be literally the only person on Earth who dislikes the use of "static" in "static if". -- Andrei You have to admit that static is used in a lot of different places in D. It doesn't always mean something like compile-time either. For instance, a static member function is not a compile time member function. However, I doubt something like this is going to change, so it doesn't really bother me. I liked the way that the Sparrow language (from the presentation you posted a few weeks ago) did it. Instead of static if, they use if[ct]. I like static if :)
Re: Should we add another SmallArray to DMD?
On Thursday, 7 July 2016 at 06:39:19 UTC, Guillaume Chatelet wrote: On Tuesday, 5 July 2016 at 19:41:18 UTC, Guillaume Chatelet wrote: DMD currently provides the Array type: https://github.com/dlang/dmd/blob/master/src/root/array.d [...] Walter, Daniel, Thoughts? I guess a few number of the perf difference this can make would be helpful.
Re: Does a Interpretation-Engine fit in phobos ?
On Saturday, 2 July 2016 at 09:22:48 UTC, Joakim Brännström wrote: On Saturday, 2 July 2016 at 08:11:48 UTC, Stefan Koch wrote: The Bytecode interpreter is now CTFEable :) Good job Stefan. The PR looks really interesting. I'll be looking into using your engine I future projects so I hope that it will be available in some way or another :) // Joakim Good it hear. I just pulled out a lot of the generation functions. And made sure the "Disassembler" is ctfeable as well.
Re: Does a Interpretation-Engine fit in phobos ?
On Thursday, 30 June 2016 at 23:27:10 UTC, lobo wrote: On Thursday, 30 June 2016 at 11:09:10 UTC, ketmar wrote: On Thursday, 30 June 2016 at 10:36:44 UTC, qznc wrote: Off-topic: Is it possible/feasible/desirable to let dmd use dub packages? please, no. not everybody out there is dub fan. You can always use dub to fetch packages into ~/.dub/packages and build them. The artifacts are then available for whatever build system you choose for your own project. I don't use DUB for my own projects because it drops garbage all through my source tree. The day it can do out of source builds is the day I'll revisit the DUB bazaar. bye, lobo The Bytecode interpreter is now CTFEable :)
Re: Do you want support for CTFE coverage ?
On Saturday, 2 July 2016 at 00:34:05 UTC, Walter Bright wrote: On 7/1/2016 1:29 PM, Stefan Koch wrote: Do you want to see coverage for code executed at CTFE ? It's not necessary since CTFE code can all be executed at runtime, and coverage tested that way. Fair enough :) execpt for code guarded by if (__ctfe)
Do you want support for CTFE coverage ?
Hi, Exactly as per title. Do you want to see coverage for code executed at CTFE ? I ask because it is slightly tricky to support this and it needs to be factored in early in design. And please off-topic or thread hijacking this time. Thanks!
Does a Interpretation-Engine fit in phobos ?
Hi, I recently had a breakthrough in my CTFE work. Though because habits die hard. I am writing the bytecode-engine in a CTFEable style. Therefore I can be used as a quite comfortable IR for CTFE things as well. It should be fairly easy to generate a inline-asm-string from the byteCode at compiletime. And thereby creating the possibility of generating optimized code at compile-time. I was wondering if such a thing would fit in phobos. Please share your opinions with me :) Regards, Stefan
Re: static if enhancement
On Monday, 27 June 2016 at 22:56:41 UTC, Jonathan M Davis wrote: On Monday, June 27, 2016 18:55:48 deadalnix via Digitalmars-d wrote: On Monday, 27 June 2016 at 18:14:26 UTC, Timon Gehr wrote: > [...] Alright, I have to range myself with most here. While I'm all for not warning about unreachable code, I'm opposed to not compiling the rest of the code. This create non orthogonality between static if and control flow analysis, the kind that clearly do not pay for itselfr Agreed. The code outside of the static if should be compiled regardless, because it's not part of the static if/else at all and therefore has not been marked as conditionally compilable. But if we don't warn about unreachable code, then the code after the static if can clearly be optimized out because it's unreachable. So, Andrei's code would become legal as long as the only problem with the code after the static if was that it was unreachable. - Jonathan M Davis It is true that that such unreachable warning can be annoying at times. However it catches bugs. Especially in generic code Those warnings can be a blessing rather then a curse. We should not swallow or gag errors!
Re: static if enhancement
On Friday, 24 June 2016 at 15:34:42 UTC, Stefan Koch wrote: On Friday, 24 June 2016 at 15:24:48 UTC, Andrei Alexandrescu wrote: Does anyone else find this annoying? https://issues.dlang.org/show_bug.cgi?id=16201 -- Andrei This would mean treating static if's differently if they alter control flow in a scope larger then themselves. Special casing a static if that returns would not be as bad. But with the current state of the compiler I would hold of on such complications. To elaborate: This requires control-flow analysis over all static if branches, to find and very one special case which we treat specially. It would only be beneficial If we hit this case predominantly. However, I am not sure how much langauge complexity this adds.
Re: static if enhancement
On Friday, 24 June 2016 at 15:24:48 UTC, Andrei Alexandrescu wrote: Does anyone else find this annoying? https://issues.dlang.org/show_bug.cgi?id=16201 -- Andrei This would mean treating static if's differently if they alter control flow in a scope larger then themselves. Special casing a static if that returns would not be as bad. But with the current state of the compiler I would hold of on such complications.
Re: Please rid me of this goto
On Thursday, 23 June 2016 at 18:05:07 UTC, Andrei Alexandrescu wrote: On 06/23/2016 01:34 PM, H. S. Teoh via Digitalmars-d wrote: I don't understand why that goto is necessary. Eh, thank you all who set me straight! I've been in my head for too long. So where is the current implementation of "^^"? If it's not as fast as this, we should replace it. -- Andrei It should be in constfold ... Apparently it's in the optimizer as well :)
Re: D's memory-hungry templates
On Friday, 10 June 2016 at 11:09:58 UTC, maik klein wrote: Not in a presentable form, I still have a framework on my other machine. I basically generated new D files from within D and then compiled them using ldc/dmd. I could clean it up and upload it when I have some time. Yes please. Compile-time perf is always good to test :)
Re: Passing private functions to template algorithms
On Wednesday, 8 June 2016 at 00:58:41 UTC, deadalnix wrote: On Wednesday, 8 June 2016 at 00:48:00 UTC, Stefan Koch wrote: I agree with you. However something in the back of my mind tells me that we'll be hitting nasty corner cases if we change the behavior. Unless you can provide such corner case, this post is contributing nothing. If I had a specific case in mind I would have provided it.
Long Symbol names
Hi, I just wanted to tell you that I am taking a shot at solving this issue. It is critical not to produce such long mangles in the first place instead of compressing after the fact. The name blow-up after this fix will still be exponential. (I guess.) However with a much much smaller n. The idea is similar to how LZ compression works. keep positions to already seen patterns. And point to them when they are used. The speed up comes from not having to look for patterns in a long string. Inside the mangler the search for seen patterns boils down to a small number of pointer comparisons. As opposed to a search on a very long string.
Re: Passing private functions to template algorithms
On Tuesday, 7 June 2016 at 18:59:03 UTC, Timon Gehr wrote: I think it is obvious that this should work. Visibility checks need to happen during identifier lookup. This lookup is happening in the module where isFileNothrow is visible. I agree with you. However something in the back of my mind tells me that we'll be hitting nasty corner cases if we change the behavior.
Re: Andrei's list of barriers to D adoption
On Monday, 6 June 2016 at 22:25:34 UTC, Walter Bright wrote: On 6/6/2016 9:40 AM, jmh530 wrote: Also, it's not like all energy will be poured into one or two of them - ideally several could be worked on. Consider the recent gigantic thread on autodecoding. Is anyone working on any PRs for it? Adam did I believe.
Re: Improving DMD's memory management
On Friday, 3 June 2016 at 15:04:59 UTC, Stefan Koch wrote: Hi, I just saw something in the Vision 2016 H2 Document, which is very dear to me. Improving the Memory Management in the compiler. I think we need at least three different allocation primitives. One for allocating Temporary Memory to which no one should keep references that have a longer lifetime then the memory-block itself. One RC primitive. And one primitive for virtually immutable memory. One size fits all is not a solution that is going to scale. When the details of those primitives are worked out replace _ALL_ malloc and free pairs. With an appropriate allocation primitive. And I think of pump-the-pointer and never free. As a perfectly fine allocation primitive. If the resources are virtually immutable. The temp-memory primitive also can be done with a bump-the-pointer method.
Improving DMD's memory management
Hi, I just saw something in the Vision 2016 H2 Document, which is very dear to me. Improving the Memory Management in the compiler. I think we need at least three different allocation primitives. One for allocating Temporary Memory to which no one should keep references that have a longer lifetime then the memory-block itself. One RC primitive. And one primitive for virtually immutable memory. One size fits all is not a solution that is going to scale. When the details of those primitives are worked out replace _ALL_ malloc and free pairs. With an appropriate allocation primitive.
Re: Lifetime tracking
On Friday, 3 June 2016 at 00:31:31 UTC, Walter Bright wrote: If they cover the cases that matter, it's good. Rust has the type system annotations you want, but Rust has a reputation for being difficult to write code for. I think we can incorporate typesafe borrowing without making it difficult to write.
Re: Lifetime tracking
On Thursday, 2 June 2016 at 23:05:40 UTC, Timon Gehr wrote: On 03.06.2016 00:29, Walter Bright wrote: On 6/2/2016 3:10 PM, Marco Leise wrote: we haven't looked into borrowing/scoped enough That's my fault. As for scoped, the idea is to make scope work analogously to DIP25's 'return ref'. I don't believe we need borrowing, we've worked out another solution that will work for ref counting. Please do not reply to this in this thread - start a new one if you wish to continue with this topic. I'd like to point out again why that design is inadequate: Whenever the type checker is using a certain piece of information to check validity of a program, there should be a way to pass that kind of information across function boundaries. Otherwise the type system is not modular. This is a serious defect. Seconded.
Re: Dealing with Autodecode
On Wednesday, 1 June 2016 at 00:46:04 UTC, Walter Bright wrote: It is not practical to just delete or deprecate autodecode - it is too embedded into things. Which Things ? The way to deal with it is to replace reliance on autodecode with .byDchar (.byDchar has a bonus of not throwing an exception on invalid UTF, but using the replacement dchar instead.) To that end, and this will be an incremental process: So does this mean we intend to carry the auto-decoding wart with us into the future. And telling everyone : "The oblivious way is broken we just have it for backwards compatibility ?" To come back to c++ [] vs. std.vector. The actually have valid reasons; mainly c compatibility. To keep [] as a pointer. I believe As of now D is still flexible enough to make a radical change. We cannot keep putting this off! It is only going to get harder to remove it.
Re: Transient ranges
On Sunday, 29 May 2016 at 15:45:14 UTC, Joseph Rushton Wakeling wrote: What's the problem with introspecting that? There is none :) it could be implemented today.
Re: Transient ranges
On Saturday, 28 May 2016 at 17:27:17 UTC, Joseph Rushton Wakeling wrote: On Saturday, 28 May 2016 at 01:48:08 UTC, Jonathan M Davis wrote: On Friday, May 27, 2016 23:42:24 Seb via Digitalmars-d wrote: So what about the convention to explicitely declare a `.transient` enum member on a range, if the front element value can change? Honestly, I don't think that supporting transient ranges is worth it. I have personally wondered if there was a case for a TransientRange concept where the only primitives defined are `empty` and `front`. `popFront()` is not defined because the whole point is that every single call to `front` will produce a different value. That is a rather sound idea.
Re: Need a Faster Compressor
On Tuesday, 24 May 2016 at 21:26:58 UTC, Walter Bright wrote: On 5/24/2016 2:06 PM, Stefan Koch wrote: A serious question, do these templates even need a mangle ? Is anyone ever going to call them from a different module ? I don't think you could do that because those are voldemort types. Those symbols are only temporary aren't they ? The types do exist in the compiler, and are used in template name generation, generating types, etc. But would one ever have to be able to reconstruct the mangle ? templates cannot be exported nor can instanciations. Or can they ?
Re: Need a Faster Compressor
On Tuesday, 24 May 2016 at 20:39:41 UTC, Walter Bright wrote: On 5/24/2016 1:27 PM, Guillaume Boucher wrote: There's no reason not to use the compression in the deco name. Just make sure the references are relative and you're set. Not totally, since you'd like to refer to types that are common across deco names. On Tuesday, 24 May 2016 at 20:39:41 UTC, Walter Bright wrote: A serious question, do these templates even need a mangle ? Is anyone ever going to call them from a different module ? I don't think you could do that because those are voldemort types. Those symbols are only temporary aren't they ?
Re: Need a Faster Compressor
On Tuesday, 24 May 2016 at 16:22:56 UTC, Timon Gehr wrote: On 24.05.2016 01:02, Walter Bright wrote: [...] Speed. The title of this thread is "Need a faster compressor". [...] No. Just increase the recursion depth by a small number of levels to completely kill the speedups gained by using a faster compression algorithm. [...] Yes, it does. The compiler does not use exponential space to store the AST. [...] It's _exponential_ growth. We don't even want to spend the time and memory required to generate the strings. [...] The reason we have this discussion is that the worst case isn't rare enough to make this argument. Why compress in the first place if mangled names don't grow large in practice? I completely agree! However such a thing will only work if dmd is used as a pre-linker and If we can stablize the hash. I.E. every run of the compiler generates the same hash.
Re: Need a Faster Compressor
Just a heads up on the LZ4. I have spent roughly 3 hours optimizing my decompresser. And while I had stunning success, a speed-up of about 400%. I am still about 600x slower then the C variant. It is still a mystery to me why that is :) Since the generated code both smaller and works almost without spills.
Re: Chat with Stefan Koch about CTFE reimplementation
On Monday, 23 May 2016 at 16:32:30 UTC, deadalnix wrote: It is like we have a car with square wheels, and you guys a reinventing a whole new car without even trying to maybe put round wheel on the existing one and see how it goes. I this particular model of car. The bolts of the wheels are on the inside. And you have to dismantle almost everything to change them.
Re: Chat with Stefan Koch about CTFE reimplementation
On Monday, 23 May 2016 at 16:32:30 UTC, deadalnix wrote: On Monday, 23 May 2016 at 15:57:42 UTC, rikki cattermole wrote: Hello! [...] Call me party pooper or something but this whole things seems to get way out of control. In order to asses the quality of the new design, one need to compare it to a baseline (aka the existing design) or one is basically going blind and unlikely to end up anywhere. There are obvious low hanging fruits in the existing design. For instance using a region allocator and blast it all after a CTFE run. It is like we have a car with square wheels, and you guys a reinventing a whole new car without even trying to maybe put round wheel on the existing one and see how it goes. Overall, that means you are embarking in an expensive project, with 0 visibility on the actual value provided. I get your point. However to stick with your car metaphor. We have a car with square wheels and a weak frame. As soon as round wheels enable it to go faster something else will give. Also I don't want a car. I want a rocket.
Re: Need a Faster Compressor
On Monday, 23 May 2016 at 15:33:45 UTC, Walter Bright wrote: Also, the LZ4 compressor posted here has a 64K string limit, which won't work for D because there are reported 8Mb identifier strings. This is only partially true. The 64k limit does not apply to the input string. It does only apply to the dictionary.It would only hit if we find 64k of identifier without repetition.