Re: Optimize away immediately-called delegate literals?
On Mon, 12 Mar 2012 20:28:15 -0400, H. S. Teoh wrote: Hmph. I tried this code: import std.stdio; struct A { int[] data; int opApply(int delegate(ref int) dg) { foreach (d; data) { if (dg(d)) return 1; } return 0; } } void main() { A a; int n = 0; a.data = [1,2,3,4,5]; foreach (d; a) { n++; } } With both dmd and gdc, the delegate is never inlined. :-( Compiling with gdc -O3 causes opApply to get inlined and loop-unrolled, but the call to the delegate is still there. With dmd -O, even opApply is not inlined, and the code is generally much longer per loop iteration than gdc -O3. IIRC, ldc does inline opApply. But this is somewhat hearsay since I don't use ldc. I'm just remembering what others have posted here. -Steve
Re: Optimize away immediately-called delegate literals?
On 3/12/2012 8:10 PM, Nick Sabalausky wrote: > "Brad Roberts" wrote: >> >> See also: bug 4440 >> >> The patch in there, if it hasn't bit rotten to badly (I suspect it has) >> will handle _this_ case. But almost no other >> case of inlining delegates. >> >> It'd be a good area for someone who wants an interesting and non-trivial >> problem to dig into. It wouldn't touch all >> that much of the codebase as the inliner is fairly self-contained. At >> least, that's what I recall from when I looked at >> this stuff a couple years ago. >> > > Do you think that patch would be a good starting place for further work, or > would a proper solution likely necessitate an entirely different approach? > I suspect that, to fix it for a broader set of cases, it'll need much larger changes. During those changes the simple version in that patch would likely disappear. But take that all with a huge grain of salt.
Re: Optimize away immediately-called delegate literals?
"Brad Roberts" wrote in message news:mailman.582.1331607753.4860.digitalmar...@puremagic.com... > On 3/12/2012 4:15 PM, Peter Alexander wrote: >> On Sunday, 11 March 2012 at 06:49:27 UTC, H. S. Teoh wrote: >>> On Sun, Mar 11, 2012 at 01:29:01AM -0500, Nick Sabalausky wrote: Suppose you have a delegate literal and immediately call it: auto a = x + (){ doStuff(); return y; }() + z; Does DMD ever (or always?) optimize away a delegate if it's executed immediately and never stored into a variable? If not, can it, and would it be a simple change? Is something like this already on the table? >>> [...] >>> >>> I've always wondered about whether delegates passed to opApply ever get >>> inlined. >> >> Don't wonder. Find out! >> >> import std.stdio; >> void doStuff() { writeln("Howdy!"); } >> void main() { >> int x = 1, y = 2, z = 3; >> auto a = x + (){ doStuff(); return y; }() + z; >> writeln(a); >> } > > See also: bug 4440 > > The patch in there, if it hasn't bit rotten to badly (I suspect it has) > will handle _this_ case. But almost no other > case of inlining delegates. > > It'd be a good area for someone who wants an interesting and non-trivial > problem to dig into. It wouldn't touch all > that much of the codebase as the inliner is fairly self-contained. At > least, that's what I recall from when I looked at > this stuff a couple years ago. > Do you think that patch would be a good starting place for further work, or would a proper solution likely necessitate an entirely different approach?
Re: Optimize away immediately-called delegate literals?
"Peter Alexander" wrote in message news:thetmhnnbeepmxgus...@forum.dlang.org... > On Sunday, 11 March 2012 at 06:49:27 UTC, H. S. Teoh wrote: >> On Sun, Mar 11, 2012 at 01:29:01AM -0500, Nick Sabalausky wrote: >>> Suppose you have a delegate literal and immediately call it: >>> >>> auto a = x + (){ doStuff(); return y; }() + z; >>> >>> Does DMD ever (or always?) optimize away a delegate if it's executed >>> immediately and never stored into a variable? If not, can it, and >>> would it be a simple change? Is something like this already on the >>> table? >> [...] >> >> I've always wondered about whether delegates passed to opApply ever get >> inlined. > > Don't wonder. Find out! > > import std.stdio; > void doStuff() { writeln("Howdy!"); } > void main() { > int x = 1, y = 2, z = 3; > auto a = x + (){ doStuff(); return y; }() + z; > writeln(a); > } > > $ dmd test.d -O -release -inline > > __Dmain: > 0001106c pushq %rbp [...] I keep forgetting I can do that. ;) > > In short. No! It doesn't currently inline in this case. > > Even if the lambda just returns a constant, it doesn't get inlined. Darn. The reason this came up is I've gotten back into some more work on HaxeD (Haxe -> D converter). Haxe allows statements to be used as expressions. For example: x = 5 + if(cond) { foo(); 1; } else 2; At the moment, I'm converting those by just tossing them into an immediately-called anonymous delegate: x = 5 + (){ ...blah... }(); But as I was afraid of, there's a performance hit with that. So eventually, I'll have to either do some fancier reworking: // Something like typeof(1) __tmp1; if(cond) { foo(); __tmp1 = 1; } else __tmp1 = 2; x = 5 + __tmp1; ...which might even run into some problems with order-of-execution. Or delegate inlining would have to get added to DMD. Now that I actually look (heh :) ), it looks like there's an old Buzilla issue for it with an apperently overly-limited patch: http://d.puremagic.com/issues/show_bug.cgi?id=4440 Unfortunately, according to Brad Roberts in the last comment: "Getting delegate inlining in dmd is going to take serious work." (Ugh, I wish Windows had an easy way to copy/paste text *without* the styling.)
Re: Optimize away immediately-called delegate literals?
On 3/12/2012 4:15 PM, Peter Alexander wrote: > On Sunday, 11 March 2012 at 06:49:27 UTC, H. S. Teoh wrote: >> On Sun, Mar 11, 2012 at 01:29:01AM -0500, Nick Sabalausky wrote: >>> Suppose you have a delegate literal and immediately call it: >>> >>> auto a = x + (){ doStuff(); return y; }() + z; >>> >>> Does DMD ever (or always?) optimize away a delegate if it's executed >>> immediately and never stored into a variable? If not, can it, and >>> would it be a simple change? Is something like this already on the >>> table? >> [...] >> >> I've always wondered about whether delegates passed to opApply ever get >> inlined. > > Don't wonder. Find out! > > import std.stdio; > void doStuff() { writeln("Howdy!"); } > void main() { > int x = 1, y = 2, z = 3; > auto a = x + (){ doStuff(); return y; }() + z; > writeln(a); > } See also: bug 4440 The patch in there, if it hasn't bit rotten to badly (I suspect it has) will handle _this_ case. But almost no other case of inlining delegates. It'd be a good area for someone who wants an interesting and non-trivial problem to dig into. It wouldn't touch all that much of the codebase as the inliner is fairly self-contained. At least, that's what I recall from when I looked at this stuff a couple years ago. Later, Brad
Re: Optimize away immediately-called delegate literals?
On Tue, Mar 13, 2012 at 12:15:02AM +0100, Peter Alexander wrote: > On Sunday, 11 March 2012 at 06:49:27 UTC, H. S. Teoh wrote: > >On Sun, Mar 11, 2012 at 01:29:01AM -0500, Nick Sabalausky wrote: > >>Suppose you have a delegate literal and immediately call it: > >> > >>auto a = x + (){ doStuff(); return y; }() + z; > >> > >>Does DMD ever (or always?) optimize away a delegate if it's executed > >>immediately and never stored into a variable? If not, can it, and > >>would it be a simple change? Is something like this already on the > >>table? > >[...] > > > >I've always wondered about whether delegates passed to opApply > >ever get > >inlined. > > Don't wonder. Find out! > > import std.stdio; > void doStuff() { writeln("Howdy!"); } > void main() { > int x = 1, y = 2, z = 3; > auto a = x + (){ doStuff(); return y; }() + z; > writeln(a); > } > > $ dmd test.d -O -release -inline > > __Dmain: > 0001106c pushq %rbp > 0001106d movq%rsp,%rbp > 00011070 pushq %rax > 00011071 pushq %rbx > 00011072 movq$0x000c,%rdi > 0001107c callq 0x1000237f0 ; symbol stub for: > __d_allocmemory > 00011081 movq%rax,%rbx > 00011084 movq$0x,(%rbx) > 0001108b movl$0x0002,0x08(%rbx) > 00011092 movq%rbx,%rdi > 00011095 call*0x0002318c(%rip) > 0001109c leal0x04(%rax),%edx > 0001109f movl$0x000a,%esi > 000110a4 leaq0x00033eed(%rip),%rdi > 000110ab callq 0x10002319c ; symbol stub for: > _D3std5stdio4File14__T5writeTiTaZ5writeMFiaZv > 000110b0 xorl%eax,%eax > 000110b2 popq%rbx > 000110b3 movq%rbp,%rsp > 000110b6 popq%rbp > 000110b7 ret > > In short. No! It doesn't currently inline in this case. > > Even if the lambda just returns a constant, it doesn't get inlined. Hmph. I tried this code: import std.stdio; struct A { int[] data; int opApply(int delegate(ref int) dg) { foreach (d; data) { if (dg(d)) return 1; } return 0; } } void main() { A a; int n = 0; a.data = [1,2,3,4,5]; foreach (d; a) { n++; } } With both dmd and gdc, the delegate is never inlined. :-( Compiling with gdc -O3 causes opApply to get inlined and loop-unrolled, but the call to the delegate is still there. With dmd -O, even opApply is not inlined, and the code is generally much longer per loop iteration than gdc -O3. Here's the code generated by gdc for the foreach() loop in main(): 404839: 48 89 c3mov%rax,%rbx 40483c: 48 8b 04 24 mov(%rsp),%rax 404840: 48 8d 74 24 3c lea0x3c(%rsp),%rsi 404845: 48 8d 7c 24 20 lea0x20(%rsp),%rdi 40484a: 48 89 03mov%rax,(%rbx) 40484d: 48 8b 44 24 08 mov0x8(%rsp),%rax 404852: 48 89 43 08 mov%rax,0x8(%rbx) 404856: 8b 44 24 10 mov0x10(%rsp),%eax 40485a: 89 43 10mov%eax,0x10(%rbx) 40485d: 8b 03 mov(%rbx),%eax 40485f: 89 44 24 3c mov%eax,0x3c(%rsp) 404863: ff d5 callq *%rbp 404865: 85 c0 test %eax,%eax 404867: 75 58 jne4048c1 <_Dmain+0xd1> 404869: 8b 43 04mov0x4(%rbx),%eax 40486c: 48 8d 74 24 3c lea0x3c(%rsp),%rsi 404871: 48 8d 7c 24 20 lea0x20(%rsp),%rdi 404876: 89 44 24 3c mov%eax,0x3c(%rsp) 40487a: ff d5 callq *%rbp 40487c: 85 c0 test %eax,%eax 40487e: 75 41 jne4048c1 <_Dmain+0xd1> 404880: 8b 43 08mov0x8(%rbx),%eax 404883: 48 8d 74 24 3c lea0x3c(%rsp),%rsi 404888: 48 8d 7c 24 20 lea0x20(%rsp),%rdi 40488d: 89 44 24 3c mov%eax,0x3c(%rsp) 404891: ff d5 callq *%rbp 404893: 85 c0 test %eax,%eax 404895: 75 2a jne4048c1 <_Dmain+0xd1> 404897: 8b 43 0cmov0xc(%rbx),%eax 40489a: 48 8d 74 24 3c lea0x3c(%rsp),%rsi 40489f: 48 8d 7c 24 20 lea0x20(%rsp),%rdi 4048a4: 89 44 24 3c mov%eax,0x3c(%rsp) 4048a8: ff d5 callq *%rbp 4048aa: 85 c0 test %ea
Re: Optimize away immediately-called delegate literals?
On Sunday, 11 March 2012 at 06:49:27 UTC, H. S. Teoh wrote: On Sun, Mar 11, 2012 at 01:29:01AM -0500, Nick Sabalausky wrote: Suppose you have a delegate literal and immediately call it: auto a = x + (){ doStuff(); return y; }() + z; Does DMD ever (or always?) optimize away a delegate if it's executed immediately and never stored into a variable? If not, can it, and would it be a simple change? Is something like this already on the table? [...] I've always wondered about whether delegates passed to opApply ever get inlined. Don't wonder. Find out! import std.stdio; void doStuff() { writeln("Howdy!"); } void main() { int x = 1, y = 2, z = 3; auto a = x + (){ doStuff(); return y; }() + z; writeln(a); } $ dmd test.d -O -release -inline __Dmain: 0001106cpushq %rbp 0001106dmovq%rsp,%rbp 00011070pushq %rax 00011071pushq %rbx 00011072movq$0x000c,%rdi 0001107c callq 0x1000237f0 ; symbol stub for: __d_allocmemory 00011081movq%rax,%rbx 00011084movq$0x,(%rbx) 0001108bmovl$0x0002,0x08(%rbx) 00011092movq%rbx,%rdi 00011095call*0x0002318c(%rip) 0001109cleal0x04(%rax),%edx 0001109fmovl$0x000a,%esi 000110a4leaq0x00033eed(%rip),%rdi 000110ab callq 0x10002319c ; symbol stub for: _D3std5stdio4File14__T5writeTiTaZ5writeMFiaZv 000110b0xorl%eax,%eax 000110b2popq%rbx 000110b3movq%rbp,%rsp 000110b6popq%rbp 000110b7ret In short. No! It doesn't currently inline in this case. Even if the lambda just returns a constant, it doesn't get inlined.
Re: [OT] Hanlon's Razor (Was: Optimize away immediately-called delegate literals?)
On Sun, Mar 11, 2012 at 03:03:39AM -0400, Nick Sabalausky wrote: > "H. S. Teoh" wrote in message > news:mailman.455.1331448575.4860.digitalmar...@puremagic.com... > > > > Never ascribe to malice that which is adequately explained by > > incompetence. -- Napoleon Bonaparte > > Pardon me veering offtopic at one of your taglines yet again, but I > have to say, this is one that I've believed very strongly in for > years. It's practically a mantra of mine, a big part of my own > personal philosophy. I've never heard it attributed to Napoleon, > though. I always knew it as "Hanlon's Razor", somewhat of a corollary > to Occam's Razor. [...] I guess my sources aren't that reliable. I just copy-n-pasted that quote from somebody else's sig. :-) T -- Some ideas are so stupid that only intellectuals could believe them. -- George Orwell
[OT] Hanlon's Razor (Was: Optimize away immediately-called delegate literals?)
"H. S. Teoh" wrote in message news:mailman.455.1331448575.4860.digitalmar...@puremagic.com... > > Never ascribe to malice that which is adequately explained by > incompetence. -- Napoleon Bonaparte Pardon me veering offtopic at one of your taglines yet again, but I have to say, this is one that I've believed very strongly in for years. It's practically a mantra of mine, a big part of my own personal philosophy. I've never heard it attributed to Napoleon, though. I always knew it as "Hanlon's Razor", somewhat of a corollary to Occam's Razor.
Re: Optimize away immediately-called delegate literals?
On Sun, Mar 11, 2012 at 01:29:01AM -0500, Nick Sabalausky wrote: > Suppose you have a delegate literal and immediately call it: > > auto a = x + (){ doStuff(); return y; }() + z; > > Does DMD ever (or always?) optimize away a delegate if it's executed > immediately and never stored into a variable? If not, can it, and > would it be a simple change? Is something like this already on the > table? [...] I've always wondered about whether delegates passed to opApply ever get inlined. Seems to be pretty silly to do the entire function pointer + context pointer and full function call thing, if both opApply and the delegate are very simple. Especially if opApply is not much more than a for loop of some sort, perhaps with a line or two of private member access code inserted. T -- Never ascribe to malice that which is adequately explained by incompetence. -- Napoleon Bonaparte
Optimize away immediately-called delegate literals?
Suppose you have a delegate literal and immediately call it: auto a = x + (){ doStuff(); return y; }() + z; Does DMD ever (or always?) optimize away a delegate if it's executed immediately and never stored into a variable? If not, can it, and would it be a simple change? Is something like this already on the table?