long compile time question
The following takes nearly three minutes to compile. The culprit is the line bar ~= B(); What is wrong with this? Thanks, Dan struct B { const size_t SIZE = 1024*64; int[SIZE] x; } void main() { B[] barr; barr ~= B(); } -
Re: long compile time question
On Tue, 23 Oct 2012 22:50:46 -0500, Dan wrote: The following takes nearly three minutes to compile. The culprit is the line bar ~= B(); What is wrong with this? Thanks, Dan struct B { const size_t SIZE = 1024*64; int[SIZE] x; } void main() { B[] barr; barr ~= B(); } - I have the same issue on linux x64 2.060 So appending to a dynamic array isn't really that efficient. But this goes WAY over that line. I'm timing your test now. It's still going... -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Re: long compile time question
On Wednesday, 24 October 2012 at 04:49:19 UTC, 1100110 wrote: The following takes nearly three minutes to compile. The culprit is the line bar ~= B(); What is wrong with this? I have the same issue on linux x64 2.060 So appending to a dynamic array isn't really that efficient. But this goes WAY over that line. I'm timing your test now. It's still going... It appears it's all happening during copying init, why I am not sure. [code] struct B { enum SIZE = 1024 * 64; int[SIZE] x; } //same timing issue, no array involved void test(B b) {} void main() { test(B()); } [/code] I've changed the *64 to various numbers and gotten curious results. The mb is the noted (estimated) memory footprint it used during compiling. 1: 0m0.725s mb:?? 2: 0m1.414s mb:?? 4: 0m2.620s mb:28 8: 0m8.937s mb:30 16: 0m35.869s mb:34 32: 2m36.922s mb:42 64: 9m27.353s mb:56
Re: long compile time question
On Wednesday, 24 October 2012 at 03:50:47 UTC, Dan wrote: The following takes nearly three minutes to compile. The culprit is the line bar ~= B(); What is wrong with this? Thanks, Dan struct B { const size_t SIZE = 1024*64; int[SIZE] x; } void main() { B[] barr; barr ~= B(); } - The code DMD generates for initializing the struct does not use loops, so it's xor ecx, ecx mov [eax], ecx mov [eax+4], ecx mov [eax+8], ecx mov [eax+0Ch], ecx mov [eax+10h], ecx mov [eax+14h], ecx mov [eax+18h], ecx mov [eax+1Ch], ecx mov [eax+20h], ecx mov [eax+24h], ecx mov [eax+28h], ecx mov [eax+2Ch], ecx mov [eax+30h], ecx mov [eax+34h], ecx mov [eax+38h], ecx ... So your code creates a lot of work for the compiler.
Re: long compile time question
On Wednesday, 24 October 2012 at 09:50:38 UTC, Era Scarecrow wrote: On Wednesday, 24 October 2012 at 04:49:19 UTC, 1100110 wrote: The following takes nearly three minutes to compile. The culprit is the line bar ~= B(); What is wrong with this? I have the same issue on linux x64 2.060 So appending to a dynamic array isn't really that efficient. But this goes WAY over that line. I'm timing your test now. It's still going... It appears it's all happening during copying init, why I am not sure. [code] struct B { enum SIZE = 1024 * 64; int[SIZE] x; } //same timing issue, no array involved void test(B b) {} void main() { test(B()); } [/code] I've changed the *64 to various numbers and gotten curious results. The mb is the noted (estimated) memory footprint it used during compiling. 1: 0m0.725s mb:?? 2: 0m1.414s mb:?? 4: 0m2.620s mb:28 8: 0m8.937s mb:30 16: 0m35.869s mb:34 32: 2m36.922s mb:42 64: 9m27.353s mb:56 According to assembly dmd just generates repetitive instructions to zero memory instead of making it in a loop. A workaround is to initialize array to void and then zero it in a loop.
Re: long compile time question
On 24/10/12 17:39, thedeemon wrote: On Wednesday, 24 October 2012 at 03:50:47 UTC, Dan wrote: The following takes nearly three minutes to compile. The culprit is the line bar ~= B(); What is wrong with this? Thanks, Dan struct B { const size_t SIZE = 1024*64; int[SIZE] x; } void main() { B[] barr; barr ~= B(); } - The code DMD generates for initializing the struct does not use loops, so it's xor ecx, ecx mov [eax], ecx mov [eax+4], ecx mov [eax+8], ecx mov [eax+0Ch], ecx mov [eax+10h], ecx mov [eax+14h], ecx mov [eax+18h], ecx mov [eax+1Ch], ecx mov [eax+20h], ecx mov [eax+24h], ecx mov [eax+28h], ecx mov [eax+2Ch], ecx mov [eax+30h], ecx mov [eax+34h], ecx mov [eax+38h], ecx ... So your code creates a lot of work for the compiler. That's incredibly horrible, please add to bugzilla.
Re: long compile time question
On Wed, Oct 24, 2012 at 06:04:10PM +0200, Don Clugston wrote: > On 24/10/12 17:39, thedeemon wrote: > >On Wednesday, 24 October 2012 at 03:50:47 UTC, Dan wrote: > >>The following takes nearly three minutes to compile. > >>The culprit is the line bar ~= B(); > >>What is wrong with this? > >> > >>Thanks, > >>Dan > >> > >>struct B { > >> const size_t SIZE = 1024*64; > >> int[SIZE] x; > >>} > >> > >>void main() { > >> B[] barr; > >> barr ~= B(); > >>} > >>- > > > >The code DMD generates for initializing the struct does not use loops, > >so it's > >xor ecx, ecx > >mov [eax], ecx > >mov [eax+4], ecx > >mov [eax+8], ecx > >mov [eax+0Ch], ecx > >mov [eax+10h], ecx > >mov [eax+14h], ecx > >mov [eax+18h], ecx > >mov [eax+1Ch], ecx [...] Yikes!! Why aren't we using memset (or equivalent) here?! > That's incredibly horrible, please add to bugzilla. [...] Yeah, no kidding! For comparison, GDC does a way better job in this department: $ time dmd test.d real0m7.564s user0m7.529s sys 0m0.029s $ time gdc test.d real0m0.107s user0m0.069s sys 0m0.036s $ This is with SIZE = 1024*16 (I didn't dare try it with 1024*64). That's a 75:1 ratio between dmd and gdc, which is pretty horrible, since dmd is usually significantly faster than gdc. Surprisingly, though, dmd still produces a smaller executable than gdc for this code! I'm guessing the optimizer cleans up that code afterwards? (Or maybe there are other factors at play here that I'm not aware of.) T -- EMACS = Extremely Massive And Cumbersome System
Re: long compile time question
On Wednesday, 24 October 2012 at 17:43:11 UTC, H. S. Teoh wrote: Surprisingly, though, dmd still produces a smaller executable than gdc for this code! I'm guessing the optimizer cleans up that code afterwards? (Or maybe there are other factors at play here that I'm not aware of.) Must be other factors. "Optimized" code (generated by dmd with -release -O) looks like mov DWORD PTR [edx], 0 mov DWORD PTR [edx+4], 0 mov DWORD PTR [edx+8], 0 ... so it should be even bigger and probably slower.
Re: long compile time question
On Wednesday, October 24, 2012 18:04:10 Don Clugston wrote: > That's incredibly horrible, please add to bugzilla. There are at least a couple of potentially related bugs already: http://d.puremagic.com/issues/show_bug.cgi?id=8828 http://d.puremagic.com/issues/show_bug.cgi?id=8449 though this may still merit a new report. - Jonathan M Davis
Re: long compile time question
Dan wrote: > The following takes nearly three minutes to compile. ... and this returns immediately: struct B { const size_t SIZE = 1024*64; int[SIZE] x= void; // !!! } void main() { B[] barr; barr ~= B(); } --- - manfred
Re: long compile time question
On Wednesday, October 24, 2012 21:35:22 Manfred Nowak wrote: > Dan wrote: > > The following takes nearly three minutes to compile. > > ... and this returns immediately: > > > struct B { > const size_t SIZE = 1024*64; > int[SIZE] x= void; // !!! > } > > void main() { > B[] barr; > barr ~= B(); > } > --- > - manfred That's incredibly dangerous though, because than B.init has garbage in it. Still, it's interesting that it would affect the speed. Maybe the compiler recognizes that it's garbage and so doesn't bother copying it (meaning that you'll get different garbage for every use of B.init). - Jonathan M Davis
Re: long compile time question
On Wednesday, 24 October 2012 at 15:39:19 UTC, thedeemon wrote: The code DMD generates for initializing the struct does not use loops, so it's xor ecx, ecx mov [eax], ecx mov [eax+4], ecx mov [eax+8], ecx ... So your code creates a lot of work for the compiler. That seems silly. I would think after the struct's init/contents were known it would make a single block that holds the basic init for it and bulk copy every time it needed it (if it's beyond a certain size, say 32 bytes). Also memset only works if the data can be defaulted to 0. Hmmm...
Re: long compile time question
On Thu, Oct 25, 2012 at 12:05:25AM +0200, Era Scarecrow wrote: > On Wednesday, 24 October 2012 at 15:39:19 UTC, thedeemon wrote: > >The code DMD generates for initializing the struct does not use > >loops, so it's > >xor ecx, ecx > >mov [eax], ecx > >mov [eax+4], ecx > >mov [eax+8], ecx > >... > > > >So your code creates a lot of work for the compiler. > > That seems silly. I would think after the struct's init/contents > were known it would make a single block that holds the basic init > for it and bulk copy every time it needed it (if it's beyond a > certain size, say 32 bytes). > > Also memset only works if the data can be defaulted to 0. Hmmm... Not true. Memset is *usually* used to set memory to 0, but it can set memory to other byte values too. Although this doesn't help if the .init value isn't something consisting of repeated byte values. In any case, though, separately initializing every member of an array is silly. That's what a loop is for. That, or a memcpy from an immutable copy of .init. T -- Life would be easier if I had the source code. -- YHL
Re: long compile time question
In any case, though, separately initializing every member of an array is silly. That's what a loop is for. That, or a memcpy from an immutable copy of .init. I think that the reasoning behind DMD's implementation is that for small structs, writing out the instructions is more efficient than a loop or a memcpy(); it's essentially the equivalent of loop unrolling and function inlining. However, that reasoning breaks down as soon as the struct's size goes beyond a certain value. In my opinion, though, this behavior should be kept for small structs. For example, if you have a struct that just wraps a size_t, just generating a move instruction is _way_ faster than a call to memcpy().
Re: long compile time question
On Saturday, 27 October 2012 at 23:07:19 UTC, BLM768 wrote: In any case, though, separately initializing every member of an array is silly. That's what a loop is for. That, or a memcpy from an immutable copy of .init. I think that the reasoning behind DMD's implementation is that for small structs, writing out the instructions is more efficient than a loop or a memcpy(); it's essentially the equivalent of loop unrolling and function inlining. However, that reasoning breaks down as soon as the struct's size goes beyond a certain value. In my opinion, though, this behavior should be kept for small structs. For example, if you have a struct that just wraps a size_t, just generating a move instruction is _way_ faster than a call to memcpy(). I just realized that this post is redundant; other posts have also mentioned optimization for small structs. That makes two relatively dumb posts from me in a day; maybe I should just stop for now. :)