long compile time question

2012-10-23 Thread Dan

The following takes nearly three minutes to compile.
The culprit is the line bar ~= B();
What is wrong with this?

Thanks,
Dan

struct B {
  const size_t SIZE = 1024*64;
  int[SIZE] x;
}

void main() {
  B[] barr;
  barr ~= B();
}
-


Re: long compile time question

2012-10-23 Thread 1100110

On Tue, 23 Oct 2012 22:50:46 -0500, Dan  wrote:


The following takes nearly three minutes to compile.
The culprit is the line bar ~= B();
What is wrong with this?

Thanks,
Dan

struct B {
   const size_t SIZE = 1024*64;
   int[SIZE] x;
}

void main() {
   B[] barr;
   barr ~= B();
}
-

I have the same issue on linux x64 2.060

So appending to a dynamic array isn't really that efficient.
But this goes WAY over that line.  I'm timing your test now.

It's still going...


--
Using Opera's revolutionary email client: http://www.opera.com/mail/


Re: long compile time question

2012-10-24 Thread Era Scarecrow

On Wednesday, 24 October 2012 at 04:49:19 UTC, 1100110 wrote:

The following takes nearly three minutes to compile.
The culprit is the line bar ~= B();
What is wrong with this?



I have the same issue on linux x64 2.060

So appending to a dynamic array isn't really that efficient. 
But this goes WAY over that line.  I'm timing your test now.


It's still going...


 It appears it's all happening during copying init, why I am not 
sure.


[code]
struct B {
  enum SIZE = 1024 * 64;
  int[SIZE] x;
}

//same timing issue, no array involved
void test(B b) {}

void main() {
  test(B());
}
[/code]

 I've changed the *64 to various numbers and gotten curious 
results. The mb is the noted (estimated) memory footprint it used 
during compiling.


1:   0m0.725s   mb:??
2:   0m1.414s   mb:??
4:   0m2.620s   mb:28
8:   0m8.937s   mb:30
16:  0m35.869s  mb:34
32:  2m36.922s  mb:42
64:  9m27.353s  mb:56


Re: long compile time question

2012-10-24 Thread thedeemon

On Wednesday, 24 October 2012 at 03:50:47 UTC, Dan wrote:

The following takes nearly three minutes to compile.
The culprit is the line bar ~= B();
What is wrong with this?

Thanks,
Dan

struct B {
  const size_t SIZE = 1024*64;
  int[SIZE] x;
}

void main() {
  B[] barr;
  barr ~= B();
}
-


The code DMD generates for initializing the struct does not use 
loops, so it's

xor ecx, ecx
mov [eax], ecx
mov [eax+4], ecx
mov [eax+8], ecx
mov [eax+0Ch], ecx
mov [eax+10h], ecx
mov [eax+14h], ecx
mov [eax+18h], ecx
mov [eax+1Ch], ecx
mov [eax+20h], ecx
mov [eax+24h], ecx
mov [eax+28h], ecx
mov [eax+2Ch], ecx
mov [eax+30h], ecx
mov [eax+34h], ecx
mov [eax+38h], ecx
...

So your code creates a lot of work for the compiler.


Re: long compile time question

2012-10-24 Thread Maxim Fomin
On Wednesday, 24 October 2012 at 09:50:38 UTC, Era Scarecrow 
wrote:

On Wednesday, 24 October 2012 at 04:49:19 UTC, 1100110 wrote:

The following takes nearly three minutes to compile.
The culprit is the line bar ~= B();
What is wrong with this?



I have the same issue on linux x64 2.060

So appending to a dynamic array isn't really that efficient. 
But this goes WAY over that line.  I'm timing your test now.


It's still going...


 It appears it's all happening during copying init, why I am 
not sure.


[code]
struct B {
  enum SIZE = 1024 * 64;
  int[SIZE] x;
}

//same timing issue, no array involved
void test(B b) {}

void main() {
  test(B());
}
[/code]

 I've changed the *64 to various numbers and gotten curious 
results. The mb is the noted (estimated) memory footprint it 
used during compiling.


1:   0m0.725s   mb:??
2:   0m1.414s   mb:??
4:   0m2.620s   mb:28
8:   0m8.937s   mb:30
16:  0m35.869s  mb:34
32:  2m36.922s  mb:42
64:  9m27.353s  mb:56


According to assembly dmd just generates repetitive instructions 
to zero memory instead of making it in a loop. A workaround is to 
initialize array to void and then zero it in a loop.




Re: long compile time question

2012-10-24 Thread Don Clugston

On 24/10/12 17:39, thedeemon wrote:

On Wednesday, 24 October 2012 at 03:50:47 UTC, Dan wrote:

The following takes nearly three minutes to compile.
The culprit is the line bar ~= B();
What is wrong with this?

Thanks,
Dan

struct B {
  const size_t SIZE = 1024*64;
  int[SIZE] x;
}

void main() {
  B[] barr;
  barr ~= B();
}
-


The code DMD generates for initializing the struct does not use loops,
so it's
xor ecx, ecx
mov [eax], ecx
mov [eax+4], ecx
mov [eax+8], ecx
mov [eax+0Ch], ecx
mov [eax+10h], ecx
mov [eax+14h], ecx
mov [eax+18h], ecx
mov [eax+1Ch], ecx
mov [eax+20h], ecx
mov [eax+24h], ecx
mov [eax+28h], ecx
mov [eax+2Ch], ecx
mov [eax+30h], ecx
mov [eax+34h], ecx
mov [eax+38h], ecx
...

So your code creates a lot of work for the compiler.


That's incredibly horrible, please add to bugzilla.




Re: long compile time question

2012-10-24 Thread H. S. Teoh
On Wed, Oct 24, 2012 at 06:04:10PM +0200, Don Clugston wrote:
> On 24/10/12 17:39, thedeemon wrote:
> >On Wednesday, 24 October 2012 at 03:50:47 UTC, Dan wrote:
> >>The following takes nearly three minutes to compile.
> >>The culprit is the line bar ~= B();
> >>What is wrong with this?
> >>
> >>Thanks,
> >>Dan
> >>
> >>struct B {
> >>  const size_t SIZE = 1024*64;
> >>  int[SIZE] x;
> >>}
> >>
> >>void main() {
> >>  B[] barr;
> >>  barr ~= B();
> >>}
> >>-
> >
> >The code DMD generates for initializing the struct does not use loops,
> >so it's
> >xor ecx, ecx
> >mov [eax], ecx
> >mov [eax+4], ecx
> >mov [eax+8], ecx
> >mov [eax+0Ch], ecx
> >mov [eax+10h], ecx
> >mov [eax+14h], ecx
> >mov [eax+18h], ecx
> >mov [eax+1Ch], ecx
[...]

Yikes!! Why aren't we using memset (or equivalent) here?!


> That's incredibly horrible, please add to bugzilla.
[...]

Yeah, no kidding! For comparison, GDC does a way better job in this
department:

$ time dmd test.d

real0m7.564s
user0m7.529s
sys 0m0.029s
$ time gdc test.d

real0m0.107s
user0m0.069s
sys 0m0.036s
$

This is with SIZE = 1024*16 (I didn't dare try it with 1024*64).  That's
a 75:1 ratio between dmd and gdc, which is pretty horrible, since dmd is
usually significantly faster than gdc.

Surprisingly, though, dmd still produces a smaller executable than gdc
for this code! I'm guessing the optimizer cleans up that code
afterwards? (Or maybe there are other factors at play here that I'm not
aware of.)


T

-- 
EMACS = Extremely Massive And Cumbersome System


Re: long compile time question

2012-10-24 Thread thedeemon

On Wednesday, 24 October 2012 at 17:43:11 UTC, H. S. Teoh wrote:

Surprisingly, though, dmd still produces a smaller executable 
than gdc

for this code! I'm guessing the optimizer cleans up that code
afterwards? (Or maybe there are other factors at play here that 
I'm not aware of.)


Must be other factors. "Optimized" code (generated by dmd with 
-release -O) looks like

mov DWORD PTR [edx], 0
mov DWORD PTR [edx+4], 0
mov DWORD PTR [edx+8], 0
...
so it should be even bigger and probably slower.



Re: long compile time question

2012-10-24 Thread Jonathan M Davis
On Wednesday, October 24, 2012 18:04:10 Don Clugston wrote:
> That's incredibly horrible, please add to bugzilla.

There are at least a couple of potentially related bugs already:

http://d.puremagic.com/issues/show_bug.cgi?id=8828
http://d.puremagic.com/issues/show_bug.cgi?id=8449

though this may still merit a new report.

- Jonathan M Davis


Re: long compile time question

2012-10-24 Thread Manfred Nowak
Dan wrote:

> The following takes nearly three minutes to compile.

... and this returns immediately:


struct B {
   const size_t SIZE = 1024*64;
   int[SIZE] x= void; // !!!
}

void main() {
   B[] barr;
   barr ~= B();
}
---
- manfred


Re: long compile time question

2012-10-24 Thread Jonathan M Davis
On Wednesday, October 24, 2012 21:35:22 Manfred Nowak wrote:
> Dan wrote:
> > The following takes nearly three minutes to compile.
> 
> ... and this returns immediately:
> 
> 
> struct B {
> const size_t SIZE = 1024*64;
> int[SIZE] x= void; // !!!
> }
> 
> void main() {
> B[] barr;
> barr ~= B();
> }
> ---
> - manfred

That's incredibly dangerous though, because than B.init has garbage in it. 
Still, it's interesting that it would affect the speed. Maybe the compiler 
recognizes that it's garbage and so doesn't bother copying it (meaning that 
you'll get different garbage for every use of B.init).

- Jonathan M Davis


Re: long compile time question

2012-10-24 Thread Era Scarecrow

On Wednesday, 24 October 2012 at 15:39:19 UTC, thedeemon wrote:
The code DMD generates for initializing the struct does not use 
loops, so it's

xor ecx, ecx
mov [eax], ecx
mov [eax+4], ecx
mov [eax+8], ecx
...

So your code creates a lot of work for the compiler.


 That seems silly. I would think after the struct's init/contents 
were known it would make a single block that holds the basic init 
for it and bulk copy every time it needed it (if it's beyond a 
certain size, say 32 bytes).


 Also memset only works if the data can be defaulted to 0. Hmmm...


Re: long compile time question

2012-10-24 Thread H. S. Teoh
On Thu, Oct 25, 2012 at 12:05:25AM +0200, Era Scarecrow wrote:
> On Wednesday, 24 October 2012 at 15:39:19 UTC, thedeemon wrote:
> >The code DMD generates for initializing the struct does not use
> >loops, so it's
> >xor ecx, ecx
> >mov [eax], ecx
> >mov [eax+4], ecx
> >mov [eax+8], ecx
> >...
> >
> >So your code creates a lot of work for the compiler.
> 
>  That seems silly. I would think after the struct's init/contents
> were known it would make a single block that holds the basic init
> for it and bulk copy every time it needed it (if it's beyond a
> certain size, say 32 bytes).
> 
>  Also memset only works if the data can be defaulted to 0. Hmmm...

Not true. Memset is *usually* used to set memory to 0, but it can set
memory to other byte values too. Although this doesn't help if the .init
value isn't something consisting of repeated byte values.

In any case, though, separately initializing every member of an array is
silly. That's what a loop is for. That, or a memcpy from an immutable
copy of .init.


T

-- 
Life would be easier if I had the source code. -- YHL


Re: long compile time question

2012-10-27 Thread BLM768




In any case, though, separately initializing every member of an 
array is
silly. That's what a loop is for. That, or a memcpy from an 
immutable

copy of .init.


I think that the reasoning behind DMD's implementation is that 
for small structs, writing out the instructions is more efficient 
than a loop or a memcpy(); it's essentially the equivalent of 
loop unrolling and function inlining. However, that reasoning 
breaks down as soon as the struct's size goes beyond a certain 
value.
In my opinion, though, this behavior should be kept for small 
structs. For example, if you have a struct that just wraps a 
size_t, just generating a move instruction is _way_ faster than a 
call to memcpy().





Re: long compile time question

2012-10-27 Thread BLM768

On Saturday, 27 October 2012 at 23:07:19 UTC, BLM768 wrote:




In any case, though, separately initializing every member of 
an array is
silly. That's what a loop is for. That, or a memcpy from an 
immutable

copy of .init.


I think that the reasoning behind DMD's implementation is that 
for small structs, writing out the instructions is more 
efficient than a loop or a memcpy(); it's essentially the 
equivalent of loop unrolling and function inlining. However, 
that reasoning breaks down as soon as the struct's size goes 
beyond a certain value.
In my opinion, though, this behavior should be kept for small 
structs. For example, if you have a struct that just wraps a 
size_t, just generating a move instruction is _way_ faster than 
a call to memcpy().


I just realized that this post is redundant; other posts have 
also mentioned optimization for small structs. That makes two 
relatively dumb posts from me in a day; maybe I should just stop 
for now. :)