On Feb 9, 2012, at 10:14 AM, Marco Leise wrote:

> Am 09.02.2012, 17:22 Uhr, schrieb dsimcha <dsim...@yahoo.com>:
> 
>> I wonder how much it helps to just optimize the GC a little.  How much does 
>> the performance gap close when you use DMD 2.058 beta instead of 2.057?  
>> This upcoming release has several new garbage collector optimizations.  If 
>> the GC is the bottleneck, then it's not surprising that anything that relies 
>> heavily on it is slow because D's GC is still fairly naive.
> 
> I did some OProfile-ing. The full report is attached, but for simplicity it 
> is without call graph this time. Here is an excerpt:
> 
> CPU: Core 2, speed 2001 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
> mask of 0x00 (Unhalted core cycles) count 100000
> samples  %        linenr info                 symbol name
> 13838    18.8416  gcx.d:426                   void* gc.gcx.GC.malloc(ulong, 
> uint, ulong*)
> 4465      6.0795  gcx.d:2454                  ulong 
> gc.gcx.Gcx.fullcollect(void*)

One random thing that just occurred to me… if the standard receive pattern is:

receive((int x) { … });

There's a good chance that a stack frame is being dynamically allocated for the 
delegate when it's passed to receive (since I don't believe there's any way to 
declare the parameters to receive as "scope").  I'll have to check this, and 
maybe consider changing receive to use alias template parameters instead of 
normal function parameters?

Reply via email to