Thanks for the suggestions.  I'll try them and report back.  Although
I've since found that out of 3 not-identical systems, this problem
only occurs on one.  So I may try different kernel/system libs and see
where that gets me.

-qg is funny.  My interpretation from the results so far is that, when
the parallel collector doesn't get stalled, it results in a big win.
But when parGC does stall, it's slower than disabling parallel gc
entirely.

I had thought the last core parallel slowdown problem was fixed a
while ago, but apparently not?

Thanks,
John

On Tue, Jun 19, 2012 at 8:49 AM, Ben Lippmeier <[email protected]> wrote:
>
> On 19/06/2012, at 24:48 , Tyson Whitehead wrote:
>
>> On June 18, 2012 04:20:51 John Lato wrote:
>>> Given this, can anyone suggest any likely causes of this issue, or
>>> anything I might want to look for?  Also, should I be concerned about
>>> the much larger gc_alloc_block_sync level for the slow run?  Does that
>>> indicate the allocator waiting to alloc a new block, or is it
>>> something else?  Am I on completely the wrong track?
>>
>> A total shot in the dark here, but wasn't there something about really bad
>> performance when you used all the CPUs on your machine under Linux?
>>
>> Presumably very tight coupling that is causing all the threads to stall
>> everytime the OS needs to do something or something?
>
> This can be a problem for data parallel computations (like in Repa). In Repa 
> all threads in the gang are supposed to run for the same time, but if one 
> gets swapped out by the OS then the whole gang is stalled.
>
> I tend to get best results using -N7 for an 8 core machine.
>
> It is also important to enable thread affinity (with the -qa) flag.
>
> For a Repa program on an 8 core machine I use +RTS -N7 -qa -qg
>
> Ben.
>
>

_______________________________________________
Glasgow-haskell-users mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Reply via email to