Re: [Bug-apl] OpenMP performance: first result

2014-03-12 Thread Elias Mårtenson
To clarify, the appealing feature of TBB that made me interested (apart
from it being very fast) is that its algorithms implement task stealing.
This should make the dispatch quite effective, even if some subtasks are
slower than others. I.e. it may actually adress some of the concerns raised
regarding coalescing the jobs.

Regards,
Elias
On 13 Mar 2014 00:03, "Elias Mårtenson"  wrote:

> I've done some experiments with Intel's Threading Building Blocks, and
> based on my initial tests, it seems incredibly light-weight, and also easy
> to use.
>
> I haven't tested with actual GNU APL code yet though (I've written
> separate test programs to experiment). My next tests will be on the real
> thing.
>
> I will report back later when I have some results to share.
>
> Regards,
> Elias
>
>
> On 12 March 2014 20:02, Juergen Sauermann 
> wrote:
>
>> Hi David,
>>
>> I guess the circle functions and ⋆/⍟ might do a better job in raising
>> your motivation!
>>
>> If I remember correctly then in 1990 we got a speedup of 5-6 on our 32
>> processor machine,
>> which means that the break-even point is at about 6 cores.
>>
>> Unfortunately my own machine has only 2 cores.
>>
>> /// Jürgen
>>
>>
>>
>>
>> On 03/12/2014 07:04 AM, David Lamkins wrote:
>>
>>> WIth the OpenMP patch I put together last night, adding two
>>> million-element vectors went from about 45ms without OpenMP to about 32ms
>>> with OpenMP.
>>>
>>> That's evidence that OpenMP is doing *something*, even though the
>>> speedup is nowhere near commensurate with the number of cores (8 on this
>>> machine).
>>>
>>> I wasn't expecting much at this early stage. I am, however, encouraged
>>> by the measurable result.
>>>
>>> --
>>> "The secret to creativity is knowing how to hide your sources."
>>> Albert Einstein
>>>
>>>
>>> http://soundcloud.com/davidlamkins
>>> http://reverbnation.com/lamkins
>>> http://reverbnation.com/lcw
>>> http://lamkins-guitar.com/
>>> http://lamkins.net/
>>> http://successful-lisp.com/
>>>
>>
>>
>>
>


Re: [Bug-apl] OpenMP performance: first result

2014-03-12 Thread Elias Mårtenson
I've done some experiments with Intel's Threading Building Blocks, and
based on my initial tests, it seems incredibly light-weight, and also easy
to use.

I haven't tested with actual GNU APL code yet though (I've written separate
test programs to experiment). My next tests will be on the real thing.

I will report back later when I have some results to share.

Regards,
Elias


On 12 March 2014 20:02, Juergen Sauermann wrote:

> Hi David,
>
> I guess the circle functions and ⋆/⍟ might do a better job in raising your
> motivation!
>
> If I remember correctly then in 1990 we got a speedup of 5-6 on our 32
> processor machine,
> which means that the break-even point is at about 6 cores.
>
> Unfortunately my own machine has only 2 cores.
>
> /// Jürgen
>
>
>
>
> On 03/12/2014 07:04 AM, David Lamkins wrote:
>
>> WIth the OpenMP patch I put together last night, adding two
>> million-element vectors went from about 45ms without OpenMP to about 32ms
>> with OpenMP.
>>
>> That's evidence that OpenMP is doing *something*, even though the speedup
>> is nowhere near commensurate with the number of cores (8 on this machine).
>>
>> I wasn't expecting much at this early stage. I am, however, encouraged by
>> the measurable result.
>>
>> --
>> "The secret to creativity is knowing how to hide your sources."
>> Albert Einstein
>>
>>
>> http://soundcloud.com/davidlamkins
>> http://reverbnation.com/lamkins
>> http://reverbnation.com/lcw
>> http://lamkins-guitar.com/
>> http://lamkins.net/
>> http://successful-lisp.com/
>>
>
>
>


Re: [Bug-apl] OpenMP performance: first result

2014-03-12 Thread Juergen Sauermann

Hi David,

I guess the circle functions and ⋆/⍟ might do a better job in raising 
your motivation!


If I remember correctly then in 1990 we got a speedup of 5-6 on our 32 
processor machine,

which means that the break-even point is at about 6 cores.

Unfortunately my own machine has only 2 cores.

/// Jürgen



On 03/12/2014 07:04 AM, David Lamkins wrote:
WIth the OpenMP patch I put together last night, adding two 
million-element vectors went from about 45ms without OpenMP to about 
32ms with OpenMP.


That's evidence that OpenMP is doing *something*, even though the 
speedup is nowhere near commensurate with the number of cores (8 on 
this machine).


I wasn't expecting much at this early stage. I am, however, encouraged 
by the measurable result.


--
"The secret to creativity is knowing how to hide your sources."
Albert Einstein


http://soundcloud.com/davidlamkins
http://reverbnation.com/lamkins
http://reverbnation.com/lcw
http://lamkins-guitar.com/
http://lamkins.net/
http://successful-lisp.com/





[Bug-apl] OpenMP performance: first result

2014-03-11 Thread David Lamkins
WIth the OpenMP patch I put together last night, adding two million-element
vectors went from about 45ms without OpenMP to about 32ms with OpenMP.

That's evidence that OpenMP is doing *something*, even though the speedup
is nowhere near commensurate with the number of cores (8 on this machine).

I wasn't expecting much at this early stage. I am, however, encouraged by
the measurable result.

-- 
"The secret to creativity is knowing how to hide your sources."
   Albert Einstein


http://soundcloud.com/davidlamkins
http://reverbnation.com/lamkins
http://reverbnation.com/lcw
http://lamkins-guitar.com/
http://lamkins.net/
http://successful-lisp.com/