Robert,

This is very nice. Basically  it confirms that if  every single variable is 
properly declared  and the compiler can make  all its optimizations,  then 
the loops  have a chance of working.

I got a bit lost  in the follow-up discussion: I think the message chain  
might have been broken.

Petr


On Thursday, December 11, 2014 2:05:40 PM UTC-8, Robert Gates wrote:
>
> Hi Petr,
>
> I just tried the devectorized problem, although I did choose to go a bit 
> of a different route: 
> https://gist.github.com/rleegates/2d99e6251fe246b017ac   
> I am not sure that this is what you intended, however, using the 
> vectorized code as a reference, I do obtain the same results up to machine 
> epsilon.
>
> Anyways, I got:
>
> In  [4]: keTest(200_000)
> Vectorized:
> elapsed time: 0.426404203 seconds (140804768 bytes allocated, 22.42% gc 
> time)
> DeVectorized:
> elapsed time: 0.078519349 seconds (128 bytes allocated)
> DeVectorized InBounds:
> elapsed time: 0.032812311 seconds (128 bytes allocated)
> Error norm deVec: 0.0
> Error norm inBnd: 0.0
>
> On Thursday, December 11, 2014 5:47:01 PM UTC+1, Petr Krysl wrote:
>>
>> Acting upon the advice that replacing matrix-matrix multiplications in 
>> vectorized form with loops would help with performance, I chopped out a 
>> piece of code from my finite element solver (
>> https://gist.github.com/anonymous/4ec426096c02faa4354d) and ran some 
>> tests with the following results:
>>
>> Vectorized code:
>> elapsed time: 0.326802682 seconds (134490340 bytes allocated, 17.06% gc 
>> time)
>>
>> Loops code:
>> elapsed time: 4.681451441 seconds (997454276 bytes allocated, 9.05% gc 
>> time) 
>>
>> SLOWER and using MORE memory?!
>>
>> I must be doing something terribly wrong.
>>
>> Petr
>>
>>

Reply via email to