Hi Petr,

I just tried the devectorized problem, although I did choose to go a bit of 
a different route: https://gist.github.com/rleegates/2d99e6251fe246b017ac   
I am not sure that this is what you intended, however, using the vectorized 
code as a reference, I do obtain the same results up to machine epsilon.

Anyways, I got:

In  [4]: keTest(200_000)
Vectorized:
elapsed time: 0.426404203 seconds (140804768 bytes allocated, 22.42% gc 
time)
DeVectorized:
elapsed time: 0.078519349 seconds (128 bytes allocated)
DeVectorized InBounds:
elapsed time: 0.032812311 seconds (128 bytes allocated)
Error norm deVec: 0.0
Error norm inBnd: 0.0

On Thursday, December 11, 2014 5:47:01 PM UTC+1, Petr Krysl wrote:
>
> Acting upon the advice that replacing matrix-matrix multiplications in 
> vectorized form with loops would help with performance, I chopped out a 
> piece of code from my finite element solver (
> https://gist.github.com/anonymous/4ec426096c02faa4354d) and ran some 
> tests with the following results:
>
> Vectorized code:
> elapsed time: 0.326802682 seconds (134490340 bytes allocated, 17.06% gc 
> time)
>
> Loops code:
> elapsed time: 4.681451441 seconds (997454276 bytes allocated, 9.05% gc 
> time) 
>
> SLOWER and using MORE memory?!
>
> I must be doing something terribly wrong.
>
> Petr
>
>

Reply via email to