I posted a message yesterday about a simple port of fragments a finite element code to solve the heat conduction equation. The Julia port was compared to the original Matlab toolkit FinEALE and the Julia code presented last year by Amuthan:
https://groups.google.com/forum/?fromgroups=#!searchin/julia-users/Krysl%7Csort:relevance/julia-users/3tTljDSQ6cs/-8UCPnNmzn4J I have now speeded the code up some more, so that we have the table (on my laptop, i7, 16 gig of memory): Amuthan's 29 seconds J FinEALE 58 seconds FinEALE 810 seconds So, given that Amuthan reports to be slower by a general-purpose C++ code by a factor of around 1.36, J FinEALE is presumably slower with respect to an equivalent FE solver coded in C++ by a factor of 2.7. So far so good. The curious thing is that @time reports huge amounts of memory allocated (something like 10% of GC time). One particular source of wild memory allocation was this line (executed in this case 2 million times) Fe += Ns[j] .* (f * Jac * w[j]); where Fe = 3 x 1 matrix Ns[j] = 3 by one matrix f = one by one matrix Jac, w[j]= scalars The cost of the operation that encloses this line (among many others): 19.835263928 seconds (4162094480 bytes allocated, 16.79% gc time Changing the one by one matrix f into a scalar (and replacing .*) Fe += Ns[j] * (f * Jac * w[j]); changed the cost quite drastically: 10.105620394 seconds (2738120272 bytes allocated, 21.33% gc time Any ideas, you Julian wizards? Thanks, Petr