[petsc-dev] KNL MatMult performance and unrolling.

Barry Smith Wed, 28 Sep 2016 13:42:04 -0700


   Mr Hong Zhang has found that removing the manual unrolling from 
MatMult_SeqAIJ_Inode() (at least with inode size 2) results in a good bump in 
performance on KNL and pointed me to the Intel gospel 
https://software.intel.com/en-us/articles/avoid-manual-loop-unrolling which 
we've always ignored in the past. It would be good try the unrolled and 
non-unrolled also on Xeon.


   We've never done a good job of managing our unrolling, where, how and when 
we do it and macros for unrolling such as PetscSparseDensePlusDot. Intel would 
say just throw it all away.

   Barry

[petsc-dev] KNL MatMult performance and unrolling.

Reply via email to