On Mar 1, 2019, at 11:00 AM, Sajid Ali 
<sajidsyed2...@u.northwestern.edu<mailto:sajidsyed2...@u.northwestern.edu>> 
wrote:


Hi Hong,

So, the speedup was coming from increased DRAM bandwidth and not the usage of 
MCDRAM.

Certainly the speedup was coming from the usage of MCDRAM (which has much 
higher bandwidth than DRAM). What I meant is your code is still using MCDRAM, 
but MCDRAM acts like L3 cache in cache mode.

Hong



There is moderate MPI imbalance, large amount of Back-End stalls and good 
vectorization.

I'm attaching my submit script, PETSc log file and Intel APS summary (all as 
non-HTML text). I can give more detailed analysis via Intel Vtune if needed.


Thank You,
Sajid Ali
Applied Physics
Northwestern University
<submit_script><intel_aps_report><knl_petsc>

Reply via email to