On Feb 28, 2019, at 6:10 AM, Sajid Ali <sajidsyed2...@u.northwestern.edu<mailto:sajidsyed2...@u.northwestern.edu>> wrote:
Hi Hong, Thanks for the advice. I see that the example takes ~180 seconds to run but I can't see the DRAM vs MCDRAM info from Intel APS. I'll try to fix the profiling and get back with further questions. MCDRAM has 4x higher bandwidth than DRAM, so the improvement you see from your example looks very reasonable. Note that in cache mode MCDRAM acts as L3 cache while in flat mode it is used as another level of memory. Hong (Mr.) Also, the intel-mpi manpages say that the use of tmi is now deprecated : https://software.intel.com/en-us/mpi-developer-guide-linux-fabrics-control Thank You, Sajid Ali Applied Physics Northwestern University