Hello, I put Graph500 in the Ubuntu image, compiled and ran it on the simulator with single core Xeon(E5620) config file which is provided in repo. As far as I understand this core config file is completely similar to my Westmere-EX(E7-2860) real-world machine except capacity of the caches (let me know if you know that I am wrong). The problem is when I look at ooo_0_0 cycles and compare it to the total cycles which is counted by papiex, I see 70% to 100% difference! I expected sth around 1% to 5% error. Does anybody have any idea where this huge error comes from? I did not add any ptlcall to the G500 source or do anything to make it more accurate, Does it mean that I am counting anything more than my benchmark by doing : ./start_sim;./omp-csr -s 12;./stop_sim ? like unexpected OS overhead or anything that papiex is not counting and full system simulation forces to simulator? Any idea is appreciated.
Thank you Alireza
_______________________________________________ http://www.marss86.org Marss86-Devel mailing list [email protected] https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel
