Hi all,
I recently compared the micro-architectural metrics such as L1
cache miss collected by gem5 with that collected by performance counters on
real ARM platform. I found that their difference was so big. For example,
the Icache miss rate per 1k instruction of bbench was about 30 collected by
hardware performance counters (referring to the paper published by Anthony
Gutierrez in IISWC'2011), but only about 3 for gem5. It's about 10x
difference.
There were several differences about two experimental environments:
1. While the real Arm platform was Cortex A9, I simulated the realview
platform on gem5, because there were some errors when running Bbench using
Versatile Express platform. However, the cache configuration was the same.
2. The Linux kernel version might not be the same, 2.6.32 for real
platform.
3. Others.
I don't think 1 & 2 can result in 10x difference for Icache miss rate
intuitively. I want to know whether somebody has tested the ARM platform
simulated by gem5. And what should I do in order to find out the
sources of 10x differences?
Thanks.
Best regards,
Yongbing Huang
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users