Hi Yongbing,

I don't have any 100% solution for you but have a few questions which may help 
you to localize the problem:

(1) Which type of model (functional or arm_detailed) do you run to collect the 
stats?
    Theoretically you should run 'arm_detailed' to take into account 
speculative misses.

(2) 'arm_detailed' uses 64B cache line while Cortex-A9 has 32B cache line.
    Did you take this into account (i.e. changed 'arm_detailed' cache line size 
to 32B)?

(3) Not sure about Chromium but browsers in general may use GPU for compositing 
on real HW and execution path will be different comparing to SW-only BBench in 
gem5.

Orangeade

Yongbing wrote:

Hi all,

         I recently compared the micro-architectural metrics such as L1
cache miss collected by gem5 with that collected by performance counters on
real ARM platform. I found that their difference was so big. For example,
the Icache miss rate per 1k instruction of bbench was about 30 collected by
hardware performance counters (referring to the paper published by Anthony
Gutierrez in IISWC'2011), but only about 3 for gem5. It's about 10x
difference.
                                          
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Reply via email to