Re: [m5-users] How does Timing CPU count number of instructions?

Korey Sewell Mon, 24 Jan 2011 18:14:06 -0800

If I'm not mistaken, there should be both parallel and serial sections of
Freqmine (or any multithreaded benchmark). So since programs typically run
in different phases, it wouldn't necessarily be out-of-line if there was a
100,000 instruction snippet that took particularly long, especially if that
was a phase where there are a lot of cache misses.


Also, if you are running a parallel benchmark but only on a single processor
then you might expect that there would be a lot of cache thrashing as
threads fight over their shared data as well as their own private data.

Looks like you'll have to dig a bit into what is the expected output for
Freqmine (or any benchmark) and then compare against what M5 is simulating
to be sure.

On Mon, Jan 24, 2011 at 8:51 PM, Stevenson Jian <[email protected]>wrote:

> Thanks for replying Steve. I only used a single processor in both
> simulations. What is shown is not the output from individual processors, but
> that of the same processor at the end of every 100,000 instructions (see
> sim_insts increment 100,000 each time)
>
>
> On Mon, Jan 24, 2011 at 7:14 PM, Steve Reinhardt <[email protected]> wrote:
>
>> With a multiprocessor, seemingly small changes in configuration can have a
>> significant impact if it changes the order in which threads grab a lock, or
>> something like that. So in particular, for the stats you have below, it
>> seems likely that there's some serialized computation going on that happened
>> on processor 3 in the first case and on processor 5 in the second case.
>>
>> Steve
>>
>> On Mon, Jan 24, 2011 at 1:30 PM, Stevenson Jian 
>> <[email protected]>wrote:
>>
>>> Hi,
>>> How does Timing CPU count number of instructions? If it stalls on a cache
>>> miss, do the Nops count as instructions as well? The reason why I ask is
>>> that by simply changing the size of the cache, the total number of
>>> instructions when the benchmark completes varies by about 0.1 - 0.01%.
>>>
>>> Another anomaly that I am observing is that again, by simply changing the
>>> size of the L2, the number of overall L2 accesses per let's say 100,000
>>> instructions can vary by over 100%.
>>>
>>> The following are 2 runs that i did on m5 with the Freqmine benchmark.
>>> The first simulation uses a 1Mb 4 way L2 with a latency of 6ns while the
>>> second simulation uses a 2MB 8 way L2 with a latency of 4.5ns. The overall
>>> access per 100,000 instructions are show.
>>>
>>> ---------------------------------------------------------------------------------------------
>>> 1MB 4Way L2:
>>> 2:
>>> sim_insts                                   100200001
>>>   # Number of instructions simulated
>>> sim_ticks                                   196940000
>>>   # Number of ticks simulated
>>> system.l2.overall_accesses                       3231
>>>   # number of overall (read+write) accesses
>>> system.l2.overall_hits                           2515
>>>   # number of overall hits
>>>
>>> 3:
>>> sim_insts                                   100300001
>>>   # Number of instructions simulated
>>> sim_ticks                                   227453000
>>>   # Number of ticks simulated
>>> system.l2.overall_accesses                       4656
>>>   # number of overall (read+write) accesses
>>> system.l2.overall_hits                           3434
>>>   # number of overall hits
>>>
>>> 4:
>>> sim_insts                                   100400001
>>>   # Number of instructions simulated
>>> sim_ticks                                   154064000
>>>   # Number of ticks simulated
>>> system.l2.overall_accesses                       1078
>>>   # number of overall (read+write) accesses
>>> system.l2.overall_hits                            722
>>>   # number of overall hits
>>>
>>> 5:
>>> sim_insts                                   100500001
>>>   # Number of instructions simulated
>>> sim_ticks                                   155779000
>>>   # Number of ticks simulated
>>> system.l2.overall_accesses                       1575
>>>   # number of overall (read+write) accesses
>>> system.l2.overall_hits                           1154
>>>   # number of overall hits
>>>
>>> ....
>>>
>>> 2MB 8Way L2:
>>> 2:
>>> sim_insts                                   100200001
>>>   # Number of instructions simulated
>>> sim_ticks                                   234810000
>>>   # Number of ticks simulated
>>> system.l2.overall_accesses                       2936
>>>   # number of overall (read+write) accesses
>>> system.l2.overall_hits                           1163
>>>   # number of overall hits
>>>
>>> 3:
>>> sim_insts                                   100300000
>>>   # Number of instructions simulated
>>> sim_ticks                                   174173000
>>>   # Number of ticks simulated
>>> system.l2.overall_accesses                       1496
>>>   # number of overall (read+write) accesses
>>> system.l2.overall_hits                            803
>>>   # number of overall hits
>>>
>>> 4:
>>> sim_insts                                   100400000
>>>   # Number of instructions simulated
>>> sim_ticks                                   190135000
>>>   # Number of ticks simulated
>>> system.l2.overall_accesses                       2290
>>>   # number of overall (read+write) accesses
>>> system.l2.overall_hits                           1672
>>>   # number of overall hits
>>>
>>> 5:
>>> sim_insts                                   100500000
>>>   # Number of instructions simulated
>>> sim_ticks                                   213086000
>>>   # Number of ticks simulated
>>> system.l2.overall_accesses                       4554
>>>   # number of overall (read+write) accesses
>>> system.l2.overall_hits                           3871
>>>   # number of overall hits
>>> .....
>>>
>>> ----------------------------------------------------------------------------
>>>  Even if Nops are counted as instructions, I don't see how that would
>>> make overall access/100,000 instructions vary by as much 200%. How does M5
>>> count the number of instructions?
>>> Thanks,
>>> Steve
>>>
>>> _______________________________________________
>>> m5-users mailing list
>>> [email protected]
>>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>>>
>>
>>
>> _______________________________________________
>> m5-users mailing list
>> [email protected]
>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>>
>
>
> _______________________________________________
> m5-users mailing list
> [email protected]
> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>



-- 
- Korey

_______________________________________________
m5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users

Re: [m5-users] How does Timing CPU count number of instructions?

Reply via email to