Hi Korey,
I am not worried about how long each interval takes. What I am confused
about is if you look at the output that I posted, interval vs interval (I
labeled the intervals as 1, 2, 3 etc), the number of L2 Overall access for
the 2 simulations that are only different by cache configurations would
differ by as much as 200 to 300%? What types of accesses are all included
under overall access?
Thanks,
Steve

On Mon, Jan 24, 2011 at 8:13 PM, Korey Sewell <[email protected]> wrote:

> If I'm not mistaken, there should be both parallel and serial sections of
> Freqmine (or any multithreaded benchmark). So since programs typically run
> in different phases, it wouldn't necessarily be out-of-line if there was a
> 100,000 instruction snippet that took particularly long, especially if that
> was a phase where there are a lot of cache misses.
>
> Also, if you are running a parallel benchmark but only on a single
> processor then you might expect that there would be a lot of cache thrashing
> as threads fight over their shared data as well as their own private data.
>
> Looks like you'll have to dig a bit into what is the expected output for
> Freqmine (or any benchmark) and then compare against what M5 is simulating
> to be sure.
>
>
> On Mon, Jan 24, 2011 at 8:51 PM, Stevenson Jian 
> <[email protected]>wrote:
>
>> Thanks for replying Steve. I only used a single processor in both
>> simulations. What is shown is not the output from individual processors, but
>> that of the same processor at the end of every 100,000 instructions (see
>> sim_insts increment 100,000 each time)
>>
>>
>> On Mon, Jan 24, 2011 at 7:14 PM, Steve Reinhardt <[email protected]>wrote:
>>
>>> With a multiprocessor, seemingly small changes in configuration can have
>>> a significant impact if it changes the order in which threads grab a lock,
>>> or something like that. So in particular, for the stats you have below, it
>>> seems likely that there's some serialized computation going on that happened
>>> on processor 3 in the first case and on processor 5 in the second case.
>>>
>>> Steve
>>>
>>> On Mon, Jan 24, 2011 at 1:30 PM, Stevenson Jian <[email protected]
>>> > wrote:
>>>
>>>> Hi,
>>>> How does Timing CPU count number of instructions? If it stalls on a
>>>> cache miss, do the Nops count as instructions as well? The reason why I ask
>>>> is that by simply changing the size of the cache, the total number of
>>>> instructions when the benchmark completes varies by about 0.1 - 0.01%.
>>>>
>>>> Another anomaly that I am observing is that again, by simply changing
>>>> the size of the L2, the number of overall L2 accesses per let's say 100,000
>>>> instructions can vary by over 100%.
>>>>
>>>> The following are 2 runs that i did on m5 with the Freqmine benchmark.
>>>> The first simulation uses a 1Mb 4 way L2 with a latency of 6ns while the
>>>> second simulation uses a 2MB 8 way L2 with a latency of 4.5ns. The overall
>>>> access per 100,000 instructions are show.
>>>>
>>>> ---------------------------------------------------------------------------------------------
>>>> 1MB 4Way L2:
>>>> 2:
>>>> sim_insts                                   100200001
>>>>     # Number of instructions simulated
>>>> sim_ticks                                   196940000
>>>>     # Number of ticks simulated
>>>> system.l2.overall_accesses                       3231
>>>>     # number of overall (read+write) accesses
>>>> system.l2.overall_hits                           2515
>>>>     # number of overall hits
>>>>
>>>> 3:
>>>> sim_insts                                   100300001
>>>>     # Number of instructions simulated
>>>> sim_ticks                                   227453000
>>>>     # Number of ticks simulated
>>>> system.l2.overall_accesses                       4656
>>>>     # number of overall (read+write) accesses
>>>> system.l2.overall_hits                           3434
>>>>     # number of overall hits
>>>>
>>>> 4:
>>>> sim_insts                                   100400001
>>>>     # Number of instructions simulated
>>>> sim_ticks                                   154064000
>>>>     # Number of ticks simulated
>>>> system.l2.overall_accesses                       1078
>>>>     # number of overall (read+write) accesses
>>>> system.l2.overall_hits                            722
>>>>     # number of overall hits
>>>>
>>>> 5:
>>>> sim_insts                                   100500001
>>>>     # Number of instructions simulated
>>>> sim_ticks                                   155779000
>>>>     # Number of ticks simulated
>>>> system.l2.overall_accesses                       1575
>>>>     # number of overall (read+write) accesses
>>>> system.l2.overall_hits                           1154
>>>>     # number of overall hits
>>>>
>>>> ....
>>>>
>>>> 2MB 8Way L2:
>>>> 2:
>>>> sim_insts                                   100200001
>>>>     # Number of instructions simulated
>>>> sim_ticks                                   234810000
>>>>     # Number of ticks simulated
>>>> system.l2.overall_accesses                       2936
>>>>     # number of overall (read+write) accesses
>>>> system.l2.overall_hits                           1163
>>>>     # number of overall hits
>>>>
>>>> 3:
>>>> sim_insts                                   100300000
>>>>     # Number of instructions simulated
>>>> sim_ticks                                   174173000
>>>>     # Number of ticks simulated
>>>> system.l2.overall_accesses                       1496
>>>>     # number of overall (read+write) accesses
>>>> system.l2.overall_hits                            803
>>>>     # number of overall hits
>>>>
>>>> 4:
>>>> sim_insts                                   100400000
>>>>     # Number of instructions simulated
>>>> sim_ticks                                   190135000
>>>>     # Number of ticks simulated
>>>> system.l2.overall_accesses                       2290
>>>>     # number of overall (read+write) accesses
>>>> system.l2.overall_hits                           1672
>>>>     # number of overall hits
>>>>
>>>> 5:
>>>> sim_insts                                   100500000
>>>>     # Number of instructions simulated
>>>> sim_ticks                                   213086000
>>>>     # Number of ticks simulated
>>>> system.l2.overall_accesses                       4554
>>>>     # number of overall (read+write) accesses
>>>> system.l2.overall_hits                           3871
>>>>     # number of overall hits
>>>> .....
>>>>
>>>> ----------------------------------------------------------------------------
>>>>  Even if Nops are counted as instructions, I don't see how that would
>>>> make overall access/100,000 instructions vary by as much 200%. How does M5
>>>> count the number of instructions?
>>>> Thanks,
>>>> Steve
>>>>
>>>> _______________________________________________
>>>> m5-users mailing list
>>>> [email protected]
>>>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>>>>
>>>
>>>
>>> _______________________________________________
>>> m5-users mailing list
>>> [email protected]
>>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>>>
>>
>>
>> _______________________________________________
>> m5-users mailing list
>> [email protected]
>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>>
>
>
>
> --
> - Korey
>
> _______________________________________________
> m5-users mailing list
> [email protected]
> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>
_______________________________________________
m5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users

Reply via email to