Number of committed instructions (system.switch_cpus.committedInsts_total)
for;
Case 1: 100,000,000
Case 2: cpu0->100,045,354
cpu1->100,310,197
cpu2-> 99,884,333
cpu3-> 99,760,117
Number of cycles for;
Case 1: 150,570,516
Case 2: 139,230,042
For both cases, they switch cpus to detailed mode at instruction #2
billion. All the reported data correspond to the 100 million instructions
in detailed mode.
>From the config.ini files I can see that separate L2 caches are defined for
Case 2. My modified CacheConfig.py file to have private L2 caches looks
like:
def config_cache(options, system):
if options.cpu_type == "arm_detailed":
try:
from O3_ARM_v7a import *
except:
print "arm_detailed is unavailable. Did you compile the O3
model?"
sys.exit(1)
dcache_class, icache_class, l2_cache_class = \
O3_ARM_v7a_DCache, O3_ARM_v7a_ICache, O3_ARM_v7aL2
else:
dcache_class, icache_class, l2_cache_class = \
L1Cache, L1Cache, L2Cache
if options.l2cache:
# Provide a clock for the L2 and the L1-to-L2 bus here as they
# are not connected using addTwoLevelCacheHierarchy. Use the
# same clock as the CPUs, and set the L1-to-L2 bus width to 32
# bytes (256 bits).
system.l2 = [l2_cache_class(clock=options.clock,
size=options.l2_size,
assoc=options.l2_assoc,
block_size=options.cacheline_size) for i
in xrange(options.num_cpus)]
system.tol2bus = [CoherentBus(clock = options.clock, width = 32)
for i in xrange(options.num_cpus)]
#system.l2.cpu_side = system.tol2bus.master
#system.l2.mem_side = system.membus.slave
for i in xrange(options.num_cpus):
if options.caches:
icache = icache_class(size=options.l1i_size,
assoc=options.l1i_assoc,
block_size=options.cacheline_size)
dcache = dcache_class(size=options.l1d_size,
assoc=options.l1d_assoc,
block_size=options.cacheline_size)
# When connecting the caches, the clock is also inherited
# from the CPU in question
if buildEnv['TARGET_ISA'] == 'x86':
system.cpu[i].addPrivateSplitL1Caches(icache, dcache,
PageTableWalkerCache(),
PageTableWalkerCache())
else:
system.cpu[i].addPrivateSplitL1Caches(icache, dcache)
system.cpu[i].createInterruptController()
if options.l2cache:
system.l2[i].cpu_side = system.tol2bus[i].master
system.l2[i].mem_side = system.membus.slave
system.cpu[i].connectAllPorts(system.tol2bus[i], system.membus)
else:
system.cpu[i].connectAllPorts(system.membus)
return system
Best,
Fulya
On Mon, Nov 4, 2013 at 10:35 PM, biswabandan panda <[email protected]>wrote:
> Hi,
> Could you report the number of committedInsts for both the cases.
>
>
> On Tue, Nov 5, 2013 at 7:04 AM, fulya <[email protected]> wrote:
>
>> In single core case, there is a 1 MB L2 cache. In 4-core case, each core
>> has its own private L2 cache of size 1 MB. As they are not shared, i dont
>> understand the reason for different cache miss rates.
>>
>> Best,
>> Fulya Kaplan
>>
>> On Nov 4, 2013, at 7:55 PM, "Tao Zhang" <[email protected]> wrote:
>>
>> Hi Fulya,
>>
>>
>>
>> What’s the L2 cache size of the 1-core test? Is it equal to the total
>> capacity of 4-core case? The stats indicates that 4-core test has less L2
>> cache miss rate, which may be the reason of IPC improvement.
>>
>>
>>
>> -Tao
>>
>>
>>
>> *From:* [email protected]
>> [mailto:[email protected]<[email protected]>]
>> *On Behalf Of *Fulya Kaplan
>> *Sent:* Monday, November 04, 2013 10:20 AM
>> *To:* gem5 users mailing list
>> *Subject:* [gem5-users] Weird IPC statistics for Spec2006 Multiprogram
>> mode
>>
>>
>>
>> Hi all,
>>
>> I am running Spec 2006 on X86 with the version gem5-stable-07352f119e48.
>> I am using multiprogram mode with syscall emulation. I am trying to compare
>> the IPC statistics for 2 cases:
>>
>> 1)Running benchmark A on a single core
>>
>> 2)Running 4 instances of benchmark A on a 4-core system with 1MB private
>> L2 cashes.
>>
>> All parameters are the same for the 2 runs except the number of cores.
>>
>> I am expecting some IPC decrease for the 4-core case as the cores will
>> share the same system bus. However, for CactusADM and Soplex benchmarks, I
>> see higher IPC for case2 compared to case 1.
>>
>> I look at the same phase of execution for both runs. I fastforward for 2
>> billion instructions and grab the ipc for each of the cores corresponding
>> to the next 100 million instructions in detailed mode.
>>
>> I ll report some other statistics for CactusADM to give a better idea of
>> what is going on.
>>
>> Case 1: ipc=0.664141, L2_overall _accesses=573746, L2_miss_rate=0.616
>>
>> Case 2: cpu0_ipc=0.718562, cpu1_ipc= 0.720464, cpu2_ipc=0.717405,
>> cpu3_ipc= 0.716513
>>
>> L2_0_accesses=591607, L2_1_accesses=581846,
>> L2_2_accesses=568095, L2_3_accesses=561180, L2_0_missrate=0.452978,
>> L2_1_missrate=0.454510, L2_2_missrate=0.475646, L2_3_missrate=0.488171
>>
>>
>>
>> Case 1:Running Time for 100M insts = 0.0716 sec
>>
>> Case 2:Running Time for 100M insts = 0.066273 sec
>>
>>
>>
>> Do you have any idea what could be the problem? Actually, is this a
>> problem or something expected for some benchmarks?
>>
>> Best,
>>
>> Fulya Kaplan
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> gem5-users mailing list
>> [email protected]
>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>
>>
>> _______________________________________________
>> gem5-users mailing list
>> [email protected]
>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>
>
>
>
> --
>
>
> *thanks®ards *
> *BISWABANDAN*
> http://www.cse.iitm.ac.in/~biswa/
>
> “We might fall down, but we will never lay down. We might not be the best,
> but we will beat the best! We might not be at the top, but we will rise.”
>
>
>
> _______________________________________________
> gem5-users mailing list
> [email protected]
> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users