Thanks a lot for pointing me to right place for memory statistics. It has great
help!
I have more question about those memory statistics.
I could find right latency for dcahce, icache, dtlb and itlb. But I could not
find any latency information for L2. L3 and main memory. I believe I enabled
them in configuration as showed in config dump in the previous post. Do I made
any mistakes on configurations? Or do I have to change the code to enable the
statistics? I am also interested in MESI statistics. Could you please let me
know how I could enable all those statistics?
(1) L2 statistics I got from produced statistics file: no latency information
......
L2 {
lat_count { (zero) }
mesi_stats { (zero) }
snooprequest { (zero) }
annul = 0; { (zero) }
latency { (zero) }
cpurequest {
stall (total 684059) {
[ 91.4% ] write (total 625174) {
[ 0.5% ] cache_port = 3175;
[ 0.0% ] buffer_full = 0; { (zero) }
[ 90.9% ] dependency = 621999;
}
[ 8.6% ] read (total 58885) {
[ 0.0% ] cache_port = 79;
[ 0.0% ] buffer_full = 0; { (zero) }
[ 8.6% ] dependency = 58806;
}
}
redirects = 0; { (zero) }
count (total 1377736) {
[ 19.3% ] miss (total 266134) {
[ 15.1% ] write = 208036;
[ 4.2% ] read = 58098;
}
[ 80.7% ] hit {
write (total 647462) {
[ 100% ] hit (total 647462) {
[ 0.0% ] forward = 0; { (zero) }
[ 100% ] hit = 647462;
}
}
read (total 464140) {
[ 100% ] hit (total 464140) {
[ 0.0% ] forward = 0; { (zero) }
[ 100% ] hit = 464140;
}
}
}
}
}
queueFull = 0; { (zero) }
}
......
(2) L3 statistics I got from produced statistics file: no information
......
L3 { (zero) }
......
(3) No main memory statistics found in file.
Thanks a lot!!!
HZ
________________________________
From: avadh patel <[email protected]>
To: Hz Xes <[email protected]>
Cc: marss86 <[email protected]>
Sent: Sat, July 31, 2010 8:44:21 AM
Subject: Re: [marss86-devel] Why can't I get cache/memory latecny statistics?
On Thu, Jul 29, 2010 at 5:12 PM, Hz Xes <[email protected]> wrote:
Hi,
>
>Very appreciated reply for helping me solve the issues of running simulator.
>After I run the simulator, I have couple of questions on the collected
>statistics.
>
>Background:
>I built “qemu-system-x86_64” at single core configuration, and run by default
>scripts of “run_bench.py” for checkpoint “blackscholes” build by
>“create_checkpoint.py”.
>
>The running configuration I used is:
>'simconfig -stats ./%s.stats -logfile .%s.log -run -stopinsns 20m -corefreq
>4000m -use-memory-hierarchy -use-shared-L3 -use-new-memory-system
>-use-memory-model –kill-after-run'
>
>Questions:
>
>(1)
>After 20 million instructions done (simulation stop and exit successfully),
>why
>couldn’t I get cache latency information? (while but miss/hit ratio
>information
>is available.)
>
>In the produced readable statics file, I could not find latency information of
>all level of cache and main memory. I did not change any hard coded
>configuration in “cacheConstant.h”. Did I make any mistakes on configuration
>or
>some other place for running the simulation?
Its not your issue, the memory stats printed in the log file are invalid and
should not be used. That is an old implementation which will be removed in
future. To get the statistics you have to use 'ptlstats' on the 'stats' file
generated by marss.
Just run './ptlstats stats_file' and it should dump a lot of data. In that if
you look at 'dcache_latency', it shows an histogram of dcache latency and also
shows average latency. You can find some more interesting stats for L1 and L2
caches in there.
>Here is what I get at end of log file:
>……
>Stopped after 18158219 cycles, 20000000 instructions and 209 seconds of sim
>time
>(cycle/sec: 86881 Hz, insns/sec: 95693, insns/cyc: 1.1014296060643392)
> kernel-insns 4.4962
> user-insns 95.5038
> kernel-cycles 13.2735
> user-cycles 86.7265
> total_uop 25808761
> total_load 0 ß why?
> load_percentage 0 ß why?
> total_store 0 ß why?
> store_percentage 0 ß why?
> total_branch 0 ß why?
> branch_percentage 0 ß why?
> branch-accuracy 95.2523
> elapse_seconds 209
> CPS 86881
> IPS 95693
> total_cycle 18158219
> per_vcpu_IPC 1.10143
> total_IPC 1.10143
> L1I_hit_rate 99.3908
> L1D_hit_rate 99.383
> L2_hit_rate 67.6867
> L1I_IF_latency –nan ß why?
> L1D_load_latency –nan ß why?
> L1_read_latency –nan ß why?
> L1D_store_latency –nan ß why?
> L2_IF_latency –nan ß why?
> L2_load_latency –nan ß why?
> L2_read_latency –nan ß why?
> L2_store_latency –nan ß why?
> L1_read_miss_latency 0 ß why?
> L1_write_miss_latency 0 ß why?
>
>
>Here is the dumped configuration at the beginning of log file
>……
>Configuration changed: Active parameters:
> -help disabled
> -domain 18446744073709551615
> -run enabled
> -stop disabled
> -native disabled
> -kill disabled
> -flush disabled
> -switch disabled
> -core ooo
> -quiet disabled
> -logfile ./blackscholes.log
> -loglevel 0
> -startlog 0
> -startlogrip 18446744073709551615
> -consolelog disabled
> -bootlog disabled
> -logbufsize 524288
> -logfilesize 67108864
> -dump-state-now disabled
> -abort-at-end disabled
> -mm-ptl_logfile
> -mm-logbuf-size 16384
> -mm-log-inline disabled
> -mm-validate disabled
> -screenshot
> -log-user-only disabled
> -ringbuf disabled
> -ringbuf-size 32768
> -flush-events disabled
> -ringbuf-trigger-rip 18446744073709551615
> -ringbuf-trigger-virt-start 0
> -ringbuf-trigger-virt-end 0
> -stats ./blackscholes.stats
> -snapshot-cycles infinity
> -snapshot-now
> -startrip 18446744073709551615
> -stopinsns 20 M
> -stopcycle infinity
> -stopiter infinity
> -stoprip 18446744073709551615
> -stop-at-marker infinity
> -stop-at-marker-hits infinity
> -stopinsns-rel infinity
> -bbinsns 65536
> -flushevery infinity
> -kill-after-run enabled
> -event-record
> -event-record-stop disabled
> -event-replay
> -corefreq 4 G
> -timerfreq 100
> -pseudo-rtc disabled
> -realtime disabled
> -maskints disabled
> -console-mfn 0
> -pause disabled
> -perfctr
> -force-native disabled
> -kill-after-finish disabled
> -exit-after-finish disabled
> -validate disabled
> -validate-start-cycle 0
> -perfect-cache disabled
> -dumpcode test.dat
> -dump-at-end disabled
> -overshoot-and-dump disabled
> -bbdump
> -L1-IP-based-prefetch disabled
> -L2-IP-based-prefetch disabled
> -L1-nextline-prefetch disabled
> -L2-nextline-prefetch disabled
> -use_GHB_prefetcher disabled
> -stride-prefetcher disabled
> -distance-prefetcher disabled
> -prefetch-degree 1
> -use-memory-model enabled
> -wait-all-finished disabled
> -perfect-L2 disabled
> -prefetch-own-line disabled
> -use-new-memory-system enabled
> -verify-cache disabled
> -comparing-cache disabled
> -trace-memory-updates disabled
> -trace-memory-updates-ptl_logfile ptlsim.mem.log
> -mem-ringbuf disabled
> -mem-ringbuf-size 262144
> -mem-flush-events disabled
> -atomic_bus disabled
> -use-memory-hierarchy enabled
> -number-of-cores 1
> -cores-per-L2 1
> -max-L1-req 16
> -cache-config-type private_L2
> -use-shared-L3 enabled
> -enable-checker disabled
> -checker-startrip 18446744073709551615
> -enable-mongo disabled
> -mongo-server 127.0.0.1
> -mongo-port 27017
> -bench-name unknown
> -db-tags
>
>(2) second question:
>If I would like to configure new parameters of the memory system including
>cache
>and main memory, is the “cacheConstant.h” only place that I have to make
>changes?
This is the correct place.
- Avadh
Thanks a lot!!! Please let me know.
>
>---HZ
>
>_______________________________________________
>http://www.marss86.org
>Marss86-Devel mailing list
>[email protected]
>https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel
>
>
_______________________________________________
http://www.marss86.org
Marss86-Devel mailing list
[email protected]
https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel