[gem5-users] Branch predictor

2014-11-13 Thread Adrián Colaso Diego via gem5-users
Hi all,

I have observed that call and return instructions are predicted as if
these instructions were conditional branches. Such predictions even
modify branch history register.
As a consequence many times these instructions are predicted not-taken
and squash function must be called after they are executed.

Moreover, when a call instruction is predited taken and no target
address is found in BTB the prediction is changed to not-taken and RAS
entry associated with this call is removed. Squash function is called
after call instruction is executed to update BTB but RAS is not updated
so return instruction will get a wrong address.

Am i missing something?

Thanks,
Adrian




___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users


[gem5-users] Hit time simulation

2014-11-13 Thread Kunal Ray via gem5-users
Hi,
Does gem 5 simulate estimates for cache hit time also ?
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] Fwd: Sharing L2 cache

2014-11-13 Thread Kumail Ahmed via gem5-users
Hello Yujie,

Here is the code I wrote for enabling private L2 caches. Unfortunately, I
couldn't find the CacheConfig.py for sharing L2 in an even fashion. But I
did that in the same way.

I hope this help you.

Best regards,
Kumail Ahmed
TU Kaiserslautern,
Germany

On Thu, Nov 13, 2014 at 2:19 AM, Yujie Chen cyjseag...@163.com wrote:

 hello Kumail,
 I modify file CacheConfig.py as you guided . But unfortunately ,
 when I execute the cmd :
 build/X86/gem5.opt  \
 --stats-file=DRAM_state.log   \
 configs/example/updated_se.py  \
 --mem-type=LPDDR2_S4_1066_x32 \
 --caches \
 --cpu-type=detailed --num-cpus=4   \
 --l1i_size 32kB --l1d_size 32kB  \
 --l2cache --l2_size 4MB\
 --l3cache --l3_size 8MB  \
 -c cyj_test/bechmarks/hello
 I had got the warns:
 warn: add_child('l2'): child 'l2' already has parent
 warn: add_child('tol2bus'): child 'tol2bus' already has parent
 warn: add_child('l2'): child 'l2' already has parent
 warn: add_child('tol2bus'): child 'tol2bus' already has parent
 *and soon , it exited unexpectedly*. I have no idea about what happened.
 Can you show it more detailed to me, or show your configure file to me.
 the accessory is my configure file. Thanks for your response ^_^.


 best wishes to you!

 --
 YujieChen
 School of Computer Science and Technology
 Cluster and Grid Computing Lab
 Services Computing Technology and System Lab
 Huazhong University of Science and Technology
 Wuhan, 430074, China
 Email:yujiechen_h...@163.com

 At 2014-11-12 22:36:24, Kumail Ahmed kumai...@gmail.com wrote:

 Hello Yuije,

 In the for-loop for allocating CPUs, you can use the modulus operator to
 apply L2 caches evenly,

 you will also need to instantiate L1 private caches in a way that you can
 connect them to the system.tol2bus.

 Thank you for your answer :)

 Kumail Ahmed
 TU Kasierslautern,
 Germany

 On Wed, Nov 12, 2014 at 3:13 PM, Yujie Chen cyjseag...@163.com wrote:

  hello,kumail,
 you can reference the original configure file of cache, namely
 config/common/CacheConfigure.py , l2 cache is shared among all cpus in the
 unmodified cacheconfig.py , and the l1 caches are all private. You can
 create a l3 cache and a l3 cache bus based on the BaseCache and
 CoherentBus,and define the l3 cache's cpu_side and mem_side . You can find
 the detailed answer on the Internet.
by the way , I'm also confused about how to share 2 l2 caches
 between 4 cores : two cpus share one l2 cache. You said you make it , how
 do you make it ! I'll appreciate it if you response me! Thanks very much!




 --
 YujieChen
 School of Computer Science and Technology
 Cluster and Grid Computing Lab
 Services Computing Technology and System Lab
 Huazhong University of Science and Technology
 Wuhan, 430074, China
 Email:yujiechen_h...@163.com

 At 2014-11-11 21:01:32, Kumail Ahmed via gem5-users 
 gem5-users@gem5.org wrote:

 Thank you very much andreas :) I did it!

 Can you tell me how I can add l3cache in the classical memory model? Do I
 have to create a new l3cache class can share it on the l2 bus?

 Thanks again,
 Kumail Ahmed
 Masters Student
 TU Kaiserslautern, Germany

 On Tue, Nov 11, 2014 at 1:56 PM, Andreas Hansson andreas.hans...@arm.com
  wrote:

  Hi Kumail,

  The crossbar in gem5 supports address striping, so you can create a
 “toL2Bus” that interleaves between two L2 caches. Have a look at
 config/common/MemConfig.py for how the interleaving is configured (for the
 memory channels). You should be able to do something similar.

  Andreas

   From: Kumail Ahmed via gem5-users gem5-users@gem5.org
 Reply-To: Kumail Ahmed kumai...@gmail.com, gem5 users mailing list 
 gem5-users@gem5.org
 Date: Tuesday, 11 November 2014 10:45
 To: gem5-users@gem5.org gem5-users@gem5.org
 Subject: [gem5-users] Sharing L2 cache

  Hello,

  How an I share two L2 caches between 4 CPU cores in GEM5.  I guess I
 have to change the code in Cacheconfig.py.

  Can someone help me with this?

  Thanks,
 Kumail Ahmed

 -- IMPORTANT NOTICE: The contents of this email and any attachments are
 confidential and may also be privileged. If you are not the intended
 recipient, please notify the sender immediately and do not disclose the
 contents to any other person, use it for any purpose, or store or copy the
 information in any medium. Thank you.

 ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ,
 Registered in England  Wales, Company No: 2557590
 ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1
 9NJ, Registered in England  Wales, Company No: 2548782









# Copyright (c) 2012-2013 ARM Limited
# All rights reserved
#
# The license below extends only to copyright in the software and shall
# not be construed as granting a license to any other intellectual
# property including but not limited to intellectual property relating
# to a hardware implementation of the functionality of the software
# licensed hereunder.  You may use the software subject to the license
# terms below 

Re: [gem5-users] Branch predictor

2014-11-13 Thread Mitch Hayenga via gem5-users
Hi,

I suspect one of a few things might be behind what you think you are seeing.


1) I have observed that call and return instructions are predicted as if these
instructions were conditional branches.

The ARMv7 ISA actually does have conditional calls and returns (this is a
consequence of letting almost any instruction be predicated).  The
disassembly generated by gem5 often lacks these additional conditional
specifiers, even though the instructions are executed properly.

If this is some other ISA (like x86),  I'd verify the
IsCondCtrl/IsUncondCtrl flags on the given instruction are properly set.
This could be an ISA-specific error.   This code in the predictor is
supposed to guard against that case, but its up to the ISA implementer to
do things right.

if (inst-isUncondCtrl()) {
DPRINTF(Branch, [tid:%i]: Unconditional control.\n, tid);
pred_taken = true;
// Tell the BP there was an unconditional branch.
uncondBranch(bp_history);
} else {
++condPredicted;
pred_taken = lookup(pc.instAddr(), bp_history);

DPRINTF(Branch, [tid:%i]: [sn:%i] Branch predictor
 predicted %i for PC %s\n, tid, seqNum,  pred_taken, pc);
}




2) Moeover, when a call instruction is predited taken and no target
address is found in BTB the prediction is changed to not-taken and RAS
entry associated with this call is removed. :

The RAS maintenance is a fun bit of code in the predictor.  There have
been bugs discovered/fixed in it recently.  Seems the fix is just to push
the proper new RAS in the squash function for the call.  This has probably
made it so long because it is unlikely to be exercised often (since the
calls will normally hit in the BTB).



On Thu, Nov 13, 2014 at 4:51 AM, Adrián Colaso Diego via gem5-users 
gem5-users@gem5.org wrote:

 Hi all,

 I have observed that call and return instructions are predicted as if
 these instructions were conditional branches. Such predictions even
 modify branch history register.
 As a consequence many times these instructions are predicted not-taken
 and squash function must be called after they are executed.

 Moreover, when a call instruction is predited taken and no target
 address is found in BTB the prediction is changed to not-taken and RAS
 entry associated with this call is removed. Squash function is called
 after call instruction is executed to update BTB but RAS is not updated
 so return instruction will get a wrong address.

 Am i missing something?

 Thanks,
 Adrian




 ___
 gem5-users mailing list
 gem5-users@gem5.org
 http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

[gem5-users] Interrupting exection in GEM5

2014-11-13 Thread Kumail Ahmed via gem5-users
Hello,

How can I interrupt an executing program between execution in SE mode?

Best regards,
Kumail
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] How to compile gem5 to statically linked program?

2014-11-13 Thread Steve Reinhardt via gem5-users
You'll probably have to modify the linker flags toward the bottom of
src/SConscript.

Steve


On Wed, Nov 12, 2014 at 4:44 PM, ni...@outlook.com via gem5-users 
gem5-users@gem5.org wrote:

 Thanks, you means add this in sconstruct file or just in the command line?

 by the way, i tried to add -static option to the CCFLAGS, but it does not
 work as i expected.

 --
 nifan


 *From:* Kumail Ahmed kumai...@gmail.com
 *Date:* 2014-11-13 00:19
 *To:* ni...@outlook.com; gem5 users mailing list gem5-users@gem5.org
 *Subject:* Re: [gem5-users] How to compile gem5 to statically linked
 program?
 Hello Nifan,

 trying using  -shared tag in SCON.


 Didn't try myself yet

 On Wed, Nov 12, 2014 at 10:14 AM, ni...@outlook.com via gem5-users 
 gem5-users@gem5.org wrote:

 Hi, all:
 Does anyone know how to compile gem5 statically so it can run on other
 machines with re-compilation?

 Thanks in advance.

 --
 nifan

 ___
 gem5-users mailing list
 gem5-users@gem5.org
 http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users



 ___
 gem5-users mailing list
 gem5-users@gem5.org
 http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

[gem5-users] Threading in Gem5

2014-11-13 Thread Thom Popovici via gem5-users
Hi! 

I am new to the GEM5 world and I have some questions regarding running 
multi-threaded application simulations. 

1. How does one stick the thread from the application on the cores of the 
simulator. I do not want SMT, I want multi core elements, and each thread 
from the application should be on one core.

2. I want the same thing with the threads but now on an SMT core. I want 
to measure contention

3. Is there a way to remove the caches from the memory hierarchy. I want 
the processor to go directly to main memory.

Thanks

___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users


[gem5-users] DRAMCTRL:Seeing an unusual behaviuor with FR-FCFS scheduler

2014-11-13 Thread Prathap Kolakkampadath via gem5-users
Hi Users,

For the following scenario:


Read0 Read1 Read2 Read3 Read4 Read5 Read6 Read7 Read8 Read9 Read10 Read11

There are 12 reads in the read queue numbered in the order of arrival.
Read 0 to Read3 access same row  of Bank1, Read4 access Bank0, Read5 to
Read8 access same row of Bank2 and Read9 to Read11 access same row of Bank3.

According to FR-FCFS scheduler, even there is only a single request Read4
to Bank0, it should be scheduled after the Read0 to Read3 are scheduled.
Because within the window of Read0-Read3, the Read4 would have done with
precharge and activate and ready to schedule. Though Read5 and Read9 are
also ready, Read4 needs to be scheduled as the next row hit, according to
FCFS.

However i see a different behaviour where Read4 is scheduled only after all
other row hits to Bank2 and Bank3 is scheduled. Also i noticed  from the
debug prints that Read4 is not becoming a row_hit.

Are we missing to mark the read as row hit after precharge and activate. I
am trying to figure this out. Is my understanding correct?

Thanks,
Prathap
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

[gem5-users] Gem5 freezing with x86 timing cpu

2014-11-13 Thread George Michelogiannakis via gem5-users
Hi all,

I have this weird problem where gem5 starts executing a benchmark (some parsec 
and some others) or starts booting in FS mode but then just freezes. I was 
wondering how you would recommend getting to the bottom of this. Can I figure 
out what it's doing? I know I can attach a debugger but this happens after many 
hours so with the debugger it'll be even slower. Also, I'm not sure if gdb's 
backtrace function will really help me.

This happens with the latest development and stable versions.

Here is how I try to run gem5 for FS (similar for SE):

./build/X86_MESI_Two_Level/gem5.opt --remote-gdb-port=0 
--outdir=results/$1_none_x86_dir --dump-config=$1_config --redirect-stdout 
--redirect-stderr --stdout-file=m5sim.out --stderr-file=$1_err 
--stats-file=$1_stats configs/example/fs.py --script=./runscripts/$1 
--disk-image=all_benchmarks.img --kernel=x86_64-vmlinux-2.6.28.4-smp 
--cpu-type=timing --caches --prefetcher-type=none --l2cache --sys-clock=2GHz 
--cpu-clock=2GHz --num-dirs=8 --num-l2caches=8 --num-l3caches=4 --l2_size=4MB 
--l3_size=32MB --l1d_assoc=2 --l1i_assoc=2 --l2_assoc=8 --l3_assoc=16 
--mem-type=DDR3_1600_x64 --cacheline_size=64 --checkpoint-dir=checkpoints 
--num-cpus=64

I tried running with atomic cpus and that seems to work, but with x86 i can't 
restore with a different type cpu nor use switchcpu during execution as I 
understand current limitations. Plus i need timing for my SE runs.

The only modification I did really is to move the prefetcher to the L2, by 
editing CacheConfig.py such as:

if options.prefetcher_type == stride:

system.l2.prefetch_on_access = 'true'
system.l2.prefetcher = StridePrefetcher(degree=8, latency=1)
elif options.prefetcher_type == tagged:
system.l2.prefetch_on_access = 'true'
system.l2.prefetcher = TaggedPrefetcher(degree=8, latency=1)
elif options.prefetcher_type == none:
system.l2.prefetch_on_access = 'false'
else:
print Unknown Prefetcher Type
sys.exit(1)


Freezing happens with all prefetchers though. Notice that I don't use Ruby.

Thank you in advance,
  George M
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users