Re: [gem5-users] Errors on compiling gem5.fast

2014-08-26 Thread Andreas Hansson via gem5-users
Hi Nishant,

That¹s unfortunate. You can always compile without LTO by adding --no-lto
to the scons command line.

I suspect it is an issue with your compiler. Could you verify that it is
indeed built with LTO support (and linker-plugin support)?

Andreas

On 8/26/14, 2:49 AM, Nishant Borse via gem5-users gem5-users@gem5.org
wrote:

Hi,

I am unable to build gem5.fast and its terminating with the following
error.
g++: error: -fuse-linker-plugin is not supported in this configuration

I am using  gcc/4.8.2.  I did run into similar error for --plugin for
gcc/4.6.3
Is anyone aware why would I be running into this while compiling
gem5.fast. I was able to build gem5.opt without any such issues.



Thanks,
Nishant Borse
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users



-- IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium.  Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered 
in England  Wales, Company No:  2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, 
Registered in England  Wales, Company No:  2548782

___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users


Re: [gem5-users] Gem5 on multiple cores

2014-08-26 Thread Hussain Asad via gem5-users
Thank you, Andreas
*moved to gem5-users :)


On Tue, Aug 26, 2014 at 8:39 AM, Andreas Hansson andreas.hans...@arm.com
wrote:

  Hi Hussain,

  I’d suggest to ask on the gem5-users list for everyone’s benefit.

  Multi-threading invariably comes at a cost, and if you want to run say
 10 experiments, they are embarrassingly parallel. As one of the main
 purposes of gem5 is design-space exploration most users will be running
 10’s or 100’s of experiments. Thus, instead of making gem5 multi-threaded
 and “throwing performance away”, it is efficient as a single-threaded
 simulator, and I suggest to run your experiments in parallel to make use of
 your many cores/servers etc.

  Andreas

   From: Hussain Asad x7xcloudstr...@gmail.com
 Date: Tuesday, 26 August 2014 04:13
 To: Andreas Hansson andreas.hans...@arm.com
 Subject: Gem5 on multiple cores

 Hi Andreas,

  I have a quick question, I am running gem5 build on a core i7 system, but
 gem5 uses just one core of the available 8(4cores +4threads).

  Is this feature not yet implemented or am I compiling the system not
 correctly, As I would assume if it was using all my CPU cores the
 simulation would be much faster.

  running gem5 on Ubuntu 14 LTS, core i7, 8GB of RAM at the moment, should
 I move my system to University servers would it be faster in a server
 system?

  Thanks
  Best Regards
  Hussain

 -- IMPORTANT NOTICE: The contents of this email and any attachments are
 confidential and may also be privileged. If you are not the intended
 recipient, please notify the sender immediately and do not disclose the
 contents to any other person, use it for any purpose, or store or copy the
 information in any medium. Thank you.

 ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ,
 Registered in England  Wales, Company No: 2557590
 ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ,
 Registered in England  Wales, Company No: 2548782

___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] Gem5 on multiple cores

2014-08-26 Thread Steve Reinhardt via gem5-users
I'll mention that gem5 does have the foundation for parallelizing a single
simulation across multiple cores; see for example
http://repo.gem5.org/gem5/rev/2cce74fe359e.  However, if you want to model
a non-trivial configuration (i.e., one where there is communication between
threads), then you have to insert synchronization, and that does limit your
speedup, as Andreas has mentioned.

Steve


On Tue, Aug 26, 2014 at 3:03 AM, Hussain Asad via gem5-users 
gem5-users@gem5.org wrote:

 Thank you, Andreas
 *moved to gem5-users :)


 On Tue, Aug 26, 2014 at 8:39 AM, Andreas Hansson andreas.hans...@arm.com
 wrote:

  Hi Hussain,

  I’d suggest to ask on the gem5-users list for everyone’s benefit.

  Multi-threading invariably comes at a cost, and if you want to run say
 10 experiments, they are embarrassingly parallel. As one of the main
 purposes of gem5 is design-space exploration most users will be running
 10’s or 100’s of experiments. Thus, instead of making gem5 multi-threaded
 and “throwing performance away”, it is efficient as a single-threaded
 simulator, and I suggest to run your experiments in parallel to make use of
 your many cores/servers etc.

  Andreas

   From: Hussain Asad x7xcloudstr...@gmail.com
 Date: Tuesday, 26 August 2014 04:13
 To: Andreas Hansson andreas.hans...@arm.com
 Subject: Gem5 on multiple cores

 Hi Andreas,

  I have a quick question, I am running gem5 build on a core i7 system,
 but gem5 uses just one core of the available 8(4cores +4threads).

  Is this feature not yet implemented or am I compiling the system not
 correctly, As I would assume if it was using all my CPU cores the
 simulation would be much faster.

  running gem5 on Ubuntu 14 LTS, core i7, 8GB of RAM at the moment, should
 I move my system to University servers would it be faster in a server
 system?

  Thanks
  Best Regards
  Hussain

 -- IMPORTANT NOTICE: The contents of this email and any attachments are
 confidential and may also be privileged. If you are not the intended
 recipient, please notify the sender immediately and do not disclose the
 contents to any other person, use it for any purpose, or store or copy the
 information in any medium. Thank you.

 ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ,
 Registered in England  Wales, Company No: 2557590
 ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ,
 Registered in England  Wales, Company No: 2548782



 ___
 gem5-users mailing list
 gem5-users@gem5.org
 http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

[gem5-users] O3 fetch throughput when i-cache hit latency is more than 1 cycle

2014-08-26 Thread Amin Farmahini via gem5-users
Hi,

Looking at the codes for the fetch unit in O3, I realized that the fetch
unit does not take advantage of non-blocking i-caches. The fetch unit does
not initiate a new i-cache request while it is waiting for the an i-cache
response. Since fetch unit in O3 does not pipeline i-cache requests, fetch
unit throughput reduces significantly when the i-cache hit latency is more
than 1 cycle. I expected that fetch unit should be able to initiate a new
i-cache request each cycle (based on BTB addr or next sequential fetch
addr) even when fetch unit is waiting for i-cache responses. Any thoughts
on this?

I understand a large fetch buffer can mitigate this to some degree...

Thanks,
Amin
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] O3 fetch throughput when i-cache hit latency is more than 1 cycle

2014-08-26 Thread Mitch Hayenga via gem5-users
Yep,

I've thought of the need for a fully pipelined fetch as well.  However my
current method is to fake longer instruction cache latencies by leaving the
delay as 1 cycle, but make up for it by adding additional fetchToDecode
delay.   This makes the front-end latency and branch mispredict penalty the
same (for branches resolved at decode as well as execute).

I haven't yet seen a case where this adding additional latency later to
make up for the lack of real instruction cache latency makes much of a
difference.



On Tue, Aug 26, 2014 at 11:32 AM, Amin Farmahini via gem5-users 
gem5-users@gem5.org wrote:

 Hi,

 Looking at the codes for the fetch unit in O3, I realized that the fetch
 unit does not take advantage of non-blocking i-caches. The fetch unit does
 not initiate a new i-cache request while it is waiting for the an i-cache
 response. Since fetch unit in O3 does not pipeline i-cache requests, fetch
 unit throughput reduces significantly when the i-cache hit latency is more
 than 1 cycle. I expected that fetch unit should be able to initiate a new
 i-cache request each cycle (based on BTB addr or next sequential fetch
 addr) even when fetch unit is waiting for i-cache responses. Any thoughts
 on this?

 I understand a large fetch buffer can mitigate this to some degree...

 Thanks,
 Amin

 ___
 gem5-users mailing list
 gem5-users@gem5.org
 http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

[gem5-users] Kernel version vs Gem5 version

2014-08-26 Thread Fulya via gem5-users
Hi all,
It seems like using the kernel version x86_64-vmlinux-2.6.22.9.smp may have 
solved my problem that was posted in this thread:
http://www.mail-archive.com/gem5-users@gem5.org/msg10387.html
However, I am using the latest gem5 version gem5-stable-aaf017eaad7d and I only 
tested the atomic cpu without any checkpointing or fast forwarding. Are the any 
problems related to using an older kernel version (such as 
x86_64-vmlinux-2.6.22.9.smp) with gem5?
Best,
Fulya
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users


[gem5-users] How to add shared nonblocking L3 cache in gem5?

2014-08-26 Thread Prathap Kolakkampadath via gem5-users
Hi Users,


I am new to gem5 and I want to add nonblacking shared Last level cache(L3).
I could see L3 cache options in Options.py with default values set. However
there is no entry for L3 in Caches.py and CacheConfig.py.

So extending Cache.py and CacheConfig.py would be enough to create L3 cache?


Thanks,
Prathap
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] O3 fetch throughput when i-cache hit latency is more than 1 cycle

2014-08-26 Thread Amin Farmahini via gem5-users
Thanks for the response Mitch. It seems like a nice way to fake a pipelined
fetch.

Amin


On Tue, Aug 26, 2014 at 10:54 AM, Mitch Hayenga 
mitch.hayenga+g...@gmail.com wrote:

 Yep,

 I've thought of the need for a fully pipelined fetch as well.  However my
 current method is to fake longer instruction cache latencies by leaving the
 delay as 1 cycle, but make up for it by adding additional fetchToDecode
 delay.   This makes the front-end latency and branch mispredict penalty the
 same (for branches resolved at decode as well as execute).

 I haven't yet seen a case where this adding additional latency later to
 make up for the lack of real instruction cache latency makes much of a
 difference.



 On Tue, Aug 26, 2014 at 11:32 AM, Amin Farmahini via gem5-users 
 gem5-users@gem5.org wrote:

 Hi,

 Looking at the codes for the fetch unit in O3, I realized that the fetch
 unit does not take advantage of non-blocking i-caches. The fetch unit does
 not initiate a new i-cache request while it is waiting for the an i-cache
 response. Since fetch unit in O3 does not pipeline i-cache requests, fetch
 unit throughput reduces significantly when the i-cache hit latency is more
 than 1 cycle. I expected that fetch unit should be able to initiate a new
 i-cache request each cycle (based on BTB addr or next sequential fetch
 addr) even when fetch unit is waiting for i-cache responses. Any thoughts
 on this?

 I understand a large fetch buffer can mitigate this to some degree...

 Thanks,
 Amin

 ___
 gem5-users mailing list
 gem5-users@gem5.org
 http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users



___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users