Re: [gem5-users] Increasing TLB size not working for X86 with O3CPU

2018-05-28 Thread Da Zhang
Hi, Jason

Sorry for my unclear description before. For our workload,
the switch_cpus.dtb's miss rate for 64 tlb entries is 154654 / 1589214 =
9.74%; the miss rate for 1048576 tlb entries is 154360 / 1583757 = 9.73%.
Both are running for 20ms warm up in atomic mode and 2.5ms real simulation
with O3CPU. They are practically identical and very high especially for
1048576 entries with only 1MB heap size.

Any idea or suggestions? Please let me know if other statistics or config
information will be helpful.

best,
Da

On Mon, May 28, 2018 at 12:03 PM, Jason Lowe-Power 
wrote:

> Hi Da,
>
> "For size > 512, the whole stats.txt is identical."
>
> This isn't surprising. 512*4KB = 2MB. So, if your workload is only 1MB
> when you have at least 512 entries you are only seeing compulsory (cold)
> misses. Try running larger workloads and/or workloads with more reuse.
>
> Cheers,
> Jason
>
> On Thu, May 24, 2018 at 9:11 AM Da Zhang  wrote:
>
>> I am using FS mode.
>>
>> On Thu, May 24, 2018 at 12:00 PM, Jason Lowe-Power 
>> wrote:
>>
>>> Hi Da,
>>>
>>> Are you using SE mode or FS mode? IIRC, the TLB size does nothing in SE
>>> mode (it doesn't use a TLB). The TLB is only used in FS mode.
>>>
>>> Jason
>>>
>>> On Thu, May 24, 2018 at 8:45 AM Da Zhang  wrote:
>>>
 Hey guys,

 I tried to increase the dtb size (i.e., number of tlb entries) for our
 research. However, the stats.txt for the different dtb size
 (64,128,256,512,1024,2048,1048576) is practical identical or
 identical. For size < 512, the system.switch_cpus.dtb.rdAccesses difference
 is only several hundred. For size > 512, the whole stats.txt is identical.
 I am working for the X86 architecture. I change the size in X86TLB.py to
 increase the dtb size. By checking the config.ini file, I see the size is
 set as expected (under system.cpu.dtb). Any clue?

 Thanks in advance.

 Best,
 Da


 ___
 gem5-users mailing list
 gem5-users@gem5.org
 http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>
>>>
>>> ___
>>> gem5-users mailing list
>>> gem5-users@gem5.org
>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>
>>
>> ___
>> gem5-users mailing list
>> gem5-users@gem5.org
>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>
>
> ___
> gem5-users mailing list
> gem5-users@gem5.org
> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] x86 floating point instruction

2018-05-28 Thread Tariq Azmy
Thanks Jason. I also came across the same document earlier but I just
wanted to ask about this in general.

On Mon, May 28, 2018 at 11:09 AM, Jason Lowe-Power 
wrote:

> Hi Tariq,
>
> It's up to you what you want the latency for SSE instructions to be. It
> depends on what architecture you're simulating. Unfortunately, we currently
> don't have any "known good" configurations for x86 cores so you'll have to
> come up with your own :). Here's some examples of numbers you could use.
> http://www.agner.org/optimize/instruction_tables.pdf
>
> Cheers,
> Jason
>
> On Fri, May 25, 2018 at 2:12 PM Tariq Azmy 
> wrote:
>
>> Hi Gabe, Jason,
>>
>> Are those x86 SIMD SSE arithmetic instructions take only one cycle as
>> latency? I looked into the FuncUnitConfig.py and seems like the op lats for
>> the SIMD functional units are not defined, so I assumed it takes value of 1
>> by default.
>>
>> I am not really familiar with x86 SIMD extension, so maybe this question
>> is more related to x86 ISA in general.
>>
>> Thanks.
>>
>> On Thu, May 24, 2018 at 9:52 AM, Jason Lowe-Power 
>> wrote:
>>
>>> Hi Tariq,
>>>
>>> It wold be great if you could review Gabe's patch on gerrit. Since it
>>> works for you, giving it a +1 or a +2 would be appropriate.
>>>
>>> Cheers,
>>> Jason
>>>
>>> On Wed, May 23, 2018 at 5:56 PM Tariq Azmy 
>>> wrote:
>>>
 Thanks Gabe. Yeah it does not impact the program but it's just that the
 statistic is incorrect.

 By the way, I applied the patch and stats now shows correct micro-ops
 entries.

 Appreciate your help. Thanks again

 On Wed, May 23, 2018 at 6:51 PM, Gabe Black 
 wrote:

> Yep, those microops aren't given a operand class, and so the isa
> parser is guessing and making the FloatAddOp. I haven't really tested this
> beyond making sure it compiles, but here's a patch that might get this
> working for you.
>
> https://gem5-review.googlesource.com/c/public/gem5/+/10541
>
> Gabe
>
> On Wed, May 23, 2018 at 4:13 PM, Gabe Black 
> wrote:
>
>> I'm confident they aren't implemented with floating point add. It's
>> likely either that the microops are misclassified, or they're 
>> unimplemented
>> and printing a warning, but the fact that they don't actually do any math
>> isn't impacting your program for whatever reason. I'll take a quick look.
>>
>> Gabe
>>
>> On Wed, May 23, 2018 at 2:07 PM, Tariq Azmy 
>> wrote:
>>
>>> Hi,
>>>
>>> I wrote simple code that does simple floating point multiplication
>>> and division operation and from the assembly, I can see there are MULSS 
>>> and
>>> DIVSS instructions. But after I ran the simulation on gem5 and looked at
>>> the stat.txt, I can only see the entries in 
>>> system.cpu.iq.FU_type_0::FloatAdd,
>>> where as the entries in FloatMul and FloatDiv remains 0.
>>>
>>> If I understand correctly, these stats refer to the micro-ops. Does
>>> that mean the MULSS and DIVSS instruction are broken down and executed 
>>> with
>>> floating point Add?
>>>
>>> Thanks
>>>
>>>
>>> ___
>>> gem5-users mailing list
>>> gem5-users@gem5.org
>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>
>>
>>
>
> ___
> gem5-users mailing list
> gem5-users@gem5.org
> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>

 ___
 gem5-users mailing list
 gem5-users@gem5.org
 http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>
>>>
>>> ___
>>> gem5-users mailing list
>>> gem5-users@gem5.org
>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>
>>
>> ___
>> gem5-users mailing list
>> gem5-users@gem5.org
>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>
>
> ___
> gem5-users mailing list
> gem5-users@gem5.org
> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] How to run a dynamically linked executable syscall emulation mode se.py in gem5?

2018-05-28 Thread Ciro Santilli
Thanks, it does work on non native QEMU user mode with the -L option:
https://github.com/cirosantilli/linux-kernel-module-cheat/tree/b60c6f1b9c8bb7c34cf2c7fbce6f035d11483d4c#qemu-user-mode

On Mon, May 28, 2018 at 5:01 PM, Jason Lowe-Power  wrote:
> Hi Ciro,
>
> As you seemed to have figured out, running dynamically linked executables
> has only been tested for x86_64 native platforms. It *is supported* if your
> binary is x86 and your native machine is x86. I'm not sure what it would
> take to get this working for native ARM machines (e.g., simulating ARM and
> running on an ARM native machine) or if you could use QEMU user mode to get
> dynamically linked executables to work on a non-native machine. Both of
> these cases would likely require some re-working of the code.
>
> Cheers,
> Jason
>
> On Sat, May 26, 2018 at 11:56 AM Ciro Santilli 
> wrote:
>>
>>
>> https://stackoverflow.com/questions/5054/how-to-run-a-dynamically-linked-executable-syscall-emulation-mode-se-py-in-gem5
>>
>> After
>> https://stackoverflow.com/questions/48959349/how-to-solve-fatal-kernel-too-old-when-running-gem5-in-syscall-emulation-se-m
>> I managed to run a statically linked hello world under certain
>> conditions.
>>
>> But if I try to run an ARM dynamically linked one against the stdlib with:
>>
>> ./out/common/gem5/build/ARM/gem5.opt
>> ./gem5/gem5/configs/example/se.py -c ./a.out
>>
>> it fails with:
>>
>> fatal: Unable to open dynamic executable's interpreter.
>>
>> How to make it find the interpreter? Hopefully without copying my
>> cross' toolchain's interpreter on my host's root.
>>
>> For x86_64 it works if I use my native compiler, and as expected
>> `strace` says that it is using the native interpreter, but it does not
>> work if I use a cross compiler.
>>
>> The current FAQ says it is not possible to use dynamic executables:
>> http://gem5.org/Frequently_Asked_Questions but I don't trust it, and
>> then these presentations mention it:
>>
>> * http://www.gem5.org/wiki/images/0/0c/2015_ws_08_dynamic-linker.pdf
>> * http://research.cs.wisc.edu/multifacet/papers/learning_gem5_tutorial.pdf
>>
>> but not how to actually use it.
>>
>> QEMU user moe has the `-L` option for that.
>>
>> Tested in gem5 49f96e7b77925837aa5bc84d4c3453ab5f07408e
>> ___
>> gem5-users mailing list
>> gem5-users@gem5.org
>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>
>
> ___
> gem5-users mailing list
> gem5-users@gem5.org
> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] x86 floating point instruction

2018-05-28 Thread Jason Lowe-Power
Hi Tariq,

It's up to you what you want the latency for SSE instructions to be. It
depends on what architecture you're simulating. Unfortunately, we currently
don't have any "known good" configurations for x86 cores so you'll have to
come up with your own :). Here's some examples of numbers you could use.
http://www.agner.org/optimize/instruction_tables.pdf

Cheers,
Jason

On Fri, May 25, 2018 at 2:12 PM Tariq Azmy  wrote:

> Hi Gabe, Jason,
>
> Are those x86 SIMD SSE arithmetic instructions take only one cycle as
> latency? I looked into the FuncUnitConfig.py and seems like the op lats for
> the SIMD functional units are not defined, so I assumed it takes value of 1
> by default.
>
> I am not really familiar with x86 SIMD extension, so maybe this question
> is more related to x86 ISA in general.
>
> Thanks.
>
> On Thu, May 24, 2018 at 9:52 AM, Jason Lowe-Power 
> wrote:
>
>> Hi Tariq,
>>
>> It wold be great if you could review Gabe's patch on gerrit. Since it
>> works for you, giving it a +1 or a +2 would be appropriate.
>>
>> Cheers,
>> Jason
>>
>> On Wed, May 23, 2018 at 5:56 PM Tariq Azmy 
>> wrote:
>>
>>> Thanks Gabe. Yeah it does not impact the program but it's just that the
>>> statistic is incorrect.
>>>
>>> By the way, I applied the patch and stats now shows correct micro-ops
>>> entries.
>>>
>>> Appreciate your help. Thanks again
>>>
>>> On Wed, May 23, 2018 at 6:51 PM, Gabe Black 
>>> wrote:
>>>
 Yep, those microops aren't given a operand class, and so the isa parser
 is guessing and making the FloatAddOp. I haven't really tested this beyond
 making sure it compiles, but here's a patch that might get this working for
 you.

 https://gem5-review.googlesource.com/c/public/gem5/+/10541

 Gabe

 On Wed, May 23, 2018 at 4:13 PM, Gabe Black 
 wrote:

> I'm confident they aren't implemented with floating point add. It's
> likely either that the microops are misclassified, or they're 
> unimplemented
> and printing a warning, but the fact that they don't actually do any math
> isn't impacting your program for whatever reason. I'll take a quick look.
>
> Gabe
>
> On Wed, May 23, 2018 at 2:07 PM, Tariq Azmy 
> wrote:
>
>> Hi,
>>
>> I wrote simple code that does simple floating point multiplication
>> and division operation and from the assembly, I can see there are MULSS 
>> and
>> DIVSS instructions. But after I ran the simulation on gem5 and looked at
>> the stat.txt, I can only see the entries in
>> system.cpu.iq.FU_type_0::FloatAdd, where as the entries in FloatMul and
>> FloatDiv remains 0.
>>
>> If I understand correctly, these stats refer to the micro-ops. Does
>> that mean the MULSS and DIVSS instruction are broken down and executed 
>> with
>> floating point Add?
>>
>> Thanks
>>
>>
>> ___
>> gem5-users mailing list
>> gem5-users@gem5.org
>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>
>
>

 ___
 gem5-users mailing list
 gem5-users@gem5.org
 http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

>>>
>>> ___
>>> gem5-users mailing list
>>> gem5-users@gem5.org
>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>
>>
>> ___
>> gem5-users mailing list
>> gem5-users@gem5.org
>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>
>
> ___
> gem5-users mailing list
> gem5-users@gem5.org
> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] Response for WritebackDirty packets (learning.gem5)

2018-05-28 Thread Jason Lowe-Power
Hi Muhammad,

Generally, if sendTimingReq fails, you have to save the packet so you can
resend it. In my Learning gem5 code, I *try* to simplify the retry logic so
that this is hidden. Instead of saving the packet in the cache code, the
packet is saved in the port code. Also, the code in Learning gem5 was
significantly simplified because it was a blocking cache with only a single
request outstanding at a time.

Jason

On Fri, May 25, 2018 at 11:50 AM Muhammad Ali Akhtar <
muhammadali...@gmail.com> wrote:

> Dear Jason,
>
> Thkns for the response. Just another quick question.
>
> What if memory was busy when u call the "sendTiimingReq" for
> WritebackDirty packet.  In insert() function, when you call
> memport.sendTimingReq for WritebackDirty blocks, you don't save them in
> blocked Packet, in case Memory is blocked and called 'sendReqRetry()" later.
>
>
>
> Muhammad Ali Akhtar
> Principal Design Engineer
> http://www.linkedin.com/in/muhammadakhtar
>
> On Tue, May 22, 2018 at 3:40 AM, Jason Lowe-Power 
> wrote:
>
>> Hello,
>>
>> No. You should not have a response for WritebackDirty. In fact, most
>> (all?) writes do not have responses. See src/mem/packet.cc. (
>> https://gem5.googlesource.com/public/gem5/+/master/src/mem/packet.cc#80)
>> Some commands have the "NeedsResponse" flag set. If so, this request will
>> be turned into a response by whatever memory object fulfills the request
>> (by calling pkt.makeResponse()).
>>
>> I hope this answers your question.
>>
>> Jason
>>
>> On Sat, May 19, 2018 at 11:38 PM Muhammad Ali Akhtar <
>> muhammadali...@gmail.com> wrote:
>>
>>> Hello All,
>>>
>>> Following jason's website, created my own cache.
>>>
>>> On Cache miss, I send the TimingReq to memory, and get the response,
>>> which I handle in "handleResponse".
>>>
>>> during HandleResponse, in case the insertion causes eviction (cache was
>>> full), the insert function generates another memPort.sendTimingReq(). This
>>> time, the pkt is WritebackDirty. However, For this TimingReq() to memory
>>> (WritebackDirty), we don't get any response from memory Write?
>>>
>>> My question is:
>>>
>>> Do we ever get a response from memory for packets of type
>>> "WritebackDirty". When I examine the simulator output, it seems that it
>>> moves on to next instrutions without waiting for response from memory for
>>> this particular request.
>>>
>>>
>>> Muhammad Ali Akhtar
>>> Principal Design Engineer
>>> http://www.linkedin.com/in/muhammadakhtar
>>> ___
>>> gem5-users mailing list
>>> gem5-users@gem5.org
>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>
>>
>> ___
>> gem5-users mailing list
>> gem5-users@gem5.org
>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>
>
> ___
> gem5-users mailing list
> gem5-users@gem5.org
> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] Increasing TLB size not working for X86 with O3CPU

2018-05-28 Thread Jason Lowe-Power
Hi Da,

"For size > 512, the whole stats.txt is identical."

This isn't surprising. 512*4KB = 2MB. So, if your workload is only 1MB when
you have at least 512 entries you are only seeing compulsory (cold) misses.
Try running larger workloads and/or workloads with more reuse.

Cheers,
Jason

On Thu, May 24, 2018 at 9:11 AM Da Zhang  wrote:

> I am using FS mode.
>
> On Thu, May 24, 2018 at 12:00 PM, Jason Lowe-Power 
> wrote:
>
>> Hi Da,
>>
>> Are you using SE mode or FS mode? IIRC, the TLB size does nothing in SE
>> mode (it doesn't use a TLB). The TLB is only used in FS mode.
>>
>> Jason
>>
>> On Thu, May 24, 2018 at 8:45 AM Da Zhang  wrote:
>>
>>> Hey guys,
>>>
>>> I tried to increase the dtb size (i.e., number of tlb entries) for our
>>> research. However, the stats.txt for the different dtb size
>>> (64,128,256,512,1024,2048,1048576) is practical identical or identical. For
>>> size < 512, the system.switch_cpus.dtb.rdAccesses difference is only
>>> several hundred. For size > 512, the whole stats.txt is identical. I am
>>> working for the X86 architecture. I change the size in X86TLB.py to
>>> increase the dtb size. By checking the config.ini file, I see the size is
>>> set as expected (under system.cpu.dtb). Any clue?
>>>
>>> Thanks in advance.
>>>
>>> Best,
>>> Da
>>>
>>>
>>> ___
>>> gem5-users mailing list
>>> gem5-users@gem5.org
>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>
>>
>> ___
>> gem5-users mailing list
>> gem5-users@gem5.org
>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>
>
> ___
> gem5-users mailing list
> gem5-users@gem5.org
> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] How to run a dynamically linked executable syscall emulation mode se.py in gem5?

2018-05-28 Thread Jason Lowe-Power
Hi Ciro,

As you seemed to have figured out, running dynamically linked executables
has only been tested for x86_64 native platforms. It *is supported* if your
binary is x86 and your native machine is x86. I'm not sure what it would
take to get this working for native ARM machines (e.g., simulating ARM and
running on an ARM native machine) or if you could use QEMU user mode to get
dynamically linked executables to work on a non-native machine. Both of
these cases would likely require some re-working of the code.

Cheers,
Jason

On Sat, May 26, 2018 at 11:56 AM Ciro Santilli 
wrote:

>
> https://stackoverflow.com/questions/5054/how-to-run-a-dynamically-linked-executable-syscall-emulation-mode-se-py-in-gem5
>
> After
> https://stackoverflow.com/questions/48959349/how-to-solve-fatal-kernel-too-old-when-running-gem5-in-syscall-emulation-se-m
> I managed to run a statically linked hello world under certain
> conditions.
>
> But if I try to run an ARM dynamically linked one against the stdlib with:
>
> ./out/common/gem5/build/ARM/gem5.opt
> ./gem5/gem5/configs/example/se.py -c ./a.out
>
> it fails with:
>
> fatal: Unable to open dynamic executable's interpreter.
>
> How to make it find the interpreter? Hopefully without copying my
> cross' toolchain's interpreter on my host's root.
>
> For x86_64 it works if I use my native compiler, and as expected
> `strace` says that it is using the native interpreter, but it does not
> work if I use a cross compiler.
>
> The current FAQ says it is not possible to use dynamic executables:
> http://gem5.org/Frequently_Asked_Questions but I don't trust it, and
> then these presentations mention it:
>
> * http://www.gem5.org/wiki/images/0/0c/2015_ws_08_dynamic-linker.pdf
> * http://research.cs.wisc.edu/multifacet/papers/learning_gem5_tutorial.pdf
>
> but not how to actually use it.
>
> QEMU user moe has the `-L` option for that.
>
> Tested in gem5 49f96e7b77925837aa5bc84d4c3453ab5f07408e
> ___
> gem5-users mailing list
> gem5-users@gem5.org
> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] RISCV ISA : "C" (compressed) extension supported?

2018-05-28 Thread Jason Lowe-Power
Hi Marcelo,

For future reference, if someone else has this issue... Another possibility
is that the branch predictor is the problem. It looks like it could be
predicting that instruction is a branch. I'm not sure if it's specifically
because of the compressed format or not, though. It's another place for the
next person to start digging.

Cheers,
Jason

On Fri, May 25, 2018 at 8:20 AM Marcelo Brandalero 
wrote:

> Hi Jason, Alec,
>
> Just to provide some feedback on this issue, it seems that the processor
> is mistakenly identifying (add reg, reg, reg) in compressed format as a
> branch instruction.
>
> I'm running a kernel that looks like this (result from 
> *riscv64-unknown-elf-objdump
> -D*)
>
> 0001019a :
>   1019a:   06400793li  a5,100
>   1019e:   4701li  a4,0
>   101a0:   4681li  a3,0
>   101a2:   4601li  a2,0
>   101a4:   0c800513li  a0,200
>   101a8:   952aadd a0,a0,a0
>   101aa:   9632add a2,a2,a2
>   101ac:   96b6add a3,a3,a3
>   101ae:   973aadd a4,a4,a4
>
>
>
>
> *   101b0:   952aadd a0,a0,a0   101b2:
>   9632add a2,a2,a2   101b4:   96b6
>add a3,a3,a3   101b6:   973a
>add a4,a4,a4*(repeat the four instructions above
> until this:)
>   104b8:   952aadd a0,a0,a0
>   104ba:   9632add a2,a2,a2
>   104bc:   96b6add a3,a3,a3
>   104be:   973aadd a4,a4,a4
>   104c0:   952aadd a0,a0,a0
>   104c2:   2501sext.w  a0,a0
>   104c4:   9632add a2,a2,a2
>   104c6:   2601sext.w  a2,a2
>   104c8:   96b6add a3,a3,a3
>   104ca:   2681sext.w  a3,a3
>   104cc:   973aadd a4,a4,a4
>   104ce:   2701sext.w  a4,a4
>   104d0:   37fdaddiw   a5,a5,-1
>   104d2:   cc079be3bneza5,101a8 
>
> And what the Fetch stage looks like when fetching this code block is this:
>
> 4048968: system.cpu.fetch: [tid:0] Waking up from cache miss.
> 4048968: system.cpu.fetch: Running stage.
> 4048968: system.cpu.fetch: Attempting to fetch from [tid:0]
> 4048968: system.cpu.fetch: [tid:0]: Icache miss is complete.
> 4048968: system.cpu.fetch: [tid:0]: Adding instructions to queue to decode.
> 4048968: system.cpu.fetch: [tid:0]: Instruction PC 0x101a8 (0) created
> [sn:8124].
> 4048968: system.cpu.fetch: [tid:0]: Instruction is: c_add a0, a0, a0
> 4048968: system.cpu.fetch: [tid:0]: Fetch queue entry created (1/256).
> *4048968: system.cpu.fetch: Branch detected with PC =
> (0x101a8=>0x101aa).(0=>1)*
> 4048968: system.cpu.fetch: [tid:0]: Done fetching, predicted branch
> instruction encountered.
> 4048968: system.cpu.fetch: [tid:0][sn:8124]: Sending instruction to decode
> from fetch queue. Fetch queue size: 1.
> 4049281: system.cpu.fetch: Running stage.
> 4049281: system.cpu.fetch: Attempting to fetch from [tid:0]
> 4049281: system.cpu.fetch: [tid:0]: Adding instructions to queue to decode.
> 4049281: system.cpu.fetch: [tid:0]: Instruction PC 0x101aa (0) created
> [sn:8125].
> 4049281: system.cpu.fetch: [tid:0]: Instruction is: c_add a2, a2, a2
> 4049281: system.cpu.fetch: [tid:0]: Fetch queue entry created (1/256).
> *4049281: system.cpu.fetch: Branch detected with PC =
> (0x101aa=>0x101ac).(0=>1)*
> 4049281: system.cpu.fetch: [tid:0]: Done fetching, predicted branch
> instruction encountered.
> 4049281: system.cpu.fetch: [tid:0][sn:8125]: Sending instruction to decode
> from fetch queue. Fetch queue size: 1.
> 4049594: system.cpu.fetch: Running stage.
> 4049594: system.cpu.fetch: Attempting to fetch from [tid:0]
> 4049594: system.cpu.fetch: [tid:0]: Adding instructions to queue to decode.
> 4049594: system.cpu.fetch: [tid:0]: Instruction PC 0x101ac (0) created
> [sn:8126].
> 4049594: system.cpu.fetch: [tid:0]: Instruction is: c_add a3, a3, a3
> 4049594: system.cpu.fetch: [tid:0]: Fetch queue entry created (1/256).
> *4049594: system.cpu.fetch: Branch detected with PC =
> (0x101ac=>0x101ae).(0=>1)*
> 4049594: system.cpu.fetch: [tid:0]: Done fetching, predicted branch
> instruction encountered.
> 4049594: system.cpu.fetch: [tid:0][sn:8126]: Sending instruction to decode
> from fetch queue. Fetch queue size: 1.
> 4049907: system.cpu.fetch: Running stage.
> 4049907: system.cpu.fetch: Attempting to fetch from [tid:0]
> 4049907: system.cpu.fetch: [tid:0]: Adding instructions to queue to decode.
> 4049907: system.cpu.fetch: [tid:0]: Instruction PC 0x101ae (0) created
> [sn:8127].
>