On Wed, Dec 12, 2012 at 7:02 PM, hanhwi jang <[email protected]> wrote:
> I have question on forwarding logic.
>
> This is my test code.
>
> for (i = 0; i < 10000; i++){
> asm volatile(
> "nop;" //signature for identifying test code region"
> "nop;"
> "nop;"
> "nop;"
> "nop;"
> "nop;"
> "nop;"
> "nop;"
> "nop;"
> "nop;"
> "nop;"
> "nop;"
> "nop;"
> "nop;"
> "nop;"
> "nop;"
> "nop;"
> "nop;"
> "nop;"
> "nop;"
> "nop;"
> "nop;"
> "nop;"
> "nop;"
> "nop;"
> "nop;"
> "nop;"
> "nop;"
> "addq %rbx, %rax;"
> "addq %rbx, %rax;"
> "addq %rbx, %rax;"
> "addq %rbx, %rax;"
> "addq %rbx, %rax;"
> "addq %rbx, %rax;"
> "addq %rbx, %rax;"
> "addq %rbx, %rax;"
> "addq %rbx, %rax;"
> "addq %rbx, %rax;"
> "addq %rbx, %rax;"
> "addq %rbx, %rax;"
> "addq %rbx, %rax;"
> "addq %rbx, %rax;"
> "addq %rbx, %rax;"
> "addq %rbx, %rax;"
> "addq %rbx, %rax;"
> "addq %rbx, %rax;"
> );
> }
> printf("TEST DONE");
>
> In most of CPU design, if i run consecutive add instructions which are
> dependent each other, I usually expect that IPC becomes 1. but Marss86
> shows different result; Marss86 commits an add instruction every two cycles
> (IPC is 1/2).
>
> Also there are two different forwarding latency in this scenario.
> 1) add i : [Dispatch] | [I & C] | [transfer] |
> add i + 1 : [Dispatch] | [Bubble] | [Bubble] | [Issue]
> (I & C is issue and complete)
> This case is when two dependent add instructions enter in issue queue at
> the same time. In this case, add i + 1 should wait forwarding signal;
> because when it is dispatched, add i's result is not bypass state. Check
> dispatch function. Thus, it can be issued two cycles after add i is issued.
>
> 2) add i : [Dispatch] | [I & C] | [transfer] |
> add i + 1 : [Rename] | [Dispatch] | [Issue] |
> In this case, add i +1 can be issued right after add i is issued.
>
> I think every dependent add instruction should be issued every cycle.
>
> Are these intended design of Marss?
>
> This is a bug from PTLsim's ooo core model. You should move 'clock' the
issue queue before the 'issue'. I currently don't have access to the patch
we made to fix this, for some reason its not in our test servers (probably
why its not yet in public version :).
In ooo.cpp move line 716 : foreach_issueq(clock());
before line 637: for_each_cluster(i) { issue(i); }
Let me know if this doesn't work.
- Avadh
Thanks,
> - Hanhwi
>
> _______________________________________________
> http://www.marss86.org
> Marss86-Devel mailing list
> [email protected]
> https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel
>
>
_______________________________________________
http://www.marss86.org
Marss86-Devel mailing list
[email protected]
https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel