Re: [marss86-devel] How the latency of cache access is reflected on the ipc?

zhenyu sun Thu, 05 May 2011 09:27:04 -0700

Hi  Dramninjas,

I agree with you about the explanation. That sounds make sense.
But one thing I can not understand is when we change the cache latency to a 
very big number e.g 200 under single core mode (using cacheController.cpp) run 
mcf, the ipc doesn't change a lot.
Have u try to run this kind of simulation by changing latency to see the impact 
on ipc?

Best,
Zhenyu

Date: Thu, 5 May 2011 12:13:24 -0400
Subject: Re: [marss86-devel] How the latency of cache access is reflected on 
the ipc?
From: [email protected]
To: [email protected]; [email protected]

Also, you're right that the lines of code that you should be looking for are 
the add_event() lines -- those are calls make up the call chains. 
Though, this isn't always the case. Sometimes the code will have an add_event() 
with delay=0 and sometimes they will just call the function directly. It makes 
it a little harder to look at. When I was debugging the cache code I was 
thinking about making it uniform (i.e. always calling add_event() with delay 0 
instead of calling the function directly), but I didn't get around to it. 

If there's interest, I can do it, but I feel like not enough people read this 
code for it to matter. 

On Thu, May 5, 2011 at 12:05 PM, DRAM Ninjas <[email protected]> wrote:

On Thu, May 5, 2011 at 11:48 AM, zhenyu sun <[email protected]> wrote:

Hi DRAMninjas,

Thank you very much for your explanation.
I understand ur points. The problem I have now is I wanna figure out where is 
the  "long chain of events" in the code. Is it in the event_queue.

Because I saw one line in the code.
In the cache_access_cb:
memoryHierarchy_->add_event(signal, delay, (void*)queueEntry);

You should read this line as 'delay cycles from now, execute the function 
pointed to by signal, and pass it queueEntry as an argument'. signal and delay 
get set based on whether this is a cache miss or hit. So let's say the access 
time for this cache is 5 cycles. After 5 cycles, the cache_hit_cb() function 
will be executed. That function will probably schedule the event for 
cache_access_completed_cb(), and then that function will generate an event that 
calls wait_interconnect_cb(), which will then generate a call to 
P2PInterconnect::handle_interconnect_cb() ... etc, etc. (note: these might not 
be the actual call chains -- these chains tend to be pretty long so I don't 
remember them exactly). Each of these events might have some delay associated 
with it. In other words, cycles are ticking by while the pipeline is waiting 
for the cache request to resolve. 

I think the final terminating point for a cache request is in cpuController:: 
finalize_request() at which point the cache request is signal'd to be done in 
the pipeline and execution continues. I don't look at the ooocore/pipeline code 
very often so I can't tell you where the actual IPC statistics are computed 
(probably in the writeback phase somewhere), but that's the overall scheme by 
which IPC is implicitly determined.

I hope that was clear enough ... 
Once this function is called, 

and in the definition of add_event
event->setup(signal, sim_cycle + delay, arg);

so does it mean the delay is finally added on the sim_cycle which is used in 
the ipc calculation: number of instruction/sim_cycle?
In other words, does the add_event function which is the member of class Event 
influence the global variable?

Thank you very much.
I am really urgent to know this answer.

Best,
Zhenyu

Date: Thu, 5 May 2011 11:16:18 -0400
Subject: Re: [marss86-devel] How the latency of cache access is reflected on 
the ipc?

From: [email protected]
To: [email protected]
CC: [email protected]

I think the answer to this question is sort of complicated. If you look down 
below, the delay variable is basically used to schedule either the cache hit 
event delay cycles in the future or a cache miss event delay cycles in the 
future. 

The code you pointed out is just a small step in the life of a cache request. 
Essentially at each point in the simulation, the simulator decides which 
function to call and how far in the future. So a cache hit will execute in x 
cycles, but a cache miss might go out to memory and take x+100 cycles to 
complete. This *total* latency to service a request is the key. The pipeline 
will send a cache request out, wait for it to complete, and then retire it.  
This is really what determines the IPC. What you've pointed out in the 
cacheController is one step along this long chain of events which ends in the 
request being 'completed' and the instruction moving forward in the pipeline. 

Please correct me if I misunderstood your question. 

On Fri, Apr 29, 2011 at 10:36 PM, zhenyu sun <[email protected]> wrote:

Hi  everyone,

I am working on the CPU performance study under different cache hit latencies.

In the cacheController.cpp:

if(hit) {
delay = cacheAccessLatency_;

If the CPU encounter a stall caused by write access (dependency or buffer is 
full), how does the 'delay' influence the ipc.

In other word, in which part of the code the delay is added on the sim_cycle?

Thanks a lot.

Zhenyu Sun

_______________________________________________

http://www.marss86.org

Marss86-Devel mailing list

[email protected]

https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel

_______________________________________________
http://www.marss86.org
Marss86-Devel mailing list
[email protected]
https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel

Re: [marss86-devel] How the latency of cache access is reflected on the ipc?

Reply via email to