Brad, my comments are inline.

On Tue, 11 Jan 2011, Beckmann, Brad wrote:

Sure, using a local variable to further reduce the calls to
getCacheEntry is a great idea.  I think that is orthogonal to the
suggestion I was making.  I just want the ability to directly set the
cache_entry and tbe_entry variables in the trigger function.  That way
the address, cache_entry, and tbe_entry variables are dealt with
consistently and it avoids adding the separate calls to
set_cache_entry() and set_tbe () in the inports.

Firstly, we have to set cache and transaction buffer entry variables whenever
we do allocation or deallocation of entries. This means these calls cannot be
completely avoided. Secondly, while processing events from the mandatory
queue (as it is called in the current implementations), if these variables are
not set, we will have to revert to the earlier approach. This would double the
number of times cache entry lookups are performed as the trigger function
will perform the lookup again. This would also mean that both the
approaches for looking up cache entry in the cache will have to exist
simultaneously.


Absolutely, we still need the ability to allocate or deallocate entries within actions. I'm not advocating to completely eliminate the set/unset cache and tbe entry functions. I just want to avoid including those calls in the inports. I'm confused why the mandatory queue is different than other queues. They all trigger events in the same way.

if (L1IcacheMemory.isTagPresent(in_msg.LineAddress)) {
   // The tag matches for the L1, so the L1 asks the L2 for it.
  trigger(mandatory_request_type_to_event(in_msg.Type), in_msg.LineAddress);
}

Brad, mandatory queue is just an example where an inport may perform tag lookup before cache and transaction buffer entries has been set. Above is an excerpt from the file MOESI_CMP_directory-L1cache.sm. Before the trigger() is called, isTagPresent() is called. This means tag look up is being performed before cache or transaction buffer entries have been set. Suppose the tag was present in L1Icache, then in the trigger() call, we will again perform lookup.

Similarly, there is an inport in the Hammer's protocol implementation where getCacheEntry() is called before a call to trigger(). Now, why should we use getCacheEntry() in the inport and cache entry in the action?

In MESI_CMP_directory, getState() is called from an inport. This means we cannot have an implementation of getState() that makes use of cache entry variable since it would not have been set. Now, different implementations for setState() and getState() simply does not make sense to me, so in my opinion setState() will also not use cache entry. These two function (I just went through the profile output for MOESI hammer), account for about half of the calls to isTagPresent(), the function that we are trying to get rid of.

Maybe I should point out that I'm assuming that getCacheEntry can return a NULL pointer and thus that can be passed into the trigger call when no cache or tbe entry exists.

You are correct that getCacheEntry() would have to return a NULL, another thing which I believe we earlier preferred avoiding. As an aside, we cannot use the keyword NULL since it is already in use. So, we will have to rename NULL as some thing else.

On second thought, I think it may not be necessary for getCacheEntry() to use the keyword.



Another concern is in implementation of getCacheEntry(). If this function has
to return a pointer to a cache entry, we would have to provide support for
local variables which internally SLICC would assume to be pointer variables.


Within SLICC understanding that certain variables are actually pointers is a little bit of a nuisance, but there already exists examples where we make that distinction. For instance, look at the "if para.pointer" conditionals in StateMachine.py. We just have to treat cache and tbe entries in the same fashion.

I know it is possible to declare pointers in SLICC, CacheMemory being the foremost example. But we only allow declaration of pointers. We tacitly assume that when ever they will be used, they will only be used after being dereferenced. Now, in case of getCacheEntry(), I do not see this happening. Below is the current implementation of getCacheEntry() from MOESI_CMP_directory-L1cache.sm.

Entry getCacheEntry(Address addr), return_by_ref="yes" {
  if (L1DcacheMemory.isTagPresent(addr)) {
    return static_cast(Entry, L1DcacheMemory[addr]);
  } else {
    return static_cast(Entry, L1IcacheMemory[addr]);
  }
}

I would probably convert it to some thing like this.

AbstractCacheEntry getCacheEntry(Address addr) {
  AbstractCacheEntry * cache_entry = L1DcacheMemory.lookup(addr);
  if(is_valid(cache_entry)) return cache_entry;
  return L1IcacheMemory.lookup(addr);
}

Now, if we assume that cache_entry will always be used in its dereferenced form, then we will face problem in returning cache entry. We can introduce a 'return_by_pointer' annotation for the function to handle this situtation. But you have pointed out in your previous emails that it is possible for getCacheEntry() to do more things, which may cause problems for the compiler.



In my opinion, we should maintain one method for looking up cache entries.
My own experience informs me that it is not difficult to incorporate calls to
set/unset_cache_entry () in already existing protocol implementations.
For implementing new protocols, I think the greater obstacle will be in
implementing the protocol correctly and not in using entry variables
correctly. If we document this change lucidly, there is no reason to believe a
SLICC programmer will be exceptionally pushed because of this change.

Assuming that this change does introduce some complexity in progamming
with SLICC, does that complexity out weigh the performance improvements?


My position is we can leverage SLICC as an intermediate language and achieve the performance benefits of your change without significantly impacting the programmability. I agree that we need the set/unset_cache_entry calls in the allocate and deallocate actions. I see no problem with that. I just want to treat these new implicit cache and tbe entry variables like the existing implicit variable address. Therefore I want to pass them into the trigger operation like the address variable. I also want just one method for looking up cache entries. I believe the only difference is that I would like to set the cache and tbe entries in the trigger function, as well as allowing them to be set in the actions.


I still don't think that we can avoid those calls in inports. By avoiding those calls, the job of the compiler and the programmer will only become more difficult. Currently I am of the opinion that if we avoid those calls, we should avoid this change all together.

I hope that clarifies at least what I'm envisioning. I appreciate your feedback on this and I want to reiterate that I think your change is really close to being done. If you still feel like I'm missing something, I would be happy to chat with you over-the-phone.

Brad




_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to