Brad, my comments are inline.
On Tue, 11 Jan 2011, Beckmann, Brad wrote:
Sure, using a local variable to further reduce the calls to
getCacheEntry is a great idea. I think that is orthogonal to the
suggestion I was making. I just want the ability to directly set the
cache_entry and tbe_entry variables in the trigger function. That way
the address, cache_entry, and tbe_entry variables are dealt with
consistently and it avoids adding the separate calls to
set_cache_entry() and set_tbe () in the inports.
Firstly, we have to set cache and transaction buffer entry variables whenever
we do allocation or deallocation of entries. This means these calls cannot be
completely avoided. Secondly, while processing events from the mandatory
queue (as it is called in the current implementations), if these variables are
not set, we will have to revert to the earlier approach. This would double the
number of times cache entry lookups are performed as the trigger function
will perform the lookup again. This would also mean that both the
approaches for looking up cache entry in the cache will have to exist
simultaneously.
Absolutely, we still need the ability to allocate or deallocate entries
within actions. I'm not advocating to completely eliminate the
set/unset cache and tbe entry functions. I just want to avoid including
those calls in the inports. I'm confused why the mandatory queue is
different than other queues. They all trigger events in the same way.
if (L1IcacheMemory.isTagPresent(in_msg.LineAddress)) {
// The tag matches for the L1, so the L1 asks the L2 for it.
trigger(mandatory_request_type_to_event(in_msg.Type), in_msg.LineAddress);
}
Brad, mandatory queue is just an example where an inport may perform tag
lookup before cache and transaction buffer entries has been set. Above is
an excerpt from the file MOESI_CMP_directory-L1cache.sm. Before the
trigger() is called, isTagPresent() is called. This means tag look up is
being performed before cache or transaction buffer entries have been set.
Suppose the tag was present in L1Icache, then in the trigger() call, we
will again perform lookup.
Similarly, there is an inport in the Hammer's protocol implementation
where getCacheEntry() is called before a call to trigger(). Now, why
should we use getCacheEntry() in the inport and cache entry in the action?
In MESI_CMP_directory, getState() is called from an inport. This means we
cannot have an implementation of getState() that makes use of cache entry
variable since it would not have been set. Now, different implementations
for setState() and getState() simply does not make sense to me, so in my
opinion setState() will also not use cache entry. These two function (I
just went through the profile output for MOESI hammer), account for about
half of the calls to isTagPresent(), the function that we are trying to
get rid of.
Maybe I should point out that I'm assuming that getCacheEntry can return
a NULL pointer and thus that can be passed into the trigger call when no
cache or tbe entry exists.
You are correct that getCacheEntry() would have to return a NULL, another
thing which I believe we earlier preferred avoiding. As an aside, we
cannot use the keyword NULL since it is already in use. So, we will have
to rename NULL as some thing else.
On second thought, I think it may not be necessary for getCacheEntry() to
use the keyword.
Another concern is in implementation of getCacheEntry(). If this function has
to return a pointer to a cache entry, we would have to provide support for
local variables which internally SLICC would assume to be pointer variables.
Within SLICC understanding that certain variables are actually pointers
is a little bit of a nuisance, but there already exists examples where
we make that distinction. For instance, look at the "if para.pointer"
conditionals in StateMachine.py. We just have to treat cache and tbe
entries in the same fashion.
I know it is possible to declare pointers in SLICC, CacheMemory being the
foremost example. But we only allow declaration of pointers. We tacitly
assume that when ever they will be used, they will only be used after
being dereferenced. Now, in case of getCacheEntry(), I do not see this
happening. Below is the current implementation of getCacheEntry() from
MOESI_CMP_directory-L1cache.sm.
Entry getCacheEntry(Address addr), return_by_ref="yes" {
if (L1DcacheMemory.isTagPresent(addr)) {
return static_cast(Entry, L1DcacheMemory[addr]);
} else {
return static_cast(Entry, L1IcacheMemory[addr]);
}
}
I would probably convert it to some thing like this.
AbstractCacheEntry getCacheEntry(Address addr) {
AbstractCacheEntry * cache_entry = L1DcacheMemory.lookup(addr);
if(is_valid(cache_entry)) return cache_entry;
return L1IcacheMemory.lookup(addr);
}
Now, if we assume that cache_entry will always be used in its dereferenced
form, then we will face problem in returning cache entry. We can introduce
a 'return_by_pointer' annotation for the function to handle this
situtation. But you have pointed out in your previous emails that it is
possible for getCacheEntry() to do more things, which may cause problems
for the compiler.
In my opinion, we should maintain one method for looking up cache entries.
My own experience informs me that it is not difficult to incorporate calls to
set/unset_cache_entry () in already existing protocol implementations.
For implementing new protocols, I think the greater obstacle will be in
implementing the protocol correctly and not in using entry variables
correctly. If we document this change lucidly, there is no reason to believe a
SLICC programmer will be exceptionally pushed because of this change.
Assuming that this change does introduce some complexity in progamming
with SLICC, does that complexity out weigh the performance improvements?
My position is we can leverage SLICC as an intermediate language and
achieve the performance benefits of your change without significantly
impacting the programmability. I agree that we need the
set/unset_cache_entry calls in the allocate and deallocate actions. I
see no problem with that. I just want to treat these new implicit cache
and tbe entry variables like the existing implicit variable address.
Therefore I want to pass them into the trigger operation like the
address variable. I also want just one method for looking up cache
entries. I believe the only difference is that I would like to set the
cache and tbe entries in the trigger function, as well as allowing them
to be set in the actions.
I still don't think that we can avoid those calls in inports. By avoiding
those calls, the job of the compiler and the programmer will only become
more difficult. Currently I am of the opinion that if we avoid those
calls, we should avoid this change all together.
I hope that clarifies at least what I'm envisioning. I appreciate your
feedback on this and I want to reiterate that I think your change is
really close to being done. If you still feel like I'm missing
something, I would be happy to chat with you over-the-phone.
Brad
_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev