[m5-dev] Possible bug in cache_impl.hh functionalAccess

2009-04-30 Thread Rick Strong
Hi all, In the testing of directory coherence, I realized that Cache::functionalAccess(PacketPtr pkt, CachePort *incomingPort, CachePort *otherSidePort) had a problem. Let's say we have a 3 level cache with each core with its own L1 and L2 but a shared L3. L1A propagates a functional request,

Re: [m5-dev] Exposing Width of FP instructions before Execute

2009-04-30 Thread Steve Reinhardt
On Thu, Apr 30, 2009 at 2:54 PM, Korey Sewell wrote: > > Actually it does because -all- floating point registers are then 32 bits. > > The existence of 64 bit registers is just an illusion the instructions > > provide by gluing two 32 bit registers together internally. > I understand that for MIP

Re: [m5-dev] Exposing Width of FP instructions before Execute

2009-04-30 Thread Korey Sewell
> Actually it does because -all- floating point registers are then 32 bits. > The existence of 64 bit registers is just an illusion the instructions > provide by gluing two 32 bit registers together internally. I understand that for MIPS/SPARC, but does the same hold true for Alpha? I'm encounteri

Re: [m5-dev] Exposing Width of FP instructions before Execute

2009-04-30 Thread Gabriel Michael Black
Quoting Korey Sewell : > IMO, > I think there are two issues in play here: > 1) For each operand, 2 source registers are needed for FP double > precision access > 2) If you want to read these registers *before* execute, how do you > access them separately (in terms of size of access)? > > For (1),

Re: [m5-dev] Exposing Width of FP instructions before Execute

2009-04-30 Thread Korey Sewell
IMO, I think there are two issues in play here: 1) For each operand, 2 source registers are needed for FP double precision access 2) If you want to read these registers *before* execute, how do you access them separately (in terms of size of access)? For (1), I agree that Gabe handles it the right

Re: [m5-dev] Exposing Width of FP instructions before Execute

2009-04-30 Thread Steve Reinhardt
I don't think I can respond to every point in this discussion, but here are my general thoughts: - Though they look superficially similar, the SPARC/MIPS single/double FP accesses shouldn't necessarily be handled the same way as x86 partial-register writes. They're both awkward and ugly, but ther

Re: [m5-dev] Exposing Width of FP instructions before Execute

2009-04-30 Thread Gabriel Michael Black
Quoting Korey Sewell : >> Unless I'm misunderstanding your question, this does work in O3 with >> SPARC because it tracks each single precision floating point register >> separately. Each FP instruction is considered to have two sources for >> each double precision register it uses, and only when

Re: [m5-dev] Exposing Width of FP instructions before Execute

2009-04-30 Thread Korey Sewell
So in general, The problem seems to be: for calculating the number of source registers for instructions be a ISA or CPU issue? If both (which it seems to be), what interface should we create such that the dependencies between source registers (e.g. double precision FPs) be maintained such that the

Re: [m5-dev] Exposing Width of FP instructions before Execute

2009-04-30 Thread Korey Sewell
> Unless I'm misunderstanding your question, this does work in O3 with > SPARC because it tracks each single precision floating point register > separately. Each FP instruction is considered to have two sources for > each double precision register it uses, and only when both are marked > ready is t

Re: [m5-dev] Exposing Width of FP instructions before Execute

2009-04-30 Thread Gabriel Michael Black
Quoting Korey Sewell : >> This is a problem I ran into with SPARC in O3 because it overlaps its >> single and double precision FP registers. The way I worked around it was >> to treat all registers as the smallest size, in that case 32 bits, and >> then have the instruction glue the bits together

Re: [m5-dev] Exposing Width of FP instructions before Execute

2009-04-30 Thread Korey Sewell
> This is a problem I ran into with SPARC in O3 because it overlaps its > single and double precision FP registers. The way I worked around it was > to treat all registers as the smallest size, in that case 32 bits, and > then have the instruction glue the bits together into a, for instance, > 64 b

[m5-dev] Cron /z/m5/regression/do-regression quick

2009-04-30 Thread Cron Daemon
* do-regression: qsub timed out, retrying locally * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/simple-atomic passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/o3-timing passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/simple-timing passed. *