[m5-dev] [PATCH] CPU: Make the CPU wait until initiateAcc finishes before calling completeAcc

2009-04-26 Thread gblack
# HG changeset patch
# User Gabe Black gbl...@eecs.umich.edu
# Date 1240729422 25200
# Node ID 284ff3c27bbddacf5763e7b5a54e99e4dcd1912a
# Parent  8652636856b3d24fb0088fb1af5f5dca5008d9c8
CPU: Make the CPU wait until initiateAcc finishes before calling completeAcc.

diff --git a/src/cpu/simple/timing.cc b/src/cpu/simple/timing.cc
--- a/src/cpu/simple/timing.cc
+++ b/src/cpu/simple/timing.cc
@@ -115,6 +115,8 @@
 ifetch_pkt = dcache_pkt = NULL;
 drainEvent = NULL;
 previousTick = 0;
+inInitiateAcc = false;
+completionPkt = NULL;
 changeState(SimObject::Running);
 }
 
@@ -758,7 +760,13 @@
 if (curStaticInst 
 curStaticInst-isMemRef()  !curStaticInst-isDataPrefetch()) {
 // load or store: just send to dcache
+inInitiateAcc = true;
 Fault fault = curStaticInst-initiateAcc(this, traceData);
+inInitiateAcc = false;
+if (completionPkt) {
+completeDataAccess(completionPkt);
+completionPkt = NULL;
+}
 if (_status != Running) {
 // instruction will complete in dcache response callback
 assert(_status == DcacheWaitResponse ||
@@ -856,6 +864,10 @@
 void
 TimingSimpleCPU::completeDataAccess(PacketPtr pkt)
 {
+if (inInitiateAcc) {
+completionPkt = pkt;
+return;
+}
 // received a response from the dcache: complete the load or store
 // instruction
 assert(!pkt-isError());
diff --git a/src/cpu/simple/timing.hh b/src/cpu/simple/timing.hh
--- a/src/cpu/simple/timing.hh
+++ b/src/cpu/simple/timing.hh
@@ -317,6 +317,9 @@
 
 Tick previousTick;
 
+bool inInitiateAcc;
+PacketPtr completionPkt;
+
   public:
 
 virtual Port *getPort(const std::string if_name, int idx = -1);
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] [PATCH] CPU: Make the CPU wait until initiateAcc finishes before calling completeAcc

2009-04-26 Thread Gabe Black
This patch should fix the issue Geoff was having where tracing would
cause a segfault. If you could please give it a try, Geoff, let me know
if it actually solves the problem. This will add a small overhead for
every memory instruction even though the old behavior is wrong only in
certain circumstances. I considered trying to avoid that overhead in
cases where the problem can't happen, but then tracing would change the
behavior of the CPU and that seems like a bad idea. Assuming this
actually fixes the bug and nobody objects I'll commit this soon.

Gabe

gbl...@eecs.umich.edu wrote:
 # HG changeset patch
 # User Gabe Black gbl...@eecs.umich.edu
 # Date 1240729422 25200
 # Node ID 284ff3c27bbddacf5763e7b5a54e99e4dcd1912a
 # Parent  8652636856b3d24fb0088fb1af5f5dca5008d9c8
 CPU: Make the CPU wait until initiateAcc finishes before calling completeAcc.

 diff --git a/src/cpu/simple/timing.cc b/src/cpu/simple/timing.cc
 --- a/src/cpu/simple/timing.cc
 +++ b/src/cpu/simple/timing.cc
 @@ -115,6 +115,8 @@
  ifetch_pkt = dcache_pkt = NULL;
  drainEvent = NULL;
  previousTick = 0;
 +inInitiateAcc = false;
 +completionPkt = NULL;
  changeState(SimObject::Running);
  }
  
 @@ -758,7 +760,13 @@
  if (curStaticInst 
  curStaticInst-isMemRef()  !curStaticInst-isDataPrefetch()) {
  // load or store: just send to dcache
 +inInitiateAcc = true;
  Fault fault = curStaticInst-initiateAcc(this, traceData);
 +inInitiateAcc = false;
 +if (completionPkt) {
 +completeDataAccess(completionPkt);
 +completionPkt = NULL;
 +}
  if (_status != Running) {
  // instruction will complete in dcache response callback
  assert(_status == DcacheWaitResponse ||
 @@ -856,6 +864,10 @@
  void
  TimingSimpleCPU::completeDataAccess(PacketPtr pkt)
  {
 +if (inInitiateAcc) {
 +completionPkt = pkt;
 +return;
 +}
  // received a response from the dcache: complete the load or store
  // instruction
  assert(!pkt-isError());
 diff --git a/src/cpu/simple/timing.hh b/src/cpu/simple/timing.hh
 --- a/src/cpu/simple/timing.hh
 +++ b/src/cpu/simple/timing.hh
 @@ -317,6 +317,9 @@
  
  Tick previousTick;
  
 +bool inInitiateAcc;
 +PacketPtr completionPkt;
 +
public:
  
  virtual Port *getPort(const std::string if_name, int idx = -1);
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev
   

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Configuration files/tables for/in the simulated guest.

2009-04-26 Thread Gabe Black
Before I get rolling on this too seriously, should the builder function
go within the python compiled into M5, or should it go into the post
compile config files? Do we have a rule of thumb for what goes where?
This seems important enough to be built in and not repeated in every
user config, but it's also not a SimObject or anything like that.

Gabe

Steve Reinhardt wrote:
 I agree with Nate... I think deducing the logical system configuration
 directly from an arbitrary M5 SimObject tree is going to be next to
 impossible, given that the tree doesn't reflect the memory hierarchy
 interconnection topology, and since the M5 tree could very easily
 correspond to an undescribable config from the BIOS perspective.

 I wouldn't worry too much in the short term about making your system
 builder flexible; is it even possible to describe a non-traditional
 configuration like a mesh in the BIOS tables?  I think having a basic
 tool that supports a handful of straightforward configs is an
 appropriate first step; people that feel the need to do more than that
 can be responsible for generalizing your solution as needed.  I'm sure
 you've heard my spiel about spending too much time designing in
 unnecessary generality up front...

 Steve

 On Tue, Apr 21, 2009 at 12:36 AM, Gabe Black gbl...@eecs.umich.edu
 mailto:gbl...@eecs.umich.edu wrote:

 It seems reasonable, although I'm sure it will be easier said than
 done
 one you get down to the details. The challenge will be putting it
 together so that it doesn't make so many assumptions that it becomes
 useless outside of traditional sorts of configurations like, for
 instance, meshes. I agree that trying to guess what the user really
 wanted your at best going to be close most of the time.

 Gabe

 nathan binkert wrote:
  My initial reaction is that it's probably better to build some
 sort of
  semi-configurable system builder function that creates a system with
  the right number of CPUs and the right topology of caches.  If
 you do
  try to deduce what the user had in mind, I just think you're
 probably
  in for a very fragile system.  Does this sort of thing seem
  reasonable?
 
Nate
 
 
  I reordered my patches so that I could push this, and now all
 of the
  remaining ones are dependent on or are hacks that force there
 to be two
  CPUs. In order to push these last ~5 patches, I need to know
 how many
  CPUs there are when I'm constructing the BIOS tables on the
 python side
  of things.
 
  First, I think it would be useful to have a mechanism that will
 allow
  enumerating a simobjects children that are also simobjects.
 That way you
  could, for instance, go through and gather up all the CPUs to count
  them, assign an ISA level ID to them, etc. There are potential
  complications where you might leak into some other system,
 not be able
  to follow certain configurations correctly, not make the jump over
  ports, etc that might make this more complicated. There may
 also be CPUs
  you don't want to count that you'll, for instance, switch over
 to later.
 
  Besides CPUs, these tables also tend to have information about
 caches,
  and at least on the x86 side they expect to be identified by depth.
  There would need to be some way of identifying how far a particular
  cache is from a particular CPU in order to label it the, for
 instance,
  L2. I think you may also need to know what CPUs are connected to a
  particular cache so you can call it shared, etc. These sorts of
 names
  may not make sense in whatever crazy configuration somebody
 sets up.
 
  Another issue is determining bus topology and interrupt
 assignment. In
  x86 each bus needs to be identified with a particular
 technology, and
  all interrupts need to be described as far as who generates
 them, what
  bus they originate on, and where they ultimately end up. Linux gets
  confused when at least the interrupt information is wrong and
 things get
  lost.
 
  I think it's important for this to be automated as much as possible
  because I will attest it can be tricky to get everything right.
 Having a
  system in place which automatically builds up a semi-qualitative
  description of whatever whacky configuration people want to
 throw at it
  is going to be very challenging, though, because it might not
 always be
  clear what the right thing to do is when setting things up
 manually, and
  fixing a generated configuration may actually be harder than
 building it
  yourself.
 
  How this works has the potential to introduce a lot of
 problems, and if
  we get it wrong it could take a major effort to fix an entrenched
  system. I see this primarily as an 

[m5-dev] Cron m5t...@zizzer /z/m5/regression/do-regression --scratch all

2009-04-26 Thread Cron Daemon

See /z/m5/regression/regress-2009-04-26-03:00:01 for details.

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Configuration files/tables for/in the simulated guest.

2009-04-26 Thread Gabe Black
I don't think a builder function that does everything I want will fit
into the layout of the config stuff as is. I'm just going to augment
what I've got to consider the number of CPUs.

Gabe

Gabe Black wrote:
 Before I get rolling on this too seriously, should the builder function
 go within the python compiled into M5, or should it go into the post
 compile config files? Do we have a rule of thumb for what goes where?
 This seems important enough to be built in and not repeated in every
 user config, but it's also not a SimObject or anything like that.

 Gabe

 Steve Reinhardt wrote:
   
 I agree with Nate... I think deducing the logical system configuration
 directly from an arbitrary M5 SimObject tree is going to be next to
 impossible, given that the tree doesn't reflect the memory hierarchy
 interconnection topology, and since the M5 tree could very easily
 correspond to an undescribable config from the BIOS perspective.

 I wouldn't worry too much in the short term about making your system
 builder flexible; is it even possible to describe a non-traditional
 configuration like a mesh in the BIOS tables?  I think having a basic
 tool that supports a handful of straightforward configs is an
 appropriate first step; people that feel the need to do more than that
 can be responsible for generalizing your solution as needed.  I'm sure
 you've heard my spiel about spending too much time designing in
 unnecessary generality up front...

 Steve

 On Tue, Apr 21, 2009 at 12:36 AM, Gabe Black gbl...@eecs.umich.edu
 mailto:gbl...@eecs.umich.edu wrote:

 It seems reasonable, although I'm sure it will be easier said than
 done
 one you get down to the details. The challenge will be putting it
 together so that it doesn't make so many assumptions that it becomes
 useless outside of traditional sorts of configurations like, for
 instance, meshes. I agree that trying to guess what the user really
 wanted your at best going to be close most of the time.

 Gabe

 nathan binkert wrote:
  My initial reaction is that it's probably better to build some
 sort of
  semi-configurable system builder function that creates a system with
  the right number of CPUs and the right topology of caches.  If
 you do
  try to deduce what the user had in mind, I just think you're
 probably
  in for a very fragile system.  Does this sort of thing seem
  reasonable?
 
Nate
 
 
  I reordered my patches so that I could push this, and now all
 of the
  remaining ones are dependent on or are hacks that force there
 to be two
  CPUs. In order to push these last ~5 patches, I need to know
 how many
  CPUs there are when I'm constructing the BIOS tables on the
 python side
  of things.
 
  First, I think it would be useful to have a mechanism that will
 allow
  enumerating a simobjects children that are also simobjects.
 That way you
  could, for instance, go through and gather up all the CPUs to count
  them, assign an ISA level ID to them, etc. There are potential
  complications where you might leak into some other system,
 not be able
  to follow certain configurations correctly, not make the jump over
  ports, etc that might make this more complicated. There may
 also be CPUs
  you don't want to count that you'll, for instance, switch over
 to later.
 
  Besides CPUs, these tables also tend to have information about
 caches,
  and at least on the x86 side they expect to be identified by depth.
  There would need to be some way of identifying how far a particular
  cache is from a particular CPU in order to label it the, for
 instance,
  L2. I think you may also need to know what CPUs are connected to a
  particular cache so you can call it shared, etc. These sorts of
 names
  may not make sense in whatever crazy configuration somebody
 sets up.
 
  Another issue is determining bus topology and interrupt
 assignment. In
  x86 each bus needs to be identified with a particular
 technology, and
  all interrupts need to be described as far as who generates
 them, what
  bus they originate on, and where they ultimately end up. Linux gets
  confused when at least the interrupt information is wrong and
 things get
  lost.
 
  I think it's important for this to be automated as much as possible
  because I will attest it can be tricky to get everything right.
 Having a
  system in place which automatically builds up a semi-qualitative
  description of whatever whacky configuration people want to
 throw at it
  is going to be very challenging, though, because it might not
 always be
  clear what the right thing to do is when setting things up
 manually, and
  fixing a generated configuration may actually be harder than
 building 

[m5-dev] changeset in m5: X86: Record the initial APIC ID which identifie...

2009-04-26 Thread Gabe Black
changeset 4f8af2f3185f in /z/repo/m5
details: http://repo.m5sim.org/m5?cmd=changeset;node=4f8af2f3185f
description:
X86: Record the initial APIC ID which identifies an APIC in M5.
The ID as exposed to software can be changed. Tracking those changes in 
M5
would be cumbersome, especially since there's no guarantee the IDs will 
remain
unique.

diffstat:

4 files changed, 23 insertions(+), 23 deletions(-)
src/arch/x86/interrupts.cc |   36 
src/arch/x86/interrupts.hh |2 ++
src/dev/x86/i82094aa.cc|2 +-
src/dev/x86/i82094aa.hh|6 --

diffs (139 lines):

diff -r 9327451a8e7a -r 4f8af2f3185f src/arch/x86/interrupts.cc
--- a/src/arch/x86/interrupts.ccSun Apr 26 02:04:32 2009 -0700
+++ b/src/arch/x86/interrupts.ccSun Apr 26 02:06:21 2009 -0700
@@ -295,17 +295,21 @@
 void
 X86ISA::Interrupts::setCPU(BaseCPU * newCPU)
 {
+assert(newCPU);
+if (cpu != NULL  cpu-cpuId() != newCPU-cpuId()) {
+panic(Local APICs can't be moved between CPUs
+ with different IDs.\n);
+}
 cpu = newCPU;
-assert(cpu);
-regs[APIC_ID] = (cpu-cpuId()  24);
+initialApicId = cpu-cpuId();
+regs[APIC_ID] = (initialApicId  24);
 }
 
 
 Tick
 X86ISA::Interrupts::recvMessage(PacketPtr pkt)
 {
-uint8_t id = (regs[APIC_ID]  24);
-Addr offset = pkt-getAddr() - x86InterruptAddress(id, 0);
+Addr offset = pkt-getAddr() - x86InterruptAddress(initialApicId, 0);
 assert(pkt-cmd == MemCmd::MessageReq);
 switch(offset)
 {
@@ -315,9 +319,6 @@
 DPRINTF(LocalApic,
 Got Trigger Interrupt message with vector %#x.\n,
 message.vector);
-// Make sure we're really supposed to get this.
-assert((message.destMode == 0  message.destination == id) ||
-   (bits((int)message.destination, id)));
 
 requestInterrupt(message.vector,
 message.deliveryMode, message.trigger);
@@ -354,10 +355,10 @@
 void
 X86ISA::Interrupts::addressRanges(AddrRangeList range_list)
 {
-uint8_t id = (regs[APIC_ID]  24);
 range_list.clear();
-RangeAddr range = RangeEx(x86LocalAPICAddress(id, 0),
-x86LocalAPICAddress(id, 0) + PageBytes);
+RangeAddr range = RangeEx(x86LocalAPICAddress(initialApicId, 0),
+x86LocalAPICAddress(initialApicId, 0) + 
+PageBytes);
 range_list.push_back(range);
 pioAddr = range.start;
 }
@@ -366,10 +367,10 @@
 void
 X86ISA::Interrupts::getIntAddrRange(AddrRangeList range_list)
 {
-uint8_t id = (regs[APIC_ID]  24);
 range_list.clear();
-range_list.push_back(RangeEx(x86InterruptAddress(id, 0),
-x86InterruptAddress(id, 0) + PhysAddrAPICRangeSize));
+range_list.push_back(RangeEx(x86InterruptAddress(initialApicId, 0),
+x86InterruptAddress(initialApicId, 0) +
+PhysAddrAPICRangeSize));
 }
 
 
@@ -515,14 +516,9 @@
 {
 int numContexts = sys-numContexts();
 pendingIPIs += (numContexts - 1);
-// We have no way to get at the thread context we're part
-// of, so we'll just have to go with the CPU for now.
-hack_once(Broadcast IPIs can't handle more than 
-one context per CPU.\n);
-int myId = cpu-getContext(0)-contextId();
 for (int i = 0; i  numContexts; i++) {
 int thisId = sys-getThreadContext(i)-contextId();
-if (thisId != myId) {
+if (thisId != initialApicId) {
 PacketPtr pkt = buildIntRequest(thisId, message);
 if (timing)
 intPort-sendMessageTiming(pkt, latency);
@@ -589,7 +585,7 @@
 pendingInit(false), initVector(0),
 pendingStartup(false), startupVector(0),
 startedUp(false), pendingUnmaskableInt(false),
-pendingIPIs(0)
+pendingIPIs(0), cpu(NULL)
 {
 pioSize = PageBytes;
 memset(regs, 0, sizeof(regs));
diff -r 9327451a8e7a -r 4f8af2f3185f src/arch/x86/interrupts.hh
--- a/src/arch/x86/interrupts.hhSun Apr 26 02:04:32 2009 -0700
+++ b/src/arch/x86/interrupts.hhSun Apr 26 02:06:21 2009 -0700
@@ -191,6 +191,8 @@
 
 BaseCPU *cpu;
 
+int initialApicId;
+
   public:
 /*
  * Params stuff.
diff -r 9327451a8e7a -r 4f8af2f3185f src/dev/x86/i82094aa.cc
--- a/src/dev/x86/i82094aa.cc   Sun Apr 26 02:04:32 2009 -0700
+++ b/src/dev/x86/i82094aa.cc   Sun Apr 26 02:06:21 2009 -0700
@@ -40,7 +40,7 @@
 extIntPic(p-external_int_pic)
 {
 // This assumes there's only one I/O APIC in the system
-id = p-apic_id;
+initialApicId = id = p-apic_id;
 assert(id = 0xf);
 arbId = id;
 regSel = 0;
diff -r 

[m5-dev] changeset in m5: X86: Make the local APICs register themselves w...

2009-04-26 Thread Gabe Black
changeset d3ee4e0d690c in /z/repo/m5
details: http://repo.m5sim.org/m5?cmd=changeset;node=d3ee4e0d690c
description:
X86: Make the local APICs register themselves with the IO APIC.
This is a hack so that the IO APIC can figure out information about the 
local
APICs. The local APICs still have no way to find out about each other.
Ideally, when the local APICs update state that's relevant to somebody 
else,
they'd send an update to everyone. Without being able to do a 
broadcast, that
would still require knowing who else there is to notify. Other 
broadcasts are
implemented using assumptions that may not always be true.

diffstat:

4 files changed, 31 insertions(+)
src/arch/x86/interrupts.cc |   13 +
src/arch/x86/interrupts.hh |5 +
src/dev/x86/i82094aa.cc|7 +++
src/dev/x86/i82094aa.hh|6 ++

diffs (96 lines):

diff -r 4f8af2f3185f -r d3ee4e0d690c src/arch/x86/interrupts.cc
--- a/src/arch/x86/interrupts.ccSun Apr 26 02:06:21 2009 -0700
+++ b/src/arch/x86/interrupts.ccSun Apr 26 02:09:13 2009 -0700
@@ -59,6 +59,9 @@
 #include arch/x86/interrupts.hh
 #include arch/x86/intmessage.hh
 #include cpu/base.hh
+#include dev/x86/i82094aa.hh
+#include dev/x86/pc.hh
+#include dev/x86/south_bridge.hh
 #include mem/packet_access.hh
 #include sim/system.hh
 
@@ -306,6 +309,16 @@
 }
 
 
+void
+X86ISA::Interrupts::init()
+{
+BasicPioDevice::init();
+Pc * pc = dynamic_castPc *(platform);
+assert(pc);
+pc-southBridge-ioApic-registerLocalApic(initialApicId, this);
+}
+
+
 Tick
 X86ISA::Interrupts::recvMessage(PacketPtr pkt)
 {
diff -r 4f8af2f3185f -r d3ee4e0d690c src/arch/x86/interrupts.hh
--- a/src/arch/x86/interrupts.hhSun Apr 26 02:06:21 2009 -0700
+++ b/src/arch/x86/interrupts.hhSun Apr 26 02:09:13 2009 -0700
@@ -214,6 +214,11 @@
 }
 
 /*
+ * Initialize this object by registering it with the IO APIC.
+ */
+void init();
+
+/*
  * Functions to interact with the interrupt port from IntDev.
  */
 Tick read(PacketPtr pkt);
diff -r 4f8af2f3185f -r d3ee4e0d690c src/dev/x86/i82094aa.cc
--- a/src/dev/x86/i82094aa.cc   Sun Apr 26 02:06:21 2009 -0700
+++ b/src/dev/x86/i82094aa.cc   Sun Apr 26 02:09:13 2009 -0700
@@ -182,6 +182,13 @@
 pinStates[number] = false;
 }
 
+void
+X86ISA::I82094AA::registerLocalApic(int initialId, Interrupts *localApic)
+{
+assert(localApic);
+localApics[initialId] = localApic;
+}
+
 X86ISA::I82094AA *
 I82094AAParams::create()
 {
diff -r 4f8af2f3185f -r d3ee4e0d690c src/dev/x86/i82094aa.hh
--- a/src/dev/x86/i82094aa.hh   Sun Apr 26 02:06:21 2009 -0700
+++ b/src/dev/x86/i82094aa.hh   Sun Apr 26 02:09:13 2009 -0700
@@ -37,10 +37,13 @@
 #include dev/x86/intdev.hh
 #include params/I82094AA.hh
 
+#include map
+
 namespace X86ISA
 {
 
 class I8259;
+class Interrupts;
 
 class I82094AA : public PioDevice, public IntDev
 {
@@ -67,6 +70,8 @@
 
 I8259 * extIntPic;
 
+std::mapint, Interrupts * localApics;
+
 uint8_t regSel;
 uint8_t initialApicId;
 uint8_t id;
@@ -122,6 +127,7 @@
 void signalInterrupt(int line);
 void raiseInterruptPin(int number);
 void lowerInterruptPin(int number);
+void registerLocalApic(int id, Interrupts *localApic);
 };
 
 }; // namespace X86ISA
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


[m5-dev] changeset in m5: X86: Tell the function that sends int messages ...

2009-04-26 Thread Gabe Black
changeset 6cbdd76b93db in /z/repo/m5
details: http://repo.m5sim.org/m5?cmd=changeset;node=6cbdd76b93db
description:
X86: Tell the function that sends int messages who to send to instead 
of figuring it out itself.

diffstat:

4 files changed, 84 insertions(+), 56 deletions(-)
src/arch/x86/interrupts.cc |   51 
src/dev/x86/i82094aa.cc|   34 -
src/dev/x86/intdev.cc  |   48 ++---
src/dev/x86/intdev.hh  |7 +-

diffs (214 lines):

diff -r d3ee4e0d690c -r 6cbdd76b93db src/arch/x86/interrupts.cc
--- a/src/arch/x86/interrupts.ccSun Apr 26 02:09:13 2009 -0700
+++ b/src/arch/x86/interrupts.ccSun Apr 26 02:09:27 2009 -0700
@@ -510,11 +510,41 @@
 bool timing = sys-getMemoryMode() == Enums::timing;
 // Be careful no updates of the delivery status bit get lost.
 regs[APIC_INTERRUPT_COMMAND_LOW] = low;
+ApicList apics;
+int numContexts = sys-numContexts();
 switch (low.destShorthand) {
   case 0:
-pendingIPIs++;
-intPort-sendMessage(message, timing);
-newVal = regs[APIC_INTERRUPT_COMMAND_LOW];
+if (message.deliveryMode == DeliveryMode::LowestPriority) {
+panic(Lowest priority delivery mode 
+IPIs aren't implemented.\n);
+}
+if (message.destMode == 1) {
+int dest = message.destination;
+hack_once(Assuming logical destinations are 1  id.\n);
+for (int i = 0; i  numContexts; i++) {
+if (dest  0x1)
+apics.push_back(i);
+dest = dest  1;
+}
+} else {
+if (message.destination == 0xFF) {
+for (int i = 0; i  numContexts; i++) {
+if (i == initialApicId) {
+requestInterrupt(message.vector,
+message.deliveryMode, message.trigger);
+} else {
+apics.push_back(i);
+}
+}
+} else {
+if (message.destination == initialApicId) {
+requestInterrupt(message.vector,
+message.deliveryMode, message.trigger);
+} else {
+apics.push_back(message.destination);
+}
+}
+}
 break;
   case 1:
 newVal = val;
@@ -527,22 +557,17 @@
 // Fall through
   case 3:
 {
-int numContexts = sys-numContexts();
-pendingIPIs += (numContexts - 1);
 for (int i = 0; i  numContexts; i++) {
-int thisId = sys-getThreadContext(i)-contextId();
-if (thisId != initialApicId) {
-PacketPtr pkt = buildIntRequest(thisId, message);
-if (timing)
-intPort-sendMessageTiming(pkt, latency);
-else
-intPort-sendMessageAtomic(pkt);
+if (i != initialApicId) {
+apics.push_back(i);
 }
 }
 }
-newVal = regs[APIC_INTERRUPT_COMMAND_LOW];
 break;
 }
+pendingIPIs += apics.size();
+intPort-sendMessage(apics, message, timing);
+newVal = regs[APIC_INTERRUPT_COMMAND_LOW];
 }
 break;
   case APIC_LVT_TIMER:
diff -r d3ee4e0d690c -r 6cbdd76b93db src/dev/x86/i82094aa.cc
--- a/src/dev/x86/i82094aa.cc   Sun Apr 26 02:09:13 2009 -0700
+++ b/src/dev/x86/i82094aa.cc   Sun Apr 26 02:09:27 2009 -0700
@@ -28,6 +28,7 @@
  * Authors: Gabe Black
  */
 
+#include arch/x86/interrupts.hh
 #include arch/x86/intmessage.hh
 #include dev/x86/i82094aa.hh
 #include dev/x86/i8259.hh
@@ -162,7 +163,38 @@
 message.destMode = entry.destMode;
 message.level = entry.polarity;
 message.trigger = entry.trigger;
-intPort-sendMessage(message, sys-getMemoryMode() == Enums::timing);
+ApicList apics;
+int numContexts = sys-numContexts();
+if (message.destMode == 0) {
+if (message.deliveryMode == DeliveryMode::LowestPriority) {
+panic(Lowest priority delivery mode from the 
+IO APIC aren't supported in physical 
+destination mode.\n);
+}
+  

[m5-dev] changeset in m5: X86: Implement lowest priority interrupts more ...

2009-04-26 Thread Gabe Black
changeset 2bfd792b1cc0 in /z/repo/m5
details: http://repo.m5sim.org/m5?cmd=changeset;node=2bfd792b1cc0
description:
X86: Implement lowest priority interrupts more correctly.
Lowest priority interrupts are now delivered based on a rotating offset 
into
the list of potential recipients. There could be parasitic cases were a
processor gets picked on and ends up at that rotating offset all the 
time, but
it's much more likely that the group will stay consistent and the pain 
will be
distributed evenly.

diffstat:

2 files changed, 19 insertions(+), 3 deletions(-)
src/dev/x86/i82094aa.cc |   20 +---
src/dev/x86/i82094aa.hh |2 ++

diffs (49 lines):

diff -r 6cbdd76b93db -r 2bfd792b1cc0 src/dev/x86/i82094aa.cc
--- a/src/dev/x86/i82094aa.cc   Sun Apr 26 02:09:27 2009 -0700
+++ b/src/dev/x86/i82094aa.cc   Sun Apr 26 02:09:54 2009 -0700
@@ -38,7 +38,7 @@
 
 X86ISA::I82094AA::I82094AA(Params *p) : PioDevice(p), IntDev(this),
 latency(p-pio_latency), pioAddr(p-pio_addr),
-extIntPic(p-external_int_pic)
+extIntPic(p-external_int_pic), lowestPriorityOffset(0)
 {
 // This assumes there's only one I/O APIC in the system
 initialApicId = id = p-apic_id;
@@ -189,8 +189,22 @@
 apics.push_back(localApicIt-first);
 }
 }
-if (message.deliveryMode == DeliveryMode::LowestPriority) {
-panic(Lowest priority delivery mode is not implemented.\n);
+if (message.deliveryMode == DeliveryMode::LowestPriority 
+apics.size()) {
+// The manual seems to suggest that the chipset just does
+// something reasonable for these instead of actually using
+// state from the local APIC. We'll just rotate an offset
+// through the set of APICs selected above.
+uint64_t modOffset = lowestPriorityOffset % apics.size();
+lowestPriorityOffset++;
+ApicList::iterator apicIt = apics.begin();
+while (modOffset--) {
+apicIt++;
+assert(apicIt != apics.end());
+}
+int selected = *apicIt;
+apics.clear();
+apics.push_back(selected);
 }
 }
 intPort-sendMessage(apics, message,
diff -r 6cbdd76b93db -r 2bfd792b1cc0 src/dev/x86/i82094aa.hh
--- a/src/dev/x86/i82094aa.hh   Sun Apr 26 02:09:27 2009 -0700
+++ b/src/dev/x86/i82094aa.hh   Sun Apr 26 02:09:54 2009 -0700
@@ -77,6 +77,8 @@
 uint8_t id;
 uint8_t arbId;
 
+uint64_t lowestPriorityOffset;
+
 static const uint8_t TableSize = 24;
 // This implementation is based on version 0x11, but 0x14 avoids having
 // to deal with the arbitration and APIC bus guck.
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] changeset in m5: X86: Implement lowest priority interrupts more ...

2009-04-26 Thread Gabe Black
X86 can now be considered to have nominal SMP support. An SMP version of
the 2.6.22.9 linux kernel boots and fails in init in the same way as the
UP kernel. I've uploaded the kernel I used to zizzer if you want to give
it a try yourself.

Gabe

Gabe Black wrote:
 changeset 2bfd792b1cc0 in /z/repo/m5
 details: http://repo.m5sim.org/m5?cmd=changeset;node=2bfd792b1cc0
 description:
   X86: Implement lowest priority interrupts more correctly.
   Lowest priority interrupts are now delivered based on a rotating offset 
 into
   the list of potential recipients. There could be parasitic cases were a
   processor gets picked on and ends up at that rotating offset all the 
 time, but
   it's much more likely that the group will stay consistent and the pain 
 will be
   distributed evenly.

 diffstat:

 2 files changed, 19 insertions(+), 3 deletions(-)
 src/dev/x86/i82094aa.cc |   20 +---
 src/dev/x86/i82094aa.hh |2 ++

 diffs (49 lines):

 diff -r 6cbdd76b93db -r 2bfd792b1cc0 src/dev/x86/i82094aa.cc
 --- a/src/dev/x86/i82094aa.cc Sun Apr 26 02:09:27 2009 -0700
 +++ b/src/dev/x86/i82094aa.cc Sun Apr 26 02:09:54 2009 -0700
 @@ -38,7 +38,7 @@
  
  X86ISA::I82094AA::I82094AA(Params *p) : PioDevice(p), IntDev(this),
  latency(p-pio_latency), pioAddr(p-pio_addr),
 -extIntPic(p-external_int_pic)
 +extIntPic(p-external_int_pic), lowestPriorityOffset(0)
  {
  // This assumes there's only one I/O APIC in the system
  initialApicId = id = p-apic_id;
 @@ -189,8 +189,22 @@
  apics.push_back(localApicIt-first);
  }
  }
 -if (message.deliveryMode == DeliveryMode::LowestPriority) {
 -panic(Lowest priority delivery mode is not implemented.\n);
 +if (message.deliveryMode == DeliveryMode::LowestPriority 
 +apics.size()) {
 +// The manual seems to suggest that the chipset just does
 +// something reasonable for these instead of actually using
 +// state from the local APIC. We'll just rotate an offset
 +// through the set of APICs selected above.
 +uint64_t modOffset = lowestPriorityOffset % apics.size();
 +lowestPriorityOffset++;
 +ApicList::iterator apicIt = apics.begin();
 +while (modOffset--) {
 +apicIt++;
 +assert(apicIt != apics.end());
 +}
 +int selected = *apicIt;
 +apics.clear();
 +apics.push_back(selected);
  }
  }
  intPort-sendMessage(apics, message,
 diff -r 6cbdd76b93db -r 2bfd792b1cc0 src/dev/x86/i82094aa.hh
 --- a/src/dev/x86/i82094aa.hh Sun Apr 26 02:09:27 2009 -0700
 +++ b/src/dev/x86/i82094aa.hh Sun Apr 26 02:09:54 2009 -0700
 @@ -77,6 +77,8 @@
  uint8_t id;
  uint8_t arbId;
  
 +uint64_t lowestPriorityOffset;
 +
  static const uint8_t TableSize = 24;
  // This implementation is based on version 0x11, but 0x14 avoids having
  // to deal with the arbitration and APIC bus guck.
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev
   

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] [PATCH] CPU: Make the CPU wait until initiateAcc finishes before calling completeAcc

2009-04-26 Thread Korey Sewell
Can you explain how the completeAcc ever gets called before the initiatieAcc
finishes in a SimpleCPU?

I'm thinking, why isnt the SimpleCPU just stalling if there is a delay
waiting for a memory access.If that's the right behavior, then isnt the
solution to just stall the CPU (or even sleep it) and wait for initiateAcc
to finish rather than try to work around the initiateAcc/completeAcc
simultaneous call? I guess I dont understand fully the problem but it seems
workable without a delay in overhead for all other memory accesses.

Also, would this be a issue for any other CPU models (O3 or InOrder)?

On Sun, Apr 26, 2009 at 3:22 AM, Gabe Black gbl...@eecs.umich.edu wrote:

 This patch should fix the issue Geoff was having where tracing would
 cause a segfault. If you could please give it a try, Geoff, let me know
 if it actually solves the problem. This will add a small overhead for
 every memory instruction even though the old behavior is wrong only in
 certain circumstances. I considered trying to avoid that overhead in
 cases where the problem can't happen, but then tracing would change the
 behavior of the CPU and that seems like a bad idea. Assuming this
 actually fixes the bug and nobody objects I'll commit this soon.

 Gabe

 gbl...@eecs.umich.edu wrote:
  # HG changeset patch
  # User Gabe Black gbl...@eecs.umich.edu
  # Date 1240729422 25200
  # Node ID 284ff3c27bbddacf5763e7b5a54e99e4dcd1912a
  # Parent  8652636856b3d24fb0088fb1af5f5dca5008d9c8
  CPU: Make the CPU wait until initiateAcc finishes before calling
 completeAcc.
 
  diff --git a/src/cpu/simple/timing.cc b/src/cpu/simple/timing.cc
  --- a/src/cpu/simple/timing.cc
  +++ b/src/cpu/simple/timing.cc
  @@ -115,6 +115,8 @@
   ifetch_pkt = dcache_pkt = NULL;
   drainEvent = NULL;
   previousTick = 0;
  +inInitiateAcc = false;
  +completionPkt = NULL;
   changeState(SimObject::Running);
   }
 
  @@ -758,7 +760,13 @@
   if (curStaticInst 
   curStaticInst-isMemRef() 
 !curStaticInst-isDataPrefetch()) {
   // load or store: just send to dcache
  +inInitiateAcc = true;
   Fault fault = curStaticInst-initiateAcc(this, traceData);
  +inInitiateAcc = false;
  +if (completionPkt) {
  +completeDataAccess(completionPkt);
  +completionPkt = NULL;
  +}
   if (_status != Running) {
   // instruction will complete in dcache response callback
   assert(_status == DcacheWaitResponse ||
  @@ -856,6 +864,10 @@
   void
   TimingSimpleCPU::completeDataAccess(PacketPtr pkt)
   {
  +if (inInitiateAcc) {
  +completionPkt = pkt;
  +return;
  +}
   // received a response from the dcache: complete the load or store
   // instruction
   assert(!pkt-isError());
  diff --git a/src/cpu/simple/timing.hh b/src/cpu/simple/timing.hh
  --- a/src/cpu/simple/timing.hh
  +++ b/src/cpu/simple/timing.hh
  @@ -317,6 +317,9 @@
 
   Tick previousTick;
 
  +bool inInitiateAcc;
  +PacketPtr completionPkt;
  +
 public:
 
   virtual Port *getPort(const std::string if_name, int idx = -1);
  ___
  m5-dev mailing list
  m5-dev@m5sim.org
  http://m5sim.org/mailman/listinfo/m5-dev
 

 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev




-- 
--
Korey L Sewell
Graduate Student - PhD Candidate
Computer Science  Engineering
University of Michigan
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Configuration files/tables for/in the simulated guest.

2009-04-26 Thread nathan binkert
 Before I get rolling on this too seriously, should the builder function
 go within the python compiled into M5, or should it go into the post
 compile config files? Do we have a rule of thumb for what goes where?
 This seems important enough to be built in and not repeated in every
 user config, but it's also not a SimObject or anything like that.

It's hard to say.  I guess I'd lean towards saying no.  The downside
of it being compiled in is that a change requires a recompile.  The
problem is that the stuff that's not compiled in is a total mess.

  Nate
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] [PATCH] CPU: Make the CPU wait until initiateAcc finishes before calling completeAcc

2009-04-26 Thread Gabe Black
There's an explanation already in the thread with the subject  Memory
corruption inm5devrepositorywhenusing   
--trace-flags=ExecEnable. Stalling or sleeping doesn't make sense in
this situation since it's not that sort of problem.

Gabe

Korey Sewell wrote:
 Can you explain how the completeAcc ever gets called before the
 initiatieAcc finishes in a SimpleCPU?

 I'm thinking, why isnt the SimpleCPU just stalling if there is a delay
 waiting for a memory access.If that's the right behavior, then isnt
 the solution to just stall the CPU (or even sleep it) and wait for
 initiateAcc to finish rather than try to work around the
 initiateAcc/completeAcc simultaneous call? I guess I dont understand
 fully the problem but it seems workable without a delay in overhead
 for all other memory accesses.

 Also, would this be a issue for any other CPU models (O3 or InOrder)?

 On Sun, Apr 26, 2009 at 3:22 AM, Gabe Black gbl...@eecs.umich.edu
 mailto:gbl...@eecs.umich.edu wrote:

 This patch should fix the issue Geoff was having where tracing would
 cause a segfault. If you could please give it a try, Geoff, let me
 know
 if it actually solves the problem. This will add a small overhead for
 every memory instruction even though the old behavior is wrong only in
 certain circumstances. I considered trying to avoid that overhead in
 cases where the problem can't happen, but then tracing would
 change the
 behavior of the CPU and that seems like a bad idea. Assuming this
 actually fixes the bug and nobody objects I'll commit this soon.

 Gabe

 gbl...@eecs.umich.edu mailto:gbl...@eecs.umich.edu wrote:
  # HG changeset patch
  # User Gabe Black gbl...@eecs.umich.edu
 mailto:gbl...@eecs.umich.edu
  # Date 1240729422 25200
  # Node ID 284ff3c27bbddacf5763e7b5a54e99e4dcd1912a
  # Parent  8652636856b3d24fb0088fb1af5f5dca5008d9c8
  CPU: Make the CPU wait until initiateAcc finishes before calling
 completeAcc.
 
  diff --git a/src/cpu/simple/timing.cc b/src/cpu/simple/timing.cc
  --- a/src/cpu/simple/timing.cc
  +++ b/src/cpu/simple/timing.cc
  @@ -115,6 +115,8 @@
   ifetch_pkt = dcache_pkt = NULL;
   drainEvent = NULL;
   previousTick = 0;
  +inInitiateAcc = false;
  +completionPkt = NULL;
   changeState(SimObject::Running);
   }
 
  @@ -758,7 +760,13 @@
   if (curStaticInst 
   curStaticInst-isMemRef() 
 !curStaticInst-isDataPrefetch()) {
   // load or store: just send to dcache
  +inInitiateAcc = true;
   Fault fault = curStaticInst-initiateAcc(this, traceData);
  +inInitiateAcc = false;
  +if (completionPkt) {
  +completeDataAccess(completionPkt);
  +completionPkt = NULL;
  +}
   if (_status != Running) {
   // instruction will complete in dcache response
 callback
   assert(_status == DcacheWaitResponse ||
  @@ -856,6 +864,10 @@
   void
   TimingSimpleCPU::completeDataAccess(PacketPtr pkt)
   {
  +if (inInitiateAcc) {
  +completionPkt = pkt;
  +return;
  +}
   // received a response from the dcache: complete the load
 or store
   // instruction
   assert(!pkt-isError());
  diff --git a/src/cpu/simple/timing.hh b/src/cpu/simple/timing.hh
  --- a/src/cpu/simple/timing.hh
  +++ b/src/cpu/simple/timing.hh
  @@ -317,6 +317,9 @@
 
   Tick previousTick;
 
  +bool inInitiateAcc;
  +PacketPtr completionPkt;
  +
 public:
 
   virtual Port *getPort(const std::string if_name, int idx =
 -1);
  ___
  m5-dev mailing list
  m5-dev@m5sim.org mailto:m5-dev@m5sim.org
  http://m5sim.org/mailman/listinfo/m5-dev
 

 ___
 m5-dev mailing list
 m5-dev@m5sim.org mailto:m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev




 -- 
 --
 Korey L Sewell
 Graduate Student - PhD Candidate
 Computer Science  Engineering
 University of Michigan
 

 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev
   

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] [PATCH] CPU: Make the CPU wait until initiateAcc finishes before calling completeAcc

2009-04-26 Thread Korey Sewell
Yea, I was on that thread. Your explanation wasnt particularly in great
detail there (files?, line #s?) so its hard to follow without getting your
hand dirty on where this code is you are referring to.

 I'll comment and you can clarify my assumptions...

Oooh. I see what's broken. This is a result of my changes to
allow delaying translation. What happens is that Stl_c goes into
initiateAcc. That function calls write on the CPU which calls into the
TLB which calls the translation callback which recognizes a failed store
conditional which completes the instruction execution with completeAcc
and cleans up. The call stack then collapses back to the initiateAcc
which is still waiting to finish and which then tries to call a member
function on traceData which was deleted during the cleanup. The problem
here is not fundamentally complicated, but the mechanisms involved are.
One solution would be to record the fact that we're still in
initiateAcc, and if we are wait for the call stack to collapse back down
to initiateAcc's caller before calling into completeAcc. That matches
the semantics an instruction would expect more, I think, where the
initiateAcc/completeAcc pair are called sequentially.

A couple of thoughts come to mind:
1. If the problem is with traceData and specifially just with store
conditionals, what prevents you from a solution that handles that is a small
overhead on that case instead of on every memory access? Particularly,  why
not add a StoreCondInitiateAcc template in mem.isa that handles this
special case? In there after the write to the CPU, you can maybe check to
see if you want to mess with traceData stuff or not depending on if the
conditional is already failed or whatever.

2. If the traceData is deleted, then why is it available to be accessed in
initiateAcc anyway? Isnt it enclosed by a if (traceData)?  Shouldnt that
return false if that is a NULL pointer?

3. Lastly, it sounds like this could be a slightly good candidate case for
having the ability to functionally split TLB access from the timing memory
access since in this case you want to do the access depending on something
that happens with the TLB...

On Sun, Apr 26, 2009 at 2:43 PM, Gabe Black gbl...@eecs.umich.edu wrote:

 There's an explanation already in the thread with the subject  Memory
 corruption inm5devrepositorywhenusing
 --trace-flags=ExecEnable. Stalling or sleeping doesn't make sense in
 this situation since it's not that sort of problem.

 Gabe

 Korey Sewell wrote:
  Can you explain how the completeAcc ever gets called before the
  initiatieAcc finishes in a SimpleCPU?
 
  I'm thinking, why isnt the SimpleCPU just stalling if there is a delay
  waiting for a memory access.If that's the right behavior, then isnt
  the solution to just stall the CPU (or even sleep it) and wait for
  initiateAcc to finish rather than try to work around the
  initiateAcc/completeAcc simultaneous call? I guess I dont understand
  fully the problem but it seems workable without a delay in overhead
  for all other memory accesses.
 
  Also, would this be a issue for any other CPU models (O3 or InOrder)?
 
  On Sun, Apr 26, 2009 at 3:22 AM, Gabe Black gbl...@eecs.umich.edu
  mailto:gbl...@eecs.umich.edu wrote:
 
  This patch should fix the issue Geoff was having where tracing would
  cause a segfault. If you could please give it a try, Geoff, let me
  know
  if it actually solves the problem. This will add a small overhead for
  every memory instruction even though the old behavior is wrong only
 in
  certain circumstances. I considered trying to avoid that overhead in
  cases where the problem can't happen, but then tracing would
  change the
  behavior of the CPU and that seems like a bad idea. Assuming this
  actually fixes the bug and nobody objects I'll commit this soon.
 
  Gabe
 
  gbl...@eecs.umich.edu mailto:gbl...@eecs.umich.edu wrote:
   # HG changeset patch
   # User Gabe Black gbl...@eecs.umich.edu
  mailto:gbl...@eecs.umich.edu
   # Date 1240729422 25200
   # Node ID 284ff3c27bbddacf5763e7b5a54e99e4dcd1912a
   # Parent  8652636856b3d24fb0088fb1af5f5dca5008d9c8
   CPU: Make the CPU wait until initiateAcc finishes before calling
  completeAcc.
  
   diff --git a/src/cpu/simple/timing.cc b/src/cpu/simple/timing.cc
   --- a/src/cpu/simple/timing.cc
   +++ b/src/cpu/simple/timing.cc
   @@ -115,6 +115,8 @@
ifetch_pkt = dcache_pkt = NULL;
drainEvent = NULL;
previousTick = 0;
   +inInitiateAcc = false;
   +completionPkt = NULL;
changeState(SimObject::Running);
}
  
   @@ -758,7 +760,13 @@
if (curStaticInst 
curStaticInst-isMemRef() 
  !curStaticInst-isDataPrefetch()) {
// load or store: just send to dcache
   +inInitiateAcc = true;
Fault fault = 

Re: [m5-dev] [PATCH] CPU: Make the CPU wait until initiateAcc finishes before calling completeAcc

2009-04-26 Thread Gabe Black
You should read my description, read the code, and then once you
understand the problem comment on my solution.

Gabe

Korey Sewell wrote:
 Yea, I was on that thread. Your explanation wasnt particularly in
 great detail there (files?, line #s?) so its hard to follow without
 getting your hand dirty on where this code is you are referring to.

  I'll comment and you can clarify my assumptions...

 Oooh. I see what's broken. This is a result of my changes to
 allow delaying translation. What happens is that Stl_c goes into
 initiateAcc. That function calls write on the CPU which calls into the
 TLB which calls the translation callback which recognizes a failed store
 conditional which completes the instruction execution with completeAcc
 and cleans up. The call stack then collapses back to the initiateAcc
 which is still waiting to finish and which then tries to call a member
 function on traceData which was deleted during the cleanup. The problem
 here is not fundamentally complicated, but the mechanisms involved are.
 One solution would be to record the fact that we're still in
 initiateAcc, and if we are wait for the call stack to collapse back down
 to initiateAcc's caller before calling into completeAcc. That matches
 the semantics an instruction would expect more, I think, where the
 initiateAcc/completeAcc pair are called sequentially.

 A couple of thoughts come to mind:
 1. If the problem is with traceData and specifially just with store
 conditionals, what prevents you from a solution that handles that is a
 small overhead on that case instead of on every memory access?
 Particularly,  why not add a StoreCondInitiateAcc template in
 mem.isa that handles this special case? In there after the write to
 the CPU, you can maybe check to see if you want to mess with traceData
 stuff or not depending on if the conditional is already failed or
 whatever.

 2. If the traceData is deleted, then why is it available to be
 accessed in initiateAcc anyway? Isnt it enclosed by a if
 (traceData)?  Shouldnt that return false if that is a NULL pointer?

 3. Lastly, it sounds like this could be a slightly good candidate case
 for having the ability to functionally split TLB access from the
 timing memory access since in this case you want to do the access
 depending on something that happens with the TLB...

 On Sun, Apr 26, 2009 at 2:43 PM, Gabe Black gbl...@eecs.umich.edu
 mailto:gbl...@eecs.umich.edu wrote:

 There's an explanation already in the thread with the subject  Memory
 corruption inm5devrepositorywhenusing
 --trace-flags=ExecEnable. Stalling or sleeping doesn't make
 sense in
 this situation since it's not that sort of problem.

 Gabe

 Korey Sewell wrote:
  Can you explain how the completeAcc ever gets called before the
  initiatieAcc finishes in a SimpleCPU?
 
  I'm thinking, why isnt the SimpleCPU just stalling if there is a
 delay
  waiting for a memory access.If that's the right behavior, then isnt
  the solution to just stall the CPU (or even sleep it) and wait for
  initiateAcc to finish rather than try to work around the
  initiateAcc/completeAcc simultaneous call? I guess I dont understand
  fully the problem but it seems workable without a delay in overhead
  for all other memory accesses.
 
  Also, would this be a issue for any other CPU models (O3 or
 InOrder)?
 
  On Sun, Apr 26, 2009 at 3:22 AM, Gabe Black
 gbl...@eecs.umich.edu mailto:gbl...@eecs.umich.edu
  mailto:gbl...@eecs.umich.edu mailto:gbl...@eecs.umich.edu
 wrote:
 
  This patch should fix the issue Geoff was having where
 tracing would
  cause a segfault. If you could please give it a try, Geoff,
 let me
  know
  if it actually solves the problem. This will add a small
 overhead for
  every memory instruction even though the old behavior is
 wrong only in
  certain circumstances. I considered trying to avoid that
 overhead in
  cases where the problem can't happen, but then tracing would
  change the
  behavior of the CPU and that seems like a bad idea. Assuming
 this
  actually fixes the bug and nobody objects I'll commit this soon.
 
  Gabe
 
  gbl...@eecs.umich.edu mailto:gbl...@eecs.umich.edu
 mailto:gbl...@eecs.umich.edu mailto:gbl...@eecs.umich.edu wrote:
   # HG changeset patch
   # User Gabe Black gbl...@eecs.umich.edu
 mailto:gbl...@eecs.umich.edu
  mailto:gbl...@eecs.umich.edu mailto:gbl...@eecs.umich.edu
   # Date 1240729422 25200
   # Node ID 284ff3c27bbddacf5763e7b5a54e99e4dcd1912a
   # Parent  8652636856b3d24fb0088fb1af5f5dca5008d9c8
   CPU: Make the CPU wait until initiateAcc finishes before
 calling
  completeAcc.
  
   diff --git a/src/cpu/simple/timing.cc
 

Re: [m5-dev] [PATCH] CPU: Make the CPU wait until initiateAcc finishes before calling completeAcc

2009-04-26 Thread Gabe Black
I should add, I realize it's not very clear what's going on, but it
would take a lot more effort to explain everything in enough detail to
get the problem across than for you (or anyone else) to look at the
code. If you do look into the code, it's not a functional problem as far
as how the timing CPU works or organizes events per se. It's a C++
problem where things may or may not end up being called from the middle
of each other.

Gabe

Gabe Black wrote:
 You should read my description, read the code, and then once you
 understand the problem comment on my solution.

 Gabe

 Korey Sewell wrote:
   
 Yea, I was on that thread. Your explanation wasnt particularly in
 great detail there (files?, line #s?) so its hard to follow without
 getting your hand dirty on where this code is you are referring to.

  I'll comment and you can clarify my assumptions...

 Oooh. I see what's broken. This is a result of my changes to
 allow delaying translation. What happens is that Stl_c goes into
 initiateAcc. That function calls write on the CPU which calls into the
 TLB which calls the translation callback which recognizes a failed store
 conditional which completes the instruction execution with completeAcc
 and cleans up. The call stack then collapses back to the initiateAcc
 which is still waiting to finish and which then tries to call a member
 function on traceData which was deleted during the cleanup. The problem
 here is not fundamentally complicated, but the mechanisms involved are.
 One solution would be to record the fact that we're still in
 initiateAcc, and if we are wait for the call stack to collapse back down
 to initiateAcc's caller before calling into completeAcc. That matches
 the semantics an instruction would expect more, I think, where the
 initiateAcc/completeAcc pair are called sequentially.

 A couple of thoughts come to mind:
 1. If the problem is with traceData and specifially just with store
 conditionals, what prevents you from a solution that handles that is a
 small overhead on that case instead of on every memory access?
 Particularly,  why not add a StoreCondInitiateAcc template in
 mem.isa that handles this special case? In there after the write to
 the CPU, you can maybe check to see if you want to mess with traceData
 stuff or not depending on if the conditional is already failed or
 whatever.

 2. If the traceData is deleted, then why is it available to be
 accessed in initiateAcc anyway? Isnt it enclosed by a if
 (traceData)?  Shouldnt that return false if that is a NULL pointer?

 3. Lastly, it sounds like this could be a slightly good candidate case
 for having the ability to functionally split TLB access from the
 timing memory access since in this case you want to do the access
 depending on something that happens with the TLB...

 On Sun, Apr 26, 2009 at 2:43 PM, Gabe Black gbl...@eecs.umich.edu
 mailto:gbl...@eecs.umich.edu wrote:

 There's an explanation already in the thread with the subject  Memory
 corruption inm5devrepositorywhenusing
 --trace-flags=ExecEnable. Stalling or sleeping doesn't make
 sense in
 this situation since it's not that sort of problem.

 Gabe

 Korey Sewell wrote:
  Can you explain how the completeAcc ever gets called before the
  initiatieAcc finishes in a SimpleCPU?
 
  I'm thinking, why isnt the SimpleCPU just stalling if there is a
 delay
  waiting for a memory access.If that's the right behavior, then isnt
  the solution to just stall the CPU (or even sleep it) and wait for
  initiateAcc to finish rather than try to work around the
  initiateAcc/completeAcc simultaneous call? I guess I dont understand
  fully the problem but it seems workable without a delay in overhead
  for all other memory accesses.
 
  Also, would this be a issue for any other CPU models (O3 or
 InOrder)?
 
  On Sun, Apr 26, 2009 at 3:22 AM, Gabe Black
 gbl...@eecs.umich.edu mailto:gbl...@eecs.umich.edu
  mailto:gbl...@eecs.umich.edu mailto:gbl...@eecs.umich.edu
 wrote:
 
  This patch should fix the issue Geoff was having where
 tracing would
  cause a segfault. If you could please give it a try, Geoff,
 let me
  know
  if it actually solves the problem. This will add a small
 overhead for
  every memory instruction even though the old behavior is
 wrong only in
  certain circumstances. I considered trying to avoid that
 overhead in
  cases where the problem can't happen, but then tracing would
  change the
  behavior of the CPU and that seems like a bad idea. Assuming
 this
  actually fixes the bug and nobody objects I'll commit this soon.
 
  Gabe
 
  gbl...@eecs.umich.edu mailto:gbl...@eecs.umich.edu
 mailto:gbl...@eecs.umich.edu mailto:gbl...@eecs.umich.edu wrote:
   # HG changeset patch
   # 

[m5-dev] changeset in m5: X86: Centralize updates to the handy M5 reg.

2009-04-26 Thread Gabe Black
changeset 7a2dc7d41ee1 in /z/repo/m5
details: http://repo.m5sim.org/m5?cmd=changeset;node=7a2dc7d41ee1
description:
X86: Centralize updates to the handy M5 reg.

diffstat:

4 files changed, 65 insertions(+), 54 deletions(-)
src/arch/x86/faults.cc  |   10 +-
src/arch/x86/miscregfile.cc |   71 +--
src/arch/x86/miscregfile.hh |2 +
src/arch/x86/process.cc |   36 ++---

diffs (239 lines):

diff -r 2bfd792b1cc0 -r 7a2dc7d41ee1 src/arch/x86/faults.cc
--- a/src/arch/x86/faults.ccSun Apr 26 02:09:54 2009 -0700
+++ b/src/arch/x86/faults.ccSun Apr 26 16:47:48 2009 -0700
@@ -271,12 +271,8 @@
 tc-setMiscReg(MISCREG_DR6, 0x0ff0ULL);
 tc-setMiscReg(MISCREG_DR7, 0x0400ULL);
 
-// We're now in real mode, effectively at CPL 0
-HandyM5Reg m5Reg = 0;
-m5Reg.mode = LegacyMode;
-m5Reg.submode = RealMode;
-m5Reg.cpl = 0;
-tc-setMiscReg(MISCREG_M5_REG, m5Reg);
+// Update the handy M5 Reg.
+tc-setMiscReg(MISCREG_M5_REG, 0);
 MicroPC entry = X86ISAInst::RomLabels::extern_label_initIntHalt;
 tc-setMicroPC(romMicroPC(entry));
 tc-setNextMicroPC(romMicroPC(entry) + 1);
@@ -289,7 +285,7 @@
 HandyM5Reg m5Reg = tc-readMiscReg(MISCREG_M5_REG);
 if (m5Reg.mode != LegacyMode || m5Reg.submode != RealMode) {
 panic(Startup IPI recived outside of real mode. 
-Don't know what to do.);
+Don't know what to do. %d, %d, m5Reg.mode, 
m5Reg.submode);
 }
 
 tc-setMiscReg(MISCREG_CS, vector  8);
diff -r 2bfd792b1cc0 -r 7a2dc7d41ee1 src/arch/x86/miscregfile.cc
--- a/src/arch/x86/miscregfile.cc   Sun Apr 26 02:09:54 2009 -0700
+++ b/src/arch/x86/miscregfile.cc   Sun Apr 26 16:47:48 2009 -0700
@@ -96,6 +96,31 @@
 
 class Checkpoint;
 
+void MiscRegFile::updateHandyM5Reg(Efer efer, CR0 cr0,
+SegAttr csAttr, RFLAGS rflags)
+{
+HandyM5Reg m5reg;
+if (efer.lma) {
+m5reg.mode = LongMode;
+if (csAttr.longMode)
+m5reg.submode = SixtyFourBitMode;
+else
+m5reg.submode = CompatabilityMode;
+} else {
+m5reg.mode = LegacyMode;
+if (cr0.pe) {
+if (rflags.vm)
+m5reg.submode = Virtual8086Mode;
+else
+m5reg.submode = ProtectedMode;
+} else {
+m5reg.submode = RealMode;
+}
+}
+m5reg.cpl = csAttr.dpl;
+regVal[MISCREG_M5_REG] = m5reg;
+}
+
 void MiscRegFile::clear()
 {
 // Blank everything. 0 might not be an appropriate value for some things,
@@ -151,39 +176,17 @@
 CR0 toggled = regVal[miscReg] ^ val;
 CR0 newCR0 = val;
 Efer efer = regVal[MISCREG_EFER];
-HandyM5Reg m5reg = regVal[MISCREG_M5_REG];
 if (toggled.pg  efer.lme) {
 if (newCR0.pg) {
 //Turning on long mode
 efer.lma = 1;
-m5reg.mode = LongMode;
 regVal[MISCREG_EFER] = efer;
 } else {
 //Turning off long mode
 efer.lma = 0;
-m5reg.mode = LegacyMode;
 regVal[MISCREG_EFER] = efer;
 }
 }
-// Figure out what submode we're in.
-if (m5reg.mode == LongMode) {
-SegAttr csAttr = regVal[MISCREG_CS_ATTR];
-if (csAttr.longMode)
-m5reg.submode = SixtyFourBitMode;
-else
-m5reg.submode = CompatabilityMode;
-} else {
-if (newCR0.pe) {
-RFLAGS rflags = regVal[MISCREG_RFLAGS];
-if (rflags.vm)
-m5reg.submode = Virtual8086Mode;
-else
-m5reg.submode = ProtectedMode;
-} else {
-m5reg.submode = RealMode;
-}
-}
-regVal[MISCREG_M5_REG] = m5reg;
 if (toggled.pg) {
 tc-getITBPtr()-invalidateAll();
 tc-getDTBPtr()-invalidateAll();
@@ -191,6 +194,10 @@
 //This must always be 1.
 newCR0.et = 1;
 newVal = newCR0;
+updateHandyM5Reg(regVal[MISCREG_EFER],
+ newCR0,
+ regVal[MISCREG_CS_ATTR],
+ regVal[MISCREG_RFLAGS]);
 }
 break;
   case MISCREG_CR2:
@@ -214,26 +221,23 @@
 {
 SegAttr toggled = regVal[miscReg] ^ val;
 SegAttr newCSAttr = val;
-HandyM5Reg m5reg = regVal[MISCREG_M5_REG];
 if (toggled.longMode) {
 if (newCSAttr.longMode) {
-if (m5reg.mode == LongMode)
-

[m5-dev] changeset in m5: X86: Split out the internal memory space from t...

2009-04-26 Thread Gabe Black
changeset 5babc3f3d8c8 in /z/repo/m5
details: http://repo.m5sim.org/m5?cmd=changeset;node=5babc3f3d8c8
description:
X86: Split out the internal memory space from the regular translate() 
and precompute mode.

diffstat:

4 files changed, 382 insertions(+), 373 deletions(-)
src/arch/x86/miscregfile.cc |2 
src/arch/x86/miscregs.hh|2 
src/arch/x86/tlb.cc |  749 +--
src/arch/x86/tlb.hh |2 

diffs (truncated from 830 to 300 lines):

diff -r 7a2dc7d41ee1 -r 5babc3f3d8c8 src/arch/x86/miscregfile.cc
--- a/src/arch/x86/miscregfile.cc   Sun Apr 26 16:47:48 2009 -0700
+++ b/src/arch/x86/miscregfile.cc   Sun Apr 26 16:48:44 2009 -0700
@@ -118,6 +118,8 @@
 }
 }
 m5reg.cpl = csAttr.dpl;
+m5reg.paging = cr0.pg;
+m5reg.prot = cr0.pe;
 regVal[MISCREG_M5_REG] = m5reg;
 }
 
diff -r 7a2dc7d41ee1 -r 5babc3f3d8c8 src/arch/x86/miscregs.hh
--- a/src/arch/x86/miscregs.hh  Sun Apr 26 16:47:48 2009 -0700
+++ b/src/arch/x86/miscregs.hh  Sun Apr 26 16:48:44 2009 -0700
@@ -518,6 +518,8 @@
 Bitfield0 mode;
 Bitfield3, 1 submode;
 Bitfield5, 4 cpl;
+Bitfield6 paging;
+Bitfield7 prot;
 EndBitUnion(HandyM5Reg)
 
 /**
diff -r 7a2dc7d41ee1 -r 5babc3f3d8c8 src/arch/x86/tlb.cc
--- a/src/arch/x86/tlb.cc   Sun Apr 26 16:47:48 2009 -0700
+++ b/src/arch/x86/tlb.cc   Sun Apr 26 16:48:44 2009 -0700
@@ -186,391 +186,393 @@
 }
 
 Fault
+TLB::translateInt(RequestPtr req, ThreadContext *tc)
+{
+DPRINTF(TLB, Addresses references internal memory.\n);
+Addr vaddr = req-getVaddr();
+Addr prefix = (vaddr  3)  IntAddrPrefixMask;
+if (prefix == IntAddrPrefixCPUID) {
+panic(CPUID memory space not yet implemented!\n);
+} else if (prefix == IntAddrPrefixMSR) {
+vaddr = vaddr  3;
+req-setMmapedIpr(true);
+Addr regNum = 0;
+switch (vaddr  ~IntAddrPrefixMask) {
+  case 0x10:
+regNum = MISCREG_TSC;
+break;
+  case 0x1B:
+regNum = MISCREG_APIC_BASE;
+break;
+  case 0xFE:
+regNum = MISCREG_MTRRCAP;
+break;
+  case 0x174:
+regNum = MISCREG_SYSENTER_CS;
+break;
+  case 0x175:
+regNum = MISCREG_SYSENTER_ESP;
+break;
+  case 0x176:
+regNum = MISCREG_SYSENTER_EIP;
+break;
+  case 0x179:
+regNum = MISCREG_MCG_CAP;
+break;
+  case 0x17A:
+regNum = MISCREG_MCG_STATUS;
+break;
+  case 0x17B:
+regNum = MISCREG_MCG_CTL;
+break;
+  case 0x1D9:
+regNum = MISCREG_DEBUG_CTL_MSR;
+break;
+  case 0x1DB:
+regNum = MISCREG_LAST_BRANCH_FROM_IP;
+break;
+  case 0x1DC:
+regNum = MISCREG_LAST_BRANCH_TO_IP;
+break;
+  case 0x1DD:
+regNum = MISCREG_LAST_EXCEPTION_FROM_IP;
+break;
+  case 0x1DE:
+regNum = MISCREG_LAST_EXCEPTION_TO_IP;
+break;
+  case 0x200:
+regNum = MISCREG_MTRR_PHYS_BASE_0;
+break;
+  case 0x201:
+regNum = MISCREG_MTRR_PHYS_MASK_0;
+break;
+  case 0x202:
+regNum = MISCREG_MTRR_PHYS_BASE_1;
+break;
+  case 0x203:
+regNum = MISCREG_MTRR_PHYS_MASK_1;
+break;
+  case 0x204:
+regNum = MISCREG_MTRR_PHYS_BASE_2;
+break;
+  case 0x205:
+regNum = MISCREG_MTRR_PHYS_MASK_2;
+break;
+  case 0x206:
+regNum = MISCREG_MTRR_PHYS_BASE_3;
+break;
+  case 0x207:
+regNum = MISCREG_MTRR_PHYS_MASK_3;
+break;
+  case 0x208:
+regNum = MISCREG_MTRR_PHYS_BASE_4;
+break;
+  case 0x209:
+regNum = MISCREG_MTRR_PHYS_MASK_4;
+break;
+  case 0x20A:
+regNum = MISCREG_MTRR_PHYS_BASE_5;
+break;
+  case 0x20B:
+regNum = MISCREG_MTRR_PHYS_MASK_5;
+break;
+  case 0x20C:
+regNum = MISCREG_MTRR_PHYS_BASE_6;
+break;
+  case 0x20D:
+regNum = MISCREG_MTRR_PHYS_MASK_6;
+break;
+  case 0x20E:
+regNum = MISCREG_MTRR_PHYS_BASE_7;
+break;
+  case 0x20F:
+regNum = MISCREG_MTRR_PHYS_MASK_7;
+break;
+  case 0x250:
+regNum = MISCREG_MTRR_FIX_64K_0;
+break;
+  case 0x258:
+regNum = MISCREG_MTRR_FIX_16K_8;
+break;
+  case 0x259:
+regNum = MISCREG_MTRR_FIX_16K_A;
+break;
+  case 0x268:
+regNum = MISCREG_MTRR_FIX_4K_C;
+

[m5-dev] changeset in m5: X86: Precompute the default and alternate addre...

2009-04-26 Thread Gabe Black
changeset af13ed3bea48 in /z/repo/m5
details: http://repo.m5sim.org/m5?cmd=changeset;node=af13ed3bea48
description:
X86: Precompute the default and alternate address and operand size and 
the stack size.

diffstat:

5 files changed, 69 insertions(+), 64 deletions(-)
src/arch/x86/miscregfile.cc |   43 +++
src/arch/x86/miscregfile.hh |2 -
src/arch/x86/miscregs.hh|5 ++
src/arch/x86/predecoder.cc  |   78 +--
src/arch/x86/predecoder.hh  |5 ++

diffs (260 lines):

diff -r 5babc3f3d8c8 -r af13ed3bea48 src/arch/x86/miscregfile.cc
--- a/src/arch/x86/miscregfile.cc   Sun Apr 26 16:48:44 2009 -0700
+++ b/src/arch/x86/miscregfile.cc   Sun Apr 26 16:49:24 2009 -0700
@@ -97,7 +97,7 @@
 class Checkpoint;
 
 void MiscRegFile::updateHandyM5Reg(Efer efer, CR0 cr0,
-SegAttr csAttr, RFLAGS rflags)
+SegAttr csAttr, SegAttr ssAttr, RFLAGS rflags)
 {
 HandyM5Reg m5reg;
 if (efer.lma) {
@@ -120,6 +120,37 @@
 m5reg.cpl = csAttr.dpl;
 m5reg.paging = cr0.pg;
 m5reg.prot = cr0.pe;
+
+// Compute the default and alternate operand size.
+if (m5reg.submode == SixtyFourBitMode || csAttr.defaultSize) {
+m5reg.defOp = 2;
+m5reg.altOp = 1;
+} else {
+m5reg.defOp = 1;
+m5reg.altOp = 2;
+}
+
+// Compute the default and alternate address size.
+if (m5reg.submode == SixtyFourBitMode) {
+m5reg.defAddr = 3;
+m5reg.altAddr = 2;
+} else if (csAttr.defaultSize) {
+m5reg.defAddr = 2;
+m5reg.altAddr = 1;
+} else {
+m5reg.defAddr = 1;
+m5reg.altAddr = 2;
+}
+
+// Compute the stack size
+if (m5reg.submode == SixtyFourBitMode) {
+m5reg.stack = 3;
+} else if (ssAttr.defaultSize) {
+m5reg.stack = 2;
+} else {
+m5reg.stack = 1;
+}
+
 regVal[MISCREG_M5_REG] = m5reg;
 }
 
@@ -199,6 +230,7 @@
 updateHandyM5Reg(regVal[MISCREG_EFER],
  newCR0,
  regVal[MISCREG_CS_ATTR],
+ regVal[MISCREG_SS_ATTR],
  regVal[MISCREG_RFLAGS]);
 }
 break;
@@ -239,9 +271,17 @@
 updateHandyM5Reg(regVal[MISCREG_EFER],
  regVal[MISCREG_CR0],
  newCSAttr,
+ regVal[MISCREG_SS_ATTR],
  regVal[MISCREG_RFLAGS]);
 }
 break;
+  case MISCREG_SS_ATTR:
+updateHandyM5Reg(regVal[MISCREG_EFER],
+ regVal[MISCREG_CR0],
+ regVal[MISCREG_CS_ATTR],
+ val,
+ regVal[MISCREG_RFLAGS]);
+break;
   // These segments always actually use their bases, or in other words
   // their effective bases must stay equal to their actual bases.
   case MISCREG_FS_BASE:
@@ -346,6 +386,7 @@
 updateHandyM5Reg(regVal[MISCREG_EFER],
  regVal[MISCREG_CR0],
  regVal[MISCREG_CS_ATTR],
+ regVal[MISCREG_SS_ATTR],
  regVal[MISCREG_RFLAGS]);
 return;
   default:
diff -r 5babc3f3d8c8 -r af13ed3bea48 src/arch/x86/miscregfile.hh
--- a/src/arch/x86/miscregfile.hh   Sun Apr 26 16:48:44 2009 -0700
+++ b/src/arch/x86/miscregfile.hh   Sun Apr 26 16:49:24 2009 -0700
@@ -108,7 +108,7 @@
   protected:
 MiscReg regVal[NumMiscRegs];
 void updateHandyM5Reg(Efer efer, CR0 cr0,
-SegAttr csAttr, RFLAGS rflags);
+SegAttr csAttr, SegAttr ssAttr, RFLAGS rflags);
 
   public:
 void clear();
diff -r 5babc3f3d8c8 -r af13ed3bea48 src/arch/x86/miscregs.hh
--- a/src/arch/x86/miscregs.hh  Sun Apr 26 16:48:44 2009 -0700
+++ b/src/arch/x86/miscregs.hh  Sun Apr 26 16:49:24 2009 -0700
@@ -520,6 +520,11 @@
 Bitfield5, 4 cpl;
 Bitfield6 paging;
 Bitfield7 prot;
+Bitfield9, 8 defOp;
+Bitfield11, 10 altOp;
+Bitfield13, 12 defAddr;
+Bitfield15, 14 altAddr;
+Bitfield17, 16 stack;
 EndBitUnion(HandyM5Reg)
 
 /**
diff -r 5babc3f3d8c8 -r af13ed3bea48 src/arch/x86/predecoder.cc
--- a/src/arch/x86/predecoder.ccSun Apr 26 16:48:44 2009 -0700
+++ b/src/arch/x86/predecoder.ccSun Apr 26 16:49:24 2009 -0700
@@ -80,9 +80,9 @@
 
 emi.modRM = 0;
 emi.sib = 0;
-HandyM5Reg m5reg = tc-readMiscRegNoEffect(MISCREG_M5_REG);
-emi.mode.mode = m5reg.mode;
-emi.mode.submode = m5reg.submode;
+m5Reg = tc-readMiscRegNoEffect(MISCREG_M5_REG);
+emi.mode.mode = m5Reg.mode;
+emi.mode.submode = m5Reg.submode;
 }
 
 void Predecoder::process()
@@ -216,34 +216,15 @@
 DPRINTF(Predecoder, Found opcode %#x.\n, nextByte);
 emi.opcode.op = nextByte;
 
-