Thanks Mitch for the quick reply.

While assuming stores are only sent after commit is true for the current O3 
model, aggressive out-of-order processors send store addresses to the memory 
system as soon as they are available (i.e. speculatively).  We actually have a 
patch that provides such a capability, but I'm having a tough time figuring out 
how to merge it with your change.  Any suggestions you may have would be very 
much appreciated.

Thanks,

Brad



-----Original Message-----
From: gem5-dev [mailto:gem5-dev-boun...@gem5.org] On Behalf Of Mitch Hayenga 
via gem5-dev
Sent: Friday, January 30, 2015 9:34 AM
To: gem5 Developer List
Cc: gem5-...@m5sim.org
Subject: Re: [gem5-dev] changeset in gem5: cpu: Fix cache blocked load behavior 
in o3 cpu

Hi,

Stores should be fine since they are only sent to the memory system after
commit.   The relevant functions to look at are
sendStore, recvRetry, and writebackStores in lsq_unit_impl.hh.

Basically, if a store gets blocked the core just waits until it gets a retry.  
Since stores are sent in-order from the SQ to the memory system, that queue 
just waits.  The stores are never removed from the SQ unless they succeed.

Loads were special in that they were effectively removed from the scheduler, 
even if they might fail.  Stores however always maintain their entries/order 
until they succeed.





On Thu, Jan 29, 2015 at 6:01 PM, Beckmann, Brad via gem5-dev < 
gem5-dev@gem5.org> wrote:

> Hi Mitch,
>
> Quick question regarding this patch.  Does this patch also handle 
> replaying stores once the cache becomes unblocked?  The changes and 
> comments appear to only handle loads, but it seems like stores could 
> have the same problem.
>
> Thanks,
>
> Brad
>
>
>
> -----Original Message-----
> From: gem5-dev [mailto:gem5-dev-boun...@gem5.org] On Behalf Of Mitch 
> Hayenga via gem5-dev
> Sent: Wednesday, September 03, 2014 4:38 AM
> To: gem5-...@m5sim.org
> Subject: [gem5-dev] changeset in gem5: cpu: Fix cache blocked load 
> behavior in o3 cpu
>
> changeset 6be8945d226b in /z/repo/gem5
> details: http://repo.gem5.org/gem5?cmd=changeset;node=6be8945d226b
> description:
>         cpu: Fix cache blocked load behavior in o3 cpu
>
>         This patch fixes the load blocked/replay mechanism in the o3 cpu.
> Rather than
>         flushing the entire pipeline, this patch replays loads once 
> the cache becomes
>         unblocked.
>
>         Additionally, deferred memory instructions (loads which had 
> conflicting stores),
>         when replayed would not respect the number of functional units 
> (only respected
>         issue width).  This patch also corrects that.
>
>         Improvements over 20% have been observed on a microbenchmark 
> designed to
>         exercise this behavior.
>
> diffstat:
>
>  src/cpu/o3/iew.hh               |   13 +-
>  src/cpu/o3/iew_impl.hh          |   57 ++--------
>  src/cpu/o3/inst_queue.hh        |   25 ++++-
>  src/cpu/o3/inst_queue_impl.hh   |   68 ++++++++++---
>  src/cpu/o3/lsq.hh               |   27 +-----
>  src/cpu/o3/lsq_impl.hh          |   23 +---
>  src/cpu/o3/lsq_unit.hh          |  198
> ++++++++++++++++-----------------------
>  src/cpu/o3/lsq_unit_impl.hh     |   40 ++-----
>  src/cpu/o3/mem_dep_unit.hh      |    4 +-
>  src/cpu/o3/mem_dep_unit_impl.hh |    4 +-
>  10 files changed, 203 insertions(+), 256 deletions(-)
>
> diffs (truncated from 846 to 300 lines):
>
> diff -r 1ba825974ee6 -r 6be8945d226b src/cpu/o3/iew.hh
> --- a/src/cpu/o3/iew.hh Wed Sep 03 07:42:38 2014 -0400
> +++ b/src/cpu/o3/iew.hh Wed Sep 03 07:42:39 2014 -0400
> @@ -1,5 +1,5 @@
>  /*
> - * Copyright (c) 2010-2012 ARM Limited
> + * Copyright (c) 2010-2012, 2014 ARM Limited
>   * All rights reserved
>   *
>   * The license below extends only to copyright in the software and 
> shall @@ -181,6 +181,12 @@
>      /** Re-executes all rescheduled memory instructions. */
>      void replayMemInst(DynInstPtr &inst);
>
> +    /** Moves memory instruction onto the list of cache blocked
> instructions */
> +    void blockMemInst(DynInstPtr &inst);
> +
> +    /** Notifies that the cache has become unblocked */
> +    void cacheUnblocked();
> +
>      /** Sends an instruction to commit through the time buffer. */
>      void instToCommit(DynInstPtr &inst);
>
> @@ -233,11 +239,6 @@
>       */
>      void squashDueToMemOrder(DynInstPtr &inst, ThreadID tid);
>
> -    /** Sends commit proper information for a squash due to memory
> becoming
> -     * blocked (younger issued instructions must be retried).
> -     */
> -    void squashDueToMemBlocked(DynInstPtr &inst, ThreadID tid);
> -
>      /** Sets Dispatch to blocked, and signals back to other stages to 
> block. */
>      void block(ThreadID tid);
>
> diff -r 1ba825974ee6 -r 6be8945d226b src/cpu/o3/iew_impl.hh
> --- a/src/cpu/o3/iew_impl.hh    Wed Sep 03 07:42:38 2014 -0400
> +++ b/src/cpu/o3/iew_impl.hh    Wed Sep 03 07:42:39 2014 -0400
> @@ -530,29 +530,6 @@
>
>  template<class Impl>
>  void
> -DefaultIEW<Impl>::squashDueToMemBlocked(DynInstPtr &inst, ThreadID tid) -{
> -    DPRINTF(IEW, "[tid:%i]: Memory blocked, squashing load and younger
> insts, "
> -            "PC: %s [sn:%i].\n", tid, inst->pcState(), inst->seqNum);
> -    if (!toCommit->squash[tid] ||
> -            inst->seqNum < toCommit->squashedSeqNum[tid]) {
> -        toCommit->squash[tid] = true;
> -
> -        toCommit->squashedSeqNum[tid] = inst->seqNum;
> -        toCommit->pc[tid] = inst->pcState();
> -        toCommit->mispredictInst[tid] = NULL;
> -
> -        // Must include the broadcasted SN in the squash.
> -        toCommit->includeSquashInst[tid] = true;
> -
> -        ldstQueue.setLoadBlockedHandled(tid);
> -
> -        wroteToTimeBuffer = true;
> -    }
> -}
> -
> -template<class Impl>
> -void
>  DefaultIEW<Impl>::block(ThreadID tid)  {
>      DPRINTF(IEW, "[tid:%u]: Blocking.\n", tid); @@ -610,6 +587,20 @@
>
>  template<class Impl>
>  void
> +DefaultIEW<Impl>::blockMemInst(DynInstPtr& inst) {
> +    instQueue.blockMemInst(inst);
> +}
> +
> +template<class Impl>
> +void
> +DefaultIEW<Impl>::cacheUnblocked()
> +{
> +    instQueue.cacheUnblocked();
> +}
> +
> +template<class Impl>
> +void
>  DefaultIEW<Impl>::instToCommit(DynInstPtr &inst)  {
>      // This function should not be called after writebackInsts in a 
> @@
> -1376,15 +1367,6 @@
>                  squashDueToMemOrder(violator, tid);
>
>                  ++memOrderViolationEvents;
> -            } else if (ldstQueue.loadBlocked(tid) &&
> -                       !ldstQueue.isLoadBlockedHandled(tid)) {
> -                fetchRedirect[tid] = true;
> -
> -                DPRINTF(IEW, "Load operation couldn't execute because the
> "
> -                        "memory system is blocked.  PC: %s [sn:%lli]\n",
> -                        inst->pcState(), inst->seqNum);
> -
> -                squashDueToMemBlocked(inst, tid);
>              }
>          } else {
>              // Reset any state associated with redirects that will 
> not @@
> -1403,17 +1385,6 @@
>
>                  ++memOrderViolationEvents;
>              }
> -            if (ldstQueue.loadBlocked(tid) &&
> -                !ldstQueue.isLoadBlockedHandled(tid)) {
> -                DPRINTF(IEW, "Load operation couldn't execute because the
> "
> -                        "memory system is blocked.  PC: %s [sn:%lli]\n",
> -                        inst->pcState(), inst->seqNum);
> -                DPRINTF(IEW, "Blocked load will not be handled because "
> -                        "already squashing\n");
> -
> -                ldstQueue.setLoadBlockedHandled(tid);
> -            }
> -
>          }
>      }
>
> diff -r 1ba825974ee6 -r 6be8945d226b src/cpu/o3/inst_queue.hh
> --- a/src/cpu/o3/inst_queue.hh  Wed Sep 03 07:42:38 2014 -0400
> +++ b/src/cpu/o3/inst_queue.hh  Wed Sep 03 07:42:39 2014 -0400
> @@ -1,5 +1,5 @@
>  /*
> - * Copyright (c) 2011-2012 ARM Limited
> + * Copyright (c) 2011-2012, 2014 ARM Limited
>   * Copyright (c) 2013 Advanced Micro Devices, Inc.
>   * All rights reserved.
>   *
> @@ -188,11 +188,16 @@
>       */
>      DynInstPtr getInstToExecute();
>
> -    /** Returns a memory instruction that was referred due to a delayed
> DTB
> -     *  translation if it is now ready to execute.
> +    /** Gets a memory instruction that was referred due to a delayed DTB
> +     *  translation if it is now ready to execute.  NULL if none
> available.
>       */
>      DynInstPtr getDeferredMemInstToExecute();
>
> +    /** Gets a memory instruction that was blocked on the cache. NULL 
> + if
> none
> +     *  available.
> +     */
> +    DynInstPtr getBlockedMemInstToExecute();
> +
>      /**
>       * Records the instruction as the producer of a register without
>       * adding it to the rest of the IQ.
> @@ -242,6 +247,12 @@
>       */
>      void deferMemInst(DynInstPtr &deferred_inst);
>
> +    /**  Defers a memory instruction when it is cache blocked. */
> +    void blockMemInst(DynInstPtr &blocked_inst);
> +
> +    /**  Notify instruction queue that a previous blockage has resolved */
> +    void cacheUnblocked();
> +
>      /** Indicates an ordering violation between a store and a load. */
>      void violation(DynInstPtr &store, DynInstPtr &faulting_load);
>
> @@ -308,6 +319,14 @@
>       */
>      std::list<DynInstPtr> deferredMemInsts;
>
> +    /** List of instructions that have been cache blocked. */
> +    std::list<DynInstPtr> blockedMemInsts;
> +
> +    /** List of instructions that were cache blocked, but a retry has
> been seen
> +     * since, so they can now be retried. May fail again go on the
> blocked list.
> +     */
> +    std::list<DynInstPtr> retryMemInsts;
> +
>      /**
>       * Struct for comparing entries to be added to the priority queue.
>       * This gives reverse ordering to the instructions in terms of 
> diff -r 1ba825974ee6 -r 6be8945d226b src/cpu/o3/inst_queue_impl.hh
> --- a/src/cpu/o3/inst_queue_impl.hh     Wed Sep 03 07:42:38 2014 -0400
> +++ b/src/cpu/o3/inst_queue_impl.hh     Wed Sep 03 07:42:39 2014 -0400
> @@ -1,5 +1,5 @@
>  /*
> - * Copyright (c) 2011-2013 ARM Limited
> + * Copyright (c) 2011-2014 ARM Limited
>   * Copyright (c) 2013 Advanced Micro Devices, Inc.
>   * All rights reserved.
>   *
> @@ -413,6 +413,8 @@
>      nonSpecInsts.clear();
>      listOrder.clear();
>      deferredMemInsts.clear();
> +    blockedMemInsts.clear();
> +    retryMemInsts.clear();
>  }
>
>  template <class Impl>
> @@ -734,13 +736,14 @@
>
>      IssueStruct *i2e_info = issueToExecuteQueue->access(0);
>
> -    DynInstPtr deferred_mem_inst;
> -    int total_deferred_mem_issued = 0;
> -    while (total_deferred_mem_issued < totalWidth &&
> -           (deferred_mem_inst = getDeferredMemInstToExecute()) != 0) {
> -        issueToExecuteQueue->access(0)->size++;
> -        instsToExecute.push_back(deferred_mem_inst);
> -        total_deferred_mem_issued++;
> +    DynInstPtr mem_inst;
> +    while (mem_inst = getDeferredMemInstToExecute()) {
> +        addReadyMemInst(mem_inst);
> +    }
> +
> +    // See if any cache blocked instructions are able to be executed
> +    while (mem_inst = getBlockedMemInstToExecute()) {
> +        addReadyMemInst(mem_inst);
>      }
>
>      // Have iterator to head of the list @@ -751,12 +754,11 @@
>      // Increment the iterator.
>      // This will avoid trying to schedule a certain op class if there 
> are no
>      // FUs that handle it.
> +    int total_issued = 0;
>      ListOrderIt order_it = listOrder.begin();
>      ListOrderIt order_end_it = listOrder.end();
> -    int total_issued = 0;
>
> -    while (total_issued < (totalWidth - total_deferred_mem_issued) &&
> -           order_it != order_end_it) {
> +    while (total_issued < totalWidth && order_it != order_end_it) {
>          OpClass op_class = (*order_it).queueType;
>
>          assert(!readyInsts[op_class].empty());
> @@ -874,7 +876,7 @@
>      // @todo If the way deferred memory instructions are handeled due to
>      // translation changes then the deferredMemInsts condition should 
> be removed
>      // from the code below.
> -    if (total_issued || total_deferred_mem_issued ||
> deferredMemInsts.size()) {
> +    if (total_issued || !retryMemInsts.empty() ||
> + !deferredMemInsts.empty()) {
>          cpu->activityThisCycle();
>      } else {
>          DPRINTF(IQ, "Not able to schedule any instructions.\n"); @@
> -1050,7 +1052,7 @@  void  
> InstructionQueue<Impl>::replayMemInst(DynInstPtr
> &replay_inst)  {
> -    memDepUnit[replay_inst->threadNumber].replay(replay_inst);
> +    memDepUnit[replay_inst->threadNumber].replay();
>  }
>
>  template <class Impl>
> @@ -1078,18 +1080,52 @@
>  }
>
>  template <class Impl>
> +void
> +InstructionQueue<Impl>::blockMemInst(DynInstPtr &blocked_inst) {
> +    blocked_inst->translationStarted(false);
> +    blocked_inst->translationCompleted(false);
> +
> +    blocked_inst->clearIssued();
> +    blocked_inst->clearCanIssue();
> +    blockedMemInsts.push_back(blocked_inst);
> +}
> +
> +template <class Impl>
> +void
> +InstructionQueue<Impl>::cacheUnblocked()
> +{
> +    retryMemInsts.splice(retryMemInsts.end(), blockedMemInsts);
> +    // Get the CPU ticking again
> +    cpu->wakeCPU();
> +}
> +
> +template <class Impl>
>  typename Impl::DynInstPtr
>  InstructionQueue<Impl>::getDeferredMemInstToExecute()
>  {
>      for (ListIt it = deferredMemInsts.begin(); it != 
> deferredMemInsts.end();
>           ++it) {
>          if ((*it)->translationCompleted() || (*it)->isSquashed()) {
> -            DynInstPtr ret = *it;
> +            DynInstPtr mem_inst = *it;
>              deferredMemInsts.erase(it);
> -            return ret;
> +            return mem_inst;
>          }
>      }
> -    return NULL;
> +    return nullptr;
> +}
> +
> +template <class Impl>
> +typename Impl::DynInstPtr
> +InstructionQueue<Impl>::getBlockedMemInstToExecute()
> +{
> _______________________________________________
> gem5-dev mailing list
> gem5-dev@gem5.org
> http://m5sim.org/mailman/listinfo/gem5-dev
> _______________________________________________
> gem5-dev mailing list
> gem5-dev@gem5.org
> http://m5sim.org/mailman/listinfo/gem5-dev
>
_______________________________________________
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev
_______________________________________________
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to