The "not fill data" path is really intended only for accesses to uncacheable memory. It's not designed to be used for cache bypass operations for coherent cacheable data.
Steve On Mon, Feb 8, 2016 at 11:40 AM Gongjin Sun <gongj...@uci.edu> wrote: > I have to supplement an important thing: > After I changed "assert(pkt->getAddr() == tgt_pkt->getAddr()); " into > "assert(pkt->getAddr() > == blockAlign(tgt_pkt->getAddr()));" to pass this assert check, I got a > 'SIGABRT' signal (after a while of run) and my program exits: > > Program received signal SIGABRT, Aborted. > > I analysed this question but still feel it should not be related to my > change because logically I didn't make some mistakes about my > above-mentioned the address analysis of pkt and tgt_pkt. So I believe it is > my change that exposes some other problem. So please help check that. Thank > you so much. > > > ------------------------------------------------------------------------------------------------ > > Some of the back trace are: > > 0x00007ffff5e7d657 in __GI_raise (sig=sig@entry=6) at > ../sysdeps/unix/sysv/linux/raise.c:55 > 0x00007ffff5e7ea2a in __GI_abort () at abort.c:89 > 0x00000000010a9eb6 in __exit_epilogue (code=-1, > func=0x1da9bbe <X86ISA::PageFault::invoke(ThreadContext*, > RefCountingPtr<StaticInst> const&)::__FUNCTION__> "invoke", > file=0x1da99ad "build/X86/arch/x86/faults.cc", line=160, format=0x1da9a40 > "Tried to %s unmapped address %#x.\n") > at build/X86/base/misc.cc:94 > 0x0000000000d42d8f in __exit_message<char const*, unsigned long> > (prefix=0x1da99ca "panic", code=-1, > func=0x1da9bbe <X86ISA::PageFault::invoke(ThreadContext*, > RefCountingPtr<StaticInst> const&)::__FUNCTION__> "invoke", > file=0x1da99ad "build/X86/arch/x86/faults.cc", line=160, format=0x1da9a40 > "Tried to %s unmapped address %#x.\n") > at build/X86/base/misc.hh:81 > 0x0000000000d414fb in X86ISA::PageFault::invoke (this=0x3c29d70, > tc=0x31ab0e0, inst=...) at build/X86/arch/x86/faults.cc:160 > 0x0000000000dcd3f3 in BaseSimpleCPU::advancePC (this=0x3196340, fault=...) > at build/X86/cpu/simple/base.cc:532 > 0x0000000000dc5774 in TimingSimpleCPU::advanceInst (this=0x3196340, > fault=...) at build/X86/cpu/simple/timing.cc:578 > 0x0000000000dc40e7 in TimingSimpleCPU::translationFault (this=0x3196340, > fault=...) at build/X86/cpu/simple/timing.cc:331 > 0x0000000000dc4e90 in TimingSimpleCPU::finishTranslation (this=0x3196340, > state=0x3c2a040) at build/X86/cpu/simple/timing.cc:497 > 0x0000000000dc8f39 in DataTranslation<TimingSimpleCPU*>::finish > (this=0x3c29e80, fault=..., req=0x3c29cf0, tc=0x31ab0e0, > mode=BaseTLB::Read) at build/X86/cpu/translation.hh:244 > 0x0000000000d806ab in X86ISA::TLB::translateTiming (this=0x312f380, > req=0x3c29cf0, tc=0x31ab0e0, translation=0x3c29e80, > mode=BaseTLB::Read) at build/X86/arch/x86/tlb.cc:429 > 0x0000000000dc477d in TimingSimpleCPU::readMem (this=0x3196340, addr=16, > data=0x7fffffffd2c0 "", size=8, flags=4) > at build/X86/cpu/simple/timing.cc:409 > 0x0000000001d47c5c in X86ISA::readMemTiming<ExecContext> (xc=0x31964a0, > traceData=0x0, addr=16, mem=@0x7fffffffd2c0: 0, dataSize=8, > flags=4) at build/X86/arch/x86/memhelpers.hh:46 > 0x0000000001d387a5 in X86ISAInst::LdBig::initiateAcc (this=0x3c29f70, > xc=0x31964a0, traceData=0x0) > at build/X86/arch/x86/generated/exec-ns.cc.inc:19231 > 0x0000000000dc5a86 in TimingSimpleCPU::completeIfetch (this=0x3196340, > pkt=0x3c29df0) at build/X86/cpu/simple/timing.cc:619 > 0x0000000000dc5e89 in TimingSimpleCPU::IcachePort::ITickEvent::process > (this=0x3196708) at build/X86/cpu/simple/timing.cc:666 > 0x0000000000ebf808 in EventQueue::serviceOne (this=0x2fc1cf0) at > build/X86/sim/eventq.cc:221 > > ... > > Best > > > > > On Mon, Feb 8, 2016 at 11:29 AM, Gongjin Sun <gongj...@uci.edu> wrote: > >> Really thank you Steve, next I'll read the comment and related code >> again, and hope can understand more about the working mechanism of >> multi-level coherence. >> >> By the way, I found a possible bug again, please help verify it. (I use >> se.py) >> >> >> ----------------------------------Cache::recvTimingResp---------------------------- >> >> } else { >> // not a cache fill, just forwarding response >> // responseLatency is the latency of the return path >> // from lower level cahces/memory to the core. >> completion_time += clockEdge(responseLatency) + pkt->payloadDe >> if (pkt->isRead() && !is_error) { >> // sanity check >> assert(pkt->getAddr() == tgt_pkt->getAddr()); >> assert(pkt->getSize() >= tgt_pkt->getSize()); >> >> tgt_pkt->setData(pkt->getConstPtr<uint8_t>()); >> >> >> ------------------------------------------------------------------------------------ >> >> The problematic line is : >> assert(pkt->getAddr() == tgt_pkt->getAddr()); >> >> In the beginning I didn't get this assert failure because all my >> application can't enter this "else" code. Usually gem5's default behavior >> is "fill" (that is, isFill is True). >> But when I modified some code and asked gem5 to not fill data in some >> cache level(for example, when a ReadReq misses in L1 but hits in L2, the >> returned data will be sent to CPU directly and not filled in L1), it will >> enter this code branch. Now I got this assert failure. After I debugged the >> process of req and resp, I found the following fact(I have a three cache >> level hierarchy): >> >> When cpu's ReadReq arrives at L1 but misses, it will be allocated in L1 >> s MSHR. In mshr entry, BLOCK address will be used. However, mshr's >> target's member 'pkt' will NOT be a block aligned address necessarily. The >> key thing is that then when we call getBusPacket(), we generate a new >> request packet with a block aligned address: >> >> ------------------------------getBusPacket()------------------------------ >> PacketPtr pkt = new Packet(cpu_pkt->req, cmd, blkSize); >> >> // the packet should be block aligned >> assert(pkt->getAddr() == blockAlign(pkt->getAddr())); >> >> >> -------------------------------------------------------------------------------- >> >> See, that's why when the returned resp packet arrives at L1, its >> address(pkt->getAddr()) can't be equal to the target's packet's address >> ('tgt_pkt->getAddr()'). >> >> The is just my observed things. So please help verify it. Thank you. >> >> Another thing, I noticed that gem5's access to cache use a physical >> address. Why doesn't it use a virtual one? As I remember Virtual Index >> Physical Tagged (VIPT) seems to be a common implementation. If I want to >> observer the cache's access behavior by virtual address, how can I change >> the configuration? I didn't find the method. (I noticed the class 'Request' >> has four constructors, but only one is related to virtual address. I didn't >> see too much use for this constructor with a virtual address.) >> >> Best regards >> gjins >> >> On Mon, Feb 8, 2016 at 10:53 AM, Steve Reinhardt <ste...@gmail.com> >> wrote: >> >>> The O state normally is not a writable state (that's what differentiates >>> it from M). The description in the wikipedia article is not very good; I >>> suggest reading about MOESI from a textbook or some other source you may >>> have access to. >>> >>> The gem5 protocol is a little unusual in how it handles states across >>> different levels in a multi-level hierarchy, but that's covered in the >>> comment I pointed you at previously. >>> >>> Steve >>> >>> On Sun, Feb 7, 2016 at 11:14 PM Gongjin Sun <gongj...@uci.edu> wrote: >>> >>>> Sorry, there is a typo: "on onwer exists" should be "no owner exists". >>>> >>>> I think more, and still can't understand why 'O' state has a "dirty" >>>> set but can't be "writable". This owner has made changes to this line, but >>>> is not "writable". That sounds like a contradiction. Or did I miss >>>> something? >>>> >>>> Thanks >>>> >>>> On Sun, Feb 7, 2016 at 11:03 PM, Gongjin Sun <gongj...@uci.edu> wrote: >>>> >>>>> Thank you, Steve. But I'm still a little confused. >>>>> >>>>> For the "A write hit implies that a cache has an exclusive copy". If >>>>> a miss happens at all cache levels, gem5 will bring this data line from >>>>> memory to L3 to L2 to L1, level by level. Now this line has three copies >>>>> and its state should be shared (clean). Next if a demand write request >>>>> arrives at L1, it will hit. So now how can we handle the copies in L2 and >>>>> L3? We can invalidate them, or propagate this line from L1 to l2 and l3 >>>>> and >>>>> make its state become shared(dirty) ?? >>>>> >>>>> Also after I read the comments in CacheBlk::print(), I think gem5's >>>>> MOESI looks like not a standard one compared with the MOESI from >>>>> wikipedia: >>>>> https://en.wikipedia.org/wiki/MOESI_protocol >>>>> >>>>> gem5's MOESI is: >>>>> >>>>> state writable dirty valid >>>>> M 1 1 1 >>>>> O 0 1 1 >>>>> E 1 0 1 >>>>> S 0 0 1 >>>>> I 0 0 0 >>>>> >>>>> For a shared block, according to the explanation of wikipedia, they >>>>> can be "dirty" (Here the 'dirty" is with respect to memory), We >>>>> probably have several modified copies. But gem5 think they are all >>>>> clean and can't be written. Does this mean on onwer exists for shared >>>>> blocks? . In addition, why can't a Owned block be "writable"? It's a >>>>> owner, right? >>>>> >>>>> I'm so confused. Hope you can help me more. Thank you so much. >>>>> >>>>> gjins >>>>> >>>>> >>>>> On Sun, Feb 7, 2016 at 10:28 PM, Steve Reinhardt <ste...@gmail.com> >>>>> wrote: >>>>> >>>>>> Upgrade requests are used on a write to a shared copy, to upgrade >>>>>> that copy's state from shared (read-only) to writable. They're generally >>>>>> treated as invalidations. >>>>>> >>>>>> A write hit implies that a cache has an exclusive copy, so it knows >>>>>> that there's no need to send invalidations to lower levels. There are >>>>>> some >>>>>> relevant comments on the block states in the CacheBlk::print() method >>>>>> definition in src/mem/cache/blk.hh. >>>>>> >>>>>> Steve >>>>>> >>>>>> >>>>>> On Sun, Feb 7, 2016 at 4:04 PM Gongjin Sun <gongj...@uci.edu> wrote: >>>>>> >>>>>>> Hi All, >>>>>>> >>>>>>> Does any know the function of the request called "UpgradeReq"? >>>>>>> Under what circumstance will this request be generated? After this >>>>>>> request >>>>>>> is sent to other cache levels, what will happen to that level? There >>>>>>> are so >>>>>>> few comments about it. Accord to its use, I guess it is related to write >>>>>>> miss. But I'm not sure about the specific functions. >>>>>>> >>>>>>> In addition, I noticed that when a "write hit" happens in a cache >>>>>>> level, this cache will NOT send an invalidate message to its lower >>>>>>> levels >>>>>>> (closer to mem) to invalidate this line's other copies. Is that correct? >>>>>>> (Note: now this cache's upper level (closer to cpu) definitely doesn't >>>>>>> contain this line, otherwise there must a write hit in that upper level >>>>>>> rather than this cache level.) >>>>>>> >>>>>>> Thank you in advance >>>>>>> >>>>>>> Best >>>>>>> gjins >>>>>>> _______________________________________________ >>>>>>> gem5-users mailing list >>>>>>> gem5-users@gem5.org >>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> gem5-users mailing list >>>>>> gem5-users@gem5.org >>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>> >>>>> >>>>> >>>> _______________________________________________ >>>> gem5-users mailing list >>>> gem5-users@gem5.org >>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>> >>> >>> _______________________________________________ >>> gem5-users mailing list >>> gem5-users@gem5.org >>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>> >> >> > _______________________________________________ > gem5-users mailing list > gem5-users@gem5.org > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users