[gem5-users] Re: [gem5-dev] Cache Compressor Architecture
(Moving this discussion to the users mailing list, as it is better suited there) Hello, Patrick, DictionaryCompressor::CompData contains the patterns (in general, any compressor's "CompData" structure contains the compressed data, not the original data - there are some exceptions). The patterns, due to the deterministic nature of cache compressors, must always be able to successfully recreate the original cache line through a call to the compressor's decompress(const CompressionData* comp_data, uint64_t* data). We do not actively use the decompression step on regular simulation because it would be extremely costly, but it should be fairly straightforward to generate the original cache line from the compressed data. You can find out how you'd call the decompress function by doing an approach similar to what is done in Compressor::Base::compress(const uint64_t* data, Cycles& comp_lat, Cycles& decomp_lat), in src/mem/cache/compressors/base.cc. Regards, Daniel ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-users] Re: Prefetch Accuracy
Hello, Deepika, Prefetches are currently only considered useful when they are not late. To add late prefetches to the useful ones you will have to modify serviceMSHRTargets(), so that in the case MSHR::Target::FromCPU the prefetcher is notified when the MSHR of a blk->wasPrefetched() is serviced. By default hits are not notified to the prefetcher; therefore, usefulPrefetches is not updated. To update the number of useful prefetches you have to set prefetch_on_access=True in the cache. This does not seem correct, so I've uploaded a proposed fix for review: https://gem5-review.googlesource.com/c/public/gem5/+/38177/1. Minor note: if you are using the develop branch, or have applied commit https://gem5-review.googlesource.com/c/public/gem5/+/35699 locally, it must be fixed with the following patch: https://gem5-review.googlesource.com/c/public/gem5/+/38176 Regards,Daniel #yiv9065000498 P {margin-top:0;margin-bottom:0;}I want to calculate the prefetch accuracy for stride prefetcher. I saw that there are two variables usefulPrefetches and issuedPrefetches in Base Pefetcher. These are used in Queued Prefetcher class to calculate prefetch accuracy. I tried using them for stride prefetcher as well. But the value of usefulPrefetches is always 0. Is this a bug or am I understanding something wrong? Best Regards,Deepika___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-users] Re: Adding new replacement policy error
Hello, John, A few questions:- Did you add the respective SHIPRP class in ReplacementPolicies.py?- Are you making sure the namespace is properly applied in the Python declaration (Something like cxx_class='ReplacementPolicy::SHiP')? - When you downloaded the patches for DRRIP, did you cherry-pick? If not, then you have probably also applied the patch https://gem5-review.googlesource.com/c/public/gem5/+/35938 , in which case you need to declare the constructor as "const Params &p". Bonus: Long ago I have also implemented SHiP-PC and SHiP-Mem. I *think* I was confident of their implementation, but I have to double-check it later. It would be great if you could take a look and give some feedback: https://gem5-review.googlesource.com/c/public/gem5/+/38118 Regards,Daniel Em sábado, 28 de novembro de 2020 00:11:28 GMT+1, John H via gem5-users escreveu: Hello, I tried creating a separate .cc and .hh file for implementing a new replacement policy and added it to Scons within replacement policies. However, when i run gem5 build, I get the following error message build/X86/mem/cache/replacement_policies/ship_rp.cc:13:22: error: 'Params' does not name a type SHIPRP::SHIPRP(const Params *p) ^ build/X86/mem/cache/replacement_policies/ship_rp.cc:13:30: error: ISO C++ forbids declaration of 'p' with no type [-fpermissive] SHIPRP::SHIPRP(const Params *p) ^ build/X86/mem/cache/replacement_policies/ship_rp.cc:13:31: error: invalid use of incomplete type 'class SHIPRP' SHIPRP::SHIPRP(const Params *p) ^ In file included from build/X86/mem/cache/replacement_policies/ship_rp.cc:8:0: build/X86/params/SHIPRP.hh:4:7: error: forward declaration of 'class SHIPRP' class SHIPRP; ^ build/X86/mem/cache/replacement_policies/ship_rp.cc:16:1: error: expected unqualified-id before '{' token { ^ scons: *** [build/X86/mem/cache/replacement_policies/ship_rp.o] Error 1 scons: building terminated because of errors. Can you please guide me on how to fix this? Thanks, John ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-users] Re: DRRIP implementation
Hello, John, I have sent you the link to the last patch in the series. On the right there is a list (called Relation Chain) containing the links to the 4 patches. I will copy and paste the links here, just in case: 1st: https://gem5-review.googlesource.com/c/public/gem5/+/37895/12nd: https://gem5-review.googlesource.com/c/public/gem5/+/37896/13rd: https://gem5-review.googlesource.com/c/public/gem5/+/37897/14th: https://gem5-review.googlesource.com/c/public/gem5/+/37898/1 To download them, you go to the top right corner, to the three vertical dots. Click on "Download Patch" (NOT Cherry Pick) and then a new window will pop up, which allows you to select how you will download. If downloading individually, copy the cherry pick command and run locally. If you want to download the whole patch chain at once go to the 4th patch in the series and select "Checkout". Ideally you'd want to cherry-pick, because then you'd keep using the stable gem5 version; however when cherry picking you will need to cherry-pick other patches for this to work (I think picking https://gem5-review.googlesource.com/c/public/gem5/+/37135 would be enough). Regarding SRRIP, it is implemented as RRIPRP(). Regards,Daniel Em sábado, 21 de novembro de 2020 22:40:48 GMT+1, John H escreveu: Hi Thanks for the reply. I don't know how to access the entire code about which you mentioned in the email. I can only see a 2 line change in ReplacementPolicies.py in the patch. Can you please provide me with the entire patch file(all the changes required for DRRIP for your implementation) if that's ok with you.That will really help me to understand it. Also is SRRIP implemented in the new gem5? Thanks John On Sat, Nov 21, 2020 at 10:27 AM Daniel Carvalho wrote: Hello, John, I have uploaded for review some patches to make DRRIP work: https://gem5-review.googlesource.com/c/public/gem5/+/37898I believe the code is well documented enough to help you understand how it works. To use DRRIPRP you must set the constituency size and the number of entries per team per constituency (team size) on instantiation. As an example, if:- table size=32KB- associativity=4 And we want:- 32 dedicated sets per team Then we'd instantiate DRRIPRP(constituency_size=1024, team_size=4) Since, IIRC, there is no analysis in the paper describing why specifically pick complements, instead of implementing DIP's complement-select policy, I implemented a "consecutive-select" policy (i.e., pick the first team_size entries of the set to be samples for team 0, and the next team_size entries of the set to be samples for team 1). If you really want that policy, you can implement it easily by modifying DuelingMonitor::initEntry. Regards,Daniel Em sábado, 21 de novembro de 2020 03:46:27 GMT+1, John H via gem5-users escreveu: Hello, I am new with gem5, just getting started. I wanted to implement DRRIP cache replacement policy. Have any one of you tried implementing this? Any pointers on this would be helpful. Thanks, John___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-users] Re: DRRIP implementation
Hello, John, I have uploaded for review some patches to make DRRIP work: https://gem5-review.googlesource.com/c/public/gem5/+/37898I believe the code is well documented enough to help you understand how it works. To use DRRIPRP you must set the constituency size and the number of entries per team per constituency (team size) on instantiation. As an example, if:- table size=32KB- associativity=4 And we want:- 32 dedicated sets per team Then we'd instantiate DRRIPRP(constituency_size=1024, team_size=4) Since, IIRC, there is no analysis in the paper describing why specifically pick complements, instead of implementing DIP's complement-select policy, I implemented a "consecutive-select" policy (i.e., pick the first team_size entries of the set to be samples for team 0, and the next team_size entries of the set to be samples for team 1). If you really want that policy, you can implement it easily by modifying DuelingMonitor::initEntry. Regards,Daniel Em sábado, 21 de novembro de 2020 03:46:27 GMT+1, John H via gem5-users escreveu: Hello, I am new with gem5, just getting started. I wanted to implement DRRIP cache replacement policy. Have any one of you tried implementing this? Any pointers on this would be helpful. Thanks, John___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-users] Re: Implementing Cache Replacement Policies
Hello, A few years ago I have implemented a few PC-reliant RPs for fun, but did not merge them upstream because I did not have time to fully test them. One day they shall see the light of day, though :) I don't remember what is required for the PC change in particular, but here are the changes that may be needed (I will add small sections of the commits for clarity). Remember that this implementation was created before Ruby started using the RPs too, so it may not work today: => Make RP:invalidate() not const (Usually these PC-reliant RPs use predictors, which must be updated on invalidations) => Use PacketPtr in *Tags::accessBlock()- virtual CacheBlk* accessBlock(Addr addr, bool is_secure, Cycles &lat) = 0; + virtual CacheBlk* accessBlock(const PacketPtr pkt, Cycles &lat) = 0; => Make touch() and reset() use packets virtual void touch(const std::shared_ptr& - replacement_data) const = 0; + replacement_data, const PacketPtr pkt) + { + touch(replacement_data); + } + virtual void touch(const std::shared_ptr& + replacement_data) const = 0; virtual void reset(const std::shared_ptr&- replacement_data) const = 0; + replacement_data, const PacketPtr pkt) + { + reset(replacement_data); + } + virtual void reset(const std::shared_ptr& + replacement_data) const = 0; => Create your RP that overrides the newly created touch() and reset() to use the packets' info Regarding your questions:1) In src/mem/packet.hh 2) The Classic cache is inside src/mem/cache/ Regards,Daniel Em quarta-feira, 28 de outubro de 2020 17:30:49 GMT+1, Abhishek Singh via gem5-users escreveu: Hi, I think you should check already implemented policies in src/mem/cache/replacement_policies and then design yours taking that as an template/example. In order to get information which you mentioned, you might have to change/add arguments to accessBlock, findBlock, insertBlock, etc function in base_set_assoc. The information you are looking can be found in pkt class. For simplicity you can also use fa_lru as template and change things in it to implement replacement policies. On Wed, Oct 28, 2020 at 11:46 AM Chongzhi Zhao via gem5-users wrote: Hi,I'm trying to evaluate a cache replacement policy with classic memory in SE mode. A few questions: - The policy requires PC, address, and access type (demand read/writeback/prefetch) to be made visible. However, I don't see these exposed to the replacement policies. Where may I find them? - The member functions are referenced in classes CacheMemory, SectorTags, and BaseSetAssoc. Which one of them would be relevant to classic memory? Cheers, Chongzhi Zhao___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s -- Best Regards, Abhishek___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-users] Re: Handling a cache miss in gem5
Hello Aritra, It seems that the tag lookup latency is indeed disregarded on misses (except for SW prefetches). The cache behaves as if a miss is always assumed to happen and "pre-prepared" in parallel with the tag lookup. I am not sure if this was a design decision, or an implementation consequence, but my guess is the latter - there is no explicit definition of the cache model pursued by the classic cache. Regards,DanielEm sexta-feira, 25 de setembro de 2020 11:00:39 GMT+2, Aritra Bagchi via gem5-users escreveu: Just a humble reminder. Any comment would be highly solicited. Thanks,Aritra On Thu, 24 Sep, 2020, 12:22 PM Aritra Bagchi, wrote: Hi all, While experimenting with gem5 classic cache, I tried to find out how an access miss is handled and with what latency. Even if in cache/tags/base_set_assoc.hh, the access (here a miss) handling latency "lat" gets assigned to the "lookupLatency", the actual latency that is used to handle a miss (in cache/base.cc: handleTimingReqMiss( ) method) is the "forwardLatency". This is my observation. Both "lookupLatency" and "forwardLatency" are assigned to the cache "tag_latency", which is okay! But I experimented with different values for them and observed that the value of "forwardLatency" actually gets reflected ( in terms of the clock cycle delay from the cpu_side port to the mem_side port) into the system for handling a cache miss. Could someone please confirm whether my observation and understanding is correct or not? Regards,Aritra ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-users] Namespace creation on develop branch
Hello, This message only concerns those who use the *develop* branch. We have recently merged another patch creating a namespace (https://gem5-review.googlesource.com/c/public/gem5/+/33294). Due to a small issue with the SCons configuration, it does not trigger automatic recompilation of the params file of the BaseCache class, so recompilation must be forced manually; otherwise, the old name without the namespace will be used, and a compilation error will pop up. More details can be found in this thread: https://www.mail-archive.com/gem5-dev@gem5.org/msg35502.html . If more patches creating namespaces are merged in the future, similar situations may happen; nonetheless, the solution is analogous. Regards, Daniel ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-users] Re: question about cache organization
Hello Sourjya, First of all, welcome! gem5 is very versatile, and there is an infinitude of things you can do with it. The first thing you will need to decide is whether you are going to use the Classic Cache (https://www.gem5.org/documentation/general_docs/memory_system/classic_caches/) or the Ruby cache (https://www.gem5.org/documentation/general_docs/ruby/). Each has its own advantages and disadvantages. My answer will focus on the Classic model. If you'd like to have a tutorial on how to use the Ruby cache, you may take a look at the Learning gem5 book (https://www.gem5.org/documentation/learning_gem5/introduction/). There are multiple cache policies you can change right off the bat, including Indexing (https://www.gem5.org/documentation/general_docs/memory_system/indexing_policies/), Replacement (https://www.gem5.org/documentation/general_docs/memory_system/replacement_policies/), and the cache organization itself (src/mem/cache/tags). Modifying the latter will allow you to achieve your goal of changing the data mapping, but you will likely need to pair it up with a new indexing policy too. Here is an example of a target cache design, and the possible changes needed to achieve it: https://stackoverflow.com/questions/62784675/are-cache-ways-in-gem5-explicit-or-are-they-implied-derived-from-the-number-of-c/62790543#62790543 A similar thought process will be applied to most changes targeting the cache organization. If you prefer to see practical code, you can check the implementation of previous tags classes, such as the Sector Cache (https://gem5-review.googlesource.com/c/public/gem5/+/9741). Regards, Daniel Em quarta-feira, 8 de julho de 2020 18:30:22 GMT+2, Sourjya Roy via gem5-users escreveu: Hi I am very new to gem5. I wanted to know if there is tutorial on changing cache data mapping or cache organization(eg If I want to change sram with any other device technology for a L2 cache) . Also I wanted to know if I can change the cache policies and data mapping inside caches. Regards, Sourjya Electrical and Computer Engineering Purdue University ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
Re: [gem5-users] How to flush all dirty blk from Cache to Memory
Hello, doWritebacks will only populate the write queue, which would then be emptied when possible. Since you are evicting a multitude of blocks simultaneously, the queue will become full and the assertion will trigger. You will either have to implement a specialized version of doWritebacks() for your use-case, or simply assume that these writebacks are done atomically and bypass the write queue with a call to doWritebacksAtomic(). Still, the second solution would probably not be that simple, because you will likely run into the same multi-eviction issue described in the last messages of https://gem5-review.googlesource.com/c/public/gem5/+/18209 Regards, Daniel Em quinta-feira, 16 de abril de 2020 16:43:12 GMT+2, 周泰宇 <645505...@qq.com> escreveu: Hi,DanielThanks for you reply. I’ve try to perform doWritebacks(..) before access(..). But It still can’t work for me. Cache will be blocked because writeBuffer.isFull() and finally gem5 will report that freeList has no space. Error report:gem5.opt: build/X86/mem/cache/write_queue.cc:64: WriteQueueEntry* WriteQueue::allocate(Addr, unsigned int, PacketPtr, Tick, Counter): Assertion `!freeList.empty()' failed. My new code is below BaseCache::recvTimingReq(pkt){ if(trigger){ //some instr will trigger this code section wb_pkts.clear(); dirty_blk_count = my_memWriteback(wb_pkts);+ doWritebacks(wb_pkts, clockEdge(lat+forwardLatency)); } bool satisfied = false; { PacketList writebacks; satisfied = access(pkt, blk, lat, writebacks); . } int BaseCache::my_memWriteback(PacketList &wb_pkts) { int count = 0; tags->forEachBlk([this,&count,&wb_pkts](CacheBlk &blk) mutable{ if(blk.isDirty()){ if(blk.isValid()){ count++; } } my_writebackVisitor(blk,wb_pkts); }); //printf("NmemWriteback count:%d\n",count); return count; } BaseCache::my_writebackVisitor(CacheBlk &blk,PacketList &writebacks) { if (blk.isDirty()) { assert(blk.isValid()); RequestPtr request = std::make_shared( regenerateBlkAddr(&blk), blkSize, 0, Request::funcMasterId); request->taskId(blk.task_id); if (blk.isSecure()) { request->setFlags(Request::SECURE); } - //PacketPtr packet = new Packet(request, MemCmd::WriteReq); + PacketPtr packet = new Packet(request, MemCmd::WriteReq);//should only see writes or clean evicts in allocateWriteBufferpacket->allocate(); std::memcpy(packet->getPtr, blk.data, blkSize); // packet->dataStatic(blk.data); writebacks.push_back(packet); blk.status &= ~BlkDirty; } } ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] How to flush all dirty blk from Cache to Memory
Hello, I believe what might be missing is a call to perform the writebacks (doWritebacks(wb_pkts, clockEdge(lat+forwardLatency))) before any other cache operations (e.g., access()). This will make sure that the coherence is kept, and you will not mistakenly use stale data. Regards, Daniel Em quarta-feira, 15 de abril de 2020 19:13:16 GMT+2, 周泰宇 <645505...@qq.com> escreveu: I don’t know why the format so garbled. So I post my code again. BaseCache::recvTimingReq(pkt){ wb_pkts.clear(); dirty_blk_count = my_memWriteback(wb_pkts); bool satisfied = false; { PacketList writebacks; satisfied = access(pkt, blk, lat, writebacks); . } int BaseCache::my_memWriteback(PacketList &wb_pkts) { int count = 0; tags->forEachBlk([this,&count,&wb_pkts](CacheBlk &blk) mutable{ if(blk.isDirty()){ if(blk.isValid()){ count++; } } my_writebackVisitor(blk,wb_pkts); }); //printf("NmemWriteback count:%d\n",count); return count; } BaseCache::my_writebackVisitor(CacheBlk &blk,PacketList &writebacks) { if (blk.isDirty()) { assert(blk.isValid()); RequestPtr request = std::make_shared( regenerateBlkAddr(&blk), blkSize, 0, Request::funcMasterId); request->taskId(blk.task_id); if (blk.isSecure()) { request->setFlags(Request::SECURE); } PacketPtr packet = new Packet(request, MemCmd::WriteReq); packet->allocate(); std::memcpy(packet->getPtr, blk.data, blkSize); // packet->dataStatic(blk.data); writebacks.push_back(packet); blk.status &= ~BlkDirty; } } ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Impact of using NonCachingSimpleCPU for profiling and creating checkpoints
In particular, the following description seems to be relevant to your questions: "Simple CPU model based on the atomic CPU. Unlike the atomic CPU, this model causes the memory system to bypass caches and is therefore slightly faster in some cases. However, its main purpose is as a substitute for hardware virtualized CPUs when stress-testing the memory system." You can find further details in the commit that introduced NonCachingSimpleCPU: https://gem5-review.googlesource.com/c/public/gem5/+/12419 When checkpoints/simpoints are taken, the cache contents are not stored, therefore it would not matter if you created the checkpoints with them. Then, when you restore the checkpoints, you must provide the real desired configuration, which would include the cache hierarchy and your non-cache-bypassing CPU type. As such, when the checkpoints are restored, the caches will be empty, so you should provide a warmup period to fill them. Regards, Daniel Em sábado, 14 de março de 2020 23:04:40 GMT+1, Abhishek Singh escreveu: Hi, I do not know the reason for that. But if you want to create simpoints which will be used by O3CPU, you should use the AtomicSimpleCPU with "--caches" option and also add "--l2cache" if your O3CPU is using L2 cache. Best regards, Abhishek On Sat, Mar 14, 2020 at 5:53 PM Ali Hajiabadi wrote: Thanks for your reply. But se.py script checks that the CPU type is non-caching. Is there a reason for that? Can I ignore those checks? On Sun, Mar 15, 2020 at 5:41 AM Abhishek Singh wrote: Hi,I would advise using Atomic Simple Cpu with “—caches” option to create Simpoints On Sat, Mar 14, 2020 at 5:35 PM Ali Hajiabadi wrote: Hi everyone, What is the difference between using NonCachingSimpleCPU and AtomicSimpleCPU in order to profile and taking simpoints and checkpoints? I want to use checkpoints to simulate and evaluate my own modified version of O3 core model. Which CPU type is the best to profile and take checkpoints? I don't want to bypass caches in my O3 model. Also, I am using RISCV implementation of gem5. Thanks,Ali___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Computing stat of type Value
Hello Victor, Ive never used functor(), cant compile right now, and there are very few examples in gem5 of how to use it (check src/sim/stat_control.cc and src/unittest/stattest.cc); however, if you are just interested in calculating the number of zero bytes to extract the percentage, as in Figure 1 of "A robust main-memory compression scheme", by Ekman et al., a simple Formula should suffice. Something around the lines of: // in .hh Stats::Scalar numZeroBytes; Stats::Scalar numByteSamples; Stats::Formula zeroBytePercentage; // in regStats zeroBytePercentage = numZeroBytes / numByteSamples; // on writes numZeroBytes is increased on every zero byte seen, and numByteSamples for every byte. This would save you the space of using that huge vector, and give you the base values, which will make it easier for you to apply geometric mean to the results. Regards, Daniel Em quarta-feira, 12 de fevereiro de 2020 03:52:43 GMT+1, Victor Kariofillis escreveu: Hi, I have created a stat of type Value called zeroBytesPercentage. It is the percentage of zero bytes of all the lines that are written in the LLC. I want it to get a value from a function (calcZeroBytesPercentage), so I'm using functor. zeroBytesPercentage .functor(calcZeroBytesPercentage) .flags(total) .precision(3) ; Throughout the program execution I'm populating a vector called totalLines, which I've declared as public in BaseCache. This vector stores all the cache lines that are written to the LLC during the execution of the program. My problem is that the function is not static and that it uses non-static variables (the totalLines vector). I've tried having the function as a member of either BaseCache or BaseCache::CacheStats, but either way I'm getting errors about the use of non-static data members. How can I do this calculation at the end of the execution and be able to access variables that are non-static? ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] [gem5-dev] gem5 stable release proposal [PLEASE VOTE!]
- I think master should be stable - I think gem5 should be released three times per year Regards,Daniel Em segunda-feira, 16 de dezembro de 2019 22:33:14 GMT+1, Bobby Bruce escreveu: * I think master should be stable * I think gem5 should be released three times per year -- Dr. Bobby R. Bruce Room 2235, Kemper Hall, UC Davis Davis, CA, 95616 On Mon, Dec 16, 2019 at 11:50 AM Jason Lowe-Power wrote: > Hi all, > > As many of you have seen on gem5-dev, we are going to be adding a > "stable" version of gem5. Below is the current proposal. There are a > couple of points below where there has not been general consensus > reached. We would appreciate feedback *from everyone in the community* > on the points where a decision hasn't been made below. gem5 is a > community-driven project, and we need feedback to make sure we're > making community-focused decisions. > > We will be introducing a new "stable" branch type to gem5. We are > doing this for the following reasons: > - Provide a way for developers to communicate major changes to the > code. We will be providing detailed release notes for each stable > release. > - Increase our test coverage. At each stable release, we will test a > large number of "common" benchmarks and configurations and publicize > the current state of gem5. > - Provide a way for researchers to communicate to the rest of the > community information about their simulation infrastructure (e.g., in > a paper you can say which version of gem5 you used). > > On the stable version of gem5, we will provide bugfixes until the > next release, but we will not make any API changes or add new > features. > > We would like your feedback on the following two questions: > > **Which branch should be default?** > > We can either have the master branch in git be the "stable" or the > "development" branch. If master is the stable branch, then it's easier > for users to get the most recent stable branch. If master is the > development branch, it's more familiar and easier for most developers. > Either way, we will be updating all of the documentation to make it > clear. > > Please let us know which you prefer by replying "I think master should > be stable" or "I think master should be development". > > **How often should we create a new gem5 release?** > > We can have a gem5 release once per year (likely in April) or three > times per year (April, August, and December). Once per year means that > if you use the stable branch you will get updates less frequently. > Three times per year will mean there are more releases to choose from > (but a newer release should always be better). On the development > side, I don't think one will be more work than the other. Once per > year means more backporting, and three times per year means more > testing and time spent on releases. > > Please let us know which you prefer by replying "I think gem5 should > be released once per year" or "I think gem5 should be released three > times per year." > > > > > A couple of notes to everyone who's been following the discussion on > the gem5-dev mailing list: > - We have dropped the proposal for major vs minor releases. Note that > there was some pushback on having only major releases when this was > proposed on the gem5 roadmap, but it sounded like the consensus was to > drop minor releases for now. > - We will still allow feature branches *in rare circumstances*. This > will be by request only (send mail to gem5-dev if you would like to > discuss adding a new branch), and the goal will be integration within > a few months. All code review will still happen in the open on gerrit. > The benefits will be > 1) rebases won't be required as you can just make changes to the head > of the branch > 2) many features take more than a few months to implement, so if it's > not ready by a release it can be pushed to the next > 3) large changes won't be hidden in AMD or Arm-specific repositories > and *anyone* will be able to request a branch. > > Thanks everyone for the discussions so far! It would be most useful to > hear back by the end of the week. However, I don't expect any concrete > actions will be taken until after the holidays. > > Cheers, > Jason > ___ > gem5-dev mailing list > gem5-...@gem5.org > http://m5sim.org/mailman/listinfo/gem5-dev ___ gem5-dev mailing list gem5-...@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] some errors when build gem5.fast
Hello, I am unable to test nor fix it until next week, but it seems that both implementations of fromDictionaryEntry() and toDictionaryEntry() must be moved from the _impl.hh to the main header file, since the patterns that have been added to the latter use them. Kindly let me know if that fixes the issue. Regards, Daniel ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Understanding the CacheRepl debug flag output
Hello, These blocks are invalid (valid: 0). Due to that, cacheblk's invalidate() has been called at some point previously, setting the tag address to MaxAddr (see src/mem/cache/cache_blk.hh). Since set and way belong to ReplaceableEntry (src/mem/cache/replacement_policies/replaceable_entry.hh), from which CacheBlk inherits, and there is no invalidation of those, it is assumed that there will be a isValid() check before they are used, therefore there should be no issue that their contents are garbage when dealing with invalid blocks. Initially, all blocks are invalid, so it makes sense that all victims are invalid, but after running for a while you'll see that valid blocks will start being replaced. In any case, those are the replacement victims, not the blocks who were inserted. If you want to print the block who is replacing the victim, I am not sure if there's a DPRINT for that, but you could easily add it, likely to src/mem/cache/tags/base.cc::insertBlock() or src/mem/cache/base.cc. Regards, Daniel Em segunda-feira, 18 de novembro de 2019 18:15:37 BRT, Charitha Saumya escreveu: Hi, I am a newbie to gem5 and I have been testing a simple array traversal using gem5 x86 build. The system I am testing has L1 icache, dcache and a share L2 cache. I used this command for running gem5,/build/X86/gem5.opt --debug-flags=CacheRepl configs/tutorial/two_level.py --l2_size='1MB' --l1d_size='256kB' --benchmark=tests/test-progs/simple/simple32 In the debug messages I get I see a lot of tag: 0xfff. Can someone explain why the cache tag is fixed but only the set is changing? In my array traversal program the array is large (2048000 elements of uint32_t)and I write to every element. 58158000: system.cpu.dcache: Replacement victim: state: 0 (I) valid: 0 writable: 0 readable: 0 dirty: 0 | tag: 0x set: 0x668 way: 0 58308000: system.l2cache: Replacement victim: state: 0 (I) valid: 0 writable: 0 readable: 0 dirty: 0 | tag: 0x set: 0x33f way: 0 58329000: system.cpu.dcache: Replacement victim: state: 0 (I) valid: 0 writable: 0 readable: 0 dirty: 0 | tag: 0x set: 0x33f way: 0 58657000: system.l2cache: Replacement victim: state: 0 (I) valid: 0 writable: 0 readable: 0 dirty: 0 | tag: 0x set: 0x664 way: 0 58678000: system.cpu.dcache: Replacement victim: state: 0 (I) valid: 0 writable: 0 readable: 0 dirty: 0 | tag: 0x set: 0x664 way: 0 58828000: system.l2cache: Replacement victim: state: 0 (I) valid: 0 writable: 0 readable: 0 dirty: 0 | tag: 0x set: 0x335 way: 0 58849000: system.cpu.dcache: Replacement victim: state: 0 (I) valid: 0 writable: 0 readable: 0 dirty: 0 | tag: 0x set: 0x335 way: 0 Thanks,Charitha ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Adding latencies in cache accesses
Hello Victor, pkt->headerDelay is the time until the metadata of the packet arrives, so that the access can start (e.g., tag lookup). pkt->payloadDelay is the time until the packet's data arrives. The max() formula means that the tag lookup is done while the payload arrives. If the data is present, but the tag lookup is not done yet, wait for it to end, and vice-versa. filLatency is the time to fill/write to a block. Regards, Daniel Em sábado, 19 de outubro de 2019 00:13:37 GMT+2, Victor Kariofillis escreveu: Hi everyone, I have one more question for the setWhenReady() function. I see that when it's called it is like that: blk->setWhenReady(clockEdge(fillLatency) + pkt->headerDelay + std::max(cyclesToTicks(tag_latency), (uint64_t)pkt->payloadDelay)); I am trying to understand the different components that comprise the total time after which the block will be ready to be accessed. If my implementation means that there is stuff going on before the data is actually written, does that mean that it directly added the above formula? What is the significance of this part of the formula: std::max(cyclesToTicks(tag_latency), (uint64_t)pkt->payloadDelay) Thanks,Victor ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Adding latencies in cache accesses
Victor, It depends on how you want the latency to be added. recvTimingResp() will receive the packet at tick X, and will start the filling process, which is done off the critical path, and thus we only need to care about this latency to schedule the evictions caused by this fill. In any case, it is assumed that filling itself takes some time, and in the meantime the block will not be accessible (check setWhenReady in handleFill). Therefore, any reads to the block in the meantime will take this extra ready time into account for their latency calculations. If I understand your situation correctly, you likely want to add your delay to that call to setWhenReady, in handleFill. I have never tried to multiply Cycles, but I'd assume it is straightforward; if not, you may have to use some casts. Regards, Daniel Em quarta-feira, 16 de outubro de 2019 23:38:46 GMT+2, Victor Kariofillis escreveu: Hi Daniel, First of all thanks for answering. I have some more questions. In my case, latencies are added every time data is written to the cache. So for example, theoretically latency should be added in handleFill() as well. I see that handleFill() doesn't have any latency computation in it. It is also absent from recvTimingResp() that calls it. Is this because it is off the critical path? Also, it there any way to multiply a Cycles type variable? What I want to do is indicate that a because some things happen serially, a particular latency happens n times. Thanks,Victor On Sun, 13 Oct 2019 at 22:40, Victor Kariofillis wrote: Hi, I am interested in adding additional latencies during a cache access. I have implemented some extra functionality that happens in the cache and I am wondering about how to model the extra time it will take for that to happen. Where would I add the extra latency? For example, inside the access() function there is this line of code: // Calculate access latency on top of when the packet arrives. This // takes into account the bus delay. lat = calculateTagOnlyLatency(pkt->headerDelay, tag_latency); Right below that, there is a "return false;" line. How is that latency being used? Also, how can I make sure whether the execution stalls until something else has finished or things that can happen concurrently? Thank you,Victor ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Adding latencies in cache accesses
Hello Victor, Everything depends on your design and when the extra latency should be applied. If it is within the tag-data access, you should likely put it inside calculateXLatency. The compressor, for example, adds latency after the data has been accessed, so the decompression latency is added after the calculateXLatency function has been called (search for getDecompressionLatency, in src/mem/cache/base.cc). I don't know which gem5 version you have, nor which line you are showing, but in general the latency will be used by the function that calls access(), recvTiminReq(). This latency is used to define when blocks will be sent for eviction, and when the response will be sent. It is an event based approach, so the execution will only continue at the specified cycle; if there are multiple things scheduled for a specific cycle, they will all happen at the same cycle. Regards,Daniel Em segunda-feira, 14 de outubro de 2019 04:41:16 GMT+2, Victor Kariofillis escreveu: Hi, I am interested in adding additional latencies during a cache access. I have implemented some extra functionality that happens in the cache and I am wondering about how to model the extra time it will take for that to happen. Where would I add the extra latency? For example, inside the access() function there is this line of code: // Calculate access latency on top of when the packet arrives. This // takes into account the bus delay. lat = calculateTagOnlyLatency(pkt->headerDelay, tag_latency); Right below that, there is a "return false;" line. How is that latency being used? Also, how can I make sure whether the execution stalls until something else has finished or things that can happen concurrently? Thank you,Victor ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Zero Compressed Data with Spec2017
Hello Debiprasanna, Since you did not specify which level you are talking about, how you are measuring zero data, system config, etc, I cannot give you a precise answer, but these results seem incorrect to me. Have you tried using BDI and CPack, which are present in your version, to verify if the amount of Zero (BDI), or Pattern (CPack) match your results? My runs of SPEC 2017 with FPCD indicate an average of 36% of data passing through L3 is composed of zeros (90%+ of these were , and the rest a mixture of XXZZ, XZZZ, ZXZX, ZZXX, ZZZX). Notice that this is a very rough/quick estimate using geometric mean of non-normalized data. On leela, the arithmetic mean of the checkpoints indicates that 48% of the compressed patterns are zeros (the patterns given above; is 55% of that). On deepsjeng the arithmetic mean of the cpts indicates that 83% of the compressed patterns are zeros (the patterns given above; is 89% of that). Regards,Daniel Em terça-feira, 24 de setembro de 2019 18:25:56 GMT+2, Jason Lowe-Power escreveu: Hi Debiprasanna, I can't answer your question about compression. However, if you'd like to contribute, please check out the CONTIRUBTING.md document. I'm not sure exactly what your script does, so it's hard to say where the best location would be. Possibly in configs/ if it is a configuration or a runscript or in util/ if it's just scripts to run gem5. Cheers,Jason On Tue, Sep 24, 2019 at 3:34 AM Debiprasanna Sahoo wrote: Hi All, I am trying to evaluate the percentage of Zero Compressed data with Spec2017 on gem5 revision-603f137. I observe that most of these benchmarks have extremely high percentage of zeros.Even some benchmarks like leela and deepsjeng show that 100% of the data is zero. The simulation was done at multiple regions of interest.Is the model accurate or there are some issue? I would also like to contribute the python script I have written to run SPEC2017 to gem5. Please let me know how do I do it. Regards, Debiprasanna Sahoo___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] About the BDI compression file
No idea; numCycles (and most other stats) should have been different if you are using something other than an extremely simple (e.g., hello world) workload. Check your configs to make sure you are running the ones you desire. Regards,Daniel Em quinta-feira, 23 de maio de 2019 17:33:00 GMT+2, Pooneh Safayenikoo escreveu: Hi Daniel, Thank you so much for your help. But, for all the benchmarks the performance (based on "system.cpu.numCycles" and "system.cpu.committedInsts" in stats file) of both BDI and uncompressed are the same for me. Actually, all the metrics of cpu in the stats file are same for both of them. Do you know why it happens for me due to different miss rate for L2 cache? Many Thanks again! Best,Pooneh On Thu, May 23, 2019 at 6:58 AM Daniel Carvalho wrote: Hello Pooneh, You can check papers that discuss turning on and off compression (among others), for common explanations of the negative influence of compression in some workloads. Here is an extract of one of my simulation results both for mcf and geo mean of all SPEC 2017 benchmarks: BDI on L3 system.switch_cpus.ipc 0.309029 # IPC: Instructions Per Cycle - 505.mcf_r system.switch_cpus.ipc 0.829107 # IPC: Instructions Per Cycle - Geo mean Uncompressedsystem.switch_cpus.ipc 0.310797 # IPC: Instructions Per Cycle - 505.mcf_rsystem.switch_cpus.ipc 0.823940 # IPC: Instructions Per Cycle - Geo mean As you can see, even though compression has a negative impact on the IPC in mcf, overall it can generate improvements (similar results are seen for the miss rate). Regards,Daniel Em quarta-feira, 22 de maio de 2019 05:50:22 GMT+2, Pooneh Safayenikoo escreveu: Hi, I want to apply BDI compression on the L2 cache. So, I changed the config file for the caches (gem5/configs/common/Caches.py) like following: class L1Cache(Cache): tags = BaseSetAssoc() compressor = NULL class L2Cache(Cache): tags = CompressedTags() compressor = BDI() After that, I got the results for some SPEC benchmarks (I used a configuration like BDI paper) to compare the L2 miss rate between this compression and baseline (without applying BDI and CompressedTags). But, miss rate increases a little for some benchmarks (like mcf and bzip). Why BDI has higher L2 miss rate? I cannot make sense of it. Many thanks for any help! Best,Pooneh ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] About the BDI compression file
Hello Pooneh, You can check papers that discuss turning on and off compression (among others), for common explanations of the negative influence of compression in some workloads. Here is an extract of one of my simulation results both for mcf and geo mean of all SPEC 2017 benchmarks: BDI on L3 system.switch_cpus.ipc 0.309029 # IPC: Instructions Per Cycle - 505.mcf_r system.switch_cpus.ipc 0.829107 # IPC: Instructions Per Cycle - Geo mean Uncompressedsystem.switch_cpus.ipc 0.310797 # IPC: Instructions Per Cycle - 505.mcf_rsystem.switch_cpus.ipc 0.823940 # IPC: Instructions Per Cycle - Geo mean As you can see, even though compression has a negative impact on the IPC in mcf, overall it can generate improvements (similar results are seen for the miss rate). Regards,Daniel Em quarta-feira, 22 de maio de 2019 05:50:22 GMT+2, Pooneh Safayenikoo escreveu: Hi, I want to apply BDI compression on the L2 cache. So, I changed the config file for the caches (gem5/configs/common/Caches.py) like following: class L1Cache(Cache): tags = BaseSetAssoc() compressor = NULL class L2Cache(Cache): tags = CompressedTags() compressor = BDI() After that, I got the results for some SPEC benchmarks (I used a configuration like BDI paper) to compare the L2 miss rate between this compression and baseline (without applying BDI and CompressedTags). But, miss rate increases a little for some benchmarks (like mcf and bzip). Why BDI has higher L2 miss rate? I cannot make sense of it. Many thanks for any help! Best,Pooneh ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Compressor
Hello Pooneh, There is currently no support for compressed L1 caches (and there is no plan to add, since it would require great modifications to the caches), therefore if you setup the configuration in src/mem/cache/Cache.py it is going to break it (it sets for all caches, including L1). What you want is to create your own config file (Jason's website can help you with that) that sets the L2 (L3,L4...; as many as you want) as compressed. A quick and VERY dirty fix just to check if it works would be to add the compression parameters (tags = CompressedTags() \n compressor = BDI()) to L2Cache in configs/common/Caches.py (Again, you'd better create your own config file; it takes a bit of time, but it is worth it). By the way, you might want to check out the other existing debug flags for caches too (src/mem/cache/SConscript). Regards, Daniel Em quinta-feira, 16 de maio de 2019 21:11:03 GMT+2, Pooneh Safayenikoo escreveu: Hi, I installed a new version of Gem5 that has compression patches. So, I changed Null to BDI() for compressor in src/mem/cache/Cache.p. but I have a segmentation fault when I write a --caches or --l2cache in the command line. I wrote dprintf to print data in l2 by adding cache for debug-flage but there is nothing in trace file. Could you please tell me how I can fix it? Many Thanks! Best,Pooneh ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] [gem5-dev] GCC Compatibility Issue
Dear Liang, I recommend to use the latest version, as you can be up to date with bug fixes and new features (and even patch some). You should find "documentation" all over the internet (jason's tutorial: http://learning.gem5.org/book/index.html, stack overflow: https://stackoverflow.com/questions/tagged/gem5, Gem5 papers, gem5.org, the mailing list, tracking back the commits that introduced the feature that you want to know about...), but indeed there's no easy way, and you will end up recurring a lot to trial and error. Regards,Daniel Em segunda-feira, 12 de novembro de 2018 13:36:35 GMT+1, 梁政 escreveu: Thanks for your advice. I can build 403.gcc, but it seems it won't stop for input set 'ref' even in my physical machine (not gem5). Currently, I move my work to Ubuntu 16.04 (It seems the latest Glibc will cause compilation error of SPEC 2006 too...) The newest release version of gem5 is 2015-09. I wonder whether it is suitable for beginners like me to use the current development version of gem5. (Actually, I did... Because of the compilation error...) It seems the documents in gem5.org are out of date, and not well documented. If I have to use the development version, where can I find new tutorials? Best Regards Zheng Liang EECS, Peking University -Original Messages- From:"Daniel Carvalho" Sent Time:2018-11-12 13:52:06 (Monday) To: gem5-users@gem5.org, gem5-...@gem5.org Cc: Subject: Re: [gem5-users] [gem5-dev] GCC Compatibility Issue Hello Liang, Regarding Gem5 and GCC, you can pull more recent versions of Gem5 which support newer versions of GCC: - Up to GCC 7 support: https://gem5-review.googlesource.com/c/public/gem5/+/9101 - Up to GCC 8 support: https://gem5-review.googlesource.com/c/public/gem5/+/11949 - Up to GCC 8.1 support: https://gem5-review.googlesource.com/c/public/gem5/+/12685 - Current Gem5 patch: https://github.com/gem5/gem5 Regarding SPEC2006, I can't help you, but I know SPEC2017 works fine (x86 and arm, multiple ubuntu versions). If I recall correctly, only had to fix one compilation flag for the gcc benchmarks to compile, and there was a seg fault in another benchmark. Regards, Daniel Em domingo, 11 de novembro de 2018 22:17:33 GMT+1, 梁政 escreveu: Hi all, It seems that gcc compatibility is a big problem. I have found several issues when I try to run SPEC 2000 and SPEC 2006 in gem5. (1) The SPEC 2006 cannot be built with gcc 7. I choose to downgrade to gcc 4.8, but it doesn't work. (1) The stable release gem5-stable-2015-09 cannot be built with gcc 7 as well as gcc 4.8. It seems the compiler check is too strict. (i) undefined macro PROTOBUF_INLINE_NOT_IN_HEADERS caused compiling error. (ii) '~' on an expression of type bool caused error [-Werror=bool-operation]. (error from /dev/copy_engine/cc) When building SPE 2006 bench... there are so many bugs... My system is Ubuntu 18.04 LTS. Is there anyone run SPEC successfully on Ubuntu 18.04? Or I should reinstall my system? Best Regards Zheng Liang Peking University ___ gem5-dev mailing list gem5-...@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev ___ gem5-dev mailing list gem5-...@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] [gem5-dev] GCC Compatibility Issue
Hello Liang, Regarding Gem5 and GCC, you can pull more recent versions of Gem5 which support newer versions of GCC:- Up to GCC 7 support: https://gem5-review.googlesource.com/c/public/gem5/+/9101- Up to GCC 8 support: https://gem5-review.googlesource.com/c/public/gem5/+/11949- Up to GCC 8.1 support: https://gem5-review.googlesource.com/c/public/gem5/+/12685- Current Gem5 patch: https://github.com/gem5/gem5 Regarding SPEC2006, I can't help you, but I know SPEC2017 works fine (x86 and arm, multiple ubuntu versions). If I recall correctly, only had to fix one compilation flag for the gcc benchmarks to compile, and there was a seg fault in another benchmark. Regards,Daniel Em domingo, 11 de novembro de 2018 22:17:33 GMT+1, 梁政 escreveu: Hi all, It seems that gcc compatibility is a big problem. I have found several issues when I try to run SPEC 2000 and SPEC 2006 in gem5. (1) The SPEC 2006 cannot be built with gcc 7. I choose to downgrade to gcc 4.8, but it doesn't work. (1) The stable release gem5-stable-2015-09 cannot be built with gcc 7 as well as gcc 4.8. It seems the compiler check is too strict. (i) undefined macro PROTOBUF_INLINE_NOT_IN_HEADERS caused compiling error. (ii) '~' on an expression of type bool caused error [-Werror=bool-operation]. (error from /dev/copy_engine/cc) When building SPE 2006 bench... there are so many bugs... My system is Ubuntu 18.04 LTS. Is there anyone run SPEC successfully on Ubuntu 18.04? Or I should reinstall my system? Best Regards Zheng Liang Peking University ___ gem5-dev mailing list gem5-...@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] How to modify the Cache Timing Model ?
Hello Liang, The cache timing model is something that me, Jason and Nikos have been recently discussing. You can follow part of the discussion on the following links: https://gem5-review.googlesource.com/c/public/gem5/+/13697 and https://gem5-review.googlesource.com/c/public/gem5/+/13835. Reworking the timing model to be more accurate and flexible is a hard task, and you will likely just want to modify code related to timing, as the latency in atomic mode is not well defined (in the sense of how correct it should be; There is no clear answer). I'd suggest you to look at these patches and their evolution to have and idea of the decisions that you may or may not want to take (I am not saying they are the right way to do it, though). Regarding the Tags, we currently have 3 sub-classes: BaseSetAssoc, FALRU and SectorTags. Although we can think of FALRU as a subset of BaseSetAssoc (and you can definitely create a full associative tag using BaseSetAssoc), FALRU has its own implementation, which leverages from hashes and the awareness of a single replacement policy (LRU), because otherwise the cost of checking every block would be too high for greater cache sizes. Regards,Daniel Em sexta-feira, 9 de novembro de 2018 22:00:59 GMT+1, 梁政 escreveu: Hi I am reading the latest code of gem5 and try to make the cache model more flexible (e.g., allowing non-constant access latency). So I will change the timing behavior of the Cache class. Currently, I am reading the code in /mem/cache. I found that two major classes have their timing model: the Cache/NoncoherentCache/BaseCache family and the Tag family. So what I need to do is to change related codes with a device timing model, right? Or there may be other points I missed? Thanks for your advice. BTW, what are the FALRU tags for? It seems all configurations use SetAssoc Tags. I found a paper from UC.Berkeley. It is related to Sector Cache. Maybe someone will use that model in the future. But why should Fully-associative LRU cache be considered separately? Regards Zheng Liang EECS, Peking University ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Understanding the SnoopFilter
Hello! First I will give some background of what is happening. This will be used to formulate some questions. I've got an error with the SnoopFilter after trying to implement cache compression: panic: panic condition !(sf_item.holder & req_port) occurred: requester 1 is not a holder :( SF value 0.0 Memory Usage: 659104 KBytes Program aborted at tick 6762500 --- BEGIN LIBC BACKTRACE --- ./build/X86/gem5.opt(_Z15print_backtracev+0x28)[0x13c1948] ./build/X86/gem5.opt(_Z12abortHandleri+0x46)[0x13d44c6] /lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7f61720c3390] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x38)[0x7f61708a6428] /lib/x86_64-linux-gnu/libc.so.6(abort+0x16a)[0x7f61708a802a] ./build/X86/gem5.opt[0x7061bf] ./build/X86/gem5.opt(_ZN11SnoopFilter13lookupRequestEPK6PacketRK9SlavePort+0xc7d)[0x77ec6d] ./build/X86/gem5.opt(_ZN12CoherentXBar10recvAtomicEP6Packets+0x107)[0x74b0d7] ./build/X86/gem5.opt(_ZN5Cache18doWritebacksAtomicERNSt7__cxx114listIP6PacketSaIS3_EEE+0xc3)[0x14aa163] ./build/X86/gem5.opt(_ZN9BaseCache10recvAtomicEP6Packet+0x392)[0x14a55a2] ./build/X86/gem5.opt(_ZN12CoherentXBar10recvAtomicEP6Packets+0x686)[0x74b656] ./build/X86/gem5.opt(_ZN5Cache19handleAtomicReqMissEP6PacketP8CacheBlkRNSt7__cxx114listIS1_SaIS1_EEE+0x149)[0x14ab569] ./build/X86/gem5.opt(_ZN9BaseCache10recvAtomicEP6Packet+0x41d)[0x14a562d] ./build/X86/gem5.opt(_ZN15AtomicSimpleCPU4tickEv+0x8cf)[0x147c02f] ./build/X86/gem5.opt(_ZN10EventQueue10serviceOneEv+0xc5)[0x13c8135] ./build/X86/gem5.opt(_Z9doSimLoopP10EventQueue+0x50)[0x13e01c0] ./build/X86/gem5.opt(_Z8simulatem+0xd1b)[0x13e12ab] I am using the following line to run Gem5: ./build/X86/gem5.opt --debug-flags=SnoopFilter,CacheAll ./configs/example/se.py -c --caches --l2cache --l2_size=4096 A superblock is a set of blocks that share a single tag. In this execution I allow a maximum of 8 blocks per superblock, although it can happen to other values too. The error only happens if there can be multiple instances of the same superblock coexisting in the cache (i.e., 2 blocks per superblock and that are not co-allocatable would generate two superblock entries, each with one of the blocks, but both with the same tag), however it is not certain to happen just because they coexist. This is likely to be caused by an insertion that is not being informed correctly. Setting the debug flags to CacheAll,SnoopFilter generates the following output (isolated for the address that causes the error; "===+===" was inserted by me to inform that there are lines relative to other addresses in between). 4559500: system.cpu.dcache: createMissPacket: created ReadExReq [2f500:2f53f] from WriteReq [2f530:2f537] 4559500: system.cpu.dcache: handleAtomicReqMiss: Sending an atomic ReadExReq [2f500:2f53f] 4559500: system.tol2bus.snoop_filter: lookupRequest: src system.tol2bus.slave[1] packet ReadExReq [2f500:2f53f] 4559500: system.tol2bus.snoop_filter: lookupRequest: SF value 0.0 4559500: system.tol2bus.snoop_filter: lookupRequest: new SF value 2.0 4559500: system.l2: access for ReadExReq [2f500:2f53f] miss 4559500: system.l2: createMissPacket: created ReadExReq [2f500:2f53f] from ReadExReq [2f500:2f53f] 4559500: system.l2: handleAtomicReqMiss: Sending an atomic ReadExReq [2f500:2f53f] 4559500: system.membus.snoop_filter: lookupRequest: src system.membus.slave[1] packet ReadExReq [2f500:2f53f] 4559500: system.membus.snoop_filter: lookupRequest: SF value 0.0 4559500: system.membus.snoop_filter: lookupRequest: new SF value 1.0 4559500: system.membus.snoop_filter: updateResponse: src system.membus.slave[1] packet ReadExResp [2f500:2f53f] 4559500: system.membus.snoop_filter: updateResponse: old SF value 1.0 4559500: system.membus.snoop_filter: updateResponse: new SF value 0.1 4559500: system.l2: handleAtomicReqMiss: Receive response: ReadExResp [2f500:2f53f] in state 0 4559500: system.l2.compressor: Compressed cache line from 512 to 32 bits. Compression latency: 13, decompression latency: 8 4559500: system.l2.tags: set 2, way 5, superblock offset 4: selecting blk for replacement 4559500: system.l2: Block addr 0x2f500 (ns) moving from state 0 to state: 87 (E) valid: 1 writable: 1 readable: 1 dirty: 0 tag: 2f ===+=== 4559500: system.tol2bus.snoop_filter: updateResponse: src system.tol2bus.slave[1] packet ReadExResp [2f500:2f53f] 4559500: system.tol2bus.snoop_filter: updateResponse: old SF value 2.0 4559500: system.tol2bus.snoop_filter: updateResponse: new SF value 0.2 4559500: system.cpu.dcache: handleAtomicReqMiss: Receive response: ReadExResp [2f500:2f53f] in state 0 4559500: system.cpu.dcache.compressor: Compressed cache line from 512 to 32 bits. Compression latency: 13, decompression latency: 8 4559500: system.cpu.dcache.tags: set 17a, way 0, superblock offset 4: selecting blk for replacement 4559500: system.cpu.dcache: Block addr 0x2f500 (ns) moving from state 0 to state: 87 (E) valid: 1 writable: 1 readable: 1 dirty: 0 tag: 0 ===+=== 6682500: system.cpu