[m5-dev] Linux Kernel/Boot Time for X86_FS
Hi everyone, I am interested in helping develop X86_FS boot up and testing. Under X86_FS, I have been able to boot a couple different versions of the Linux kernel (v2.6.22.9 and v2.6.28.4), but the bring up requires more than 12 hours of simulation time. I am hoping to reduce the boot time to make it more usable. I recall that the M5 patches for alpha-linux play some tricks to speed bootup, so I tried building an x86 kernel v2.6.27 with the patches. It looks like many of the patches are specific to ALPHA, so (maybe unsurprisingly) I encountered errors quickly in the build. I am wondering if anyone is currently working on this, or if I could get some pointers on where to dig in. Thank you, Joel -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Linux Kernel/Boot Time for X86_FS
Hi Gabe and Ali, Thanks for the leads! I'd love to get my hands on the x86-specific patches if you can find them. I have been booting to shell with m5term so far, and you're right, I need to disable the init services. Is it safe to disable all of them, or all under certain runlevels? Thanks, Joel On Wed, Jun 9, 2010 at 10:00 PM, Ali Saidi sa...@umich.edu wrote: Hi Joel, The patches do two things to improve the simulation speed. First, they calculate what loopsperjiffy would be given the processor frequency and write that value into the global variable. You can get around this by just passing the lpj=XX boot argument to the kernel, so this change isn't particularly needed anymore. The other thing they do is re-write the __delay to use an pseudo instruction (a made up opcode that does simulator specific functionality) that encodes how long the processor should sleep for. Thus when udelay() and nsdelay() are used in the kernel, the cpu model can just jump to the right time (either the end of the delay or an interrupt). The various other patches provide additional pseudo instructions, but none of them relate to performance. Ali On Jun 9, 2010, at 4:09 PM, Joel Hestness wrote: Hi everyone, I am interested in helping develop X86_FS boot up and testing. Under X86_FS, I have been able to boot a couple different versions of the Linux kernel (v2.6.22.9 and v2.6.28.4), but the bring up requires more than 12 hours of simulation time. I am hoping to reduce the boot time to make it more usable. I recall that the M5 patches for alpha-linux play some tricks to speed bootup, so I tried building an x86 kernel v2.6.27 with the patches. It looks like many of the patches are specific to ALPHA, so (maybe unsurprisingly) I encountered errors quickly in the build. I am wondering if anyone is currently working on this, or if I could get some pointers on where to dig in. Thank you, Joel -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
[m5-dev] Configuration file for building Linux x86
Hi, This might be a question for Gabe: Steve Reinhardt pointed me to a Linux binary that I have been able to boot with X86_FS. I have built a couple different binaries from the Linux source, including the M5 specific patches, but it appears that M5 hangs when trying to boot them. I am wondering if there are any critical options that I need to look for in the .config file, or if anyone has a .config specifically for building X86_FS kernel binaries. Also, any tips for debugging M5 Linux boot? Thank you, Joel -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
[m5-dev] M5 X86_FS pseudo instruction: readfile
Hi, This is probably a question for Nate, Gabe or Ali: I have built the m5 util application for x86 and I have been testing it under X86_FS simulation. It looks like /sbin/m5 readfile is failing to print the script to the console of the simulated system. I have been able to verify that the pseudo instruction executes correctly, and the appropriate function (PseudoInst::readfile) in the simulator is called with the correct parameters. There, the file is read into M5s memory, but it isn't ever printed to the terminal in the simulated system. Call graph: PseudoInst::readfile - VirtualPort::CopyIn - VirtualPort::writeBlob - Port::writeBlob - Port::blobHelper At that point, blobHelper calls sendFunctional, to transfer the contents of the file into the simulated system, but I'm having trouble tracing where the packets end up. Any ideas on whats going on or how I can debug? Thanks, Joel -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
[m5-dev] X86_FS vtophys implementation
Hi, It turns out that the readfile bug I posted previously (see below) is a result of an unimplemented vtophys function: CopyIn reads the file in, but the virtual address where it should be placed is not translated to a physical address before sendFunctional is called. This results in a BadAddressError and the packets are dropped. So, I've started looking at the vtophys function. It looks like it will be trickier to implement than it was for prior architectures because of the page table hardware organization and walker. I think vtophys should be implemented by making a functional access to the page table walker. The only problem is that the state machine controlling the walker is updated in each of the access functions. I see a couple possible solutions: 1. vtophys uses a separate walker to look up the entry. The walker could be dynamically instantiated when needed, or it could be saved as a system object specifically for functional accesses. This option seems pretty hacky. 2. vtophys uses the ITB or DTB walker to look up the entry. This would require functional access to the walker so as to not upset its current state. Walker::start would need to take the desired memory mode, and in the case of a functional access, it would need to make sure that it doesn't perturb the current state. This looks like a much better solution to me. I am wondering if anyone has feedback on a choice here, or if there is maybe a better solution. I'd be willing to take a stab at the updates. Thanks, Joel On Mon, Jun 28, 2010 at 4:19 PM, nathan binkert n...@binkert.org wrote: This is probably a question for Nate, Gabe or Ali: I have built the m5 util application for x86 and I have been testing it under X86_FS simulation. It looks like /sbin/m5 readfile is failing to print the script to the console of the simulated system. I have been able to verify that the pseudo instruction executes correctly, and the appropriate function (PseudoInst::readfile) in the simulator is called with the correct parameters. There, the file is read into M5s memory, but it isn't ever printed to the terminal in the simulated system. Call graph: PseudoInst::readfile - VirtualPort::CopyIn - VirtualPort::writeBlob - Port::writeBlob - Port::blobHelper At that point, blobHelper calls sendFunctional, to transfer the contents of the file into the simulated system, but I'm having trouble tracing where the packets end up. Any ideas on whats going on or how I can debug? There is a mechanism for tracing packets through the system. Basically, if you attach PrintReqState to the object, the system will print out info about the object moving through the memory system. (Search old e-mails and perhaps the history to find out how it really works. I've never used it, Steve wrote it.) As for checking things. Did you try firing this up in the debugger and then stepping over the CopyIn call to find out if it succeeded? I'd try that to find out if there is something else wrong before you dig through the memory system. Nate ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] X86_FS vtophys implementation
So wouldn't a functional table walker be basically be the same as an atomic-mode one? I guess I was envisioning a single walker that can handle each type of access. Walker::start handles both timing and atomic accesses currently, but the way it updates state could be trouble for atomic (and functional) accesses that are interleaved with timing accesses: For example, the state assertion at the entrance would fail in the (extremely unlikely) corner case where the MemoryMode was switched from timing to atomic (or functional) while a timing access was in flight (i.e. state != Ready). The ability to interleave timing and functional accesses is going to be necessary eventually, so sorting it out here would make sense. Also, note that in functional mode you don't want to change visible system state, so you don't want to update the access bits. For the readfile problem, I hacked vtophys to just do a lookup in the TLBs, and it solved the address translation problem (quick solution so I can continue with further tests). The TLB lookup function supports accesses that don't update LRU bits, so I was thinking the same principle could be applied to the walker state. (I haven't dug into a lot of the code, so I'm not sure if thats a common or agreed upon convention) On Thu, Jul 1, 2010 at 11:43 AM, Steve Reinhardt ste...@gmail.com wrote: So wouldn't a functional table walker be basically be the same as an atomic-mode one? I'd think it's only the timing-mode version that really needs all the explicit state. That is, if you were going to have two versions, I'd think you'd have a functional/atomic one and a timing one, not a functional one and an atomic/timing one (which is I believe what you're advocating, since the current one seems to already handle both atomic and timing modes). Also, note that in functional mode you don't want to change visible system state, so you don't want to update the access bits. I believe that also means it's OK to bypass the TLB as well, right? (You still might want to check the TLB if you think there's a good chance you'll get a hit there, but the question is whether it's necessary for correctness.) Steve On Thu, Jul 1, 2010 at 11:19 AM, Gabe Black gbl...@eecs.umich.edu wrote: Yeah, I skipped implementing that so far. The reason the table walker is the way it is is that it needs to actually cooperate with the memory system and do real loads/stores, honor timing, etc. For functional accesses you should be able to write a simpler implementation that just uses its own functional accesses to read from the page tables in memory (and write to update the access bits, etc.). You should be careful, though, since the TLB acts like a cache and you'll need to check there first and not just always go straight to the in memory tables. There'll be a little duplication (which might be factored out into utility functions) but the page table walker sim object isn't really the right tool for this job. Gabe Joel Hestness wrote: Hi, It turns out that the readfile bug I posted previously (see below) is a result of an unimplemented vtophys function: CopyIn reads the file in, but the virtual address where it should be placed is not translated to a physical address before sendFunctional is called. This results in a BadAddressError and the packets are dropped. So, I've started looking at the vtophys function. It looks like it will be trickier to implement than it was for prior architectures because of the page table hardware organization and walker. I think vtophys should be implemented by making a functional access to the page table walker. The only problem is that the state machine controlling the walker is updated in each of the access functions. I see a couple possible solutions: 1. vtophys uses a separate walker to look up the entry. The walker could be dynamically instantiated when needed, or it could be saved as a system object specifically for functional accesses. This option seems pretty hacky. 2. vtophys uses the ITB or DTB walker to look up the entry. This would require functional access to the walker so as to not upset its current state. Walker::start would need to take the desired memory mode, and in the case of a functional access, it would need to make sure that it doesn't perturb the current state. This looks like a much better solution to me. I am wondering if anyone has feedback on a choice here, or if there is maybe a better solution. I'd be willing to take a stab at the updates. Thanks, Joel On Mon, Jun 28, 2010 at 4:19 PM, nathan binkert n...@binkert.org mailto:n...@binkert.org wrote: This is probably a question for Nate, Gabe or Ali: I have built the m5 util application for x86 and I have been testing it under X86_FS simulation. It looks like /sbin/m5 readfile is failing to print the script
[m5-dev] Booting Linux, X86_FS Timing CPU
Hi, I am currently experimenting with the timing CPU in X86_FS, and I have encountered an assertion failure while booting Linux (using Linux boot as a test): m5.debug: build/X86_FS/cpu/simple/timing.cc:900: void TimingSimpleCPU::completeDataAccess(Packet*): Assertion `_status == DcacheWaitResponse || _status == DTBWaitResponse' failed. I have attached a stack trace (note that completeDataAccess is called twice in the trace). The current macro-instruction is a POP_M, and the current uop is the Cda. In timing mode since the Cda doesn't access memory (the Request::NO_ACCESS flag is set by Cda), it doesn't wait on a memory access or TLB, so the status of the CPU before the assertion is _status = Running. I've tried adding || _status == Running to the conditional in the assertion, and the simulation gets past that point, but crashes later. I'm not sure if this is a sound fix, or if there is a better way to handle this. While browsing the code, I noticed that further up in the call stack, TimingSimpleCPU::write is called, and when executing this same test using the atomic CPU, AtomicSimpleCPU::write is called. In the AtomicSimpleCPU::write code, there is a special case test for when the Request::NO_ACCESS flag is set. I wonder if the same should occur in TimingSimpleCPU::write? Thanks, Joel -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness Starting program: /home/jhestnes/work/m5/build/X86_FS/m5.debug --outdir=$OUTDIR ./configs/example/fs.py --timing [Thread debugging using libthread_db enabled] Program received signal SIGABRT, Aborted. 0x765f2a75 in *__GI_raise (sig=value optimized out) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 64 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory. in ../nptl/sysdeps/unix/sysv/linux/raise.c #0 0x765f2a75 in *__GI_raise (sig=value optimized out) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #1 0x765f65c0 in *__GI_abort () at abort.c:92 #2 0x765eb941 in *__GI___assert_fail (assertion=0xd5ef38 _status == DcacheWaitResponse || _status == DTBWaitResponse, file=value optimized out, line=900, function=0xd5fca0 void TimingSimpleCPU::completeDataAccess(Packet*)) at assert.c:81 #3 0x00486796 in TimingSimpleCPU::completeDataAccess (this=0x1d13830, pkt=0x1d1fa90) at build/X86_FS/cpu/simple/timing.cc:900 #4 0x00483f49 in TimingSimpleCPU::sendData (this=0x1d13830, req=0x1d1f5b0, data=0x2882b70 , res=0x0, read=false) at build/X86_FS/cpu/simple/timing.cc:280 #5 0x00484c57 in TimingSimpleCPU::finishTranslation (this=0x1d13830, state=0x2878270) at build/X86_FS/cpu/simple/timing.cc:659 #6 0x0048caf9 in DataTranslationTimingSimpleCPU::finish (this=0x2882f00, fault=..., req=0x1d1f5b0, tc=0x1d154a0, mode=BaseTLB::Write) at build/X86_FS/cpu/translation.hh:233 #7 0x0063be1d in X86ISA::TLB::translateTiming (this=0x1d10080, req=0x1d1f5b0, tc=0x1d154a0, translation=0x2882f00, mode=BaseTLB::Write) at build/X86_FS/arch/x86/tlb.cc:721 #8 0x0048badd in TimingSimpleCPU::writeunsigned long (this=0x1d13830, data=0, addr=18446744071571259048, flags=524291, res=0x0) at build/X86_FS/cpu/simple/timing.cc:580 #9 0x00ab143d in X86ISA::LdStOp::writeTimingSimpleCPU, unsigned long (this=0x2882cf0, xc=0x1d13830, m...@0x7fffc198, EA=18446744071571259048, flags=524291) at build/X86_FS/arch/x86/insts/microldstop.hh:141 #10 0x00aa3bfc in X86ISAInst::Cda::initiateAcc (this=0x2882cf0, xc=0x1d13830, traceData=0x0) at build/X86_FS/arch/x86/timing_simple_cpu_exec.cc:9199 #11 0x00485e32 in TimingSimpleCPU::completeIfetch (this=0x1d13830, pkt=0x0) at build/X86_FS/cpu/simple/timing.cc:770 #12 0x004853ac in TimingSimpleCPU::fetch (this=0x1d13830) at build/X86_FS/cpu/simple/timing.cc:690 #13 0x00485696 in TimingSimpleCPU::advanceInst (this=0x1d13830, fault=...) at build/X86_FS/cpu/simple/timing.cc:735 #14 0x0048699e in TimingSimpleCPU::completeDataAccess (this=0x1d13830, pkt=0x1d1fa90) at build/X86_FS/cpu/simple/timing.cc:932 #15 0x00487097 in TimingSimpleCPU::DcachePort::recvTiming (this=0x1d13b10, pkt=0x1d1fa90) at build/X86_FS/cpu/simple/timing.cc:964 #16 0x00487f5e in Port::sendTiming (this=0x1d16760, pkt=0x1d1fa90) at build/X86_FS/mem/port.hh:186 #17 0x00507f28 in Bus::recvTiming (this=0x1a4ce00, pkt=0x1d1fa90) at build/X86_FS/mem/bus.cc:243 #18 0x00510777 in Bus::BusPort::recvTiming (this=0x1d11650, pkt=0x1d1fa90) at build/X86_FS/mem/bus.hh:89 #19 0x00487f5e in Port::sendTiming (this=0x1d16170, pkt=0x1d1fa90) at build/X86_FS/mem/port.hh:186 #20 0x0053f7da in SimpleTimingPort::sendDeferredPacket (this=0x1d16170) at build/X86_FS/mem/tport.cc:150 #21 0x00540431 in SimpleTimingPort::processSendEvent (this=0x1d16170) at build/X86_FS
Re: [m5-dev] Review Request: util/m5/m5.c: in readfile(), added memset to touch all pages - ensure they are in the page table
Hey Gabe, Comments are in-lined below. If you'd like me to resubmit another review of all or part, just let me know. Thanks, Joel util/m5/Makefile.x86 http://reviews.m5sim.org/r/64/#comment248 Why is this necessary? Is this so it runs under SE mode? In that case I think we should make it run like before as the default since 99% of the time this will run in FS, and provide a way to inject -static for the 1% of the time it runs in SE. Compiling it as static all the time wouldn't be the end of the world, but it seems like we'd be making universal changes for a very uncommon case. Building the m5 binary without -static allows it to dynamically link a few libraries: j...@capillary:~/research/m5-new/util/m5$ ldd m5 linux-vdso.so.1 = (0x7fff3f9ff000) libc.so.6 = /lib/libc.so.6 (0x7fb05131f000) /lib64/ld-linux-x86-64.so.2 (0x7fb05168f000) When I was putting together a disk image using busybox, it had issues with library versions. In general, since the m5 utility isn't performance critical and just implements simulator magic, I think it would be easiest if it was always built statically whether for FS or SE. On the other, I would imagine that it's built very infrequently and only for initial disk image creation, so perhaps its not worth changing. util/m5/m5ops.h http://reviews.m5sim.org/r/64/#comment249 It looks like Ali commandeered that value on line 61. It might have been better to use 0x5A for that, but it also might not be safe to change it now since there may be binaries out there that use it (probably not too many). It would be a little strange, but you could actually use 0x5A for reserved1_func. I don't know what restrictions there are in the various ISAs for function numbers, but in x86 it's a 16 bit value. Ah, I didn't see that originally! The only real trouble right now is that if you try to build the m5 utility for x86_64 with the current version in the repo, it will fail with an undefined reference to reserved1_func: gcc -O2 -o m5op_x86.o -c m5op_x86.S gcc -o m5 m5.o m5op_x86.o m5op_x86.o: In function `m5_reserved1_func': (.text+0x5c): undefined reference to `reserved1_func' collect2: ld returned 1 exit status make: *** [m5] Error 1 It looks like neither m5op_alpha.S or m5op_sparc.S use reserved1_func, so another solution would be to remove it from m5op_x86.S (eliminate it completely from the m5 utility codebase). - Gabe -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
[m5-dev] Review Request: SIMPLE TIMING: when a request is NO_ACCESS (x86 CDA microinstruction), TimingSimpleCPU::completeDataAccess must still complete
--- This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/65/ --- Review request for Default. Summary --- SIMPLE TIMING: when a request is NO_ACCESS (x86 CDA microinstruction), TimingSimpleCPU::completeDataAccess must still complete ./cpu/simple/timing.cc: fix for x86 CDA microop - since CDA doesn't read or update memory, completeDataAccess needs to handle the case where the current status of the CPU is _status = Running caused by a request NO_ACCESS This change is RE: Booting Linux, X86_FS Timing CPU (http://www.mail-archive.com/m5-dev@m5sim.org/msg07290.html) Gabe Black: The assert is, as you said, from NO_ACCESS skipping the call out to the memory system and going right to the code that finishes off execution of that instruction, surprising that code by never having left the Running state. Under any other circumstance, though, the CPU shouldn't be in the Running state, and if we just added that to the assert we wouldn't catch those bugs. What I think would be a better fix is to move the assert (but not the assignment to _status) up above the code that aggregates the components of a split packet and add pkt-req-getFlags().isSet(Request::NO_ACCESS) or something similar to the or. This isn't perfect because it asserts every time the function is called and not just once all the fragments (should be only two) are gathered, but it's safer and the overhead should be minimal. This change seems to have fixed the problem for X86_FS. Since no other architectures use the request NO_ACCESS flag, it is unlikely they will be impacted, though they still need to be tested. Diffs - src/cpu/simple/timing.cc a75564db03c3 Diff: http://reviews.m5sim.org/r/65/diff Testing --- Thanks, Joel ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
[m5-dev] Review Request: TimingCPU: REPOST: Request::NO_ACCESS bypass in completeDataAccess
--- This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/66/ --- Review request for Default. Summary --- TimingCPU: REPOST: Request::NO_ACCESS bypass in completeDataAccess ./cpu/simple/timing.cc: fix for x86 CDA microop - since CDA doesn't read or update memory, completeDataAccess needs to handle the case where the current status of the CPU is _status = Running caused by a request NO_ACCESS Discarded previous review request (SIMPLE TIMING: when a request is NO_ACCESS (x86 CDA microinstruction), TimingSimpleCPU::completeDataAccess must still complete) Diffs - src/cpu/simple/timing.cc a75564db03c3 Diff: http://reviews.m5sim.org/r/66/diff Testing --- Thanks, Joel ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Review Request: util/m5/m5.c: in readfile(), added memset to touch all pages - ensure they are in the page table
So, it appears that the only change that we agree on for now is the change to m5.c. Should I submit that change as its own patch and withdraw this one? Thanks, Joel On Fri, Jul 23, 2010 at 3:45 PM, Gabriel Michael Black gbl...@eecs.umich.edu wrote: Quoting Ali Saidi sa...@umich.edu: On Fri, 23 Jul 2010 16:59:08 -0400, Gabriel Michael Black gbl...@eecs.umich.edu wrote: Hmm, maybe we should be building these regularly too... What do you think, Ali? Would it be possible to return reserved1_func and use a different code? It was reserved for me while I was doing the bottleneck analysis work and didn't want anyone to grab that ID. Once I pushed all of the bottleneck analysis changes, I changed reserved into the actual cp_annotate operations. So, everything worked as intended. reserved1_func shouldn't be used anywhere and shouldn't be added back to the file. Ali ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev I don't understand how that made it reserved. Wouldn't anyone else be able to do the same thing you did but with some conflicting use? The comment next to those says Reserved for user, but it's not if it ends up being assigned an official use. Why would we want to have reserved2_func but not reserved1_func? Gabe ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
[m5-dev] Checkpointing x86
Hi, This question is probably for Gabe: I'm currently implementing checkpointing for x86, and I have run into a question about inheritance with a couple x86-specific devices. src/dev/x86/i8042.hh defines a PS2Device, which doesn't inherit from anything, but it looks like the PS2Keyboard and PS2Mouse have state that might need to be checkpointed (e.g. mouse status in the case that Linux enables/disables it). Should PS2Device descend from SimObject? (if so, through a particular subclass of SimObject?) Thanks, Joel -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Review Request: TimingCPU: REPOST: Request::NO_ACCESS bypass in completeDataAccess
Is there a way for me to ship this, or does someone else need to push it to the repo? Thanks, Joel On Thu, Jul 29, 2010 at 8:21 AM, Steve Reinhardt ste...@gmail.com wrote: --- This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/66/#review111 --- Ship it! - Steve On 2010-07-28 16:05:00, Joel Hestness wrote: --- This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/66/ --- (Updated 2010-07-28 16:05:00) Review request for Default. Summary --- TimingCPU: REPOST: Request::NO_ACCESS bypass in completeDataAccess ./cpu/simple/timing.cc: fix for x86 CDA microop - since CDA doesn't read or update memory, completeDataAccess needs to handle the case where the current status of the CPU is _status = Running caused by a request NO_ACCESS Discarded previous review request (SIMPLE TIMING: when a request is NO_ACCESS (x86 CDA microinstruction), TimingSimpleCPU::completeDataAccess must still complete) Diffs - src/cpu/simple/timing.cc a75564db03c3 Diff: http://reviews.m5sim.org/r/66/diff Testing --- Thanks, Joel -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
[m5-dev] Review Request: M5 utility: remove reserve1_func to build for x86
--- This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/120/ --- Review request for Default. Summary --- ./util/m5/m5op_x86.S: To get the m5 utility to build for x86, remove the reserved1_func link. Diffs - util/m5/m5op_x86.S a75564db03c3 Diff: http://reviews.m5sim.org/r/120/diff Testing --- The M5 utility currently does not build for x86 because the reserved1_func was previously removed from m5ops.h by Ali Saidi. This patch fixes the build problem by removing the reference to reserved1_func in m5op_x86.S. Thanks, Joel ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Review Request: M5 utility: Touch all pages in readfile buffer
I have also tested using this loop. For a reasonably large file being read, using the loop executes 300k fewer simulated instructions (~6% of the readfile execution) than using memset. On the other hand, the simulation time of the readfile call using memset was actually 0.4s quicker (~5%) than using the loop. The simulated system still needs to do a pagetable walk and mapping for each page regardless of the implementation. So, my conclusion was that it doesn't make a perceptible difference in performance, and the use of memset abstracts away from possible problems with strided accesses using a static page size. Joel On Mon, Aug 9, 2010 at 12:50 PM, Ali Saidi sa...@umich.edu wrote: On 2010-08-09 12:13:59, Nathan Binkert wrote: How is this not a possible issue for every isa? We're talking about touching 256kB of data. That shouldn't take very long. We've beaten this to death, it's time to just call it done and move on. If we're that concerned just do: for (int x = 0; x sizeof(buf); x += 512) buf[x] = 0; It improves the speed by three orders of magnitude, doesn't require and ifdef and will work on everything that doesn't have a unbelievably small page size. - Ali --- This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/121/#review154 --- On 2010-08-09 10:35:49, Joel Hestness wrote: --- This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/121/ --- (Updated 2010-08-09 10:35:49) Review request for Default. Summary --- util/m5/m5.c: in readfile(), added memset to touch all pages - ensure they are in the page table This problem is caused by Linux demand paging. If the pages are not yet mapped in the page table, the M5 utility does not know the physical memory address in the simulated system to which it is sending the file read from the host machine. Diffs - util/m5/m5.c a75564db03c3 Diff: http://reviews.m5sim.org/r/121/diff Testing --- This fixes the functionality for x86, where the problem was first encountered. I have also tested the utility for Alpha. The simulated system executes approximately 10% more instructions during the readfile operation due to the memset, but the simulation time required for this is still marginal. Using memset provides an ISA independent solution compared to buffer accesses that use a page-sized stride. Thanks, Joel -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
[m5-dev] IntDev and intmessage question
Hi, I'm looking at the interrupt device interface (dev/x86/IntDev.hh) and intmessage (arch/x86/intmessage.hh) code, and I have a question about scoping. Currently, the methods defined in intmessage.hh are only used by methods in the IntDev interface class. I also notice that the only places where MemCmd::MessageReq is referenced elsewhere in the code are in MemPort, from which the IntDev::IntDevPort descends, and in arch/x86/interrupts.hh, which descends from IntDev. Is there any reason why these intmessage methods are scoped to X86ISA, or could they be moved under the IntDev class? Thanks, Joel -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] IntDev and intmessage question
*MessagePort (mem/mport.hh) not MemPort... sorry for any confusion Joel On Tue, Aug 10, 2010 at 3:46 PM, Joel Hestness hestn...@cs.utexas.eduwrote: Hi, I'm looking at the interrupt device interface (dev/x86/IntDev.hh) and intmessage (arch/x86/intmessage.hh) code, and I have a question about scoping. Currently, the methods defined in intmessage.hh are only used by methods in the IntDev interface class. I also notice that the only places where MemCmd::MessageReq is referenced elsewhere in the code are in MemPort, from which the IntDev::IntDevPort descends, and in arch/x86/interrupts.hh, which descends from IntDev. Is there any reason why these intmessage methods are scoped to X86ISA, or could they be moved under the IntDev class? Thanks, Joel -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness http://www.cs.utexas.edu/~hestness ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Regression tests for X86
Hi Dibakar, I recently built a Linux kernel for M5 X86_FS that was based on v2.6.28.4, but modified and compiled specifically for M5. I have patches for my changes, but we're still in the debugging phases of bringing up X86_FS with multicore support, so I'm not confident enough to send them around just yet. Aside from building a Linux kernel, you will need to build and configure a disk image as well, which is also a fair amount of work. I've found that, unfortunately due to the long simulation time of Linux boot up, the iteration time to debug the X86_FS bootup is quite long. I think it would be useful if you could describe what you would like to do with X86_FS, and I can maybe give you some direction on how to get there or whether it makes sense to wait for updates to M5. Thanks, Joel On Sat, Aug 7, 2010 at 12:30 PM, dibakar gope dibakar...@gmail.com wrote: Hi All, I have few queries regarding the regression tests for X86. (1) I could build the x86 in FS mode for AtomicSimpleCPU, O3CPU and SimpleTimingCPU mode (I am using a bunch of x86-specific patches from http://www.csl.cornell.edu/~vince/projects/m5/m5_x86_64_se_status.html). I guess that the pre-compiled linux kernel (that can be downloaded from M5 site) was complied for alpha arch only. So I actually downloaded the linux-dist tarball from M5 site for x86 build. This tarball has a .config.M5 that can be used for compiling the kernel, but that .config.m5 is ALPHA-specific. So in order to compile the linux for x86, I used the config of my native linux machine kernel as a basis for our x86 config kernel and got the vmlinux for x86. Following commands are used for that:- cp /boot/config ./.config make menuconfig make-kpkg clean fakeroot make-kpkg --initrd --append-to-version=-v2.6.27 kernel_image kernel_headers I used that vmliux for X86_FS build and did not get any error during the build process. So my query is, are there any x86-specific patches (configurations) that I should have considered for compiling the linux kernel for x86. (2)Then I tried to test that X86_FS m5.opt using regression tests. All the several test programs present for the regression tests have config.ini files only for alpha in m5-dev tarball, but they don't have the same for X86. But using the following command, I can generate those x86-specific config-ini for the test-programs used in FS mode regression. build/X86_FS/m5.opt -re configs/example/fs.py --cmd=tests/test-progs/test program name/bin/x86/test program binary But the problem is that the m5-dev tarball (m5/tests/test-progs/*) does not have the test program binaries (m5/tests/quick/*) (for example, 10.linux-boot,80.netperf-stream,50.memtest etc) except hello (which is not used for FS mode regression). So I could not generate the config.ini for x86 in order to run the regression tests. So my query is, have anyone worked on the X86 regression tests / faced the same problem? Before I use the x86_FS.opt for SPEC2000/2006 benchmarks, I want that to pass the regression tests first. Thanks and Regards, Dibakar Gope Texas AM University ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
[m5-dev] changeset in m5: TimingSimpleCPU: fix NO_ACCESS memory op handling
changeset cfbbc9178e7a in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=cfbbc9178e7a description: TimingSimpleCPU: fix NO_ACCESS memory op handling When a request is NO_ACCESS (x86 CDA microinstruction), the memory op doesn't go to the cache, so TimingSimpleCPU::completeDataAccess needs to handle the case where the current status of the CPU is Running and not DcacheWaitResponse or DTBWaitResponse diffstat: src/cpu/simple/timing.cc | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diffs (20 lines): diff -r 82453f1b46c5 -r cfbbc9178e7a src/cpu/simple/timing.cc --- a/src/cpu/simple/timing.cc Sun Aug 08 22:57:16 2010 -0700 +++ b/src/cpu/simple/timing.cc Thu Aug 12 17:16:02 2010 -0700 @@ -868,6 +868,8 @@ // received a response from the dcache: complete the load or store // instruction assert(!pkt-isError()); +assert(_status == DcacheWaitResponse || _status == DTBWaitResponse || + pkt-req-getFlags().isSet(Request::NO_ACCESS)); numCycles += tickToCycles(curTick - previousTick); previousTick = curTick; @@ -897,7 +899,6 @@ } } -assert(_status == DcacheWaitResponse || _status == DTBWaitResponse); _status = Running; Fault fault = curStaticInst-completeAcc(pkt, this, traceData); ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
[m5-dev] changeset in m5: util/m5/m5.c: ensure readfile() buffer pages ar...
changeset b69cc0fd934d in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=b69cc0fd934d description: util/m5/m5.c: ensure readfile() buffer pages are in page table (and marked dirty, in case that matters) by touching them beforehand diffstat: util/m5/m5.c | 5 + 1 files changed, 5 insertions(+), 0 deletions(-) diffs (15 lines): diff -r cfbbc9178e7a -r b69cc0fd934d util/m5/m5.c --- a/util/m5/m5.c Thu Aug 12 17:16:02 2010 -0700 +++ b/util/m5/m5.c Thu Aug 12 17:16:04 2010 -0700 @@ -65,6 +65,11 @@ int offset = 0; int len; +// Touch all buffer pages to ensure they are mapped in the +// page table. This is required in the case of X86_FS, where +// Linux does demand paging. +memset(buf, 0, sizeof(buf)); + while ((len = m5_readfile(buf, sizeof(buf), offset)) 0) { write(dest_fid, buf, len); offset += len; ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
[m5-dev] TimingSimpleCPU, x86: sendSplitData packet sender states
Hi, I am currently looking at the sendSplitData function in TimingSimpleCPU (cpu/simple/timing.cc:~307), and I'm encountering a problem with the packet sender states when running with Ruby. After the call to buildSplitPacket, pkt1 and pkt2 have senderState type SplitFragmentSenderState. However, with Ruby enabled, the call to handleReadPacket sends the packet to a RubyPort, and in RubyPort::M5Port::recvTiming (mem/ruby/system/RubyPort.cc:~173), a new senderState is pushed into the packet that has type SenderState (note that the old senderState is saved in the new senderState. After the packet transfer, Ruby restores the old senderState). When the stack unwinds back to sendSplitData, the dynamic_cast after handleReadPacket fails because of the type difference. It looks like the senderState variable is used elsewhere as a stack to store data while the packet traverses from source to destination and on the way back as a response, which makes sense. I'm wondering why the clearFromParent call needs to happen in sendSplitData, since it seems like it should happen in completeDataAccess when cleaning up the packets. Thanks, Joel PS. In sendSplitData after handleReadPacket(pkt2), it looks like there is a bug with the dynamic_cast and clearFromParent since the cast is called on pkt1-senderState. This doesn't affect correctness, but it does leave references that affect deletion of the packets. Is that correct? -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] TimingSimpleCPU, x86: sendSplitData packet sender states
I just realized that the clearFromParent call is used for tracking which of the packets have successfully sent, so that if the send port is busy, it can retry them when a recvRetry is received later. It appears that maybe a better solution to this is to hold a pointer on the stack in sendSplitData to the senderState that may eventually call clearFromParent rather than trying to get the senderState back out after the call to handleReadPacket. Does sound reasonable? Thanks, Joel On Tue, Aug 17, 2010 at 3:11 PM, Joel Hestness hestn...@cs.utexas.eduwrote: Hi, I am currently looking at the sendSplitData function in TimingSimpleCPU (cpu/simple/timing.cc:~307), and I'm encountering a problem with the packet sender states when running with Ruby. After the call to buildSplitPacket, pkt1 and pkt2 have senderState type SplitFragmentSenderState. However, with Ruby enabled, the call to handleReadPacket sends the packet to a RubyPort, and in RubyPort::M5Port::recvTiming (mem/ruby/system/RubyPort.cc:~173), a new senderState is pushed into the packet that has type SenderState (note that the old senderState is saved in the new senderState. After the packet transfer, Ruby restores the old senderState). When the stack unwinds back to sendSplitData, the dynamic_cast after handleReadPacket fails because of the type difference. It looks like the senderState variable is used elsewhere as a stack to store data while the packet traverses from source to destination and on the way back as a response, which makes sense. I'm wondering why the clearFromParent call needs to happen in sendSplitData, since it seems like it should happen in completeDataAccess when cleaning up the packets. Thanks, Joel PS. In sendSplitData after handleReadPacket(pkt2), it looks like there is a bug with the dynamic_cast and clearFromParent since the cast is called on pkt1-senderState. This doesn't affect correctness, but it does leave references that affect deletion of the packets. Is that correct? -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
[m5-dev] TimingSimpleCPU, x86: sendSplitData + TLB miss
Hi, I am currently running a benchmark in X86_FS timing mode (single or multicore) that crashes due to the page table walker. On a data write (or read) instruction that causes TimingSimpleCPU::write to split the TLB access into two accesses (cpu/simple/timing.cc:~560), if the first TLB access misses, it causes the page table walker to start a walk and its state = Waiting. Since the second access happens immediately in TimingSimpleCPU::write, if the second request also misses, it causes another walk that fails the (state == Ready) assertion in X86ISA::Walker::start (arch/x86/pagetable_walker.cc:~316). Seems this is a corner case of a corner case, namely, an unaligned (split) data access, whose split TLB accesses both miss. It doesn't look like there is any code to handle the situation yet, and I'm hoping to get some guidance on how to address it. It seems to me that since this only happens on a TLB miss, that the TLB or walker should be able to handle the multiple requests. I see that in the ARM code, the page table walker has a queue of walks that are currently in flight (I'm having trouble convincing myself that the queues can't conflict when multiple walks are in flight :\). Would it make sense to have similar state queuing in the x86 page table walker? Thanks, Joel -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
[m5-dev] Unable to checkpoint restore into detailed/timing CPU
Hi, I'm working on something else right now, and I might not have a chance to dig into this for a while, so I figured I would post to the list: I just updated to the most recent repo, and ALPHA checkpoint restore into timing-enabled CPUs doesn't appear to be working: 1) Checkpoint by running this and use m5 utility to checkpoint from command line: % ./build/ALPHA_FS/m5.debug configs/example/fs.py --num-cpus=4 2) Try to restore from checkpoint: % ./build/ALPHA_FS/m5.debug --outdir=$OUTDIR configs/example/fs.py --timing --caches --l2cache --num-cpus=4 -r 1 OR: % ./build/ALPHA_FS/m5.debug --outdir=$OUTDIR configs/example/fs.py --detailed --caches --l2cache --num-cpus=4 -r 1 ---OUTPUT M5 Simulator System Copyright (c) 2001-2008 The Regents of The University of Michigan All Rights Reserved M5 compiled Sep 14 2010 16:11:04 M5 revision 37c56be05af0 7682 default tip M5 started Sep 14 2010 17:12:12 M5 executing on RADLAB-0002 command line: ./build/ALPHA_FS/m5.debug --outdir=/home/jhestnes/work/m5/m5out/2010-09-14_mcpat_o3_test-0 configs/example/fs.py --timing --caches --l2cache --num-cpus=4 -r 1 Script to execute: Global frequency set at 1 ticks per second info: kernel located at: /home/jhestnes/work/disk-images/binaries/vmlinux_2.6.27-gcc_4.3.4_test64 Listening for system connection on port 3456 0: system.tsunami.io.rtc: Real-time clock set to Thu Jan 1 00:00:00 2009 0: system.remote_gdb.listener: listening for remote gdb #0 on port 7000 0: system.remote_gdb.listener: listening for remote gdb #1 on port 7001 0: system.remote_gdb.listener: listening for remote gdb #2 on port 7002 0: system.remote_gdb.listener: listening for remote gdb #3 on port 7003 Switch at curTick count:1 info: Entering event queue @ 5473006271000. Starting simulation... Switched CPUS @ cycle = 5473006281000 Traceback (most recent call last): File string, line 1, in module File /home/jhestnes/work/public-m5/src/python/m5/main.py, line 359, in main exec filecode in scope File configs/example/fs.py, line 192, in module Simulation.run(options, root, test_sys, FutureClass) File /home/jhestnes/work/public-m5/configs/common/Simulation.py, line 257, in run m5.changeToTiming(testsys) File /home/jhestnes/work/public-m5/src/python/m5/simulate.py, line 188, in changeToTiming if system.getMemoryMode() != objects.params.timing: AttributeError: 'module' object has no attribute 'timing' ---OUTPUT Thanks, Joel -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Unable to checkpoint restore into detailed/timing CPU
Hi guys, I am using swig 1.3.40, so per Nate's note in the last changeset, I don't think that should be an issue. @Steve: Thanks for the pointer to 'bisect'. Also, in debugging this problem, I ran into the same MC146818 checkpointing/drain bug from before. Previously, I had written a patch to fix the problem. However, I just talked to Brad about it, and he mentioned that you were thinking about backing out that changeset (7559). Based on what I see in the code, it's unclear whether my patch does the right thing and I think the previous code is certainly more correct. I backed out that changeset locally, and it fixed the MC146818 problem, so my recommendation would be to back out that changeset in the repo. If we're in agreement, should I submit that patch for review? So, back to the bug at hand: After a fair amount of testing, it looks like the checkpoint restore problem was introduced somewhere between changeset 7674 and 7678. (it would probably be another 30-40 minutes of testing for me to identify exactly) @Nate: I tried changing the module to 'internal', but it gave the same error. Is it just about getting the correct imports? Thanks, Joel On Wed, Sep 15, 2010 at 5:55 PM, nathan binkert n...@binkert.org wrote: if system.getMemoryMode() != objects.params.timing: AttributeError: 'module' object has no attribute 'timing' I'm pretty sure that this is my fault. Can you change it to internal.params.timing and let me know if it works? Thanks, Nate ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Unable to checkpoint restore into detailed/timing CPU
are related? Thanks, Joel -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Unable to checkpoint restore into detailed/timing CPU
I would guess that the issues aren't related, but it's always difficult to be certain. Have you tried running it with valgrind? Valgrind shows an initialization error and what appears to be an interesting memcpy bug in packet handling in the cache tags, but it doesn't look like either are the cause of the seg fault (see attached). Joel -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness valgrind.out Description: Binary data ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
[m5-dev] Statistics Output Conventions
Hi, I'm currently trying to leverage a Python script and McPAT to consume M5 statistics (stats.txt) and calculate power estimates for a simulated system. Stats variables in the SimObjects are lowerCamelCase according to the coding style, but it looks like the names output to the stats.txt file are mixed, either lowerCamelCase or lower_case_with_underscores. I'm wondering if there is a convention for statistic names that are output to the stats.txt that I (we) can be aiming for. Thanks, Joel -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Review Request: IntDev: latency fix
On 2011-01-07 04:34:28, Gabe Black wrote: See review of the earlier IntDev patch. Basically this is displacing the latency value from the base class that uses it into the subclass that gets it from the config. I don't think it's necessary as described previously, but also that decentralizes a value that's always used in the same place for the same purpose. **Note that this patch removes the latency member from IntPort.** This patch doesn't indicate where the latency member should end up (I'll comment on that in the other review request). Regardless of where the latency is handled, the rest of the codebase indicates that a port should not be responsible for assessing latency (see mem/port.*, mem/tport.* and mem/mport.*), so this is why I removed latency from the IntPort definition. - Joel --- This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/384/#review641 --- On 2011-01-06 15:57:01, Brad Beckmann wrote: --- This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/384/ --- (Updated 2011-01-06 15:57:01) Review request for Default, Ali Saidi, Gabe Black, Steve Reinhardt, and Nathan Binkert. Summary --- IntDev: latency fix Since the device should be responsible for latency of packets, remove the latency field of the IntPort completely. Diffs - src/dev/x86/intdev.hh 9f9e10967912 Diff: http://reviews.m5sim.org/r/384/diff Testing --- Thanks, Brad ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Review Request: MessagePort: implemented virtual recvTiming avoiding double delete
On 2011-01-07 04:21:05, Gabe Black wrote: I think there are two problems with this patch. First, if at all possible we should avoid the code duplication we'd now have for the recvTiming function. Second, while this probably does fix the legitimate problem of deleting packets twice, I think it creates a memory leak in the process. I suspect if you leave your other changes in place but get rid of your custom recvTiming function, things will still work. The packet won't be deleted by the device, won't be deleted after being received as a request in either atomic or timing mode, but will be deleted in both modes after being received as a response. The virtual you added in tport.hh could almost certainly go away then too. Brad Beckmann wrote: Joel is the one who actually wrote this patch, so hopefully he can elaborate on the possible the memory leak. I'll hold off on this patch until he can respond. Actually, the double delete problem still exists if we removed the (almost) replicated recvTiming code. This is because pkt-needsResponse() returns false when the message type is MemCmd::MessageResp, which causes execution of the needsResponse else clause in SimpleTimingPort::recvTiming. It would be freed there, as well as in recvAtomic. I think when I tested this with Valgrind, I didn't see the memory leak (doesn't mean it doesn't exist). However, I don't think I was able to justify to myself why it didn't occur. I remember that I spent a while trying to figure out how to make this work nicely, but the inheritance SimpleTimingPort - MessagePort - IntPort, and the overloading that that implies makes this quite difficult to analyze. For instance, I'm still not clear why the new MemCmd, MessageReq/Resp, needed to be defined for this. On 2011-01-07 04:21:05, Gabe Black wrote: src/mem/tport.hh, line 145 http://reviews.m5sim.org/r/382/diff/1/?file=9048#file9048line145 Marking this as explicitly virtual shouldn't really be necessary. Is there a reason you want to? I think I had trouble compiling since MessagePort overloads recvTiming. In this patch, MessagePort would become the first (only) descendant class of SimpleTimingPort that overloads recvTiming. - Joel --- This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/382/#review639 --- On 2011-01-06 15:56:19, Brad Beckmann wrote: --- This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/382/ --- (Updated 2011-01-06 15:56:19) Review request for Default, Ali Saidi, Gabe Black, Steve Reinhardt, and Nathan Binkert. Summary --- MessagePort: implemented virtual recvTiming avoiding double delete Double packet delete problem is due to an interrupt device deleting a packet that the SimpleTimingPort also deletes. Since MessagePort descends from SimpleTimingPort, simply reimplement the failing code from SimpleTimingPort: recvTiming. Diffs - src/arch/x86/interrupts.cc 9f9e10967912 src/dev/x86/intdev.hh 9f9e10967912 src/mem/mport.hh 9f9e10967912 src/mem/mport.cc 9f9e10967912 src/mem/tport.hh 9f9e10967912 Diff: http://reviews.m5sim.org/r/382/diff Testing --- Thanks, Brad ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Review Request: x86: page table walker functional support
On 2011-01-07 04:45:16, Gabe Black wrote: src/arch/x86/vtophys.cc, line 58 http://reviews.m5sim.org/r/385/diff/1/?file=9054#file9054line58 Better wording might be Need access to page tables. I like that change On 2011-01-07 04:45:16, Gabe Black wrote: src/arch/x86/vtophys.cc, line 70 http://reviews.m5sim.org/r/385/diff/1/?file=9054#file9054line70 Having a temporary variable here seems unnecessary unless it's to prevent having to wrap the next line. It's not a big deal, though. As far as I can tell, convention in ALL other code is to store the fault as a temporary variable, even if it could simply be pushed into the if-clause. On 2011-01-07 04:45:16, Gabe Black wrote: src/arch/x86/vtophys.cc, line 73 http://reviews.m5sim.org/r/385/diff/1/?file=9054#file9054line73 This is very suspicious. The request size was set to 0 when you constructed the request object, so this is anding the original address with -1. That doesn't do anything, so you're really just oring the addresses together. The TLB will already have taken care of any page offset/page number munging that you need. Actually, this whole function is suspect (not because of your code) since there's no guarantee code/data and/or different forms of data will be translated the same, or that flags aren't important. Brad Beckmann wrote: I agree, something seems off here. However, I'll let Joel respond before changing it. At least there needs to be a comment explaining why this calculation is necessary. The size field of the request is set in the functional portion of Walker::WalkerState::startWalk in my other patch for review. The physical address that is returned from vtophys needs to include the offset into the page, which in x86 can have multiple different sizes. The page table contains the information about the page size, so it needs to be passed in the request object through startFunctional(). - Joel --- This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/385/#review642 --- On 2011-01-06 15:59:24, Brad Beckmann wrote: --- This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/385/ --- (Updated 2011-01-06 15:59:24) Review request for Default, Ali Saidi, Gabe Black, Steve Reinhardt, and Nathan Binkert. Summary --- x86: page table walker functional support src/arch/x86/pagetable_walker.hh: Added method to functionally walk page table src/arch/x86/pagetable_walker.cc: Added method to functionally walk page table src/arch/x86/tlb.cc: Added method to return pointer to walker src/arch/x86/tlb.hh: Added method to return pointer to walker src/arch/x86/vtophys.cc: Calls walker to look up virt. to phys. page mapping Diffs - src/arch/x86/vtophys.cc 9f9e10967912 Diff: http://reviews.m5sim.org/r/385/diff Testing --- Thanks, Brad ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Review Request: x86: Timing support for pagetable walker
On 2011-01-07 05:51:30, Gabe Black wrote: The code seems ok, but why do we need to have multiple outstanding page walks in timing mode again? Gabe Black wrote: Actually, I wrote the above before I'd read it carefully. My question still stands, but there are some areas that need to be fixed up. Also, since translation is very much on the critical path, make sure you measure how much this change affects performance. I expect with the addition indirection at least there will be some slow down, and we should know what that is before we commit anything. In timing mode x86, if a memory address translation misses in the TLB AND happens to be an unaligned access (one that straddles a page boundary), the TLB promptly fires both of the requests to the page table walker. The old implementation of the walker doesn't support multiple outstanding requests, so it immediately crashes simulation with a state assertion failure (I asked a few questions about this in June and July, back when I made the changes to the walker). The implementation in this patch can queue the requests and service them sequentially. It should be a simple future extension to service them concurrently. I modeled this implementation after the ARM implementation in arch/arm/table_walker.*. Concerning the slowdown, the frequency of unaligned accesses that miss in the TLB is extremely rare (10 in seconds of simulated system time). Since timing mode doesn't work without this fix, there isn't a way to compare performance against a baseline. On 2011-01-07 05:51:30, Gabe Black wrote: src/arch/x86/pagetable_walker.hh, line 187 http://reviews.m5sim.org/r/396/diff/1/?file=9102#file9102line187 Why call this reqType instead of leaving it as mode? requests have types which are orthogonal to this, and it's called mode everywhere else. Good point. I'm not sure why I named it that. mode would be better. On 2011-01-07 05:51:30, Gabe Black wrote: src/arch/x86/pagetable_walker.cc, line 77 http://reviews.m5sim.org/r/396/diff/1/?file=9103#file9103line77 These should use FastAlloc if at all possible since they're on a critical path and the heap is slow. Is this as simple as having WalkerState inherit from FastAlloc? On 2011-01-07 05:51:30, Gabe Black wrote: src/arch/x86/pagetable_walker.cc, line 89 http://reviews.m5sim.org/r/396/diff/1/?file=9103#file9103line89 Memory leak. Well, that's embarrassing :P On 2011-01-07 05:51:30, Gabe Black wrote: src/arch/x86/pagetable_walker.cc, line 179 http://reviews.m5sim.org/r/396/diff/1/?file=9103#file9103line179 Is letting translations pass each other realistic? I worry we're making our walker artificially powerful. These loops will also slow things down potentially. This is an abstract implementation just to get the walker to work. It can be easily molded to order the requests appropriately. On the topic of slowdown, having more than one request in the queue is extremely rare, so any slowdown should be trivial. On 2011-01-07 05:51:30, Gabe Black wrote: src/arch/x86/pagetable_walker.cc, line 541 http://reviews.m5sim.org/r/396/diff/1/?file=9103#file9103line541 Declare this where it's used. I think I had other plans for this variable, but it doesn't look like I followed through. Moving it to line 566 should be a simple fix. On 2011-01-07 05:51:30, Gabe Black wrote: src/arch/x86/pagetable_walker.cc, line 574 http://reviews.m5sim.org/r/396/diff/1/?file=9103#file9103line574 Why is this pulled out into its own switch statement? That will slow down the code and makes things more complicated. As I recall, this was part of my intermediate solution to the unaligned access problem. These lines can be moved back to the previous locations. - Joel --- This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/396/#review649 --- On 2011-01-06 16:12:34, Brad Beckmann wrote: --- This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/396/ --- (Updated 2011-01-06 16:12:34) Review request for Default, Ali Saidi, Gabe Black, Steve Reinhardt, and Nathan Binkert. Summary --- x86: Timing support for pagetable walker Move page table walker state to its own object type, and make the walker instantiate state for each outstanding walk. By storing the states in a queue, the walker is able to handle multiple outstanding timing requests. Note that functional walks use separate state elements. Diffs - src/arch/x86/pagetable_walker.hh 9f9e10967912 src/arch/x86/pagetable_walker.cc 9f9e10967912 src/arch/x86/tlb.hh 9f9e10967912 src/arch/x86/tlb.cc 9f9e10967912
Re: [m5-dev] Error in Simulating Mesh Network
Hi Nilay, I believe that this error is fixed in one of the patches that I worked on while at AMD. Brad has pushed it up for review: http://reviews.m5sim.org/r/381/. It's a one line fix. Hope this helps, Joel On Thu, Jan 20, 2011 at 8:15 AM, Nilay Vaish ni...@cs.wisc.edu wrote: Brad, I tried simulating a mesh network with four processors. ./build/ALPHA_FS_MOESI_hammer/m5.prof ./configs/example/ruby_fs.py --maxtick 2000 -n 4 --topology Mesh --mesh-rows 2 --num-l2cache 4 --num-dir 4 I receive the following error: panic: FIFO ordering violated: [MessageBuffer: consumer-yes [ [71227521, 870, 1; ] ]] [Version 1, L1Cache, triggerQueue_in] name: [Version 1, L1Cache, triggerQueue_in] current time: 71227512 delta: 1 arrival_time: 71227513 last arrival_time: 71227521 @ cycle 35613756000 [enqueue:build/ALPHA_FS_MOESI_hammer/mem/ruby/buffers/MessageBuffer.cc, line 198] Do you think that the options I have specified should work correctly? Thanks Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] changeset in m5: checkpointing: fix bug from curTick accessor co...
I like this idea a lot. Not only would it solve the SERIALIZE_* v. paramOut usage problem, but it would also decouple the code variable name from the name written to the checkpoint. If used intelligently, this could alleviate some of the pain of fixing old checkpoints when code changes. Joel On Fri, Jan 21, 2011 at 12:57 AM, Gabe Black gbl...@eecs.umich.edu wrote: From time to time It seems to be that we need to serialize something but call it something other than its variable name. Would it make sense to add SERIALIZE_*_AS macros that take a name argument as well? It's not that hard to create a temporary variable or use those param functions directly, but it would at least make things look more consistent to always (or almost always) use SERIALIZE_FOO. Gabe On 01/20/11 22:11, Steve Reinhardt wrote: changeset 494b5426e70d in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=494b5426e70d description: checkpointing: fix bug from curTick accessor conversion. Regex replacement of curTick with curTick() accidentally changed checkpoint key string for serialization but not for unserialization. diffstat: src/sim/serialize.cc | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diffs (12 lines): diff -r f84bfd45d607 -r 494b5426e70d src/sim/serialize.cc --- a/src/sim/serialize.ccWed Jan 19 16:22:23 2011 -0800 +++ b/src/sim/serialize.ccThu Jan 20 22:13:33 2011 -0800 @@ -400,7 +400,7 @@ Globals::serialize(ostream os) { nameOut(os); -SERIALIZE_SCALAR(curTick()); +paramOut(os, curTick, curTick()); nameOut(os, MainEventQueue); mainEventQueue.serialize(os); ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
[m5-dev] changeset in m5: IntDev: packet latency fix
changeset 8b05ff5ef958 in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=8b05ff5ef958 description: IntDev: packet latency fix The x86 local apic now includes a separate latency parameter for interrupts. diffstat: src/arch/x86/X86LocalApic.py | 2 ++ src/arch/x86/interrupts.cc | 3 ++- 2 files changed, 4 insertions(+), 1 deletions(-) diffs (22 lines): diff -r 38eca2df1124 -r 8b05ff5ef958 src/arch/x86/X86LocalApic.py --- a/src/arch/x86/X86LocalApic.py Sun Feb 06 22:14:17 2011 -0800 +++ b/src/arch/x86/X86LocalApic.py Sun Feb 06 22:14:17 2011 -0800 @@ -34,3 +34,5 @@ cxx_class = 'X86ISA::Interrupts' pio_latency = Param.Latency('1ns', 'Programmed IO latency in simticks') int_port = Port(Port for sending and receiving interrupt messages) +int_latency = Param.Latency('1ns', \ +Latency for an interrupt to propagate through this device.) diff -r 38eca2df1124 -r 8b05ff5ef958 src/arch/x86/interrupts.cc --- a/src/arch/x86/interrupts.ccSun Feb 06 22:14:17 2011 -0800 +++ b/src/arch/x86/interrupts.ccSun Feb 06 22:14:17 2011 -0800 @@ -595,7 +595,8 @@ X86ISA::Interrupts::Interrupts(Params * p) : -BasicPioDevice(p), IntDev(this), latency(p-pio_latency), clock(0), +BasicPioDevice(p), IntDev(this, p-int_latency), latency(p-pio_latency), +clock(0), apicTimerEvent(this), pendingSmi(false), smiVector(0), pendingNmi(false), nmiVector(0), ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
[m5-dev] changeset in m5: x86: implements vtophys
changeset f9b675da608a in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=f9b675da608a description: x86: implements vtophys Calls walker to look up virt. to phys. page mapping diffstat: src/arch/x86/pagetable_walker.hh | 1 + src/arch/x86/system.cc | 1 + src/arch/x86/vtophys.cc | 29 ++--- src/arch/x86/vtophys.hh | 3 --- 4 files changed, 28 insertions(+), 6 deletions(-) diffs (87 lines): diff -r 8b05ff5ef958 -r f9b675da608a src/arch/x86/pagetable_walker.hh --- a/src/arch/x86/pagetable_walker.hh Sun Feb 06 22:14:17 2011 -0800 +++ b/src/arch/x86/pagetable_walker.hh Sun Feb 06 22:14:17 2011 -0800 @@ -48,6 +48,7 @@ #include mem/mem_object.hh #include mem/packet.hh #include params/X86PagetableWalker.hh +#include sim/faults.hh class ThreadContext; diff -r 8b05ff5ef958 -r f9b675da608a src/arch/x86/system.cc --- a/src/arch/x86/system.ccSun Feb 06 22:14:17 2011 -0800 +++ b/src/arch/x86/system.ccSun Feb 06 22:14:17 2011 -0800 @@ -39,6 +39,7 @@ #include arch/x86/bios/smbios.hh #include arch/x86/bios/intelmp.hh +#include arch/x86/isa_traits.hh #include arch/x86/regs/misc.hh #include arch/x86/system.hh #include arch/vtophys.hh diff -r 8b05ff5ef958 -r f9b675da608a src/arch/x86/vtophys.cc --- a/src/arch/x86/vtophys.cc Sun Feb 06 22:14:17 2011 -0800 +++ b/src/arch/x86/vtophys.cc Sun Feb 06 22:14:17 2011 -0800 @@ -39,19 +39,42 @@ #include string +#include arch/x86/pagetable_walker.hh +#include arch/x86/tlb.hh #include arch/x86/vtophys.hh +#include base/trace.hh +#include config/full_system.hh +#include cpu/thread_context.hh +#include sim/fault.hh using namespace std; namespace X86ISA { -Addr vtophys(Addr vaddr) +Addr +vtophys(Addr vaddr) { +#if FULL_SYSTEM +panic(Need access to page tables\n); +#endif return vaddr; } -Addr vtophys(ThreadContext *tc, Addr addr) +Addr +vtophys(ThreadContext *tc, Addr vaddr) { -return addr; +#if FULL_SYSTEM +Walker *walker = tc-getDTBPtr()-getWalker(); +Addr size; +Addr addr = vaddr; +Fault fault = walker-startFunctional(tc, addr, size, BaseTLB::Read); +if (fault != NoFault) +panic(vtophys page walk returned fault\n); +Addr masked_addr = vaddr (size - 1); +Addr paddr = addr | masked_addr; +DPRINTF(VtoPhys, vtophys(%#x) - %#x\n, vaddr, paddr); +return paddr; +#endif +return vaddr; } } diff -r 8b05ff5ef958 -r f9b675da608a src/arch/x86/vtophys.hh --- a/src/arch/x86/vtophys.hh Sun Feb 06 22:14:17 2011 -0800 +++ b/src/arch/x86/vtophys.hh Sun Feb 06 22:14:17 2011 -0800 @@ -40,12 +40,9 @@ #ifndef __ARCH_X86_VTOPHYS_HH__ #define __ARCH_X86_VTOPHYS_HH__ -#include arch/x86/isa_traits.hh -#include arch/x86/pagetable.hh #include base/types.hh class ThreadContext; -class FunctionalPort; namespace X86ISA { ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
[m5-dev] changeset in m5: Ruby: Add support for locked memory accesses in...
changeset 4e83ebb67794 in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=4e83ebb67794 description: Ruby: Add support for locked memory accesses in X86_FS diffstat: src/mem/ruby/libruby.cc | 8 ++ src/mem/ruby/libruby.hh | 2 + src/mem/ruby/system/DMASequencer.cc | 2 + src/mem/ruby/system/RubyPort.cc | 36 -- src/mem/ruby/system/Sequencer.cc| 43 +--- 5 files changed, 70 insertions(+), 21 deletions(-) diffs (213 lines): diff -r d648b8409d4c -r 4e83ebb67794 src/mem/ruby/libruby.cc --- a/src/mem/ruby/libruby.cc Sun Feb 06 22:14:18 2011 -0800 +++ b/src/mem/ruby/libruby.cc Sun Feb 06 22:14:18 2011 -0800 @@ -58,6 +58,10 @@ return RMW_Read; case RubyRequestType_RMW_Write: return RMW_Write; + case RubyRequestType_Locked_RMW_Read: +return Locked_RMW_Read; + case RubyRequestType_Locked_RMW_Write: +return Locked_RMW_Write; case RubyRequestType_NULL: default: assert(0); @@ -82,6 +86,10 @@ return RubyRequestType_RMW_Read; else if (str == RMW_Write) return RubyRequestType_RMW_Write; +else if (str == Locked_RMW_Read) +return RubyRequestType_Locked_RMW_Read; +else if (str == Locked_RMW_Write) +return RubyRequestType_Locked_RMW_Write; else assert(0); return RubyRequestType_NULL; diff -r d648b8409d4c -r 4e83ebb67794 src/mem/ruby/libruby.hh --- a/src/mem/ruby/libruby.hh Sun Feb 06 22:14:18 2011 -0800 +++ b/src/mem/ruby/libruby.hh Sun Feb 06 22:14:18 2011 -0800 @@ -44,6 +44,8 @@ RubyRequestType_Store_Conditional, RubyRequestType_RMW_Read, RubyRequestType_RMW_Write, + RubyRequestType_Locked_RMW_Read, + RubyRequestType_Locked_RMW_Write, RubyRequestType_NUM }; diff -r d648b8409d4c -r 4e83ebb67794 src/mem/ruby/system/DMASequencer.cc --- a/src/mem/ruby/system/DMASequencer.cc Sun Feb 06 22:14:18 2011 -0800 +++ b/src/mem/ruby/system/DMASequencer.cc Sun Feb 06 22:14:18 2011 -0800 @@ -70,6 +70,8 @@ case RubyRequestType_Store_Conditional: case RubyRequestType_RMW_Read: case RubyRequestType_RMW_Write: + case RubyRequestType_Locked_RMW_Read: + case RubyRequestType_Locked_RMW_Write: case RubyRequestType_NUM: panic(DMASequencer::makeRequest does not support RubyRequestType); return RequestStatus_NULL; diff -r d648b8409d4c -r 4e83ebb67794 src/mem/ruby/system/RubyPort.cc --- a/src/mem/ruby/system/RubyPort.cc Sun Feb 06 22:14:18 2011 -0800 +++ b/src/mem/ruby/system/RubyPort.cc Sun Feb 06 22:14:18 2011 -0800 @@ -26,6 +26,10 @@ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ +#include config/the_isa.hh +#if THE_ISA == X86_ISA +#include arch/x86/insts/microldstop.hh +#endif // X86_ISA #include cpu/testers/rubytest/RubyTester.hh #include mem/physical.hh #include mem/ruby/slicc_interface/AbstractController.hh @@ -201,22 +205,38 @@ assert(pkt-isRead()); type = RubyRequestType_Load_Linked; } +} else if (pkt-req-isLocked()) { +if (pkt-isWrite()) { +DPRINTF(MemoryAccess, Issuing Locked RMW Write\n); +type = RubyRequestType_Locked_RMW_Write; +} else { +DPRINTF(MemoryAccess, Issuing Locked RMW Read\n); +assert(pkt-isRead()); +type = RubyRequestType_Locked_RMW_Read; +} } else { if (pkt-isRead()) { if (pkt-req-isInstFetch()) { type = RubyRequestType_IFETCH; } else { -type = RubyRequestType_LD; +#if THE_ISA == X86_ISA +uint32_t flags = pkt-req-getFlags(); +bool storeCheck = flags +(TheISA::StoreCheck TheISA::FlagShift); +#else +bool storeCheck = false; +#endif // X86_ISA +if (storeCheck) { +type = RubyRequestType_RMW_Read; +} else { +type = RubyRequestType_LD; +} } } else if (pkt-isWrite()) { +// +// Note: M5 packets do not differentiate ST from RMW_Write +// type = RubyRequestType_ST; -} else if (pkt-isReadWrite()) { -// Fix me. This conditional will never be executed -// because isReadWrite() is just an OR of isRead() and -// isWrite(). Furthermore, just because the packet is a -// read/write request does not necessary mean it is a -// read-modify-write atomic operation. -type = RubyRequestType_RMW_Write; } else { panic(Unsupported ruby packet type\n); } diff -r d648b8409d4c -r 4e83ebb67794 src/mem/ruby/system/Sequencer.cc --- a/src/mem/ruby/system/Sequencer.cc Sun Feb 06 22:14:18 2011 -0800 +++
[m5-dev] changeset in m5: Ruby: Fix to return cache block size to CPU for...
changeset eee578ed2130 in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=eee578ed2130 description: Ruby: Fix to return cache block size to CPU for split data transfers diffstat: src/mem/ruby/system/RubyPort.cc | 6 ++ src/mem/ruby/system/RubyPort.hh | 2 ++ 2 files changed, 8 insertions(+), 0 deletions(-) diffs (32 lines): diff -r 4e83ebb67794 -r eee578ed2130 src/mem/ruby/system/RubyPort.cc --- a/src/mem/ruby/system/RubyPort.cc Sun Feb 06 22:14:18 2011 -0800 +++ b/src/mem/ruby/system/RubyPort.cc Sun Feb 06 22:14:18 2011 -0800 @@ -370,3 +370,9 @@ } return false; } + +unsigned +RubyPort::M5Port::deviceBlockSize() const +{ +return (unsigned) RubySystem::getBlockSizeBytes(); +} diff -r 4e83ebb67794 -r eee578ed2130 src/mem/ruby/system/RubyPort.hh --- a/src/mem/ruby/system/RubyPort.hh Sun Feb 06 22:14:18 2011 -0800 +++ b/src/mem/ruby/system/RubyPort.hh Sun Feb 06 22:14:18 2011 -0800 @@ -36,6 +36,7 @@ #include mem/physical.hh #include mem/protocol/RequestStatus.hh #include mem/ruby/libruby.hh +#include mem/ruby/system/System.hh #include mem/tport.hh #include params/RubyPort.hh @@ -54,6 +55,7 @@ M5Port(const std::string _name, RubyPort *_port); bool sendTiming(PacketPtr pkt); void hitCallback(PacketPtr pkt); +unsigned deviceBlockSize() const; protected: virtual bool recvTiming(PacketPtr pkt); ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
[m5-dev] changeset in m5: x86: Add checkpointing capability to devices
changeset 7fcfb515d7bf in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=7fcfb515d7bf description: x86: Add checkpointing capability to devices Add checkpointing capability to the Intel 8254 timer, CMOS, I8042, PS2 Keyboard and Mouse, I82094AA, I8237, I8254, I8259, and speaker devices diffstat: src/dev/intel_8254_timer.cc |4 +- src/dev/x86/cmos.cc | 20 src/dev/x86/cmos.hh |4 + src/dev/x86/i8042.cc| 109 src/dev/x86/i8042.hh| 11 src/dev/x86/i82094aa.cc | 29 +++ src/dev/x86/i82094aa.hh |3 + src/dev/x86/i8237.cc| 14 +- src/dev/x86/i8237.hh|3 + src/dev/x86/i8254.cc| 12 src/dev/x86/i8254.hh|4 + src/dev/x86/i8259.cc| 36 ++ src/dev/x86/i8259.hh|3 + src/dev/x86/speaker.cc | 15 ++ src/dev/x86/speaker.hh |4 + 15 files changed, 269 insertions(+), 2 deletions(-) diffs (truncated from 440 to 300 lines): diff -r aafb4a7384d4 -r 7fcfb515d7bf src/dev/intel_8254_timer.cc --- a/src/dev/intel_8254_timer.cc Sun Feb 06 22:14:17 2011 -0800 +++ b/src/dev/intel_8254_timer.cc Sun Feb 06 22:14:18 2011 -0800 @@ -247,7 +247,9 @@ paramIn(cp, section, base + .read_byte, read_byte); paramIn(cp, section, base + .write_byte, write_byte); -Tick event_tick; +Tick event_tick = 0; +if (event.scheduled()) +parent-deschedule(event); paramIn(cp, section, base + .event_tick, event_tick); if (event_tick) parent-schedule(event, event_tick); diff -r aafb4a7384d4 -r 7fcfb515d7bf src/dev/x86/cmos.cc --- a/src/dev/x86/cmos.cc Sun Feb 06 22:14:17 2011 -0800 +++ b/src/dev/x86/cmos.cc Sun Feb 06 22:14:18 2011 -0800 @@ -111,6 +111,26 @@ } } +void +X86ISA::Cmos::serialize(std::ostream os) +{ +SERIALIZE_SCALAR(address); +SERIALIZE_ARRAY(regs, numRegs); + +// Serialize the timer +rtc.serialize(rtc, os); +} + +void +X86ISA::Cmos::unserialize(Checkpoint *cp, const std::string section) +{ +UNSERIALIZE_SCALAR(address); +UNSERIALIZE_ARRAY(regs, numRegs); + +// Serialize the timer +rtc.unserialize(rtc, cp, section); +} + X86ISA::Cmos * CmosParams::create() { diff -r aafb4a7384d4 -r 7fcfb515d7bf src/dev/x86/cmos.hh --- a/src/dev/x86/cmos.hh Sun Feb 06 22:14:17 2011 -0800 +++ b/src/dev/x86/cmos.hh Sun Feb 06 22:14:18 2011 -0800 @@ -82,6 +82,10 @@ Tick read(PacketPtr pkt); Tick write(PacketPtr pkt); + +virtual void serialize(std::ostream os); +virtual void unserialize(Checkpoint *cp, const std::string section); + }; } // namespace X86ISA diff -r aafb4a7384d4 -r 7fcfb515d7bf src/dev/x86/i8042.cc --- a/src/dev/x86/i8042.cc Sun Feb 06 22:14:17 2011 -0800 +++ b/src/dev/x86/i8042.cc Sun Feb 06 22:14:18 2011 -0800 @@ -439,6 +439,115 @@ return latency; } +void +X86ISA::I8042::serialize(std::ostream os) +{ +uint8_t statusRegData = statusReg.__data; +uint8_t commandByteData = commandByte.__data; + +SERIALIZE_SCALAR(dataPort); +SERIALIZE_SCALAR(commandPort); +SERIALIZE_SCALAR(statusRegData); +SERIALIZE_SCALAR(commandByteData); +SERIALIZE_SCALAR(dataReg); +SERIALIZE_SCALAR(lastCommand); +mouse.serialize(mouse, os); +keyboard.serialize(keyboard, os); +} + +void +X86ISA::I8042::unserialize(Checkpoint *cp, const std::string section) +{ +uint8_t statusRegData; +uint8_t commandByteData; + +UNSERIALIZE_SCALAR(dataPort); +UNSERIALIZE_SCALAR(commandPort); +UNSERIALIZE_SCALAR(statusRegData); +UNSERIALIZE_SCALAR(commandByteData); +UNSERIALIZE_SCALAR(dataReg); +UNSERIALIZE_SCALAR(lastCommand); +mouse.unserialize(mouse, cp, section); +keyboard.unserialize(keyboard, cp, section); + +statusReg.__data = statusRegData; +commandByte.__data = commandByteData; +} + +void +X86ISA::PS2Keyboard::serialize(const std::string base, std::ostream os) +{ +paramOut(os, base + .lastCommand, lastCommand); +int bufferSize = outBuffer.size(); +paramOut(os, base + .outBuffer.size, bufferSize); +uint8_t *buffer = new uint8_t[bufferSize]; +for (int i = 0; i bufferSize; ++i) { +buffer[i] = outBuffer.front(); +outBuffer.pop(); +} +arrayParamOut(os, base + .outBuffer.elts, buffer, +bufferSize*sizeof(uint8_t)); +delete buffer; +} + +void +X86ISA::PS2Keyboard::unserialize(const std::string base, Checkpoint *cp, +const std::string section) +{ +paramIn(cp, section, base + .lastCommand, lastCommand); +int bufferSize; +paramIn(cp, section, base + .outBuffer.size, bufferSize); +uint8_t *buffer = new uint8_t[bufferSize]; +arrayParamIn(cp, section, base + .outBuffer.elts, buffer, +bufferSize*sizeof(uint8_t)); +for (int i = 0; i bufferSize; ++i) { +
[m5-dev] changeset in m5: x86: Timing support for pagetable walker
changeset a9f05ab40763 in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=a9f05ab40763 description: x86: Timing support for pagetable walker Move page table walker state to its own object type, and make the walker instantiate state for each outstanding walk. By storing the states in a queue, the walker is able to handle multiple outstanding timing requests. Note that functional walks use separate state elements. diffstat: src/arch/x86/pagetable_walker.cc | 922 ++ src/arch/x86/pagetable_walker.hh | 181 --- src/arch/x86/tlb.cc |6 + src/arch/x86/tlb.hh |2 + 4 files changed, 647 insertions(+), 464 deletions(-) diffs (truncated from 1247 to 300 lines): diff -r 267e1e16e51b -r a9f05ab40763 src/arch/x86/pagetable_walker.cc --- a/src/arch/x86/pagetable_walker.cc Sun Feb 06 22:14:18 2011 -0800 +++ b/src/arch/x86/pagetable_walker.cc Sun Feb 06 22:14:18 2011 -0800 @@ -40,6 +40,7 @@ #include arch/x86/pagetable.hh #include arch/x86/pagetable_walker.hh #include arch/x86/tlb.hh +#include arch/x86/vtophys.hh #include base/bitfield.hh #include cpu/thread_context.hh #include cpu/base.hh @@ -67,328 +68,36 @@ EndBitUnion(PageTableEntry) Fault -Walker::doNext(PacketPtr write) +Walker::start(ThreadContext * _tc, BaseTLB::Translation *_translation, + RequestPtr _req, BaseTLB::Mode _mode) { -assert(state != Ready state != Waiting); -write = NULL; -PageTableEntry pte; -if (size == 8) -pte = read-getuint64_t(); -else -pte = read-getuint32_t(); -VAddr vaddr = entry.vaddr; -bool uncacheable = pte.pcd; -Addr nextRead = 0; -bool doWrite = false; -bool badNX = pte.nx mode == BaseTLB::Execute enableNX; -switch(state) { - case LongPML4: -DPRINTF(PageTableWalker, -Got long mode PML4 entry %#016x.\n, (uint64_t)pte); -nextRead = ((uint64_t)pte (mask(40) 12)) + vaddr.longl3 * size; -doWrite = !pte.a; -pte.a = 1; -entry.writable = pte.w; -entry.user = pte.u; -if (badNX || !pte.p) { -stop(); -return pageFault(pte.p); +// TODO: in timing mode, instead of blocking when there are other +// outstanding requests, see if this request can be coalesced with +// another one (i.e. either coalesce or start walk) +WalkerState * newState = new WalkerState(this, _translation, _req); +newState-initState(_tc, _mode, sys-getMemoryMode() == Enums::timing); +if (currStates.size()) { +assert(newState-isTiming()); +DPRINTF(PageTableWalker, Walks in progress: %d\n, currStates.size()); +currStates.push_back(newState); +return NoFault; +} else { +currStates.push_back(newState); +Fault fault = newState-startWalk(); +if (!newState-isTiming()) { +currStates.pop_front(); +delete newState; } -entry.noExec = pte.nx; -nextState = LongPDP; -break; - case LongPDP: -DPRINTF(PageTableWalker, -Got long mode PDP entry %#016x.\n, (uint64_t)pte); -nextRead = ((uint64_t)pte (mask(40) 12)) + vaddr.longl2 * size; -doWrite = !pte.a; -pte.a = 1; -entry.writable = entry.writable pte.w; -entry.user = entry.user pte.u; -if (badNX || !pte.p) { -stop(); -return pageFault(pte.p); -} -nextState = LongPD; -break; - case LongPD: -DPRINTF(PageTableWalker, -Got long mode PD entry %#016x.\n, (uint64_t)pte); -doWrite = !pte.a; -pte.a = 1; -entry.writable = entry.writable pte.w; -entry.user = entry.user pte.u; -if (badNX || !pte.p) { -stop(); -return pageFault(pte.p); -} -if (!pte.ps) { -// 4 KB page -entry.size = 4 * (1 10); -nextRead = -((uint64_t)pte (mask(40) 12)) + vaddr.longl1 * size; -nextState = LongPTE; -break; -} else { -// 2 MB page -entry.size = 2 * (1 20); -entry.paddr = (uint64_t)pte (mask(31) 21); -entry.uncacheable = uncacheable; -entry.global = pte.g; -entry.patBit = bits(pte, 12); -entry.vaddr = entry.vaddr ~((2 * (1 20)) - 1); -tlb-insert(entry.vaddr, entry); -stop(); -return NoFault; -} - case LongPTE: -DPRINTF(PageTableWalker, -Got long mode PTE entry %#016x.\n, (uint64_t)pte); -doWrite = !pte.a; -pte.a = 1; -entry.writable = entry.writable pte.w; -entry.user = entry.user pte.u; -if (badNX || !pte.p) { -stop(); -return pageFault(pte.p); -} -
[m5-dev] changeset in m5: TimingSimpleCPU: split data sender state fix
changeset 267e1e16e51b in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=267e1e16e51b description: TimingSimpleCPU: split data sender state fix In sendSplitData, keep a pointer to the senderState that may be updated after the call to handle*Packet. This way, if the receiver updates the packet senderState, it can still be accessed in sendSplitData. diffstat: src/cpu/simple/timing.cc | 16 1 files changed, 8 insertions(+), 8 deletions(-) diffs (38 lines): diff -r 8a92b39be50e -r 267e1e16e51b src/cpu/simple/timing.cc --- a/src/cpu/simple/timing.cc Sun Feb 06 22:14:18 2011 -0800 +++ b/src/cpu/simple/timing.cc Sun Feb 06 22:14:18 2011 -0800 @@ -325,26 +325,26 @@ pkt1-makeResponse(); completeDataAccess(pkt1); } else if (read) { +SplitFragmentSenderState * send_state = +dynamic_castSplitFragmentSenderState *(pkt1-senderState); if (handleReadPacket(pkt1)) { -SplitFragmentSenderState * send_state = -dynamic_castSplitFragmentSenderState *(pkt1-senderState); send_state-clearFromParent(); +send_state = dynamic_castSplitFragmentSenderState *( +pkt2-senderState); if (handleReadPacket(pkt2)) { -send_state = dynamic_castSplitFragmentSenderState *( -pkt1-senderState); send_state-clearFromParent(); } } } else { dcache_pkt = pkt1; +SplitFragmentSenderState * send_state = +dynamic_castSplitFragmentSenderState *(pkt1-senderState); if (handleWritePacket()) { -SplitFragmentSenderState * send_state = -dynamic_castSplitFragmentSenderState *(pkt1-senderState); send_state-clearFromParent(); dcache_pkt = pkt2; +send_state = dynamic_castSplitFragmentSenderState *( +pkt2-senderState); if (handleWritePacket()) { -send_state = dynamic_castSplitFragmentSenderState *( -pkt1-senderState); send_state-clearFromParent(); } } ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
[m5-dev] changeset in m5: garnet: Split network power in ruby.stats
changeset 3a02353d6e43 in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=3a02353d6e43 description: garnet: Split network power in ruby.stats Split out dynamic and static power numbers for printing to ruby.stats diffstat: src/mem/ruby/network/garnet/fixed-pipeline/GarnetNetwork_d.cc | 12 +++ src/mem/ruby/network/garnet/fixed-pipeline/NetworkLink_d.hh | 5 src/mem/ruby/network/garnet/fixed-pipeline/Router_d.hh| 6 + src/mem/ruby/network/orion/NetworkPower.cc| 7 ++ 4 files changed, 30 insertions(+), 0 deletions(-) diffs (111 lines): diff -r 409a2692b8e6 -r 3a02353d6e43 src/mem/ruby/network/garnet/fixed-pipeline/GarnetNetwork_d.cc --- a/src/mem/ruby/network/garnet/fixed-pipeline/GarnetNetwork_d.cc Sun Feb 06 22:14:19 2011 -0800 +++ b/src/mem/ruby/network/garnet/fixed-pipeline/GarnetNetwork_d.cc Sun Feb 06 22:14:19 2011 -0800 @@ -319,16 +319,28 @@ out - endl; double m_total_link_power = 0.0; +double m_dynamic_link_power = 0.0; +double m_static_link_power = 0.0; double m_total_router_power = 0.0; +double m_dynamic_router_power = 0.0; +double m_static_router_power = 0.0; for (int i = 0; i m_link_ptr_vector.size(); i++) { m_total_link_power += m_link_ptr_vector[i]-calculate_power(); +m_dynamic_link_power += m_link_ptr_vector[i]-get_dynamic_power(); +m_static_link_power += m_link_ptr_vector[i]-get_static_power(); } for (int i = 0; i m_router_ptr_vector.size(); i++) { m_total_router_power += m_router_ptr_vector[i]-calculate_power(); +m_dynamic_router_power += m_router_ptr_vector[i]-get_dynamic_power(); +m_static_router_power += m_router_ptr_vector[i]-get_static_power(); } +out Link Dynamic Power = m_dynamic_link_power W endl; +out Link Static Power = m_static_link_power W endl; out Total Link Power = m_total_link_power W endl; +out Router Dynamic Power = m_dynamic_router_power W endl; +out Router Static Power = m_static_router_power W endl; out Total Router Power = m_total_router_power W endl; out - endl; m_topology_ptr-printStats(out); diff -r 409a2692b8e6 -r 3a02353d6e43 src/mem/ruby/network/garnet/fixed-pipeline/NetworkLink_d.hh --- a/src/mem/ruby/network/garnet/fixed-pipeline/NetworkLink_d.hh Sun Feb 06 22:14:19 2011 -0800 +++ b/src/mem/ruby/network/garnet/fixed-pipeline/NetworkLink_d.hh Sun Feb 06 22:14:19 2011 -0800 @@ -54,6 +54,8 @@ int getLinkUtilization(); std::vectorint getVcLoad(); int get_id(){return m_id;} +double get_dynamic_power(){return m_power_dyn;} +double get_static_power(){return m_power_sta;} void wakeup(); double calculate_power(); @@ -73,6 +75,9 @@ int m_link_utilized; std::vectorint m_vc_load; int m_flit_width; + +double m_power_dyn; +double m_power_sta; }; #endif // __MEM_RUBY_NETWORK_GARNET_FIXED_PIPELINE_NETWORK_LINK_D_HH__ diff -r 409a2692b8e6 -r 3a02353d6e43 src/mem/ruby/network/garnet/fixed-pipeline/Router_d.hh --- a/src/mem/ruby/network/garnet/fixed-pipeline/Router_d.hhSun Feb 06 22:14:19 2011 -0800 +++ b/src/mem/ruby/network/garnet/fixed-pipeline/Router_d.hhSun Feb 06 22:14:19 2011 -0800 @@ -81,6 +81,9 @@ double calculate_power(); void calculate_performance_numbers(); +double get_dynamic_power(){return m_power_dyn;} +double get_static_power(){return m_power_sta;} + private: int m_id; int m_virtual_networks, m_num_vcs, m_vc_per_vnet; @@ -100,6 +103,9 @@ VCallocator_d *m_vc_alloc; SWallocator_d *m_sw_alloc; Switch_d *m_switch; + +double m_power_dyn; +double m_power_sta; }; #endif // __MEM_RUBY_NETWORK_GARNET_FIXED_PIPELINE_ROUTER_D_HH__ diff -r 409a2692b8e6 -r 3a02353d6e43 src/mem/ruby/network/orion/NetworkPower.cc --- a/src/mem/ruby/network/orion/NetworkPower.ccSun Feb 06 22:14:19 2011 -0800 +++ b/src/mem/ruby/network/orion/NetworkPower.ccSun Feb 06 22:14:19 2011 -0800 @@ -206,6 +206,7 @@ Pxbar_dyn + Pclk_dyn; +m_power_dyn = Ptotal_dyn; // Static Power Pbuf_sta = orion_rtr_ptr-get_static_power_buf(); @@ -215,6 +216,8 @@ Ptotal_sta += Pbuf_sta + Pvc_arb_sta + Psw_arb_sta + Pxbar_sta; +m_power_sta = Ptotal_sta; + Ptotal = Ptotal_dyn + Ptotal_sta; return Ptotal; @@ -250,9 +253,13 @@ double Plink_dyn = orion_link_ptr-calc_dynamic_energy(channel_width/2)* (m_link_utilized/ sim_cycles)*freq_Hz; +m_power_dyn = Plink_dyn; + // Static Power double Plink_sta = orion_link_ptr-get_static_power(); +m_power_sta = Plink_sta; + double Ptotal = Plink_dyn + Plink_sta; return Ptotal; ___ m5-dev mailing list m5-dev@m5sim.org
Re: [m5-dev] Testing Functional Access
Hi Nilay, I don't know if there is a regression for it, but the M5 utility (./util/m5/) sets up functional accesses to memory. For instance, in FS, if you specify an rcS script to fs.py and call % /sbin/m5 readfile from the command line of the simulated system, it will read the specified rcS file off the host machine's disk and send it to the memory of the simulated system using functional accesses. I think there are other functional access examples in the magic that the M5 utility provides. Hope this helps, Joel On Tue, Mar 1, 2011 at 8:51 AM, Nilay ni...@cs.wisc.edu wrote: How can I test whether or not functional accesses to the memory are working correctly? Do we have some regression test for this? Thanks Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] changeset in m5: garnet: removed flit_width from Routers
OrionRouter( num_in_port, ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] [m5-users] Tracing does not work
Hey Nilay, It looks like the tracing (debug) functionality is now working again, but the M5 help message is still incorrect (and extremely misleading). For instance, trace-flags, and trace-file are still accepted, but they don't do anything now. They should be eliminated from the message. We're also missing the equivalent of trace-start and trace-file. Do you mind cleaning that up? Thanks, Joel PS. I haven't followed the tracing/debugging thread closely enough, but it seems like trace and debug should be different things (though they are currently implemented as the same thing). Is there a reason why we moved over to debug? On Fri, Apr 29, 2011 at 8:28 AM, Nilay Vaish ni...@cs.wisc.edu wrote: On Fri, 29 Apr 2011, Korey Sewell wrote: Is it not now debug-help and debug-flags instead of trace-help and trace-flags??? On Fri, Apr 29, 2011 at 9:18 AM, Nilay Vaish ni...@cs.wisc.edu wrote: On Thu, 28 Apr 2011, Nilay wrote: On Thu, April 28, 2011 7:55 pm, Andrea Pellegrini wrote: Hi all, I just downloaded the latest repo from: http://repo.m5sim.org/m5 When I activate the trace functionalities through the flags nothing shows up in the output. The same command for older versions of m5 (few weeks ago) worked flawlessly. Can anybody help? Thanks -Andrea Andrea, we are aware of the problem. The solution is almost ready, and hopefully by tomorrow trace would start functioning again. -- Nilay Andrea, trace facility is working now. In fact it was fixed yesterday itself. -- Nilay ___ m5-users mailing list m5-us...@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users -- - Korey That's right, the option names have been changed. But there was some error in the trace facility it self that Nate corrected yesterday. Nilay ___ m5-users mailing list m5-us...@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] [m5-users] Tracing does not work
Hey guys, I wasn't sure what the intended outcome with tracing v. debugging was going to be. It sounds like the move is to debug as a more general term, though it will support all of the trace functionality. In that case, my confusion arose from the naming of the flags. Since trace-file and trace-start now go along with the other debug flags (i.e. you wouldn't use them unless you're using the debug flags), it probably makes sense to change the names to reflect the connection. For instance, debug-trace-file and debug-trace-start are clearer and still reflect that you'll be collecting a trace. Joel I was thinking og doing it since Nate is not around. I'll do it soon. instance, trace-flags, and trace-file are still accepted, but they don't do anything now. They should be eliminated from the message. We're also missing the equivalent of trace-start and trace-file. Do you mind cleaning that up? Are you sure that trace-file doesn't work? I've basically renamed --trace-help to --debug-help, so the former can be removed. Also I've renamed --trace-flags to --debug-flags, so that one can be removed too. (I intended to, I just screwed up.) The purpose of renaming trace flags to debug flags is that the flags themselves can be used for a lot more than tracing (I'm starting to use them to insert debugging breakpoints, they're used for exec trace which is really a different tracing facility, they can be used for whatever) and it seemed odd to have two different classes of flags (though we could do that if we wanted to). The only error that I know of right now is that --trace-help and --trace-flags still exist and silently act when they shouldn't. I'm compiling right now, but things are slow on my laptop. I'll test out --trace-file, but I'm not sure why that would have changed at all. Nate ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev