The o3 regression with caches does successfully run. So, what us the difference between what you're running and what the regressions run?
Ali On Nov 26, 2008, at 14:02, "Steve Reinhardt" <[EMAIL PROTECTED]> wrote: > So far I've reproduced this bug all the way back to beta4. I can't > get beta3 to compile on my machine because it apparently predates a > bunch of necessary gcc 4.2 patches. I'm guessing this is something > that's always been there in the new memory system and we've never seen > it since we generally don't boot in detailed mode with caches. > > I think the workaround for Ryan (and anyone else) is to do what we do: > boot in atomic mode, checkpoint, and then restart from the checkpoint > and switch into detailed mode to run your program. (You could even > skip the checkpoint and just boot and then switch over, but I believe > the existing scripts support the checkpoint model better.) > > Meanwhile I'll keep poking at this, but probably not at high priority. > > Steve > > On Wed, Nov 26, 2008 at 11:38 AM, Steve Reinhardt <[EMAIL PROTECTED]> > wrote: >> OK, bad news... looks like the bug is there even in m5-stable. I'm >> doing an hg bisect to figure out where it started. >> >> Steve >> >> On Wed, Nov 26, 2008 at 10:53 AM, Steve Reinhardt >> <[EMAIL PROTECTED]> wrote: >>> Is this with m5 or m5-stable? Assuming it's the former, then we >>> still >>> have some debugging to do, but the answer for Ryan may be to stick >>> with m5-stable. >>> >>> If it's the latter, then that's just bad. >>> >>> Steve >>> >>> On Wed, Nov 26, 2008 at 10:26 AM, Lisa Hsu <[EMAIL PROTECTED]> >>> wrote: >>>> I just tried to replicate Ryan's problem - and I did, >>>> unfortunately. With >>>> Turkey Day coming and travel approaching, I can only look at this >>>> sporadically over the next few days, but it appears that >>>> something is broken >>>> in FS. >>>> >>>> My repo isn't clean but all my patches shouldn't affect regular FS >>>> simulation. Ryan ran ValMemLat and it breaks for me in the same >>>> way he >>>> described, same with NetperfMaerts. >>>> >>>> here's the m5term output right before breaking: >>>> >>>> SMP: Total of 1 processors activated (4002.20 BogoMIPS). >>>> NET: Registered protocol family 16 >>>> EISA bus registered >>>> >>>> I'll try to take a look over the weekend, but did anyone push >>>> anything >>>> recently that might break busses/addr range stuff? Oh - my last >>>> pull from >>>> the main repo was Steve's tracediff stuff, so something happened >>>> between >>>> that push and the release when we tested everything. >>>> >>>> Lisa >>>> >>>> >>>> On Wed, Nov 26, 2008 at 1:11 PM, Ryan Markley <[EMAIL PROTECTED]> >>>> wrote: >>>>> >>>>> Hello again, >>>>> >>>>> I have installed the simulator in other machine and I still get >>>>> the >>>>> same error also I have run the regression tests and I do not >>>>> pass the >>>>> twosys-tsunami-simple-atomic, this is the complete process that >>>>> I do >>>>> to install the simulator. >>>>> >>>>> To install the FS mode - scons build/ALPHA_FS/m5.debug >>>>> >>>>> I Download m5_system_2.0b3.tar.bz2 and I untar it in one of my >>>>> local >>>>> directories in the wiki is done with the sudo command, I cannot do >>>>> that because I am not root, is the problem here?. >>>>> >>>>> After I change syspath.py to point to my local directory where I >>>>> untar >>>>> the disk image. >>>>> >>>>> And finally I try this command ./build/ALPHA_FS/m5.opt >>>>> configs/example/fs.py -b ValMemLat --caches --detailed and I get >>>>> the >>>>> error of unable to find destination for address. Am I doing >>>>> something >>>>> wrong in the installation?. >>>>> >>>>> Thanks again. >>>>> >>>>> On Tue, Nov 25, 2008 at 3:26 PM, Ryan Markley >>>>> <[EMAIL PROTECTED]> wrote: >>>>>> >>>>>> Hi Lisa thank you for your answer, I will try to run the >>>>>> simulator in >>>>>> another kenerl and see what happens, this is part of the output >>>>>> that I get >>>>>> with exec, I do not know what information it can be relevant, >>>>>> so if you can >>>>>> tell me something to look at, then I could give you more useful >>>>>> information. >>>>>> >>>>>> This is the output for a hello program: >>>>>> >>>>>> 44873185000: system.cpu T0 : @vsnprintf+128 : extbl r2,r3,r1 >>>>>> : IntAlu : D=0x00000000 00000025 >>>>>> 44873185000: system.cpu T0 : @vsnprintf+132 : sll r1,56,r1 >>>>>> : IntAlu : D=0x25000000 00000000 >>>>>> 44873185000: system.cpu T0 : @vsnprintf+136 : sra r1,56,r1 >>>>>> : IntAlu : D=0x00000000 00000025 >>>>>> 44873185000: system.cpu T0 : @vsnprintf+140 : nop (bis >>>>>> r31,r31,r31) : No_OpClass : >>>>>> 44873185000: system.cpu T0 : @vsnprintf+144 : cmpeq r1,37,r1 >>>>>> : IntAlu : D=0x00000000 00000001 >>>>>> 44873185000: system.cpu T0 : @vsnprintf+148 : bne >>>>>> r1,0xfffffc00004bd6e0 : IntAlu : >>>>>> 44873192000: system.cpu T0 : @vsnprintf+656 : bis >>>>>> r31,r31,r12 >>>>>> : IntAlu : D=0x00000000 00000000 >>>>>> 44873192000: system.cpu T0 : @vsnprintf+660 : nop (ldq_u >>>>>> r31,0(r30)) : No_OpClass : >>>>>> 44873192000: system.cpu T0 : @vsnprintf+664 : nop (bis >>>>>> r31,r31,r31) : No_OpClass : >>>>>> 44873192000: system.cpu T0 : @vsnprintf+668 : nop (ldq_u >>>>>> r31,0(r30)) : No_OpClass : >>>>>> 44873192000: system.cpu T0 : @vsnprintf+672 : lda >>>>>> r18,1(r18) >>>>>> : IntAlu : D=0xfffffc00 0066dae9 >>>>>> 44873192000: system.cpu T0 : @vsnprintf+676 : stq >>>>>> r18,96(r30) >>>>>> : MemWrite : D=0xfffffc 000066dae9 >>>>>> A=0xfffffc0000c3bd58 >>>>>> 44873192000: system.cpu T0 : @vsnprintf+680 : ldq_u >>>>>> r2,0(r18) >>>>>> : MemRead : D=0x3230253 a78343025 >>>>>> A=0xfffffc000066dae8 >>>>>> 44873192000: system.cpu T0 : @vsnprintf+684 : extbl >>>>>> r2,r18,r1 >>>>>> : IntAlu : D=0x00000000 00000030 >>>>>> 44873193500: system.cpu T0 : @vsnprintf+688 : sll r1,56,r1 >>>>>> : IntAlu : D=0x30000000 00000000 >>>>>> 44873193500: system.cpu T0 : @vsnprintf+692 : sra r1,56,r2 >>>>>> : IntAlu : D=0x00000000 00000030 >>>>>> 44873193500: system.cpu T0 : @vsnprintf+696 : lda >>>>>> r2,-32(r2) >>>>>> : IntAlu : D=0x00000000 00000010 >>>>>> 44873193500: system.cpu T0 : @vsnprintf+700 : zapnot r2,15,r2 >>>>>> : IntAlu : D=0x00000000 00000010 >>>>>> 44873193500: system.cpu T0 : @vsnprintf+704 : cmpule r2,16,r1 >>>>>> : IntAlu : D=0x00000000 00000001 >>>>>> 44873193500: system.cpu T0 : @vsnprintf+708 : beq >>>>>> r1,0xfffffc00004bd560 : IntAlu : >>>>>> panic: Unable to find destination for addr (user set default >>>>>> responder): >>>>>> 0x80c4dbc0 >>>>>> @ cycle 44873206500 >>>>>> [findPort:build/ALPHA_FS/mem/bus.cc, line 334] >>>>>> Memory Usage: 197560 KBytes >>>>>> Program aborted at cycle 44873206500 >>>>>> Aborted >>>>>> >>>>>> This is the output for the ValMemLat benchmark: >>>>>> >>>>>> 53177309000: system.cpu T0 : @vsnprintf+140 : nop (bis >>>>>> r31,r31,r31) : No_OpClass : >>>>>> 53177309000: system.cpu T0 : @vsnprintf+144 : cmpeq r1,37,r1 >>>>>> : IntAlu : D=0x0000000000000001 >>>>>> 53177309000: system.cpu T0 : @vsnprintf+148 : bne >>>>>> r1,0xfffffc00004bd6e0 : IntAlu : >>>>>> 53177316000: system.cpu T0 : @vsnprintf+656 : bis >>>>>> r31,r31,r12 >>>>>> : IntAlu : D=0x0000000000000000 >>>>>> 53177316000: system.cpu T0 : @vsnprintf+660 : nop (ldq_u >>>>>> r31,0(r30)) : No_OpClass : >>>>>> 53177316000: system.cpu T0 : @vsnprintf+664 : nop (bis >>>>>> r31,r31,r31) : No_OpClass : >>>>>> 53177316000: system.cpu T0 : @vsnprintf+668 : nop (ldq_u >>>>>> r31,0(r30)) : No_OpClass : >>>>>> 53177316000: system.cpu T0 : @vsnprintf+672 : lda >>>>>> r18,1(r18) >>>>>> : IntAlu : D=0xfffffc000066dae9 >>>>>> 53177316000: system.cpu T0 : @vsnprintf+676 : stq >>>>>> r18,96(r30) >>>>>> : MemWrite : D=0xfffffc000066dae9 A=0xfffffc00010a3d58 >>>>>> 53177316000: system.cpu T0 : @vsnprintf+680 : ldq_u >>>>>> r2,0(r18) >>>>>> : MemRead : D=0x3230253a78343025 A=0xfffffc000066dae8 >>>>>> 53177316000: system.cpu T0 : @vsnprintf+684 : extbl >>>>>> r2,r18,r1 >>>>>> : IntAlu : D=0x0000000000000030 >>>>>> 53177317500: system.cpu T0 : @vsnprintf+688 : sll r1,56,r1 >>>>>> : IntAlu : D=0x3000000000000000 >>>>>> 53177317500: system.cpu T0 : @vsnprintf+692 : sra r1,56,r2 >>>>>> : IntAlu : D=0x0000000000000030 >>>>>> 53177317500: system.cpu T0 : @vsnprintf+696 : lda >>>>>> r2,-32(r2) >>>>>> : IntAlu : D=0x0000000000000010 >>>>>> 53177317500: system.cpu T0 : @vsnprintf+700 : zapnot r2,15,r2 >>>>>> : IntAlu : D=0x0000000000000010 >>>>>> 53177317500: system.cpu T0 : @vsnprintf+704 : cmpule r2,16,r1 >>>>>> : IntAlu : D=0x0000000000000001 >>>>>> 53177317500: system.cpu T0 : @vsnprintf+708 : beq >>>>>> r1,0xfffffc00004bd560 : IntAlu : >>>>>> panic: Unable to find destination for addr (user set default >>>>>> responder): >>>>>> 0x81017bc0 >>>>>> @ cycle 53177330500 >>>>>> [findPort:build/ALPHA_FS/mem/bus.cc, line 334] >>>>>> Memory Usage: 590828 KBytes >>>>>> Program aborted at cycle 53177330500 >>>>>> Aborted >>>>>> >>>>>> For every program I always get the same error It seems that the >>>>>> error >>>>>> always come after beq r1,0xfffffc00004bd560. >>>>>> >>>>>> Thanks again. >>>>>> >>>>>> On Tue, Nov 25, 2008 at 2:31 PM, Lisa Hsu <[EMAIL PROTECTED]> >>>>>> wrote: >>>>>>> >>>>>>> What happens with you turn on the trace flags? It could be a >>>>>>> lot of >>>>>>> things, just asking "what could it be?" won't get any >>>>>>> answers...if you could >>>>>>> paste the relevant output from the Exectrace that would help. >>>>>>> >>>>>>> Also, I thik you mentioned before that it was your native >>>>>>> machine >>>>>>> running 2.6.9 and your m5 is simulating 2.6.13. that's fine, >>>>>>> though you can >>>>>>> go newer if you want. >>>>>>> >>>>>>> Lisa >>>>>>> >>>>>>> On Tue, Nov 25, 2008 at 2:58 PM, Ryan Markley >>>>>>> <[EMAIL PROTECTED]> >>>>>>> wrote: >>>>>>>> >>>>>>>> Hello Ali thanks again for your effort in helping me, I >>>>>>>> wasn't able to >>>>>>>> find where the address is coming from, where do you think >>>>>>>> that the problem >>>>>>>> it can be?, I am running such an old kernel thanks to the >>>>>>>> administrator of >>>>>>>> my cluster. Have you got any other ideas about what is the >>>>>>>> problem?. Do you >>>>>>>> think that is a problem of my old kernel?. Thanks. >>>>>>>> >>>>>>>> On Mon, Nov 24, 2008 at 6:24 PM, Ali Saidi <[EMAIL PROTECTED]> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Why are you running such an old kernel? >>>>>>>>> >>>>>>>>> Add the O3CPUAll traceflag and start tracing a bit earlier. >>>>>>>>> You >>>>>>>>> should >>>>>>>>> figure out where that address is coming from. >>>>>>>>> Ali >>>>>>>>> >>>>>>>>> On Nov 24, 2008, at 8:14 PM, Ryan Markley wrote: >>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> When I said the kernel in my last mail I said the kernel of >>>>>>>>>> the >>>>>>>>>> disk >>>>>>>>>> image, my kernel is 2.6.9. I have found this problem with >>>>>>>>>> the disk >>>>>>>>>> image of the web site and the disk image of the PARSEC >>>>>>>>>> benchmarks >>>>>>>>>> that Joel post several days ago. This is the output that I >>>>>>>>>> with the >>>>>>>>>> exec flag. >>>>>>>>>> >>>>>>>>>> 44873185000: system.cpu T0 : @vsnprintf+144 : cmpeq >>>>>>>>>> r1,37,r1 : IntAl u : >>>>>>>>>> D=0x0000000000000001 >>>>>>>>>> 44873185000: system.cpu T0 : @vsnprintf+148 : bne >>>>>>>>>> r1,0xfffffc00004bd6e0 : IntAlu : >>>>>>>>>> 44873192000: system.cpu T0 : @vsnprintf+656 : bis >>>>>>>>>> r31,r31,r12 : IntAl u : >>>>>>>>>> D=0x0000000000000000 >>>>>>>>>> 44873192000: system.cpu T0 : @vsnprintf+660 : nop >>>>>>>>>> (ldq_u >>>>>>>>>> r31,0(r30)) : No_OpClass : >>>>>>>>>> 44873192000: system.cpu T0 : @vsnprintf+664 : nop (bis >>>>>>>>>> r31,r31,r31 ) : No_OpClass : >>>>>>>>>> 44873192000: system.cpu T0 : @vsnprintf+668 : nop >>>>>>>>>> (ldq_u >>>>>>>>>> r31,0(r30)) : No_OpClass : >>>>>>>>>> 44873192000: system.cpu T0 : @vsnprintf+672 : lda >>>>>>>>>> r18,1(r18) : IntAl u : >>>>>>>>>> D=0xfffffc000066dae9 >>>>>>>>>> 44873192000: system.cpu T0 : @vsnprintf+676 : stq >>>>>>>>>> r18,96(r30) : MemWr ite : >>>>>>>>>> D=0xfffffc000066dae9 >>>>>>>>>> A=0xfffffc0000c3bd58 >>>>>>>>>> 44873192000: system.cpu T0 : @vsnprintf+680 : ldq_u >>>>>>>>>> r2,0(r18) : MemRe ad : >>>>>>>>>> D=0x3230253a78343025 >>>>>>>>>> A=0xfffffc000066dae8 >>>>>>>>>> 44873192000: system.cpu T0 : @vsnprintf+684 : extbl >>>>>>>>>> r2,r18,r1 : IntAl u : >>>>>>>>>> D=0x0000000000000030 >>>>>>>>>> 44873193500: system.cpu T0 : @vsnprintf+688 : sll >>>>>>>>>> r1,56,r1 : IntAl u : >>>>>>>>>> D=0x3000000000000000 >>>>>>>>>> 44873193500: system.cpu T0 : @vsnprintf+692 : sra >>>>>>>>>> r1,56,r2 : IntAl u : >>>>>>>>>> D=0x0000000000000030 >>>>>>>>>> 44873193500: system.cpu T0 : @vsnprintf+696 : lda >>>>>>>>>> r2,-32(r2) : IntAl u : >>>>>>>>>> D=0x0000000000000010 >>>>>>>>>> 44873193500: system.cpu T0 : @vsnprintf+700 : zapnot >>>>>>>>>> r2,15,r2 : IntAl u : >>>>>>>>>> D=0x0000000000000010 >>>>>>>>>> 44873193500: system.cpu T0 : @vsnprintf+704 : cmpule >>>>>>>>>> r2,16,r1 : IntAl u : >>>>>>>>>> D=0x0000000000000001 >>>>>>>>>> 44873193500: system.cpu T0 : @vsnprintf+708 : beq >>>>>>>>>> r1,0xfffffc00004bd560 : IntAlu : >>>>>>>>>> panic: Unable to find destination for addr (user set default >>>>>>>>>> responder): 0x80c4d bc0 >>>>>>>>>> @ cycle 44873206500 >>>>>>>>>> [findPort:build/ALPHA_FS/mem/bus.cc, line 334] >>>>>>>>>> Memory Usage: 197688 KBytes >>>>>>>>>> Program aborted at cycle 44873206500 >>>>>>>>>> Aborted >>>>>>>>>> >>>>>>>>>> Thanks for the help. >>>>>>>>>> >>>>>>>>>> On Mon, Nov 24, 2008 at 4:52 PM, Ali Saidi >>>>>>>>>> <[EMAIL PROTECTED]> wrote: >>>>>>>>>> The Exec traceflag is very useful. You'll see the symbol >>>>>>>>>> names for >>>>>>>>>> the >>>>>>>>>> function that is causing the read to be issued. However, >>>>>>>>>> you should >>>>>>>>>> only enable tracing right before the error (e.g. --trace- >>>>>>>>>> start= >>>>>>>>>> 44873106500). >>>>>>>>>> >>>>>>>>>> Do you encounter the problem with the compiled kernel >>>>>>>>>> available on >>>>>>>>>> the >>>>>>>>>> website? >>>>>>>>>> >>>>>>>>>> Ali >>>>>>>>>> >>>>>>>>>> On Nov 24, 2008, at 7:35 PM, Ryan Markley wrote: >>>>>>>>>> >>>>>>>>>>> Hi Ali thanks again, >>>>>>>>>>> >>>>>>>>>>> I have been trying several programs and in all of them is >>>>>>>>>>> the >>>>>>>>>>> same, >>>>>>>>>>> do you think that maybe is a bug of the software for the GCC >>>>>>>>>>> version >>>>>>>>>>> or other libraries?, I did not do any changes to the >>>>>>>>>>> simulator. >>>>>>>>>>> My >>>>>>>>>>> gcc version is 4.3.2 and my kernel is .6.6.13. I have >>>>>>>>>>> enable the >>>>>>>>>>> bus >>>>>>>>>>> trace flags and this is the output: >>>>>>>>>>> >>>>>>>>>>> 44873206500: system.iobus: recvTiming: src 0 dst -1 ReadReq >>>>>>>>>> 0x80c4dbc0 >>>>>>>>>>> panic: Unable to find destination for addr (user set default >>>>>>>>>>> responder): 0x80c4dbc0 >>>>>>>>>>> >>>>>>>>>>> I am a beginner in the simulator so can you tell me other >>>>>>>>>>> trace >>>>>>>>>>> flags that I can use to give your more useful information, >>>>>>>>>>> in >>>>>>>>>>> addition how can I do to show the information after a >>>>>>>>>>> certain >>>>>>>>>>> number >>>>>>>>>>> of cycles?. >>>>>>>>>>> >>>>>>>>>>> Thanks. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Mon, Nov 24, 2008 at 2:55 PM, Ali Saidi <[EMAIL PROTECTED]> >>>>>>>>>>> wrote: >>>>>>>>>>> Ok, now you're going to need to do some debugging. You >>>>>>>>>>> know what >>>>>>>>>> cycle >>>>>>>>>>> the panic occurs at, so you should enable some trace flags >>>>>>>>>>> a few >>>>>>>>>>> thousand cycles before that and figure out what the CPU is >>>>>>>>>>> doing. >>>>>>>>>>> Is >>>>>>>>>>> it accessing a good address? Is there some bug with the >>>>>>>>>>> address >>>>>>>>>>> calculation? Where is the address coming from? >>>>>>>>>>> >>>>>>>>>>> Have you made any changes to the simulator? What kernel >>>>>>>>>>> are you >>>>>>>>>>> running? >>>>>>>>>>> >>>>>>>>>>> Ali >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Nov 24, 2008, at 5:34 PM, Ryan Markley wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi Ali thanks for y _______________________________________________ m5-users mailing list [email protected] http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
