Looks like the main difference is that the regression has an L2 cache where just saying --caches only gives you L1s. I'm rerunning the problematic version with --l2cache to see if that fixes the problem.
Steve On Wed, Nov 26, 2008 at 12:59 PM, Ali Saidi <[EMAIL PROTECTED]> wrote: > The o3 regression with caches does successfully run. So, what us the > difference between what you're running and what the regressions run? > > Ali > > > > On Nov 26, 2008, at 14:02, "Steve Reinhardt" <[EMAIL PROTECTED]> wrote: > >> So far I've reproduced this bug all the way back to beta4. I can't >> get beta3 to compile on my machine because it apparently predates a >> bunch of necessary gcc 4.2 patches. I'm guessing this is something >> that's always been there in the new memory system and we've never seen >> it since we generally don't boot in detailed mode with caches. >> >> I think the workaround for Ryan (and anyone else) is to do what we do: >> boot in atomic mode, checkpoint, and then restart from the checkpoint >> and switch into detailed mode to run your program. (You could even >> skip the checkpoint and just boot and then switch over, but I believe >> the existing scripts support the checkpoint model better.) >> >> Meanwhile I'll keep poking at this, but probably not at high priority. >> >> Steve >> >> On Wed, Nov 26, 2008 at 11:38 AM, Steve Reinhardt <[EMAIL PROTECTED]> >> wrote: >>> OK, bad news... looks like the bug is there even in m5-stable. I'm >>> doing an hg bisect to figure out where it started. >>> >>> Steve >>> >>> On Wed, Nov 26, 2008 at 10:53 AM, Steve Reinhardt >>> <[EMAIL PROTECTED]> wrote: >>>> Is this with m5 or m5-stable? Assuming it's the former, then we >>>> still >>>> have some debugging to do, but the answer for Ryan may be to stick >>>> with m5-stable. >>>> >>>> If it's the latter, then that's just bad. >>>> >>>> Steve >>>> >>>> On Wed, Nov 26, 2008 at 10:26 AM, Lisa Hsu <[EMAIL PROTECTED]> >>>> wrote: >>>>> I just tried to replicate Ryan's problem - and I did, >>>>> unfortunately. With >>>>> Turkey Day coming and travel approaching, I can only look at this >>>>> sporadically over the next few days, but it appears that >>>>> something is broken >>>>> in FS. >>>>> >>>>> My repo isn't clean but all my patches shouldn't affect regular FS >>>>> simulation. Ryan ran ValMemLat and it breaks for me in the same >>>>> way he >>>>> described, same with NetperfMaerts. >>>>> >>>>> here's the m5term output right before breaking: >>>>> >>>>> SMP: Total of 1 processors activated (4002.20 BogoMIPS). >>>>> NET: Registered protocol family 16 >>>>> EISA bus registered >>>>> >>>>> I'll try to take a look over the weekend, but did anyone push >>>>> anything >>>>> recently that might break busses/addr range stuff? Oh - my last >>>>> pull from >>>>> the main repo was Steve's tracediff stuff, so something happened >>>>> between >>>>> that push and the release when we tested everything. >>>>> >>>>> Lisa >>>>> >>>>> >>>>> On Wed, Nov 26, 2008 at 1:11 PM, Ryan Markley <[EMAIL PROTECTED]> >>>>> wrote: >>>>>> >>>>>> Hello again, >>>>>> >>>>>> I have installed the simulator in other machine and I still get >>>>>> the >>>>>> same error also I have run the regression tests and I do not >>>>>> pass the >>>>>> twosys-tsunami-simple-atomic, this is the complete process that >>>>>> I do >>>>>> to install the simulator. >>>>>> >>>>>> To install the FS mode - scons build/ALPHA_FS/m5.debug >>>>>> >>>>>> I Download m5_system_2.0b3.tar.bz2 and I untar it in one of my >>>>>> local >>>>>> directories in the wiki is done with the sudo command, I cannot do >>>>>> that because I am not root, is the problem here?. >>>>>> >>>>>> After I change syspath.py to point to my local directory where I >>>>>> untar >>>>>> the disk image. >>>>>> >>>>>> And finally I try this command ./build/ALPHA_FS/m5.opt >>>>>> configs/example/fs.py -b ValMemLat --caches --detailed and I get >>>>>> the >>>>>> error of unable to find destination for address. Am I doing >>>>>> something >>>>>> wrong in the installation?. >>>>>> >>>>>> Thanks again. >>>>>> >>>>>> On Tue, Nov 25, 2008 at 3:26 PM, Ryan Markley >>>>>> <[EMAIL PROTECTED]> wrote: >>>>>>> >>>>>>> Hi Lisa thank you for your answer, I will try to run the >>>>>>> simulator in >>>>>>> another kenerl and see what happens, this is part of the output >>>>>>> that I get >>>>>>> with exec, I do not know what information it can be relevant, >>>>>>> so if you can >>>>>>> tell me something to look at, then I could give you more useful >>>>>>> information. >>>>>>> >>>>>>> This is the output for a hello program: >>>>>>> >>>>>>> 44873185000: system.cpu T0 : @vsnprintf+128 : extbl r2,r3,r1 >>>>>>> : IntAlu : D=0x00000000 00000025 >>>>>>> 44873185000: system.cpu T0 : @vsnprintf+132 : sll r1,56,r1 >>>>>>> : IntAlu : D=0x25000000 00000000 >>>>>>> 44873185000: system.cpu T0 : @vsnprintf+136 : sra r1,56,r1 >>>>>>> : IntAlu : D=0x00000000 00000025 >>>>>>> 44873185000: system.cpu T0 : @vsnprintf+140 : nop (bis >>>>>>> r31,r31,r31) : No_OpClass : >>>>>>> 44873185000: system.cpu T0 : @vsnprintf+144 : cmpeq r1,37,r1 >>>>>>> : IntAlu : D=0x00000000 00000001 >>>>>>> 44873185000: system.cpu T0 : @vsnprintf+148 : bne >>>>>>> r1,0xfffffc00004bd6e0 : IntAlu : >>>>>>> 44873192000: system.cpu T0 : @vsnprintf+656 : bis >>>>>>> r31,r31,r12 >>>>>>> : IntAlu : D=0x00000000 00000000 >>>>>>> 44873192000: system.cpu T0 : @vsnprintf+660 : nop (ldq_u >>>>>>> r31,0(r30)) : No_OpClass : >>>>>>> 44873192000: system.cpu T0 : @vsnprintf+664 : nop (bis >>>>>>> r31,r31,r31) : No_OpClass : >>>>>>> 44873192000: system.cpu T0 : @vsnprintf+668 : nop (ldq_u >>>>>>> r31,0(r30)) : No_OpClass : >>>>>>> 44873192000: system.cpu T0 : @vsnprintf+672 : lda >>>>>>> r18,1(r18) >>>>>>> : IntAlu : D=0xfffffc00 0066dae9 >>>>>>> 44873192000: system.cpu T0 : @vsnprintf+676 : stq >>>>>>> r18,96(r30) >>>>>>> : MemWrite : D=0xfffffc 000066dae9 >>>>>>> A=0xfffffc0000c3bd58 >>>>>>> 44873192000: system.cpu T0 : @vsnprintf+680 : ldq_u >>>>>>> r2,0(r18) >>>>>>> : MemRead : D=0x3230253 a78343025 >>>>>>> A=0xfffffc000066dae8 >>>>>>> 44873192000: system.cpu T0 : @vsnprintf+684 : extbl >>>>>>> r2,r18,r1 >>>>>>> : IntAlu : D=0x00000000 00000030 >>>>>>> 44873193500: system.cpu T0 : @vsnprintf+688 : sll r1,56,r1 >>>>>>> : IntAlu : D=0x30000000 00000000 >>>>>>> 44873193500: system.cpu T0 : @vsnprintf+692 : sra r1,56,r2 >>>>>>> : IntAlu : D=0x00000000 00000030 >>>>>>> 44873193500: system.cpu T0 : @vsnprintf+696 : lda >>>>>>> r2,-32(r2) >>>>>>> : IntAlu : D=0x00000000 00000010 >>>>>>> 44873193500: system.cpu T0 : @vsnprintf+700 : zapnot r2,15,r2 >>>>>>> : IntAlu : D=0x00000000 00000010 >>>>>>> 44873193500: system.cpu T0 : @vsnprintf+704 : cmpule r2,16,r1 >>>>>>> : IntAlu : D=0x00000000 00000001 >>>>>>> 44873193500: system.cpu T0 : @vsnprintf+708 : beq >>>>>>> r1,0xfffffc00004bd560 : IntAlu : >>>>>>> panic: Unable to find destination for addr (user set default >>>>>>> responder): >>>>>>> 0x80c4dbc0 >>>>>>> @ cycle 44873206500 >>>>>>> [findPort:build/ALPHA_FS/mem/bus.cc, line 334] >>>>>>> Memory Usage: 197560 KBytes >>>>>>> Program aborted at cycle 44873206500 >>>>>>> Aborted >>>>>>> >>>>>>> This is the output for the ValMemLat benchmark: >>>>>>> >>>>>>> 53177309000: system.cpu T0 : @vsnprintf+140 : nop (bis >>>>>>> r31,r31,r31) : No_OpClass : >>>>>>> 53177309000: system.cpu T0 : @vsnprintf+144 : cmpeq r1,37,r1 >>>>>>> : IntAlu : D=0x0000000000000001 >>>>>>> 53177309000: system.cpu T0 : @vsnprintf+148 : bne >>>>>>> r1,0xfffffc00004bd6e0 : IntAlu : >>>>>>> 53177316000: system.cpu T0 : @vsnprintf+656 : bis >>>>>>> r31,r31,r12 >>>>>>> : IntAlu : D=0x0000000000000000 >>>>>>> 53177316000: system.cpu T0 : @vsnprintf+660 : nop (ldq_u >>>>>>> r31,0(r30)) : No_OpClass : >>>>>>> 53177316000: system.cpu T0 : @vsnprintf+664 : nop (bis >>>>>>> r31,r31,r31) : No_OpClass : >>>>>>> 53177316000: system.cpu T0 : @vsnprintf+668 : nop (ldq_u >>>>>>> r31,0(r30)) : No_OpClass : >>>>>>> 53177316000: system.cpu T0 : @vsnprintf+672 : lda >>>>>>> r18,1(r18) >>>>>>> : IntAlu : D=0xfffffc000066dae9 >>>>>>> 53177316000: system.cpu T0 : @vsnprintf+676 : stq >>>>>>> r18,96(r30) >>>>>>> : MemWrite : D=0xfffffc000066dae9 A=0xfffffc00010a3d58 >>>>>>> 53177316000: system.cpu T0 : @vsnprintf+680 : ldq_u >>>>>>> r2,0(r18) >>>>>>> : MemRead : D=0x3230253a78343025 A=0xfffffc000066dae8 >>>>>>> 53177316000: system.cpu T0 : @vsnprintf+684 : extbl >>>>>>> r2,r18,r1 >>>>>>> : IntAlu : D=0x0000000000000030 >>>>>>> 53177317500: system.cpu T0 : @vsnprintf+688 : sll r1,56,r1 >>>>>>> : IntAlu : D=0x3000000000000000 >>>>>>> 53177317500: system.cpu T0 : @vsnprintf+692 : sra r1,56,r2 >>>>>>> : IntAlu : D=0x0000000000000030 >>>>>>> 53177317500: system.cpu T0 : @vsnprintf+696 : lda >>>>>>> r2,-32(r2) >>>>>>> : IntAlu : D=0x0000000000000010 >>>>>>> 53177317500: system.cpu T0 : @vsnprintf+700 : zapnot r2,15,r2 >>>>>>> : IntAlu : D=0x0000000000000010 >>>>>>> 53177317500: system.cpu T0 : @vsnprintf+704 : cmpule r2,16,r1 >>>>>>> : IntAlu : D=0x0000000000000001 >>>>>>> 53177317500: system.cpu T0 : @vsnprintf+708 : beq >>>>>>> r1,0xfffffc00004bd560 : IntAlu : >>>>>>> panic: Unable to find destination for addr (user set default >>>>>>> responder): >>>>>>> 0x81017bc0 >>>>>>> @ cycle 53177330500 >>>>>>> [findPort:build/ALPHA_FS/mem/bus.cc, line 334] >>>>>>> Memory Usage: 590828 KBytes >>>>>>> Program aborted at cycle 53177330500 >>>>>>> Aborted >>>>>>> >>>>>>> For every program I always get the same error It seems that the >>>>>>> error >>>>>>> always come after beq r1,0xfffffc00004bd560. >>>>>>> >>>>>>> Thanks again. >>>>>>> >>>>>>> On Tue, Nov 25, 2008 at 2:31 PM, Lisa Hsu <[EMAIL PROTECTED]> >>>>>>> wrote: >>>>>>>> >>>>>>>> What happens with you turn on the trace flags? It could be a >>>>>>>> lot of >>>>>>>> things, just asking "what could it be?" won't get any >>>>>>>> answers...if you could >>>>>>>> paste the relevant output from the Exectrace that would help. >>>>>>>> >>>>>>>> Also, I thik you mentioned before that it was your native >>>>>>>> machine >>>>>>>> running 2.6.9 and your m5 is simulating 2.6.13. that's fine, >>>>>>>> though you can >>>>>>>> go newer if you want. >>>>>>>> >>>>>>>> Lisa >>>>>>>> >>>>>>>> On Tue, Nov 25, 2008 at 2:58 PM, Ryan Markley >>>>>>>> <[EMAIL PROTECTED]> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Hello Ali thanks again for your effort in helping me, I >>>>>>>>> wasn't able to >>>>>>>>> find where the address is coming from, where do you think >>>>>>>>> that the problem >>>>>>>>> it can be?, I am running such an old kernel thanks to the >>>>>>>>> administrator of >>>>>>>>> my cluster. Have you got any other ideas about what is the >>>>>>>>> problem?. Do you >>>>>>>>> think that is a problem of my old kernel?. Thanks. >>>>>>>>> >>>>>>>>> On Mon, Nov 24, 2008 at 6:24 PM, Ali Saidi <[EMAIL PROTECTED]> >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Why are you running such an old kernel? >>>>>>>>>> >>>>>>>>>> Add the O3CPUAll traceflag and start tracing a bit earlier. >>>>>>>>>> You >>>>>>>>>> should >>>>>>>>>> figure out where that address is coming from. >>>>>>>>>> Ali >>>>>>>>>> >>>>>>>>>> On Nov 24, 2008, at 8:14 PM, Ryan Markley wrote: >>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> When I said the kernel in my last mail I said the kernel of >>>>>>>>>>> the >>>>>>>>>>> disk >>>>>>>>>>> image, my kernel is 2.6.9. I have found this problem with >>>>>>>>>>> the disk >>>>>>>>>>> image of the web site and the disk image of the PARSEC >>>>>>>>>>> benchmarks >>>>>>>>>>> that Joel post several days ago. This is the output that I >>>>>>>>>>> with the >>>>>>>>>>> exec flag. >>>>>>>>>>> >>>>>>>>>>> 44873185000: system.cpu T0 : @vsnprintf+144 : cmpeq >>>>>>>>>>> r1,37,r1 : IntAl u : >>>>>>>>>>> D=0x0000000000000001 >>>>>>>>>>> 44873185000: system.cpu T0 : @vsnprintf+148 : bne >>>>>>>>>>> r1,0xfffffc00004bd6e0 : IntAlu : >>>>>>>>>>> 44873192000: system.cpu T0 : @vsnprintf+656 : bis >>>>>>>>>>> r31,r31,r12 : IntAl u : >>>>>>>>>>> D=0x0000000000000000 >>>>>>>>>>> 44873192000: system.cpu T0 : @vsnprintf+660 : nop >>>>>>>>>>> (ldq_u >>>>>>>>>>> r31,0(r30)) : No_OpClass : >>>>>>>>>>> 44873192000: system.cpu T0 : @vsnprintf+664 : nop (bis >>>>>>>>>>> r31,r31,r31 ) : No_OpClass : >>>>>>>>>>> 44873192000: system.cpu T0 : @vsnprintf+668 : nop >>>>>>>>>>> (ldq_u >>>>>>>>>>> r31,0(r30)) : No_OpClass : >>>>>>>>>>> 44873192000: system.cpu T0 : @vsnprintf+672 : lda >>>>>>>>>>> r18,1(r18) : IntAl u : >>>>>>>>>>> D=0xfffffc000066dae9 >>>>>>>>>>> 44873192000: system.cpu T0 : @vsnprintf+676 : stq >>>>>>>>>>> r18,96(r30) : MemWr ite : >>>>>>>>>>> D=0xfffffc000066dae9 >>>>>>>>>>> A=0xfffffc0000c3bd58 >>>>>>>>>>> 44873192000: system.cpu T0 : @vsnprintf+680 : ldq_u >>>>>>>>>>> r2,0(r18) : MemRe ad : >>>>>>>>>>> D=0x3230253a78343025 >>>>>>>>>>> A=0xfffffc000066dae8 >>>>>>>>>>> 44873192000: system.cpu T0 : @vsnprintf+684 : extbl >>>>>>>>>>> r2,r18,r1 : IntAl u : >>>>>>>>>>> D=0x0000000000000030 >>>>>>>>>>> 44873193500: system.cpu T0 : @vsnprintf+688 : sll >>>>>>>>>>> r1,56,r1 : IntAl u : >>>>>>>>>>> D=0x3000000000000000 >>>>>>>>>>> 44873193500: system.cpu T0 : @vsnprintf+692 : sra >>>>>>>>>>> r1,56,r2 : IntAl u : >>>>>>>>>>> D=0x0000000000000030 >>>>>>>>>>> 44873193500: system.cpu T0 : @vsnprintf+696 : lda >>>>>>>>>>> r2,-32(r2) : IntAl u : >>>>>>>>>>> D=0x0000000000000010 >>>>>>>>>>> 44873193500: system.cpu T0 : @vsnprintf+700 : zapnot >>>>>>>>>>> r2,15,r2 : IntAl u : >>>>>>>>>>> D=0x0000000000000010 >>>>>>>>>>> 44873193500: system.cpu T0 : @vsnprintf+704 : cmpule >>>>>>>>>>> r2,16,r1 : IntAl u : >>>>>>>>>>> D=0x0000000000000001 >>>>>>>>>>> 44873193500: system.cpu T0 : @vsnprintf+708 : beq >>>>>>>>>>> r1,0xfffffc00004bd560 : IntAlu : >>>>>>>>>>> panic: Unable to find destination for addr (user set default >>>>>>>>>>> responder): 0x80c4d bc0 >>>>>>>>>>> @ cycle 44873206500 >>>>>>>>>>> [findPort:build/ALPHA_FS/mem/bus.cc, line 334] >>>>>>>>>>> Memory Usage: 197688 KBytes >>>>>>>>>>> Program aborted at cycle 44873206500 >>>>>>>>>>> Aborted >>>>>>>>>>> >>>>>>>>>>> Thanks for the help. >>>>>>>>>>> >>>>>>>>>>> On Mon, Nov 24, 2008 at 4:52 PM, Ali Saidi >>>>>>>>>>> <[EMAIL PROTECTED]> wrote: >>>>>>>>>>> The Exec traceflag is very useful. You'll see the symbol >>>>>>>>>>> names for >>>>>>>>>>> the >>>>>>>>>>> function that is causing the read to be issued. However, >>>>>>>>>>> you should >>>>>>>>>>> only enable tracing right before the error (e.g. --trace- >>>>>>>>>>> start= >>>>>>>>>>> 44873106500). >>>>>>>>>>> >>>>>>>>>>> Do you encounter the problem with the compiled kernel >>>>>>>>>>> available on >>>>>>>>>>> the >>>>>>>>>>> website? >>>>>>>>>>> >>>>>>>>>>> Ali >>>>>>>>>>> >>>>>>>>>>> On Nov 24, 2008, at 7:35 PM, Ryan Markley wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi Ali thanks again, >>>>>>>>>>>> >>>>>>>>>>>> I have been trying several programs and in all of them is >>>>>>>>>>>> the >>>>>>>>>>>> same, >>>>>>>>>>>> do you think that maybe is a bug of the software for the GCC >>>>>>>>>>>> version >>>>>>>>>>>> or other libraries?, I did not do any changes to the >>>>>>>>>>>> simulator. >>>>>>>>>>>> My >>>>>>>>>>>> gcc version is 4.3.2 and my kernel is .6.6.13. I have >>>>>>>>>>>> enable the >>>>>>>>>>>> bus >>>>>>>>>>>> trace flags and this is the output: >>>>>>>>>>>> >>>>>>>>>>>> 44873206500: system.iobus: recvTiming: src 0 dst -1 ReadReq >>>>>>>>>>> 0x80c4dbc0 >>>>>>>>>>>> panic: Unable to find destination for addr (user set default >>>>>>>>>>>> responder): 0x80c4dbc0 >>>>>>>>>>>> >>>>>>>>>>>> I am a beginner in the simulator so can you tell me other >>>>>>>>>>>> trace >>>>>>>>>>>> flags that I can use to give your more useful information, >>>>>>>>>>>> in >>>>>>>>>>>> addition how can I do to show the information after a >>>>>>>>>>>> certain >>>>>>>>>>>> number >>>>>>>>>>>> of cycles?. >>>>>>>>>>>> >>>>>>>>>>>> Thanks. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Mon, Nov 24, 2008 at 2:55 PM, Ali Saidi <[EMAIL PROTECTED]> >>>>>>>>>>>> wrote: >>>>>>>>>>>> Ok, now you're going to need to do some debugging. You >>>>>>>>>>>> know what >>>>>>>>>>> cycle >>>>>>>>>>>> the panic occurs at, so you should enable some trace flags >>>>>>>>>>>> a few >>>>>>>>>>>> thousand cycles before that and figure out what the CPU is >>>>>>>>>>>> doing. >>>>>>>>>>>> Is >>>>>>>>>>>> it accessing a good address? Is there some bug with the >>>>>>>>>>>> address >>>>>>>>>>>> calculation? Where is the address coming from? >>>>>>>>>>>> >>>>>>>>>>>> Have you made any changes to the simulator? What kernel >>>>>>>>>>>> are you >>>>>>>>>>>> running? >>>>>>>>>>>> >>>>>>>>>>>> Ali >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Nov 24, 2008, at 5:34 PM, Ryan Markley wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi Ali thanks for y > _______________________________________________ > m5-users mailing list > [email protected] > http://m5sim.org/cgi-bin/mailman/listinfo/m5-users > _______________________________________________ m5-users mailing list [email protected] http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
