Looks like the main difference is that the regression has an L2 cache
where just saying --caches only gives you L1s.  I'm rerunning the
problematic version with --l2cache to see if that fixes the problem.

Steve

On Wed, Nov 26, 2008 at 12:59 PM, Ali Saidi <[EMAIL PROTECTED]> wrote:
> The o3 regression with caches does successfully run. So, what us the
> difference between what you're running and what the regressions run?
>
> Ali
>
>
>
> On Nov 26, 2008, at 14:02, "Steve Reinhardt" <[EMAIL PROTECTED]> wrote:
>
>> So far I've reproduced this bug all the way back to beta4.  I can't
>> get beta3 to compile on my machine because it apparently predates a
>> bunch of necessary gcc 4.2 patches.  I'm guessing this is something
>> that's always been there in the new memory system and we've never seen
>> it since we generally don't boot in detailed mode with caches.
>>
>> I think the workaround for Ryan (and anyone else) is to do what we do:
>> boot in atomic mode, checkpoint, and then restart from the checkpoint
>> and switch into detailed mode to run your program.  (You could even
>> skip the checkpoint and just boot and then switch over, but I believe
>> the existing scripts support the checkpoint model better.)
>>
>> Meanwhile I'll keep poking at this, but probably not at high priority.
>>
>> Steve
>>
>> On Wed, Nov 26, 2008 at 11:38 AM, Steve Reinhardt <[EMAIL PROTECTED]>
>> wrote:
>>> OK, bad news... looks like the bug is there even in m5-stable.  I'm
>>> doing an hg bisect to figure out where it started.
>>>
>>> Steve
>>>
>>> On Wed, Nov 26, 2008 at 10:53 AM, Steve Reinhardt
>>> <[EMAIL PROTECTED]> wrote:
>>>> Is this with m5 or m5-stable?  Assuming it's the former, then we
>>>> still
>>>> have some debugging to do, but the answer for Ryan may be to stick
>>>> with m5-stable.
>>>>
>>>> If it's the latter, then that's just bad.
>>>>
>>>> Steve
>>>>
>>>> On Wed, Nov 26, 2008 at 10:26 AM, Lisa Hsu <[EMAIL PROTECTED]>
>>>> wrote:
>>>>> I just tried to replicate Ryan's problem - and I did,
>>>>> unfortunately.  With
>>>>> Turkey Day coming and travel approaching, I can only look at this
>>>>> sporadically over the next few days, but it appears that
>>>>> something is broken
>>>>> in FS.
>>>>>
>>>>> My repo isn't clean but all my patches shouldn't affect regular FS
>>>>> simulation.  Ryan ran ValMemLat and it breaks for me in the same
>>>>> way he
>>>>> described, same with NetperfMaerts.
>>>>>
>>>>> here's the m5term output right before breaking:
>>>>>
>>>>> SMP: Total of 1 processors activated (4002.20 BogoMIPS).
>>>>> NET: Registered protocol family 16
>>>>> EISA bus registered
>>>>>
>>>>> I'll try to take a look over the weekend, but did anyone push
>>>>> anything
>>>>> recently that might break busses/addr range stuff?  Oh - my last
>>>>> pull from
>>>>> the main repo was Steve's tracediff stuff, so something happened
>>>>> between
>>>>> that push and the release when we tested everything.
>>>>>
>>>>> Lisa
>>>>>
>>>>>
>>>>> On Wed, Nov 26, 2008 at 1:11 PM, Ryan Markley <[EMAIL PROTECTED]>
>>>>> wrote:
>>>>>>
>>>>>> Hello again,
>>>>>>
>>>>>> I have installed the simulator in other machine and I still get
>>>>>> the
>>>>>> same error also I have run the regression tests and I do not
>>>>>> pass the
>>>>>> twosys-tsunami-simple-atomic, this is the complete process that
>>>>>> I do
>>>>>> to install the simulator.
>>>>>>
>>>>>> To install the FS mode - scons build/ALPHA_FS/m5.debug
>>>>>>
>>>>>> I Download m5_system_2.0b3.tar.bz2 and I untar it in one of my
>>>>>> local
>>>>>> directories in the wiki is done with the sudo command, I cannot do
>>>>>> that because I am not root, is the problem here?.
>>>>>>
>>>>>> After I change syspath.py to point to my local directory where I
>>>>>> untar
>>>>>> the disk image.
>>>>>>
>>>>>> And finally I try this command  ./build/ALPHA_FS/m5.opt
>>>>>> configs/example/fs.py -b ValMemLat --caches --detailed and I get
>>>>>> the
>>>>>> error of unable to find destination for address. Am I doing
>>>>>> something
>>>>>> wrong in the installation?.
>>>>>>
>>>>>> Thanks again.
>>>>>>
>>>>>> On Tue, Nov 25, 2008 at 3:26 PM, Ryan Markley
>>>>>> <[EMAIL PROTECTED]> wrote:
>>>>>>>
>>>>>>> Hi Lisa thank you for your answer, I will try to run the
>>>>>>> simulator in
>>>>>>> another kenerl and see what happens, this is part of the output
>>>>>>> that I get
>>>>>>> with exec, I do not know what information it can be relevant,
>>>>>>> so if you can
>>>>>>> tell me something to look at, then I could give you more useful
>>>>>>> information.
>>>>>>>
>>>>>>> This is the output for a hello program:
>>>>>>>
>>>>>>> 44873185000: system.cpu T0 : @vsnprintf+128 : extbl      r2,r3,r1
>>>>>>> : IntAlu :  D=0x00000000                           00000025
>>>>>>> 44873185000: system.cpu T0 : @vsnprintf+132 : sll        r1,56,r1
>>>>>>> : IntAlu :  D=0x25000000                           00000000
>>>>>>> 44873185000: system.cpu T0 : @vsnprintf+136 : sra        r1,56,r1
>>>>>>> : IntAlu :  D=0x00000000                           00000025
>>>>>>> 44873185000: system.cpu T0 : @vsnprintf+140 : nop        (bis
>>>>>>> r31,r31,r31) : No_OpClass :
>>>>>>> 44873185000: system.cpu T0 : @vsnprintf+144 : cmpeq      r1,37,r1
>>>>>>> : IntAlu :  D=0x00000000                           00000001
>>>>>>> 44873185000: system.cpu T0 : @vsnprintf+148 : bne
>>>>>>> r1,0xfffffc00004bd6e0 : IntAlu :
>>>>>>> 44873192000: system.cpu T0 : @vsnprintf+656 : bis
>>>>>>> r31,r31,r12
>>>>>>> : IntAlu :  D=0x00000000                           00000000
>>>>>>> 44873192000: system.cpu T0 : @vsnprintf+660 : nop        (ldq_u
>>>>>>> r31,0(r30)) : No_OpClass :
>>>>>>> 44873192000: system.cpu T0 : @vsnprintf+664 : nop        (bis
>>>>>>> r31,r31,r31) : No_OpClass :
>>>>>>> 44873192000: system.cpu T0 : @vsnprintf+668 : nop        (ldq_u
>>>>>>> r31,0(r30)) : No_OpClass :
>>>>>>> 44873192000: system.cpu T0 : @vsnprintf+672 : lda
>>>>>>> r18,1(r18)
>>>>>>> : IntAlu :  D=0xfffffc00                           0066dae9
>>>>>>> 44873192000: system.cpu T0 : @vsnprintf+676 : stq
>>>>>>> r18,96(r30)
>>>>>>> : MemWrite :  D=0xfffffc                           000066dae9
>>>>>>> A=0xfffffc0000c3bd58
>>>>>>> 44873192000: system.cpu T0 : @vsnprintf+680 : ldq_u
>>>>>>> r2,0(r18)
>>>>>>> : MemRead :  D=0x3230253                           a78343025
>>>>>>> A=0xfffffc000066dae8
>>>>>>> 44873192000: system.cpu T0 : @vsnprintf+684 : extbl
>>>>>>> r2,r18,r1
>>>>>>> : IntAlu :  D=0x00000000                           00000030
>>>>>>> 44873193500: system.cpu T0 : @vsnprintf+688 : sll        r1,56,r1
>>>>>>> : IntAlu :  D=0x30000000                           00000000
>>>>>>> 44873193500: system.cpu T0 : @vsnprintf+692 : sra        r1,56,r2
>>>>>>> : IntAlu :  D=0x00000000                           00000030
>>>>>>> 44873193500: system.cpu T0 : @vsnprintf+696 : lda
>>>>>>> r2,-32(r2)
>>>>>>> : IntAlu :  D=0x00000000                           00000010
>>>>>>> 44873193500: system.cpu T0 : @vsnprintf+700 : zapnot     r2,15,r2
>>>>>>> : IntAlu :  D=0x00000000                           00000010
>>>>>>> 44873193500: system.cpu T0 : @vsnprintf+704 : cmpule     r2,16,r1
>>>>>>> : IntAlu :  D=0x00000000                           00000001
>>>>>>> 44873193500: system.cpu T0 : @vsnprintf+708 : beq
>>>>>>> r1,0xfffffc00004bd560 : IntAlu :
>>>>>>> panic: Unable to find destination for addr (user set default
>>>>>>> responder):
>>>>>>> 0x80c4dbc0
>>>>>>> @ cycle 44873206500
>>>>>>> [findPort:build/ALPHA_FS/mem/bus.cc, line 334]
>>>>>>> Memory Usage: 197560 KBytes
>>>>>>> Program aborted at cycle 44873206500
>>>>>>> Aborted
>>>>>>>
>>>>>>> This is the output for the ValMemLat benchmark:
>>>>>>>
>>>>>>> 53177309000: system.cpu T0 : @vsnprintf+140 : nop        (bis
>>>>>>> r31,r31,r31) : No_OpClass :
>>>>>>> 53177309000: system.cpu T0 : @vsnprintf+144 : cmpeq      r1,37,r1
>>>>>>> : IntAlu :  D=0x0000000000000001
>>>>>>> 53177309000: system.cpu T0 : @vsnprintf+148 : bne
>>>>>>> r1,0xfffffc00004bd6e0 : IntAlu :
>>>>>>> 53177316000: system.cpu T0 : @vsnprintf+656 : bis
>>>>>>> r31,r31,r12
>>>>>>> : IntAlu :  D=0x0000000000000000
>>>>>>> 53177316000: system.cpu T0 : @vsnprintf+660 : nop        (ldq_u
>>>>>>> r31,0(r30)) : No_OpClass :
>>>>>>> 53177316000: system.cpu T0 : @vsnprintf+664 : nop        (bis
>>>>>>> r31,r31,r31) : No_OpClass :
>>>>>>> 53177316000: system.cpu T0 : @vsnprintf+668 : nop        (ldq_u
>>>>>>> r31,0(r30)) : No_OpClass :
>>>>>>> 53177316000: system.cpu T0 : @vsnprintf+672 : lda
>>>>>>> r18,1(r18)
>>>>>>> : IntAlu :  D=0xfffffc000066dae9
>>>>>>> 53177316000: system.cpu T0 : @vsnprintf+676 : stq
>>>>>>> r18,96(r30)
>>>>>>> : MemWrite :  D=0xfffffc000066dae9 A=0xfffffc00010a3d58
>>>>>>> 53177316000: system.cpu T0 : @vsnprintf+680 : ldq_u
>>>>>>> r2,0(r18)
>>>>>>> : MemRead :  D=0x3230253a78343025 A=0xfffffc000066dae8
>>>>>>> 53177316000: system.cpu T0 : @vsnprintf+684 : extbl
>>>>>>> r2,r18,r1
>>>>>>> : IntAlu :  D=0x0000000000000030
>>>>>>> 53177317500: system.cpu T0 : @vsnprintf+688 : sll        r1,56,r1
>>>>>>> : IntAlu :  D=0x3000000000000000
>>>>>>> 53177317500: system.cpu T0 : @vsnprintf+692 : sra        r1,56,r2
>>>>>>> : IntAlu :  D=0x0000000000000030
>>>>>>> 53177317500: system.cpu T0 : @vsnprintf+696 : lda
>>>>>>> r2,-32(r2)
>>>>>>> : IntAlu :  D=0x0000000000000010
>>>>>>> 53177317500: system.cpu T0 : @vsnprintf+700 : zapnot     r2,15,r2
>>>>>>> : IntAlu :  D=0x0000000000000010
>>>>>>> 53177317500: system.cpu T0 : @vsnprintf+704 : cmpule     r2,16,r1
>>>>>>> : IntAlu :  D=0x0000000000000001
>>>>>>> 53177317500: system.cpu T0 : @vsnprintf+708 : beq
>>>>>>> r1,0xfffffc00004bd560 : IntAlu :
>>>>>>> panic: Unable to find destination for addr (user set default
>>>>>>> responder):
>>>>>>> 0x81017bc0
>>>>>>> @ cycle 53177330500
>>>>>>> [findPort:build/ALPHA_FS/mem/bus.cc, line 334]
>>>>>>> Memory Usage: 590828 KBytes
>>>>>>> Program aborted at cycle 53177330500
>>>>>>> Aborted
>>>>>>>
>>>>>>> For every program I always get the same error It seems that the
>>>>>>> error
>>>>>>> always come after beq r1,0xfffffc00004bd560.
>>>>>>>
>>>>>>> Thanks again.
>>>>>>>
>>>>>>> On Tue, Nov 25, 2008 at 2:31 PM, Lisa Hsu <[EMAIL PROTECTED]>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> What happens with you turn on the trace flags?  It could be a
>>>>>>>> lot of
>>>>>>>> things, just asking "what could it be?" won't get any
>>>>>>>> answers...if you could
>>>>>>>> paste the relevant output from the Exectrace that would help.
>>>>>>>>
>>>>>>>> Also, I thik you mentioned before that it was your native
>>>>>>>> machine
>>>>>>>> running 2.6.9 and your m5 is simulating 2.6.13.  that's fine,
>>>>>>>> though you can
>>>>>>>> go newer if you want.
>>>>>>>>
>>>>>>>> Lisa
>>>>>>>>
>>>>>>>> On Tue, Nov 25, 2008 at 2:58 PM, Ryan Markley
>>>>>>>> <[EMAIL PROTECTED]>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Hello Ali thanks again for your effort in helping me, I
>>>>>>>>> wasn't able to
>>>>>>>>> find where the address is coming from, where do you think
>>>>>>>>> that the problem
>>>>>>>>> it can be?, I am running such an old kernel thanks to the
>>>>>>>>> administrator of
>>>>>>>>> my cluster. Have you got any other ideas about what is the
>>>>>>>>> problem?. Do you
>>>>>>>>> think that is a problem of my old kernel?. Thanks.
>>>>>>>>>
>>>>>>>>> On Mon, Nov 24, 2008 at 6:24 PM, Ali Saidi <[EMAIL PROTECTED]>
>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Why are you running such an old kernel?
>>>>>>>>>>
>>>>>>>>>> Add the O3CPUAll traceflag and start tracing a bit earlier.
>>>>>>>>>> You
>>>>>>>>>> should
>>>>>>>>>> figure out where that address is coming from.
>>>>>>>>>> Ali
>>>>>>>>>>
>>>>>>>>>> On Nov 24, 2008, at 8:14 PM, Ryan Markley wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> When I said the kernel in my last mail I said the kernel of
>>>>>>>>>>> the
>>>>>>>>>>> disk
>>>>>>>>>>> image, my kernel is 2.6.9. I have found this problem with
>>>>>>>>>>> the disk
>>>>>>>>>>> image of the web site and the disk image of the PARSEC
>>>>>>>>>>> benchmarks
>>>>>>>>>>> that Joel post several days ago. This is the output that I
>>>>>>>>>>> with the
>>>>>>>>>>> exec flag.
>>>>>>>>>>>
>>>>>>>>>>> 44873185000: system.cpu T0 : @vsnprintf+144 : cmpeq
>>>>>>>>>>> r1,37,r1        : IntAl                u :
>>>>>>>>>>> D=0x0000000000000001
>>>>>>>>>>> 44873185000: system.cpu T0 : @vsnprintf+148 : bne
>>>>>>>>>>> r1,0xfffffc00004bd6e0 :                 IntAlu :
>>>>>>>>>>> 44873192000: system.cpu T0 : @vsnprintf+656 : bis
>>>>>>>>>>> r31,r31,r12     : IntAl                u :
>>>>>>>>>>> D=0x0000000000000000
>>>>>>>>>>> 44873192000: system.cpu T0 : @vsnprintf+660 : nop
>>>>>>>>>>> (ldq_u
>>>>>>>>>>> r31,0(r30))                 : No_OpClass :
>>>>>>>>>>> 44873192000: system.cpu T0 : @vsnprintf+664 : nop        (bis
>>>>>>>>>>> r31,r31,r31                ) : No_OpClass :
>>>>>>>>>>> 44873192000: system.cpu T0 : @vsnprintf+668 : nop
>>>>>>>>>>> (ldq_u
>>>>>>>>>>> r31,0(r30))                 : No_OpClass :
>>>>>>>>>>> 44873192000: system.cpu T0 : @vsnprintf+672 : lda
>>>>>>>>>>> r18,1(r18)      : IntAl                u :
>>>>>>>>>>> D=0xfffffc000066dae9
>>>>>>>>>>> 44873192000: system.cpu T0 : @vsnprintf+676 : stq
>>>>>>>>>>> r18,96(r30)     : MemWr                ite :
>>>>>>>>>>> D=0xfffffc000066dae9
>>>>>>>>>>> A=0xfffffc0000c3bd58
>>>>>>>>>>> 44873192000: system.cpu T0 : @vsnprintf+680 : ldq_u
>>>>>>>>>>> r2,0(r18)       : MemRe                ad :
>>>>>>>>>>> D=0x3230253a78343025
>>>>>>>>>>> A=0xfffffc000066dae8
>>>>>>>>>>> 44873192000: system.cpu T0 : @vsnprintf+684 : extbl
>>>>>>>>>>> r2,r18,r1       : IntAl                u :
>>>>>>>>>>> D=0x0000000000000030
>>>>>>>>>>> 44873193500: system.cpu T0 : @vsnprintf+688 : sll
>>>>>>>>>>> r1,56,r1        : IntAl                u :
>>>>>>>>>>> D=0x3000000000000000
>>>>>>>>>>> 44873193500: system.cpu T0 : @vsnprintf+692 : sra
>>>>>>>>>>> r1,56,r2        : IntAl                u :
>>>>>>>>>>> D=0x0000000000000030
>>>>>>>>>>> 44873193500: system.cpu T0 : @vsnprintf+696 : lda
>>>>>>>>>>> r2,-32(r2)      : IntAl                u :
>>>>>>>>>>> D=0x0000000000000010
>>>>>>>>>>> 44873193500: system.cpu T0 : @vsnprintf+700 : zapnot
>>>>>>>>>>> r2,15,r2        : IntAl                u :
>>>>>>>>>>> D=0x0000000000000010
>>>>>>>>>>> 44873193500: system.cpu T0 : @vsnprintf+704 : cmpule
>>>>>>>>>>> r2,16,r1        : IntAl                u :
>>>>>>>>>>> D=0x0000000000000001
>>>>>>>>>>> 44873193500: system.cpu T0 : @vsnprintf+708 : beq
>>>>>>>>>>> r1,0xfffffc00004bd560 :                 IntAlu :
>>>>>>>>>>> panic: Unable to find destination for addr (user set default
>>>>>>>>>>> responder): 0x80c4d                bc0
>>>>>>>>>>> @ cycle 44873206500
>>>>>>>>>>> [findPort:build/ALPHA_FS/mem/bus.cc, line 334]
>>>>>>>>>>> Memory Usage: 197688 KBytes
>>>>>>>>>>> Program aborted at cycle 44873206500
>>>>>>>>>>> Aborted
>>>>>>>>>>>
>>>>>>>>>>> Thanks for the help.
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Nov 24, 2008 at 4:52 PM, Ali Saidi
>>>>>>>>>>> <[EMAIL PROTECTED]> wrote:
>>>>>>>>>>> The Exec traceflag is very useful. You'll see the symbol
>>>>>>>>>>> names for
>>>>>>>>>>> the
>>>>>>>>>>> function that is causing the read to be issued. However,
>>>>>>>>>>> you should
>>>>>>>>>>> only enable tracing right before the error (e.g. --trace-
>>>>>>>>>>> start=
>>>>>>>>>>> 44873106500).
>>>>>>>>>>>
>>>>>>>>>>> Do you encounter the problem with the compiled kernel
>>>>>>>>>>> available on
>>>>>>>>>>> the
>>>>>>>>>>> website?
>>>>>>>>>>>
>>>>>>>>>>> Ali
>>>>>>>>>>>
>>>>>>>>>>> On Nov 24, 2008, at 7:35 PM, Ryan Markley wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Ali thanks again,
>>>>>>>>>>>>
>>>>>>>>>>>> I have been trying several programs and in all of them is
>>>>>>>>>>>> the
>>>>>>>>>>>> same,
>>>>>>>>>>>> do you think that maybe is a bug of the software for the GCC
>>>>>>>>>>>> version
>>>>>>>>>>>> or other libraries?, I did not do any changes to the
>>>>>>>>>>>> simulator.
>>>>>>>>>>>> My
>>>>>>>>>>>> gcc version is 4.3.2 and my kernel is .6.6.13. I have
>>>>>>>>>>>> enable the
>>>>>>>>>>>> bus
>>>>>>>>>>>> trace flags and this is the output:
>>>>>>>>>>>>
>>>>>>>>>>>> 44873206500: system.iobus: recvTiming: src 0 dst -1 ReadReq
>>>>>>>>>>> 0x80c4dbc0
>>>>>>>>>>>> panic: Unable to find destination for addr (user set default
>>>>>>>>>>>> responder): 0x80c4dbc0
>>>>>>>>>>>>
>>>>>>>>>>>> I am a beginner in the simulator so can you tell me other
>>>>>>>>>>>> trace
>>>>>>>>>>>> flags that I can use to give your more useful information,
>>>>>>>>>>>> in
>>>>>>>>>>>> addition how can I do to show the information after a
>>>>>>>>>>>> certain
>>>>>>>>>>>> number
>>>>>>>>>>>> of cycles?.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Nov 24, 2008 at 2:55 PM, Ali Saidi <[EMAIL PROTECTED]>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> Ok, now you're going to need to do some debugging. You
>>>>>>>>>>>> know what
>>>>>>>>>>> cycle
>>>>>>>>>>>> the panic occurs at, so you should enable some trace flags
>>>>>>>>>>>> a few
>>>>>>>>>>>> thousand cycles before that and figure out what the CPU is
>>>>>>>>>>>> doing.
>>>>>>>>>>>> Is
>>>>>>>>>>>> it accessing a good address? Is there some bug with the
>>>>>>>>>>>> address
>>>>>>>>>>>> calculation? Where is the address coming from?
>>>>>>>>>>>>
>>>>>>>>>>>> Have you made any changes to the simulator? What kernel
>>>>>>>>>>>> are you
>>>>>>>>>>>> running?
>>>>>>>>>>>>
>>>>>>>>>>>> Ali
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Nov 24, 2008, at 5:34 PM, Ryan Markley wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Ali thanks for y
> _______________________________________________
> m5-users mailing list
> [email protected]
> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>
_______________________________________________
m5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users

Reply via email to