Ah yes sorry about that. I have 2 directories one in which is a clean
copy of the m5 repo, and one private one in which i fetch changes into
from the clean one. The changesets I referred to all correspond to the
strictly clean m5 repo, but never the less, Here is a copy from an hg
log to the changesets I have referred too.

changeset:   7922:7532067f818e
user:        Brad Beckmann <[email protected]>
date:        Sun Feb 06 22:14:19 2011 -0800
summary:     ruby: support to stallAndWait the mandatory queue

changeset:   7921:351f1761765f
user:        Brad Beckmann <[email protected]>
date:        Sun Feb 06 22:14:19 2011 -0800
summary:     ruby: minor fix to deadlock panic message

changeset:   7920:39c86a8306d2
user:        Brad Beckmann <[email protected]>
date:        Sun Feb 06 22:14:19 2011 -0800
summary:     boot: script that creates a checkpoint after Linux boot up

changeset:   7919:3a02353d6e43
user:        Joel Hestness <[email protected]>
date:        Sun Feb 06 22:14:19 2011 -0800
summary:     garnet: Split network power in ruby.stats


changeset:   7910:8a92b39be50e
user:        Brad Beckmann <[email protected]>
date:        Sun Feb 06 22:14:18 2011 -0800
summary:     ruby: Fix RubyPort to properly handle retrys

changeset:   7909:eee578ed2130
user:        Joel Hestness <[email protected]>
date:        Sun Feb 06 22:14:18 2011 -0800
summary:     Ruby: Fix to return cache block size to CPU for split data transfer
s

changeset:   7908:4e83ebb67794
user:        Joel Hestness <[email protected]>
date:        Sun Feb 06 22:14:18 2011 -0800
summary:     Ruby: Add support for locked memory accesses in X86_FS

changeset:   7907:d648b8409d4c
user:        Joel Hestness <[email protected]>
date:        Sun Feb 06 22:14:18 2011 -0800
summary:     Ruby: Update the Ruby request type names for LL/SC

changeset:   7906:5ccd97218ca0
user:        Brad Beckmann <[email protected]>
date:        Sun Feb 06 22:14:18 2011 -0800
summary:     ruby: Assert for x86 misaligned access

changeset:   7905:00ad807ed2ca
user:        Brad Beckmann <[email protected]>
date:        Sun Feb 06 22:14:18 2011 -0800
summary:     ruby: x86 fs config support

Malek

On Thu, Feb 10, 2011 at 5:35 PM, Gabe Black <[email protected]> wrote:
> Numbers like 7905 are only meaningful in a strict sense in your own tree
> since different trees might number things differently. The longer hex
> value is universal. It's possible the trees are similar enough that
> those would match, but there's no guarantee.
>
> Gabe
>
> On 02/10/11 14:26, Malek Musleh wrote:
>> Hi Brad,
>>
>> I tested the different changesets and have narrowed down to where it begins.
>>
>> The last changeset that works (since 7842) is 7905.
>>
>> At 7906 this is the error:
>>
>> command line: ./build/ALPHA_FS_MOESI_CMP_directory/m5.opt
>> ./configs/example/ruby\
>> _fs.py -n 4 --topology Crossbar
>> Global frequency set at 1000000000000 ticks per second
>> info: kernel located at: /home/musleh/M5/m5_system_2.0b3/binaries/vmlinux
>> Listening for system connection on port 3456
>>       0: system.tsunami.io.rtc: Real-time clock set to Thu Jan  1 00:00:00 
>> 2009
>> 0: system.remote_gdb.listener: listening for remote gdb #0 on port 7000
>> 0: system.remote_gdb.listener: listening for remote gdb #1 on port 7001
>> 0: system.remote_gdb.listener: listening for remote gdb #2 on port 7002
>> 0: system.remote_gdb.listener: listening for remote gdb #3 on port 7003
>> **** REAL SIMULATION ****
>> info: Entering event queue @ 0.  Starting simulation...
>> info: Launching CPU 1 @ 835461000
>> info: Launching CPU 2 @ 846156000
>> info: Launching CPU 3 @ 856768000
>> warn: Prefetch instrutions is Alpha do not do anything
>> For more information see: http://www.m5sim.org/warn/3e0eccba
>> 1349195500: system.terminal: attach terminal 0
>> warn: Prefetch instrutions is Alpha do not do anything
>> For more information see: http://www.m5sim.org/warn/3e0eccba
>> m5.opt: build/ALPHA_FS_MOESI_CMP_directory/mem/ruby/system/RubyPort.cc:230:
>> virt\
>> ual bool RubyPort::M5Port::recvTiming(Packet*): Assertion
>> `Address(ruby_request.\
>> paddr).getOffset() + ruby_request.len <=
>> RubySystem::getBlockSizeBytes()' failed\
>> .
>> Program aborted at cycle 2406378289516
>> Aborted
>>
>>
>> The same error occurs for 7907 - 7908.
>>
>> At changeset 7909 is where the dma_expiry error first shows up:
>>
>> 7909:
>>
>> hda: M5 IDE Disk, ATA DISK drive
>> hdb: M5 IDE Disk, ATA DISK drive
>> hda: UDMA/33 mode selected
>> hdb: UDMA/33 mode selected
>> ide0 at 0x8410-0x8417,0x8422 on irq 31
>> ide1 at 0x8418-0x841f,0x8426 on irq 31
>> ide_generic: please use "probe_mask=0x3f" module parameter for probing
>> all legac\
>> y ISA IDE ports
>> ide2 at 0x1f0-0x1f7,0x3f6 on irq 14
>> ide3 at 0x170-0x177,0x376 on irq 15
>> hda: max request size: 128KiB
>> hda: 101808 sectors (52 MB), CHS=101/16/63
>>  hda:<4>hda: dma_timer_expiry: dma status == 0x65
>> hda: DMA interrupt recovery
>> hda: lost interrupt
>>  unknown partition table
>> hdb: max request size: 128KiB
>> hdb: 4177920 sectors (2139 MB), CHS=4144/16/63
>>
>> I tested changeset 7920:
>>
>> and thats where I notice the handleResponse()
>>
>> 7920:
>>
>> M5 compiled Feb 10 2011 14:49:49
>> M5 revision 39c86a8306d2+ 7920+ default
>> M5 started Feb 10 2011 14:53:38
>> M5 executing on sherpa05
>> command line: ./build/ALPHA_FS_MOESI_CMP_directory/m5.opt
>> ./configs/example/ruby\
>> _fs.py -n 4 --topology Crossbar
>> Global frequency set at 1000000000000 ticks per second
>> info: kernel located at: /home/musleh/M5/m5_system_2.0b3/binaries/vmlinux
>> Listening for system connection on port 3456
>>       0: system.tsunami.io.rtc: Real-time clock set to Thu Jan  1 00:00:00 
>> 2009
>> 0: system.remote_gdb.listener: listening for remote gdb #0 on port 7000
>> 0: system.remote_gdb.listener: listening for remote gdb #1 on port 7001
>> 0: system.remote_gdb.listener: listening for remote gdb #2 on port 7002
>> 0: system.remote_gdb.listener: listening for remote gdb #3 on port 7003
>> **** REAL SIMULATION ****
>> info: Entering event queue @ 0.  Starting simulation...
>> info: Launching CPU 1 @ 835461000
>> info: Launching CPU 2 @ 846156000
>> info: Launching CPU 3 @ 856768000
>> warn: Prefetch instrutions is Alpha do not do anything
>> For more information see: http://www.m5sim.org/warn/3e0eccba
>> 1128875500: system.terminal: attach terminal 0
>> warn: Prefetch instrutions is Alpha do not do anything
>> For more information see: http://www.m5sim.org/warn/3e0eccba
>> m5.opt: build/ALPHA_FS_MOESI_CMP_directory/mem/packet.hh:590: void
>> Packet::makeResponse(): Assertion `needsResponse()' failed.
>> Program aborted at cycle 36235566500
>> Aborted
>>
>> Note that I have not tested changesets 7911-7918.
>>
>> I have tested the MOESI_CMP_directory protocol on all of these with
>> m5.opt. I have testes using MESI_CMP_directory for some of them and
>> got the same messages.
>>
>> This is my command line:
>>
>> ./build/ALPHA_FS_MOESI_CMP_directory/m5.opt -
>> ./configs/example/ruby_fs.py -n 4 --topology Crossbar
>>
>> The error comes at about 15 minutes in to boot the kernel. Note that
>> it takes a while for the io to be scheduled.
>>
>> io scheduler noop registered
>> io scheduler anticipatory registered
>> io scheduler deadline registered
>> io scheduler cfq registered (default)
>>
>> In all cases though where the dma_expiry occurs (which does not
>> include changesets 7906-7908), the last thing that appears is this:
>>
>> ide0 at 0x8410-0x8417,0x8422 on irq 31
>> ide1 at 0x8418-0x841f,0x8426 on irq 31
>> ide_generic: please use "probe_mask=0x3f" module parameter for probing
>> all legacy ISA IDE ports
>> ide2 at 0x1f0-0x1f7,0x3f6 on irq 14
>> ide3 at 0x170-0x177,0x376 on irq 15
>> hda: max request size: 128KiB
>> hda: 101808 sectors (52 MB), CHS=101/16/63
>>  hda:<4>hda: dma_timer_expiry: dma status == 0x65
>> hda: DMA interrupt recovery
>> hda: lost interrupt
>>  unknown partition table
>> hdb: max request size: 128KiB
>> hdb: 4177920 sectors (2139 MB), CHS=4144/16/63
>>
>> Is it possible to generate a trace for Ruby in M5 the way it is for
>> Ruby in GEMS like something of this sort:
>>
>> http://www.cs.wisc.edu/gems/doc/gems-wiki/moin.cgi/How_do_I_understand_a_Protocol
>>
>> ?
>>
>> Let me know if you need anymore information.
>>
>> Malek
>>
>> On Thu, Feb 10, 2011 at 4:43 PM, Beckmann, Brad <[email protected]> 
>> wrote:
>>> H Malek,
>>>
>>> Hmm...I have never seen that type of error before.  As you mentioned, I 
>>> don't think any of my recent patches changed how DMA is executed for 
>>> ALPHA_FS.
>>>
>>> How long does it take for you to encounter the error?  It would be great if 
>>> you could tell me how I can reproduce the error.  I would like to look at 
>>> this in more detail and get a protocol trace of what is going on.
>>>
>>> Thanks,
>>>
>>> Brad
>>>
>>>
>>>> -----Original Message-----
>>>> From: [email protected] [mailto:[email protected]]
>>>> On Behalf Of Malek Musleh
>>>> Sent: Thursday, February 10, 2011 5:05 AM
>>>> To: M5 Developer List
>>>> Subject: Re: [m5-dev] Ruby FS Fails with recent Changesets
>>>>
>>>> Hi Brad,
>>>>
>>>> I tested your latest changeset, and it seems that it 'solves' the
>>>> handleResponse error I was getting when running 3 or more cores, but the
>>>> dma_expiry error is still there.
>>>>
>>>> Such that, now the error is consistent, no matter what number of cores I 
>>>> try
>>>> to run with:
>>>>
>>>> For more information see: http://www.m5sim.org/warn/3e0eccba
>>>> panic: Inconsistent DMA transfer state: dmaState = 2 devState = 1  @ cycle
>>>> 62411238889001
>>>> [doDmaTransfer:build/ALPHA_FS_MOESI_CMP_directory/dev/ide_disk.cc,
>>>> line 323] Memory Usage: 382600 KBytes
>>>>
>>>> ------------------------- M5 Terminal -------------------
>>>> hda: max request size: 128KiB
>>>> hda: 101808 sectors (52 MB), CHS=101/16/63
>>>>  hda:<4>hda: dma_timer_expiry: dma status == 0x65
>>>> hda: DMA interrupt recovery
>>>> hda: lost interrupt
>>>>  unknown partition table
>>>> hdb: max request size: 128KiB
>>>> hdb: 4177920 sectors (2139 MB), CHS=4144/16/63
>>>>  hdb:<4>hdb: dma_timer_expiry: dma status == 0x65
>>>> hdb: DMA interrupt recovery
>>>> hdb: lost interrupt
>>>>
>>>> The panic error seems to suggest an inconsistent DMA state, so I tried
>>>> reverting to an older changeset (before DMA changes were pushed out)
>>>> such as 7936, and even 7930 but no such luck.
>>>>
>>>> The changeset that I know works from last week or so is changeset 7842.
>>>> Looking at the changset summaries between 7842 and 7930 seem to indicate
>>>> a lot of changes 'unrelated' to the DMA, such as O3, InOrderCPU, and x86
>>>> changes. That being said, I did not do a diff on those intermediate 
>>>> changesets
>>>> to verify that maybe a related file was slightly modified in the process.
>>>>
>>>> I might be able to spend some more time trying changesets till I narrow 
>>>> down
>>>> which one its coming from, but maybe the new panic message might give
>>>> you some indication on how to fix it?
>>>>
>>>> (I think the panic messaged appeared now and not before because I let the
>>>> simulation terminate itself when running overnight as opposed to me 
>>>> killing it
>>>> once I saw the dma_expiry message on the M5 Terminal).
>>>>
>>>> Malek
>>>>
>>>> On Wed, Feb 9, 2011 at 7:00 PM, Beckmann, Brad
>>>> <[email protected]> wrote:
>>>>> Hi Malek,
>>>>>
>>>>> Yes, thanks for letting us know.  I'm pretty sure I know what the problem
>>>> is.  Previously, if a SC operation failed, the RubyPort would convert the
>>>> request packet to a response packet, bypassed writing the functional view 
>>>> of
>>>> memory, and pass it back up to the CPU.  In my most recent patches I
>>>> generalized the mechanism that converts request packets to response
>>>> packets and avoids writing functional memory.  However, I forgot to remove
>>>> the duplicate request to response conversion for failed SC
>>>> requests.  Therefore, I bet you are encounter that assertion error on that
>>>> duplicate call.  It should be a simple one line change that fixes your
>>>> problem.  I'll push it momentarily and it would be great if you could 
>>>> confirm
>>>> that my change does indeed fix your problem.
>>>>> Brad
>>>>>
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: [email protected] [mailto:m5-dev-
>>>> [email protected]] On
>>>>>> Behalf Of Gabe Black
>>>>>> Sent: Wednesday, February 09, 2011 3:54 PM
>>>>>> To: M5 Developer List
>>>>>> Subject: Re: [m5-dev] Ruby FS Fails with recent Changesets
>>>>>>
>>>>>> Thanks for letting us know. If it wouldn't be too much trouble, could
>>>>>> you please try some other changesets near the one that isn't working
>>>>>> and try to determine which one specifically broke things? A bunch of
>>>>>> changes went in recently so it would be helpful to narrow things
>>>>>> down. I'm not very involved with Ruby right now personally, but I
>>>>>> assume that would be useful information for the people that are.
>>>>>>
>>>>>> Gabe
>>>>>>
>>>>>> On 02/09/11 14:51, Malek Musleh wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>> I first started using the Ruby Model in M5  about a week or so ago,
>>>>>>> and was able to boot in FS mode (up to 64 cores once applying the
>>>>>>> BigTsunami patches).
>>>>>>>
>>>>>>> In order to keep up with the changes in the Ruby code, I have
>>>>>>> started fetching recent updates from the devrepo.
>>>>>>>
>>>>>>> However, in fetching the updates to the recent changesets (from the
>>>>>>> last 2 days) Ruby FS does not boot. I tried both MESI_CMP_directory
>>>>>>> and MOESI_CMP_directory.
>>>>>>>
>>>>>>> If running 2 cores or less I get this at the terminal screen after
>>>>>>> letting it run for some time:
>>>>>>>
>>>>>>> hda: M5 IDE Disk, ATA DISK drive
>>>>>>> hdb: M5 IDE Disk, ATA DISK drive
>>>>>>> hda: UDMA/33 mode selected
>>>>>>> hdb: UDMA/33 mode selected
>>>>>>> ide0 at 0x8410-0x8417,0x8422 on irq 31
>>>>>>> ide1 at 0x8418-0x841f,0x8426 on irq 31
>>>>>>> ide_generic: please use "probe_mask=0x3f" module parameter for
>>>>>>> probing all legacy ISA IDE ports
>>>>>>> ide2 at 0x1f0-0x1f7,0x3f6 on irq 14
>>>>>>> ide3 at 0x170-0x177,0x376 on irq 15
>>>>>>> hda: max request size: 128KiB
>>>>>>> hda: 101808 sectors (52 MB), CHS=101/16/63
>>>>>>>  hda:<4>hda: dma_timer_expiry: dma status == 0x65
>>>>>>> <------------------------------------------------------- problem
>>>>>>>
>>>>>>>
>>>>>>> When running 3 or more cores, I get the following assertion failure:
>>>>>>>
>>>>>>>
>>>>>>> info: kernel located at:
>>>>>>> /home/musleh/M5/m5_system_2.0b3/binaries/vmlinux
>>>>>>> Listening for system connection on port 3456
>>>>>>>       0: system.tsunami.io.rtc: Real-time clock set to Thu Jan  1
>>>>>>> 00:00:00 2009
>>>>>>> 0: system.remote_gdb.listener: listening for remote gdb #0 on port
>>>>>>> 7000
>>>>>>> 0: system.remote_gdb.listener: listening for remote gdb #1 on port
>>>>>>> 7001
>>>>>>> 0: system.remote_gdb.listener: listening for remote gdb #2 on port
>>>>>>> 7002
>>>>>>> 0: system.remote_gdb.listener: listening for remote gdb #3 on port
>>>>>>> 7003
>>>>>>> **** REAL SIMULATION ****
>>>>>>> info: Entering event queue @ 0.  Starting simulation...
>>>>>>> info: Launching CPU 1 @ 834794000
>>>>>>> info: Launching CPU 2 @ 845489000
>>>>>>> info: Launching CPU 3 @ 856101000
>>>>>>> m5.opt: build/ALPHA_FS_MESI_CMP_directory/mem/packet.hh:590:
>>>> void
>>>>>>> Packet::makeResponse(): Assertion `needsResponse()' failed.
>>>>>>> Program aborted at cycle 977160000
>>>>>>> Aborted
>>>>>>>
>>>>>>> The top of the tree is this last changeset:
>>>>>>>
>>>>>>> changeset:   7939:215c8be67063
>>>>>>> tag:         tip
>>>>>>> user:        Brad Beckmann <[email protected]>
>>>>>>> date:        Tue Feb 08 18:07:54 2011 -0800
>>>>>>> summary:     regess: protocol regression tester updates
>>>>>>>
>>>>>>> I am not sure if those whom it concern are aware of it or not, or
>>>>>>> if there will be a soon to be updated changeset already in the
>>>>>>> works for this or not, but I figured I would bring it to your attention.
>>>>>>>
>>>>>>> Malek
>>>>>>> _______________________________________________
>>>>>>> m5-dev mailing list
>>>>>>> [email protected]
>>>>>>> http://m5sim.org/mailman/listinfo/m5-dev
>>>>>> _______________________________________________
>>>>>> m5-dev mailing list
>>>>>> [email protected]
>>>>>> http://m5sim.org/mailman/listinfo/m5-dev
>>>>>
>>>>> _______________________________________________
>>>>> m5-dev mailing list
>>>>> [email protected]
>>>>> http://m5sim.org/mailman/listinfo/m5-dev
>>>>>
>>>> _______________________________________________
>>>> m5-dev mailing list
>>>> [email protected]
>>>> http://m5sim.org/mailman/listinfo/m5-dev
>>>
>>> _______________________________________________
>>> m5-dev mailing list
>>> [email protected]
>>> http://m5sim.org/mailman/listinfo/m5-dev
>>>
>> _______________________________________________
>> m5-dev mailing list
>> [email protected]
>> http://m5sim.org/mailman/listinfo/m5-dev
>
> _______________________________________________
> m5-dev mailing list
> [email protected]
> http://m5sim.org/mailman/listinfo/m5-dev
>
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to