I should note that I did not try the Mesh2D configuration to see if
that results in the same error or not, and although I specify the
topology to be the Crossbar I believe Crossbar is already the default
implementation.

Malek

On Thu, Feb 10, 2011 at 5:26 PM, Malek Musleh <[email protected]> wrote:
> Hi Brad,
>
> I tested the different changesets and have narrowed down to where it begins.
>
> The last changeset that works (since 7842) is 7905.
>
> At 7906 this is the error:
>
> command line: ./build/ALPHA_FS_MOESI_CMP_directory/m5.opt
> ./configs/example/ruby\
> _fs.py -n 4 --topology Crossbar
> Global frequency set at 1000000000000 ticks per second
> info: kernel located at: /home/musleh/M5/m5_system_2.0b3/binaries/vmlinux
> Listening for system connection on port 3456
>      0: system.tsunami.io.rtc: Real-time clock set to Thu Jan  1 00:00:00 2009
> 0: system.remote_gdb.listener: listening for remote gdb #0 on port 7000
> 0: system.remote_gdb.listener: listening for remote gdb #1 on port 7001
> 0: system.remote_gdb.listener: listening for remote gdb #2 on port 7002
> 0: system.remote_gdb.listener: listening for remote gdb #3 on port 7003
> **** REAL SIMULATION ****
> info: Entering event queue @ 0.  Starting simulation...
> info: Launching CPU 1 @ 835461000
> info: Launching CPU 2 @ 846156000
> info: Launching CPU 3 @ 856768000
> warn: Prefetch instrutions is Alpha do not do anything
> For more information see: http://www.m5sim.org/warn/3e0eccba
> 1349195500: system.terminal: attach terminal 0
> warn: Prefetch instrutions is Alpha do not do anything
> For more information see: http://www.m5sim.org/warn/3e0eccba
> m5.opt: build/ALPHA_FS_MOESI_CMP_directory/mem/ruby/system/RubyPort.cc:230:
> virt\
> ual bool RubyPort::M5Port::recvTiming(Packet*): Assertion
> `Address(ruby_request.\
> paddr).getOffset() + ruby_request.len <=
> RubySystem::getBlockSizeBytes()' failed\
> .
> Program aborted at cycle 2406378289516
> Aborted
>
>
> The same error occurs for 7907 - 7908.
>
> At changeset 7909 is where the dma_expiry error first shows up:
>
> 7909:
>
> hda: M5 IDE Disk, ATA DISK drive
> hdb: M5 IDE Disk, ATA DISK drive
> hda: UDMA/33 mode selected
> hdb: UDMA/33 mode selected
> ide0 at 0x8410-0x8417,0x8422 on irq 31
> ide1 at 0x8418-0x841f,0x8426 on irq 31
> ide_generic: please use "probe_mask=0x3f" module parameter for probing
> all legac\
> y ISA IDE ports
> ide2 at 0x1f0-0x1f7,0x3f6 on irq 14
> ide3 at 0x170-0x177,0x376 on irq 15
> hda: max request size: 128KiB
> hda: 101808 sectors (52 MB), CHS=101/16/63
>  hda:<4>hda: dma_timer_expiry: dma status == 0x65
> hda: DMA interrupt recovery
> hda: lost interrupt
>  unknown partition table
> hdb: max request size: 128KiB
> hdb: 4177920 sectors (2139 MB), CHS=4144/16/63
>
> I tested changeset 7920:
>
> and thats where I notice the handleResponse()
>
> 7920:
>
> M5 compiled Feb 10 2011 14:49:49
> M5 revision 39c86a8306d2+ 7920+ default
> M5 started Feb 10 2011 14:53:38
> M5 executing on sherpa05
> command line: ./build/ALPHA_FS_MOESI_CMP_directory/m5.opt
> ./configs/example/ruby\
> _fs.py -n 4 --topology Crossbar
> Global frequency set at 1000000000000 ticks per second
> info: kernel located at: /home/musleh/M5/m5_system_2.0b3/binaries/vmlinux
> Listening for system connection on port 3456
>      0: system.tsunami.io.rtc: Real-time clock set to Thu Jan  1 00:00:00 2009
> 0: system.remote_gdb.listener: listening for remote gdb #0 on port 7000
> 0: system.remote_gdb.listener: listening for remote gdb #1 on port 7001
> 0: system.remote_gdb.listener: listening for remote gdb #2 on port 7002
> 0: system.remote_gdb.listener: listening for remote gdb #3 on port 7003
> **** REAL SIMULATION ****
> info: Entering event queue @ 0.  Starting simulation...
> info: Launching CPU 1 @ 835461000
> info: Launching CPU 2 @ 846156000
> info: Launching CPU 3 @ 856768000
> warn: Prefetch instrutions is Alpha do not do anything
> For more information see: http://www.m5sim.org/warn/3e0eccba
> 1128875500: system.terminal: attach terminal 0
> warn: Prefetch instrutions is Alpha do not do anything
> For more information see: http://www.m5sim.org/warn/3e0eccba
> m5.opt: build/ALPHA_FS_MOESI_CMP_directory/mem/packet.hh:590: void
> Packet::makeResponse(): Assertion `needsResponse()' failed.
> Program aborted at cycle 36235566500
> Aborted
>
> Note that I have not tested changesets 7911-7918.
>
> I have tested the MOESI_CMP_directory protocol on all of these with
> m5.opt. I have testes using MESI_CMP_directory for some of them and
> got the same messages.
>
> This is my command line:
>
> ./build/ALPHA_FS_MOESI_CMP_directory/m5.opt -
> ./configs/example/ruby_fs.py -n 4 --topology Crossbar
>
> The error comes at about 15 minutes in to boot the kernel. Note that
> it takes a while for the io to be scheduled.
>
> io scheduler noop registered
> io scheduler anticipatory registered
> io scheduler deadline registered
> io scheduler cfq registered (default)
>
> In all cases though where the dma_expiry occurs (which does not
> include changesets 7906-7908), the last thing that appears is this:
>
> ide0 at 0x8410-0x8417,0x8422 on irq 31
> ide1 at 0x8418-0x841f,0x8426 on irq 31
> ide_generic: please use "probe_mask=0x3f" module parameter for probing
> all legacy ISA IDE ports
> ide2 at 0x1f0-0x1f7,0x3f6 on irq 14
> ide3 at 0x170-0x177,0x376 on irq 15
> hda: max request size: 128KiB
> hda: 101808 sectors (52 MB), CHS=101/16/63
>  hda:<4>hda: dma_timer_expiry: dma status == 0x65
> hda: DMA interrupt recovery
> hda: lost interrupt
>  unknown partition table
> hdb: max request size: 128KiB
> hdb: 4177920 sectors (2139 MB), CHS=4144/16/63
>
> Is it possible to generate a trace for Ruby in M5 the way it is for
> Ruby in GEMS like something of this sort:
>
> http://www.cs.wisc.edu/gems/doc/gems-wiki/moin.cgi/How_do_I_understand_a_Protocol
>
> ?
>
> Let me know if you need anymore information.
>
> Malek
>
> On Thu, Feb 10, 2011 at 4:43 PM, Beckmann, Brad <[email protected]> wrote:
>> H Malek,
>>
>> Hmm...I have never seen that type of error before.  As you mentioned, I 
>> don't think any of my recent patches changed how DMA is executed for 
>> ALPHA_FS.
>>
>> How long does it take for you to encounter the error?  It would be great if 
>> you could tell me how I can reproduce the error.  I would like to look at 
>> this in more detail and get a protocol trace of what is going on.
>>
>> Thanks,
>>
>> Brad
>>
>>
>>> -----Original Message-----
>>> From: [email protected] [mailto:[email protected]]
>>> On Behalf Of Malek Musleh
>>> Sent: Thursday, February 10, 2011 5:05 AM
>>> To: M5 Developer List
>>> Subject: Re: [m5-dev] Ruby FS Fails with recent Changesets
>>>
>>> Hi Brad,
>>>
>>> I tested your latest changeset, and it seems that it 'solves' the
>>> handleResponse error I was getting when running 3 or more cores, but the
>>> dma_expiry error is still there.
>>>
>>> Such that, now the error is consistent, no matter what number of cores I try
>>> to run with:
>>>
>>> For more information see: http://www.m5sim.org/warn/3e0eccba
>>> panic: Inconsistent DMA transfer state: dmaState = 2 devState = 1  @ cycle
>>> 62411238889001
>>> [doDmaTransfer:build/ALPHA_FS_MOESI_CMP_directory/dev/ide_disk.cc,
>>> line 323] Memory Usage: 382600 KBytes
>>>
>>> ------------------------- M5 Terminal -------------------
>>> hda: max request size: 128KiB
>>> hda: 101808 sectors (52 MB), CHS=101/16/63
>>>  hda:<4>hda: dma_timer_expiry: dma status == 0x65
>>> hda: DMA interrupt recovery
>>> hda: lost interrupt
>>>  unknown partition table
>>> hdb: max request size: 128KiB
>>> hdb: 4177920 sectors (2139 MB), CHS=4144/16/63
>>>  hdb:<4>hdb: dma_timer_expiry: dma status == 0x65
>>> hdb: DMA interrupt recovery
>>> hdb: lost interrupt
>>>
>>> The panic error seems to suggest an inconsistent DMA state, so I tried
>>> reverting to an older changeset (before DMA changes were pushed out)
>>> such as 7936, and even 7930 but no such luck.
>>>
>>> The changeset that I know works from last week or so is changeset 7842.
>>> Looking at the changset summaries between 7842 and 7930 seem to indicate
>>> a lot of changes 'unrelated' to the DMA, such as O3, InOrderCPU, and x86
>>> changes. That being said, I did not do a diff on those intermediate 
>>> changesets
>>> to verify that maybe a related file was slightly modified in the process.
>>>
>>> I might be able to spend some more time trying changesets till I narrow down
>>> which one its coming from, but maybe the new panic message might give
>>> you some indication on how to fix it?
>>>
>>> (I think the panic messaged appeared now and not before because I let the
>>> simulation terminate itself when running overnight as opposed to me killing 
>>> it
>>> once I saw the dma_expiry message on the M5 Terminal).
>>>
>>> Malek
>>>
>>> On Wed, Feb 9, 2011 at 7:00 PM, Beckmann, Brad
>>> <[email protected]> wrote:
>>> > Hi Malek,
>>> >
>>> > Yes, thanks for letting us know.  I'm pretty sure I know what the problem
>>> is.  Previously, if a SC operation failed, the RubyPort would convert the
>>> request packet to a response packet, bypassed writing the functional view of
>>> memory, and pass it back up to the CPU.  In my most recent patches I
>>> generalized the mechanism that converts request packets to response
>>> packets and avoids writing functional memory.  However, I forgot to remove
>>> the duplicate request to response conversion for failed SC
>>> requests.  Therefore, I bet you are encounter that assertion error on that
>>> duplicate call.  It should be a simple one line change that fixes your
>>> problem.  I'll push it momentarily and it would be great if you could 
>>> confirm
>>> that my change does indeed fix your problem.
>>> >
>>> > Brad
>>> >
>>> >
>>> >
>>> >> -----Original Message-----
>>> >> From: [email protected] [mailto:m5-dev-
>>> [email protected]] On
>>> >> Behalf Of Gabe Black
>>> >> Sent: Wednesday, February 09, 2011 3:54 PM
>>> >> To: M5 Developer List
>>> >> Subject: Re: [m5-dev] Ruby FS Fails with recent Changesets
>>> >>
>>> >> Thanks for letting us know. If it wouldn't be too much trouble, could
>>> >> you please try some other changesets near the one that isn't working
>>> >> and try to determine which one specifically broke things? A bunch of
>>> >> changes went in recently so it would be helpful to narrow things
>>> >> down. I'm not very involved with Ruby right now personally, but I
>>> >> assume that would be useful information for the people that are.
>>> >>
>>> >> Gabe
>>> >>
>>> >> On 02/09/11 14:51, Malek Musleh wrote:
>>> >> > Hello,
>>> >> >
>>> >> > I first started using the Ruby Model in M5  about a week or so ago,
>>> >> > and was able to boot in FS mode (up to 64 cores once applying the
>>> >> > BigTsunami patches).
>>> >> >
>>> >> > In order to keep up with the changes in the Ruby code, I have
>>> >> > started fetching recent updates from the devrepo.
>>> >> >
>>> >> > However, in fetching the updates to the recent changesets (from the
>>> >> > last 2 days) Ruby FS does not boot. I tried both MESI_CMP_directory
>>> >> > and MOESI_CMP_directory.
>>> >> >
>>> >> > If running 2 cores or less I get this at the terminal screen after
>>> >> > letting it run for some time:
>>> >> >
>>> >> > hda: M5 IDE Disk, ATA DISK drive
>>> >> > hdb: M5 IDE Disk, ATA DISK drive
>>> >> > hda: UDMA/33 mode selected
>>> >> > hdb: UDMA/33 mode selected
>>> >> > ide0 at 0x8410-0x8417,0x8422 on irq 31
>>> >> > ide1 at 0x8418-0x841f,0x8426 on irq 31
>>> >> > ide_generic: please use "probe_mask=0x3f" module parameter for
>>> >> > probing all legacy ISA IDE ports
>>> >> > ide2 at 0x1f0-0x1f7,0x3f6 on irq 14
>>> >> > ide3 at 0x170-0x177,0x376 on irq 15
>>> >> > hda: max request size: 128KiB
>>> >> > hda: 101808 sectors (52 MB), CHS=101/16/63
>>> >> >  hda:<4>hda: dma_timer_expiry: dma status == 0x65
>>> >> > <------------------------------------------------------- problem
>>> >> >
>>> >> >
>>> >> > When running 3 or more cores, I get the following assertion failure:
>>> >> >
>>> >> >
>>> >> > info: kernel located at:
>>> >> > /home/musleh/M5/m5_system_2.0b3/binaries/vmlinux
>>> >> > Listening for system connection on port 3456
>>> >> >       0: system.tsunami.io.rtc: Real-time clock set to Thu Jan  1
>>> >> > 00:00:00 2009
>>> >> > 0: system.remote_gdb.listener: listening for remote gdb #0 on port
>>> >> > 7000
>>> >> > 0: system.remote_gdb.listener: listening for remote gdb #1 on port
>>> >> > 7001
>>> >> > 0: system.remote_gdb.listener: listening for remote gdb #2 on port
>>> >> > 7002
>>> >> > 0: system.remote_gdb.listener: listening for remote gdb #3 on port
>>> >> > 7003
>>> >> > **** REAL SIMULATION ****
>>> >> > info: Entering event queue @ 0.  Starting simulation...
>>> >> > info: Launching CPU 1 @ 834794000
>>> >> > info: Launching CPU 2 @ 845489000
>>> >> > info: Launching CPU 3 @ 856101000
>>> >> > m5.opt: build/ALPHA_FS_MESI_CMP_directory/mem/packet.hh:590:
>>> void
>>> >> > Packet::makeResponse(): Assertion `needsResponse()' failed.
>>> >> > Program aborted at cycle 977160000
>>> >> > Aborted
>>> >> >
>>> >> > The top of the tree is this last changeset:
>>> >> >
>>> >> > changeset:   7939:215c8be67063
>>> >> > tag:         tip
>>> >> > user:        Brad Beckmann <[email protected]>
>>> >> > date:        Tue Feb 08 18:07:54 2011 -0800
>>> >> > summary:     regess: protocol regression tester updates
>>> >> >
>>> >> > I am not sure if those whom it concern are aware of it or not, or
>>> >> > if there will be a soon to be updated changeset already in the
>>> >> > works for this or not, but I figured I would bring it to your 
>>> >> > attention.
>>> >> >
>>> >> > Malek
>>> >> > _______________________________________________
>>> >> > m5-dev mailing list
>>> >> > [email protected]
>>> >> > http://m5sim.org/mailman/listinfo/m5-dev
>>> >>
>>> >> _______________________________________________
>>> >> m5-dev mailing list
>>> >> [email protected]
>>> >> http://m5sim.org/mailman/listinfo/m5-dev
>>> >
>>> >
>>> > _______________________________________________
>>> > m5-dev mailing list
>>> > [email protected]
>>> > http://m5sim.org/mailman/listinfo/m5-dev
>>> >
>>> _______________________________________________
>>> m5-dev mailing list
>>> [email protected]
>>> http://m5sim.org/mailman/listinfo/m5-dev
>>
>>
>> _______________________________________________
>> m5-dev mailing list
>> [email protected]
>> http://m5sim.org/mailman/listinfo/m5-dev
>>
>
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to