I should note that I did not try the Mesh2D configuration to see if that results in the same error or not, and although I specify the topology to be the Crossbar I believe Crossbar is already the default implementation.
Malek On Thu, Feb 10, 2011 at 5:26 PM, Malek Musleh <[email protected]> wrote: > Hi Brad, > > I tested the different changesets and have narrowed down to where it begins. > > The last changeset that works (since 7842) is 7905. > > At 7906 this is the error: > > command line: ./build/ALPHA_FS_MOESI_CMP_directory/m5.opt > ./configs/example/ruby\ > _fs.py -n 4 --topology Crossbar > Global frequency set at 1000000000000 ticks per second > info: kernel located at: /home/musleh/M5/m5_system_2.0b3/binaries/vmlinux > Listening for system connection on port 3456 > 0: system.tsunami.io.rtc: Real-time clock set to Thu Jan 1 00:00:00 2009 > 0: system.remote_gdb.listener: listening for remote gdb #0 on port 7000 > 0: system.remote_gdb.listener: listening for remote gdb #1 on port 7001 > 0: system.remote_gdb.listener: listening for remote gdb #2 on port 7002 > 0: system.remote_gdb.listener: listening for remote gdb #3 on port 7003 > **** REAL SIMULATION **** > info: Entering event queue @ 0. Starting simulation... > info: Launching CPU 1 @ 835461000 > info: Launching CPU 2 @ 846156000 > info: Launching CPU 3 @ 856768000 > warn: Prefetch instrutions is Alpha do not do anything > For more information see: http://www.m5sim.org/warn/3e0eccba > 1349195500: system.terminal: attach terminal 0 > warn: Prefetch instrutions is Alpha do not do anything > For more information see: http://www.m5sim.org/warn/3e0eccba > m5.opt: build/ALPHA_FS_MOESI_CMP_directory/mem/ruby/system/RubyPort.cc:230: > virt\ > ual bool RubyPort::M5Port::recvTiming(Packet*): Assertion > `Address(ruby_request.\ > paddr).getOffset() + ruby_request.len <= > RubySystem::getBlockSizeBytes()' failed\ > . > Program aborted at cycle 2406378289516 > Aborted > > > The same error occurs for 7907 - 7908. > > At changeset 7909 is where the dma_expiry error first shows up: > > 7909: > > hda: M5 IDE Disk, ATA DISK drive > hdb: M5 IDE Disk, ATA DISK drive > hda: UDMA/33 mode selected > hdb: UDMA/33 mode selected > ide0 at 0x8410-0x8417,0x8422 on irq 31 > ide1 at 0x8418-0x841f,0x8426 on irq 31 > ide_generic: please use "probe_mask=0x3f" module parameter for probing > all legac\ > y ISA IDE ports > ide2 at 0x1f0-0x1f7,0x3f6 on irq 14 > ide3 at 0x170-0x177,0x376 on irq 15 > hda: max request size: 128KiB > hda: 101808 sectors (52 MB), CHS=101/16/63 > hda:<4>hda: dma_timer_expiry: dma status == 0x65 > hda: DMA interrupt recovery > hda: lost interrupt > unknown partition table > hdb: max request size: 128KiB > hdb: 4177920 sectors (2139 MB), CHS=4144/16/63 > > I tested changeset 7920: > > and thats where I notice the handleResponse() > > 7920: > > M5 compiled Feb 10 2011 14:49:49 > M5 revision 39c86a8306d2+ 7920+ default > M5 started Feb 10 2011 14:53:38 > M5 executing on sherpa05 > command line: ./build/ALPHA_FS_MOESI_CMP_directory/m5.opt > ./configs/example/ruby\ > _fs.py -n 4 --topology Crossbar > Global frequency set at 1000000000000 ticks per second > info: kernel located at: /home/musleh/M5/m5_system_2.0b3/binaries/vmlinux > Listening for system connection on port 3456 > 0: system.tsunami.io.rtc: Real-time clock set to Thu Jan 1 00:00:00 2009 > 0: system.remote_gdb.listener: listening for remote gdb #0 on port 7000 > 0: system.remote_gdb.listener: listening for remote gdb #1 on port 7001 > 0: system.remote_gdb.listener: listening for remote gdb #2 on port 7002 > 0: system.remote_gdb.listener: listening for remote gdb #3 on port 7003 > **** REAL SIMULATION **** > info: Entering event queue @ 0. Starting simulation... > info: Launching CPU 1 @ 835461000 > info: Launching CPU 2 @ 846156000 > info: Launching CPU 3 @ 856768000 > warn: Prefetch instrutions is Alpha do not do anything > For more information see: http://www.m5sim.org/warn/3e0eccba > 1128875500: system.terminal: attach terminal 0 > warn: Prefetch instrutions is Alpha do not do anything > For more information see: http://www.m5sim.org/warn/3e0eccba > m5.opt: build/ALPHA_FS_MOESI_CMP_directory/mem/packet.hh:590: void > Packet::makeResponse(): Assertion `needsResponse()' failed. > Program aborted at cycle 36235566500 > Aborted > > Note that I have not tested changesets 7911-7918. > > I have tested the MOESI_CMP_directory protocol on all of these with > m5.opt. I have testes using MESI_CMP_directory for some of them and > got the same messages. > > This is my command line: > > ./build/ALPHA_FS_MOESI_CMP_directory/m5.opt - > ./configs/example/ruby_fs.py -n 4 --topology Crossbar > > The error comes at about 15 minutes in to boot the kernel. Note that > it takes a while for the io to be scheduled. > > io scheduler noop registered > io scheduler anticipatory registered > io scheduler deadline registered > io scheduler cfq registered (default) > > In all cases though where the dma_expiry occurs (which does not > include changesets 7906-7908), the last thing that appears is this: > > ide0 at 0x8410-0x8417,0x8422 on irq 31 > ide1 at 0x8418-0x841f,0x8426 on irq 31 > ide_generic: please use "probe_mask=0x3f" module parameter for probing > all legacy ISA IDE ports > ide2 at 0x1f0-0x1f7,0x3f6 on irq 14 > ide3 at 0x170-0x177,0x376 on irq 15 > hda: max request size: 128KiB > hda: 101808 sectors (52 MB), CHS=101/16/63 > hda:<4>hda: dma_timer_expiry: dma status == 0x65 > hda: DMA interrupt recovery > hda: lost interrupt > unknown partition table > hdb: max request size: 128KiB > hdb: 4177920 sectors (2139 MB), CHS=4144/16/63 > > Is it possible to generate a trace for Ruby in M5 the way it is for > Ruby in GEMS like something of this sort: > > http://www.cs.wisc.edu/gems/doc/gems-wiki/moin.cgi/How_do_I_understand_a_Protocol > > ? > > Let me know if you need anymore information. > > Malek > > On Thu, Feb 10, 2011 at 4:43 PM, Beckmann, Brad <[email protected]> wrote: >> H Malek, >> >> Hmm...I have never seen that type of error before. As you mentioned, I >> don't think any of my recent patches changed how DMA is executed for >> ALPHA_FS. >> >> How long does it take for you to encounter the error? It would be great if >> you could tell me how I can reproduce the error. I would like to look at >> this in more detail and get a protocol trace of what is going on. >> >> Thanks, >> >> Brad >> >> >>> -----Original Message----- >>> From: [email protected] [mailto:[email protected]] >>> On Behalf Of Malek Musleh >>> Sent: Thursday, February 10, 2011 5:05 AM >>> To: M5 Developer List >>> Subject: Re: [m5-dev] Ruby FS Fails with recent Changesets >>> >>> Hi Brad, >>> >>> I tested your latest changeset, and it seems that it 'solves' the >>> handleResponse error I was getting when running 3 or more cores, but the >>> dma_expiry error is still there. >>> >>> Such that, now the error is consistent, no matter what number of cores I try >>> to run with: >>> >>> For more information see: http://www.m5sim.org/warn/3e0eccba >>> panic: Inconsistent DMA transfer state: dmaState = 2 devState = 1 @ cycle >>> 62411238889001 >>> [doDmaTransfer:build/ALPHA_FS_MOESI_CMP_directory/dev/ide_disk.cc, >>> line 323] Memory Usage: 382600 KBytes >>> >>> ------------------------- M5 Terminal ------------------- >>> hda: max request size: 128KiB >>> hda: 101808 sectors (52 MB), CHS=101/16/63 >>> hda:<4>hda: dma_timer_expiry: dma status == 0x65 >>> hda: DMA interrupt recovery >>> hda: lost interrupt >>> unknown partition table >>> hdb: max request size: 128KiB >>> hdb: 4177920 sectors (2139 MB), CHS=4144/16/63 >>> hdb:<4>hdb: dma_timer_expiry: dma status == 0x65 >>> hdb: DMA interrupt recovery >>> hdb: lost interrupt >>> >>> The panic error seems to suggest an inconsistent DMA state, so I tried >>> reverting to an older changeset (before DMA changes were pushed out) >>> such as 7936, and even 7930 but no such luck. >>> >>> The changeset that I know works from last week or so is changeset 7842. >>> Looking at the changset summaries between 7842 and 7930 seem to indicate >>> a lot of changes 'unrelated' to the DMA, such as O3, InOrderCPU, and x86 >>> changes. That being said, I did not do a diff on those intermediate >>> changesets >>> to verify that maybe a related file was slightly modified in the process. >>> >>> I might be able to spend some more time trying changesets till I narrow down >>> which one its coming from, but maybe the new panic message might give >>> you some indication on how to fix it? >>> >>> (I think the panic messaged appeared now and not before because I let the >>> simulation terminate itself when running overnight as opposed to me killing >>> it >>> once I saw the dma_expiry message on the M5 Terminal). >>> >>> Malek >>> >>> On Wed, Feb 9, 2011 at 7:00 PM, Beckmann, Brad >>> <[email protected]> wrote: >>> > Hi Malek, >>> > >>> > Yes, thanks for letting us know. I'm pretty sure I know what the problem >>> is. Previously, if a SC operation failed, the RubyPort would convert the >>> request packet to a response packet, bypassed writing the functional view of >>> memory, and pass it back up to the CPU. In my most recent patches I >>> generalized the mechanism that converts request packets to response >>> packets and avoids writing functional memory. However, I forgot to remove >>> the duplicate request to response conversion for failed SC >>> requests. Therefore, I bet you are encounter that assertion error on that >>> duplicate call. It should be a simple one line change that fixes your >>> problem. I'll push it momentarily and it would be great if you could >>> confirm >>> that my change does indeed fix your problem. >>> > >>> > Brad >>> > >>> > >>> > >>> >> -----Original Message----- >>> >> From: [email protected] [mailto:m5-dev- >>> [email protected]] On >>> >> Behalf Of Gabe Black >>> >> Sent: Wednesday, February 09, 2011 3:54 PM >>> >> To: M5 Developer List >>> >> Subject: Re: [m5-dev] Ruby FS Fails with recent Changesets >>> >> >>> >> Thanks for letting us know. If it wouldn't be too much trouble, could >>> >> you please try some other changesets near the one that isn't working >>> >> and try to determine which one specifically broke things? A bunch of >>> >> changes went in recently so it would be helpful to narrow things >>> >> down. I'm not very involved with Ruby right now personally, but I >>> >> assume that would be useful information for the people that are. >>> >> >>> >> Gabe >>> >> >>> >> On 02/09/11 14:51, Malek Musleh wrote: >>> >> > Hello, >>> >> > >>> >> > I first started using the Ruby Model in M5 about a week or so ago, >>> >> > and was able to boot in FS mode (up to 64 cores once applying the >>> >> > BigTsunami patches). >>> >> > >>> >> > In order to keep up with the changes in the Ruby code, I have >>> >> > started fetching recent updates from the devrepo. >>> >> > >>> >> > However, in fetching the updates to the recent changesets (from the >>> >> > last 2 days) Ruby FS does not boot. I tried both MESI_CMP_directory >>> >> > and MOESI_CMP_directory. >>> >> > >>> >> > If running 2 cores or less I get this at the terminal screen after >>> >> > letting it run for some time: >>> >> > >>> >> > hda: M5 IDE Disk, ATA DISK drive >>> >> > hdb: M5 IDE Disk, ATA DISK drive >>> >> > hda: UDMA/33 mode selected >>> >> > hdb: UDMA/33 mode selected >>> >> > ide0 at 0x8410-0x8417,0x8422 on irq 31 >>> >> > ide1 at 0x8418-0x841f,0x8426 on irq 31 >>> >> > ide_generic: please use "probe_mask=0x3f" module parameter for >>> >> > probing all legacy ISA IDE ports >>> >> > ide2 at 0x1f0-0x1f7,0x3f6 on irq 14 >>> >> > ide3 at 0x170-0x177,0x376 on irq 15 >>> >> > hda: max request size: 128KiB >>> >> > hda: 101808 sectors (52 MB), CHS=101/16/63 >>> >> > hda:<4>hda: dma_timer_expiry: dma status == 0x65 >>> >> > <------------------------------------------------------- problem >>> >> > >>> >> > >>> >> > When running 3 or more cores, I get the following assertion failure: >>> >> > >>> >> > >>> >> > info: kernel located at: >>> >> > /home/musleh/M5/m5_system_2.0b3/binaries/vmlinux >>> >> > Listening for system connection on port 3456 >>> >> > 0: system.tsunami.io.rtc: Real-time clock set to Thu Jan 1 >>> >> > 00:00:00 2009 >>> >> > 0: system.remote_gdb.listener: listening for remote gdb #0 on port >>> >> > 7000 >>> >> > 0: system.remote_gdb.listener: listening for remote gdb #1 on port >>> >> > 7001 >>> >> > 0: system.remote_gdb.listener: listening for remote gdb #2 on port >>> >> > 7002 >>> >> > 0: system.remote_gdb.listener: listening for remote gdb #3 on port >>> >> > 7003 >>> >> > **** REAL SIMULATION **** >>> >> > info: Entering event queue @ 0. Starting simulation... >>> >> > info: Launching CPU 1 @ 834794000 >>> >> > info: Launching CPU 2 @ 845489000 >>> >> > info: Launching CPU 3 @ 856101000 >>> >> > m5.opt: build/ALPHA_FS_MESI_CMP_directory/mem/packet.hh:590: >>> void >>> >> > Packet::makeResponse(): Assertion `needsResponse()' failed. >>> >> > Program aborted at cycle 977160000 >>> >> > Aborted >>> >> > >>> >> > The top of the tree is this last changeset: >>> >> > >>> >> > changeset: 7939:215c8be67063 >>> >> > tag: tip >>> >> > user: Brad Beckmann <[email protected]> >>> >> > date: Tue Feb 08 18:07:54 2011 -0800 >>> >> > summary: regess: protocol regression tester updates >>> >> > >>> >> > I am not sure if those whom it concern are aware of it or not, or >>> >> > if there will be a soon to be updated changeset already in the >>> >> > works for this or not, but I figured I would bring it to your >>> >> > attention. >>> >> > >>> >> > Malek >>> >> > _______________________________________________ >>> >> > m5-dev mailing list >>> >> > [email protected] >>> >> > http://m5sim.org/mailman/listinfo/m5-dev >>> >> >>> >> _______________________________________________ >>> >> m5-dev mailing list >>> >> [email protected] >>> >> http://m5sim.org/mailman/listinfo/m5-dev >>> > >>> > >>> > _______________________________________________ >>> > m5-dev mailing list >>> > [email protected] >>> > http://m5sim.org/mailman/listinfo/m5-dev >>> > >>> _______________________________________________ >>> m5-dev mailing list >>> [email protected] >>> http://m5sim.org/mailman/listinfo/m5-dev >> >> >> _______________________________________________ >> m5-dev mailing list >> [email protected] >> http://m5sim.org/mailman/listinfo/m5-dev >> > _______________________________________________ m5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/m5-dev
