Re: ata "fallback to PIO mode" on dual processor AMD systems
On Thu, 2 Jan 2003, Bruce Campbell wrote: > I've manually set: > > atacontrol mode 0 UDMA33 UDMA33 > > and the problem has not recurred. That sort of hints that there's some issue with the cabling, as UDMA33 is the highest you can go on a 40wire IDE cable. Going beyond requires an 80wire cable (& no longer than 450mm/18" as I recall). - Andrew I MacIntyre "These thoughts are mine alone..." E-mail: [EMAIL PROTECTED] (pref) | Snail: PO Box 370 [EMAIL PROTECTED] (alt) |Belconnen ACT 2616 Web:http://www.andymac.org/ |Australia ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: ata "fallback to PIO mode" on dual processor AMD systems
On Sun, Jan 05, 2003 at 03:02:46PM +0100, Francesco Casadei wrote: > [snip] > Yesterday I checked the drive ad6 with the Drive Fitness Test program from IBM. > Both quick and advanced test returned that the drive is ok. I then ran the test > against ad0 (the backup drive): the quick test showed that the drive was > defective because of "Excessive Shock". Re-executing the test gave same result. > I rebooted the system and disabled the S.M.A.R.T. option for the drive attached > to the motherboard's controller (i.e. the backup drive). Re-executing the quick > test showed that the drive is ok! > > After 16 hours of uptime and one level-0 file system dump all drives are still > using UDMA100. > > If for some reason the system will fall back again to PIO4 mode I will try to > remove the two following options from the kernel: > > # ISA optimization > options AUTO_EOI_1 > options AUTO_EOI_2 > > > If the problem won't still be solved then I will try in order the following: > - disable tagged queuing > - buy different hardware! > > Francesco Casadei > -- > You can download my public key from http://digilander.libero.it/fcasadei/ > or retrieve it from a keyserver (pgpkeys.mit.edu, wwwkeys.pgp.net, ...) > > Key fingerprint is: 1671 9A23 ACB4 520A E7EE 00B0 7EC3 375F 164E B17B > > end of the original message Disabling S.M.A.R.T. capability on ad0 did not solve the problem :( After ~5 days of uptime: Jan 9 05:39:34 zeus /kernel: ad6: SERVICE timeout tag=24 s=c0 e=04 Jan 9 05:39:44 zeus /kernel: ad6: invalidating queued requests Jan 9 05:39:44 zeus /kernel: ad6: timeout sending command=00 s=c0 e=04 Jan 9 05:39:44 zeus /kernel: ad6: flush queue failed Jan 9 05:39:44 zeus /kernel: ad6: timeout sending command=c7 s=c0 e=04 Jan 9 05:39:44 zeus /kernel: ad6: error executing commandad6: invalidating queued requests Jan 9 05:39:44 zeus /kernel: ad6: timeout sending command=00 s=c0 e=04 Jan 9 05:39:44 zeus /kernel: ad6: flush queue failed Jan 9 05:39:44 zeus /kernel: - resetting Jan 9 05:39:44 zeus /kernel: ata3: resetting devices .. ad6: invalidating queued requests Jan 9 05:39:44 zeus /kernel: done Jan 9 05:39:44 zeus /kernel: ad6: no request for tag=1 Jan 9 05:39:44 zeus /kernel: ad6: invalidating queued requests Jan 9 05:39:34 zeus apcsmart[159]: Serial port read timed out Jan 9 05:39:44 zeus upsd[162]: Data for UPS [Back-UPS_PRO_650] is stale - check support module (shm_ctime too old) Jan 9 05:39:44 zeus upsmon[166]: Poll UPS [Back-UPS_Pro_650@localhost] failed - Data stale Jan 9 05:39:44 zeus /kernel: Jan 9 05:39:44 zeus upsmon[166]: Poll UPS [Back-UPS_Pro_650@localhost] failed - Data stale Jan 9 05:39:44 zeus upsd[162]: Host 127.0.0.1 disconnected Jan 9 05:39:44 zeus upsmon[166]: Communications with UPS Back-UPS_Pro_650@localhost lost Jan 9 05:39:44 zeus apcsmart[159]: Serial port read ok again Jan 9 05:39:46 zeus upsd[162]: Data for UPS [Back-UPS_PRO_650] is now OK Jan 9 05:39:46 zeus upsd[162]: Data source for UPS [Back-UPS_PRO_650]: SHM (65536) Jan 9 05:39:49 zeus upsd[162]: Connection from 127.0.0.1 Jan 9 05:39:49 zeus upsmon[166]: Communications with UPS Back-UPS_Pro_650@localhost established Jan 9 05:39:49 zeus upsd[162]: Client 127.0.0.1 logged into UPS [Back-UPS_Pro_650] Jan 9 05:39:54 zeus /kernel: ad6: READ command timeout tag=1 serv=0 - resetting Jan 9 05:40:15 zeus /kernel: ad6: invalidating queued requests Jan 9 05:40:15 zeus /kernel: ata3: resetting devices .. ad6: invalidating queued requests Jan 9 05:40:15 zeus /kernel: done Jan 9 05:40:15 zeus /kernel: ad6: READ command timeout tag=0 serv=1 - resetting Jan 9 05:40:15 zeus /kernel: ad6: invalidating queued requests Jan 9 05:40:15 zeus /kernel: ata3: resetting devices .. ad6: invalidating queued requests Jan 9 05:40:15 zeus /kernel: done Jan 9 05:40:15 zeus /kernel: ad6: timeout waiting for READY Jan 9 05:40:15 zeus /kernel: ad6: invalidating queued requests Jan 9 05:40:15 zeus /kernel: ad6: timeout sending command=00 s=d0 e=04 Jan 9 05:40:15 zeus /kernel: ad6: flush queue failed Jan 9 05:40:15 zeus /kernel: - resetting Jan 9 05:40:15 zeus /kernel: ata3: resetting devices .. ad6: invalidating queued requests Jan 9 05:40:15 zeus /kernel: done Jan 9 05:40:15 zeus /kernel: ad6: READ command timeout tag=1 serv=0 - resetting Jan 9 05:40:15 zeus /kernel: ad6: invalidating queued requests Jan 9 05:40:15 zeus /kernel: ata3: resetting devices .. ad6: invalidating queued requests Jan 9 05:40:15 zeus /kernel: done Jan 9 05:40:15 zeus /kernel: ad6: READ command timeout tag=0 serv=1 - resetting Jan 9 05:40:15 zeus /kernel: ad6: invalidating queued requests Jan 9 05:40:15 zeus /kernel: ata3: resetting devices .. ad6: invalidating queued requests Jan 9 05:40:15 zeus /kernel: done Jan 9 05:40:15 zeus /kernel: ad6: no request for tag=0 Jan 9 05:40:15 zeus /kernel: ad6: invalidating queued requests Jan 9 05:40:15 zeus /kernel: ad6: WRITE command timeout tag=0 serv=0 - resetting Jan 9
Re: ata "fallback to PIO mode" on dual processor AMD systems
This article from The Register may be of interest: http://www.theregister.co.uk/content/3/18267.html It talks about a bug in the VIA 686B Southbridge chipset that can cause data corruption when processing large amounts of data. Guy -- Guy DawsonI.T. Manager Crossflight Ltd [EMAIL PROTECTED] 07973 79781901753 776104 ** This email contains the views and opinions of a Crossflight Limited employee and at this stage are in no way a direct representation of Crossflight Limited. This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. To ensure the integrity and appropriate use of its email system, Crossflight Limited reserves the right to examine any email held on its email system or sent to or from it. This footnote also confirms that this email message has been swept by MIMEsweeper for the presence of computer viruses. We strongly recomend that you check this email with your own virus software as Crossflight Limited will not be held responsible for any damage caused by viruses as a result of opening this email. ** To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-questions" in the body of the message
Re: ata "fallback to PIO mode" on dual processor AMD systems
This would be legacy behaviour from the days of buggy ATA33/UDMA implementations, where falling back to PIO mode would allow a device with a buggy UDMA implementation (Unfortunately rather common at the time) to function. --Adam - Original Message - From: "Bruce Campbell" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Sunday, January 05, 2003 10:01 PM Subject: Re: ata "fallback to PIO mode" on dual processor AMD systems > Quoting Bruce Campbell <[EMAIL PROTECTED]>: > > > Quoting Matthew Emmerton <[EMAIL PROTECTED]>: > > > > > [ cc'ing Soren since he's the ATA guru ] > > > > > > > Dec 30 23:27:00 ecserv13 /kernel: ad0: trying fallback to PIO mode > > > > Dec 30 23:27:00 ecserv13 /kernel: ata0: resetting devices .. done > > > > > > > > The test continues to run with the ata controller in PIO mode, with > > > > slower performance, and higher load average. > > > > > > > > Once the master drops to PIO, attempts to access the slave then cause > > > > it to drop to PIO. > > > > > > Are you using 80-conductor cables on all your drives? These are required > > to > > > get consistent high throughput, and running without them may cause the > > > problems you're seeing. > > > > Thanks for the information about the design of IDE etc, and the suggestion > > about the cables. I was about to shuffle things to get the disks > > onto separate channels, but I now see that would be a mistake as my > > CD drive would share a cable with a disk. > > ps. As an aside, I have since determined that putting a PIO device and > a UDMA device on the same channel does not affect the performance > of the UDMA device, unless the PIO device is in use. So, sharing > a low use CD rom drive with a disk wouldn't be so bad. > > I am puzzled about the fallback to PIO concept. If a disk has > gives some sort of timeout error or whatever, why would trying > PIO correct the problem ? That seems equivalent to asking the > disk to do the same thing, just more slowly. > > In my case, some sort of timeout error occurs on ad0, so > it falls back to PIO, and works. A later access to ad1 > also yields a timeout error, and then it drops to PIO, > and works too. I'm fairly confident both disks did not > experience media errors at the same time, which suggests > a problem with the onboard IDE controller, or a driver bug. > > Tests continue... > > > > > > > > > This mail sent through www.mywaterloo.ca > > To Unsubscribe: send mail to [EMAIL PROTECTED] > with "unsubscribe freebsd-questions" in the body of the message > To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-questions" in the body of the message
Re: ata "fallback to PIO mode" on dual processor AMD systems
Quoting Bruce Campbell <[EMAIL PROTECTED]>: > Quoting Matthew Emmerton <[EMAIL PROTECTED]>: > > > [ cc'ing Soren since he's the ATA guru ] > > > > > Dec 30 23:27:00 ecserv13 /kernel: ad0: trying fallback to PIO mode > > > Dec 30 23:27:00 ecserv13 /kernel: ata0: resetting devices .. done > > > > > > The test continues to run with the ata controller in PIO mode, with > > > slower performance, and higher load average. > > > > > > Once the master drops to PIO, attempts to access the slave then cause > > > it to drop to PIO. > > > > Are you using 80-conductor cables on all your drives? These are required > to > > get consistent high throughput, and running without them may cause the > > problems you're seeing. > > Thanks for the information about the design of IDE etc, and the suggestion > about the cables. I was about to shuffle things to get the disks > onto separate channels, but I now see that would be a mistake as my > CD drive would share a cable with a disk. ps. As an aside, I have since determined that putting a PIO device and a UDMA device on the same channel does not affect the performance of the UDMA device, unless the PIO device is in use. So, sharing a low use CD rom drive with a disk wouldn't be so bad. I am puzzled about the fallback to PIO concept. If a disk has gives some sort of timeout error or whatever, why would trying PIO correct the problem ? That seems equivalent to asking the disk to do the same thing, just more slowly. In my case, some sort of timeout error occurs on ad0, so it falls back to PIO, and works. A later access to ad1 also yields a timeout error, and then it drops to PIO, and works too. I'm fairly confident both disks did not experience media errors at the same time, which suggests a problem with the onboard IDE controller, or a driver bug. Tests continue... This mail sent through www.mywaterloo.ca To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-questions" in the body of the message
Re: ata "fallback to PIO mode" on dual processor AMD systems
On Thu, Jan 02, 2003 at 01:42:03PM -0500, Bruce Campbell wrote: [snip] > > I don't have it enabled: > > hw.ata.tags: 0 > > I've manually set: > > atacontrol mode 0 UDMA33 UDMA33 > > and the problem has not recurred. > > -- > Bruce Campbell > Engineering Computing > CPH-2374B > University of Waterloo > (519)888-4567 ext 5889 > > > This mail sent through www.mywaterloo.ca > > To Unsubscribe: send mail to [EMAIL PROTECTED] > with "unsubscribe freebsd-questions" in the body of the message > > end of the original message Yesterday I checked the drive ad6 with the Drive Fitness Test program from IBM. Both quick and advanced test returned that the drive is ok. I then ran the test against ad0 (the backup drive): the quick test showed that the drive was defective because of "Excessive Shock". Re-executing the test gave same result. I rebooted the system and disabled the S.M.A.R.T. option for the drive attached to the motherboard's controller (i.e. the backup drive). Re-executing the quick test showed that the drive is ok! After 16 hours of uptime and one level-0 file system dump all drives are still using UDMA100. If for some reason the system will fall back again to PIO4 mode I will try to remove the two following options from the kernel: # ISA optimization options AUTO_EOI_1 options AUTO_EOI_2 If the problem won't still be solved then I will try in order the following: - disable tagged queuing - buy different hardware! Francesco Casadei -- You can download my public key from http://digilander.libero.it/fcasadei/ or retrieve it from a keyserver (pgpkeys.mit.edu, wwwkeys.pgp.net, ...) Key fingerprint is: 1671 9A23 ACB4 520A E7EE 00B0 7EC3 375F 164E B17B msg14309/pgp0.pgp Description: PGP signature
Re: ata "fallback to PIO mode" on dual processor AMD systems
On Thu, Jan 02, 2003 at 01:42:03PM -0500, Bruce Campbell wrote: > [snip] > I don't have it enabled: > > hw.ata.tags: 0 > > I've manually set: > > atacontrol mode 0 UDMA33 UDMA33 > > and the problem has not recurred. > > -- > Bruce Campbell > Engineering Computing > CPH-2374B > University of Waterloo > (519)888-4567 ext 5889 > > > This mail sent through www.mywaterloo.ca > > To Unsubscribe: send mail to [EMAIL PROTECTED] > with "unsubscribe freebsd-questions" in the body of the message > > end of the original message # atacontrol mode 3 Master = PIO4 Slave = ??? # atacontrol mode 3 udma33 xxx Master = UDMA33 Slave = ??? # atacontrol mode 3 Master = UDMA33 Slave = ??? # find / -name nonexistent -print # atacontrol mode 3 Master = PIO4 Slave = ??? After little disk activity, like searching a file throughout the entire filesystem, the second disk of the RAID array falls back to PIO4 mode. I booted the system from the live system cd (2nd disk of the freebsd distribution set) then ran dd to read from and write to ad6: no errors were found. Francesco Casadei -- You can download my public key from http://digilander.libero.it/fcasadei/ or retrieve it from a keyserver (pgpkeys.mit.edu, wwwkeys.pgp.net, ...) Key fingerprint is: 1671 9A23 ACB4 520A E7EE 00B0 7EC3 375F 164E B17B msg14040/pgp0.pgp Description: PGP signature
Re: ata "fallback to PIO mode" on dual processor AMD systems
Quoting Francesco Casadei <[EMAIL PROTECTED]>: > On Tue, Dec 31, 2002 at 03:57:16PM -0500, Bruce Campbell wrote: > > > > I am seeing a problem with ata disks on 4 new systems, which > > I believe is either a bug in the ata driver, or a problem with > > the onboard IDE controller, or something else. Systems are as follows: > > ... > > Motherboard: ASUS A7M266-D > > CPUs : 2 x 2000+ AMD MP > > Memory : 2 x 512MB Crucial part: CT6472Y265 > > Dec 30 23:26:59 ecserv13 /kernel: ad0: WRITE command timeout tag=0 serv=0 > - > > resetting > > Dec 30 23:26:59 ecserv13 /kernel: ata0: resetting devices .. done > > Dec 30 23:26:59 ecserv13 /kernel: ad0: WRITE command timeout tag=0 serv=0 > > resetting > > Dec 30 23:27:00 ecserv13 /kernel: ata0: resetting devices .. done > > Dec 30 23:27:00 ecserv13 /kernel: ad0: WRITE command timeout tag=0 serv=0 > > resetting > > Dec 30 23:27:00 ecserv13 /kernel: ata0: resetting devices .. done > > Dec 30 23:27:00 ecserv13 /kernel: ad0: WRITE command timeout tag=0 serv=0 > > resetting > > Dec 30 23:27:00 ecserv13 /kernel: ad0: timeout waiting for cmd=ef s=d0 > e=00 > > Dec 30 23:27:00 ecserv13 /kernel: ad0: trying fallback to PIO mode > > Same problem here, but slightly different configuration: > > # atacontrol list > ATA channel 0: > Master: ad0 ATA/ATAPI rev 5 > Slave: no device present > ATA channel 1: > Master: acd0 ATA/ATAPI rev 0 > Slave: no device present > ATA channel 2: > Master: ad4 ATA/ATAPI rev 5 > Slave: no device present > ATA channel 3: > Master: ad6 ATA/ATAPI rev 5 > Slave: no device present > > ad4 and ad6 are attached to a Promise FastTrak 100 TX2 ATA RAID controller. > > # atacontrol mode 0 > Master = UDMA100 > Slave = ??? > > # atacontrol mode 1 > Master = PIO4 > Slave = ??? > > # atacontrol mode 2 > Master = UDMA100 > Slave = ??? > > # atacontrol mode 3 > Master = PIO4 > Slave = ??? > > ad6 falls back to PIO mode on heavy I/O activity, i.e. when the system does > a > level 0 file systems dump from the RAID 1 array (ad4,ad6) to the backup disk > ad0. > Rebooting and rebuilding the array with the Promise BIOS utility temporarily > solve the problem. The system may be up and running for 1-4 weeks doing a > level 0 dump every morning at 5:30am and then one day the drive ad6 falls > back > to PIO mode again (little before the completion of fs dump). > > Do the hard drives you are using support the ATA tagged queuing? And if so, > do > you have TQ enbled? I don't have it enabled: hw.ata.tags: 0 I've manually set: atacontrol mode 0 UDMA33 UDMA33 and the problem has not recurred. -- Bruce Campbell Engineering Computing CPH-2374B University of Waterloo (519)888-4567 ext 5889 This mail sent through www.mywaterloo.ca To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-questions" in the body of the message
Re: ata "fallback to PIO mode" on dual processor AMD systems
On Tue, Dec 31, 2002 at 03:57:16PM -0500, Bruce Campbell wrote: > > I am seeing a problem with ata disks on 4 new systems, which > I believe is either a bug in the ata driver, or a problem with > the onboard IDE controller, or something else. Systems are as follows: > > Motherboard: ASUS A7M266-D > CPUs : 2 x 2000+ AMD MP > Memory : 2 x 512MB Crucial part: CT6472Y265 > > Disks (all UDMA100): > > Master Slave > System 1: WDC WD400BB WDC WD1000BB > System 2: WDC WD400BB WDC WD1000BB > System 3: WDC WD400BB WDC WD800BB > System 4: WDC WD400BB Maxtor 98196H8 > > Kernel : 4.7-RELEASE, custom kernel (compared to GENERIC): > > commented out: > > cpu I386_CPU > cpu I486_CPU > > enabled > > options SMP # Symmetric MultiProcessor Kernel > options APIC_IO # Symmetric (APIC) I/O > > > I am running a test with "dbench" (/usr/ports/benchmarks/dbench) > with a script which runs: > > dbench 1 > sleep for 5 minutes > dbench 2 > sleep for 5 minutes > dbench 3 > ... > > to simulate 1,2,3... clients. > > The following has happened on systems 2,3 and 4, after about 15 hours > of running the test: > > Dec 30 23:26:59 ecserv13 /kernel: ad0: WRITE command timeout tag=0 serv=0 - > resetting > Dec 30 23:26:59 ecserv13 /kernel: ata0: resetting devices .. done > Dec 30 23:26:59 ecserv13 /kernel: ad0: WRITE command timeout tag=0 serv=0 > resetting > Dec 30 23:27:00 ecserv13 /kernel: ata0: resetting devices .. done > Dec 30 23:27:00 ecserv13 /kernel: ad0: WRITE command timeout tag=0 serv=0 > resetting > Dec 30 23:27:00 ecserv13 /kernel: ata0: resetting devices .. done > Dec 30 23:27:00 ecserv13 /kernel: ad0: WRITE command timeout tag=0 serv=0 > resetting > Dec 30 23:27:00 ecserv13 /kernel: ad0: timeout waiting for cmd=ef s=d0 e=00 > Dec 30 23:27:00 ecserv13 /kernel: ad0: trying fallback to PIO mode > Dec 30 23:27:00 ecserv13 /kernel: ata0: resetting devices .. done > > The test continues to run with the ata controller in PIO mode, with > slower performance, and higher load average. > > Once the master drops to PIO, attempts to access the slave then cause > it to drop to PIO. > > If I run: > > atacontrol mode 0 UDMA100 UDMA100 > > attempts to access either drive result in a delay until the controller > drops to PIO, and then operations resume. A soft reboot and things > work in UDMA mode again. Also tried UDMA33 and UDMA66 with no change. > I also tried "atacontrol reinit 0" with no help. > > Theories when I search the web for "fallback to PIO mode" include: > > - bad disks > - something to do with thermal recalibration > > I don't believe the problems are bad disks, as the slave drops to PIO > after the master does, and I can't get in back to UDMA, other than by > soft reboot. Plus I see the problem on 6 of 8 disks. > > The problem is very repeatable. > > Can anyone offer any ideas, or suggest investigative steps ? I have a system > in PIO mode right now. > > Thanks, > > -- > Bruce Campbell > Engineering Computing > CPH-2374B > University of Waterloo > (519)888-4567 ext 5889 > > > This mail sent through www.mywaterloo.ca > > To Unsubscribe: send mail to [EMAIL PROTECTED] > with "unsubscribe freebsd-questions" in the body of the message > > end of the original message Same problem here, but slightly different configuration: # atacontrol list ATA channel 0: Master: ad0 ATA/ATAPI rev 5 Slave: no device present ATA channel 1: Master: acd0 ATA/ATAPI rev 0 Slave: no device present ATA channel 2: Master: ad4 ATA/ATAPI rev 5 Slave: no device present ATA channel 3: Master: ad6 ATA/ATAPI rev 5 Slave: no device present ad4 and ad6 are attached to a Promise FastTrak 100 TX2 ATA RAID controller. # atacontrol mode 0 Master = UDMA100 Slave = ??? # atacontrol mode 1 Master = PIO4 Slave = ??? # atacontrol mode 2 Master = UDMA100 Slave = ??? # atacontrol mode 3 Master = PIO4 Slave = ??? ad6 falls back to PIO mode on heavy I/O activity, i.e. when the system does a level 0 file systems dump from the RAID 1 array (ad4,ad6) to the backup disk ad0. Rebooting and rebuilding the array with the Promise BIOS utility temporarily solve the problem. The system may be up and running for 1-4 weeks doing a level 0 dump every morning at 5:30am and then one day the drive ad6 falls back to PIO mode again (little before the completion of fs dump). Do the hard drives you are using support the ATA tagged queuing? And if so, do you have TQ enbled? Francesco Casadei -- You can download my public key from http://digilander.libero.it/fcasadei/ or retrieve it from a keyserver (pgpkeys.mit.edu, wwwkeys.pgp.net, ...) Key fingerprint is: 1671 9A23 ACB4 520A E7EE 00B0 7EC3 375F 164E B17B msg13998/pgp0.pgp Description: PGP signa
Re: ata "fallback to PIO mode" on dual processor AMD systems
Quoting Matthew Emmerton <[EMAIL PROTECTED]>: > [ cc'ing Soren since he's the ATA guru ] > > > Dec 30 23:27:00 ecserv13 /kernel: ad0: trying fallback to PIO mode > > Dec 30 23:27:00 ecserv13 /kernel: ata0: resetting devices .. done > > > > The test continues to run with the ata controller in PIO mode, with > > slower performance, and higher load average. > > > > Once the master drops to PIO, attempts to access the slave then cause > > it to drop to PIO. > > Are you using 80-conductor cables on all your drives? These are required to > get consistent high throughput, and running without them may cause the > problems you're seeing. Thanks for the information about the design of IDE etc, and the suggestion about the cables. I was about to shuffle things to get the disks onto separate channels, but I now see that would be a mistake as my CD drive would share a cable with a disk. Anyway, they all have the 80 conductor cable. I forgot to add some environmental and other information. The 4 AMD systems are in Aopen hx08 towers, with 400 watt power supplies, and 5 auxilliary fans (in addition to the power supply fan, and fan on each cpu). They are in an air conditioned machine room. The CPU and motherboard temperatures are within spec. I mention this as I note many reported AMD system problems traced to overheating. All drives are installed in removeable drive bays. I don't have the make/model on hand right now. They were $19 CAD. ($13USD). The low cost makes me suspicious now, but... I'm running the same tests on 4 single processor 2.4GHz Intel systems. They have not failed in this manner so far. Initially, I had 1GB memory modules in the AMD systems (I can't remember the make) and the systems froze and rebooted randomly. I moved to Crucial 512MB modules to cure that problem. This mail sent through www.mywaterloo.ca To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-questions" in the body of the message
Re: ata "fallback to PIO mode" on dual processor AMD systems
[ cc'ing Soren since he's the ATA guru ] > I am seeing a problem with ata disks on 4 new systems, which > I believe is either a bug in the ata driver, or a problem with > the onboard IDE controller, or something else. Systems are as follows: > > Motherboard: ASUS A7M266-D > CPUs : 2 x 2000+ AMD MP > Memory : 2 x 512MB Crucial part: CT6472Y265 > > Disks (all UDMA100): > > Master Slave > System 1:WDC WD400BB WDC WD1000BB > System 2: WDC WD400BB WDC WD1000BB > System 3: WDC WD400BB WDC WD800BB > System 4: WDC WD400BB Maxtor 98196H8 > > Kernel : 4.7-RELEASE, custom kernel (compared to GENERIC): > > commented out: > > cpu I386_CPU > cpu I486_CPU > > enabled > > options SMP # Symmetric MultiProcessor Kernel > options APIC_IO # Symmetric (APIC) I/O > > > I am running a test with "dbench" (/usr/ports/benchmarks/dbench) > with a script which runs: > > dbench 1 > sleep for 5 minutes > dbench 2 > sleep for 5 minutes > dbench 3 > ... > > to simulate 1,2,3... clients. > > The following has happened on systems 2,3 and 4, after about 15 hours > of running the test: > > Dec 30 23:26:59 ecserv13 /kernel: ad0: WRITE command timeout tag=0 serv=0 - > resetting > Dec 30 23:26:59 ecserv13 /kernel: ata0: resetting devices .. done > Dec 30 23:26:59 ecserv13 /kernel: ad0: WRITE command timeout tag=0 serv=0 > resetting > Dec 30 23:27:00 ecserv13 /kernel: ata0: resetting devices .. done > Dec 30 23:27:00 ecserv13 /kernel: ad0: WRITE command timeout tag=0 serv=0 > resetting > Dec 30 23:27:00 ecserv13 /kernel: ata0: resetting devices .. done > Dec 30 23:27:00 ecserv13 /kernel: ad0: WRITE command timeout tag=0 serv=0 > resetting > Dec 30 23:27:00 ecserv13 /kernel: ad0: timeout waiting for cmd=ef s=d0 e=00 > Dec 30 23:27:00 ecserv13 /kernel: ad0: trying fallback to PIO mode > Dec 30 23:27:00 ecserv13 /kernel: ata0: resetting devices .. done > > The test continues to run with the ata controller in PIO mode, with > slower performance, and higher load average. > > Once the master drops to PIO, attempts to access the slave then cause > it to drop to PIO. > > If I run: > > atacontrol mode 0 UDMA100 UDMA100 > > attempts to access either drive result in a delay until the controller > drops to PIO, and then operations resume. A soft reboot and things > work in UDMA mode again. Also tried UDMA33 and UDMA66 with no change. > I also tried "atacontrol reinit 0" with no help. > > Theories when I search the web for "fallback to PIO mode" include: > > - bad disks > - something to do with thermal recalibration > > I don't believe the problems are bad disks, as the slave drops to PIO > after the master does, and I can't get in back to UDMA, other than by > soft reboot. Plus I see the problem on 6 of 8 disks. > > The problem is very repeatable. > > Can anyone offer any ideas, or suggest investigative steps ? I have a system > in PIO mode right now. The reason the slave drops to PIO after the master does is by design - the master and slave have to use the same signalling mode since they're on the same cable. (People often report lackluster performance of fast UDMA hard drives with non-UDMA CD-ROMs on the same channel.) Are you using 80-conductor cables on all your drives? These are required to get consistent high throughput, and running without them may cause the problems you're seeing. -- Matt Emmerton To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-questions" in the body of the message
ata "fallback to PIO mode" on dual processor AMD systems
I am seeing a problem with ata disks on 4 new systems, which I believe is either a bug in the ata driver, or a problem with the onboard IDE controller, or something else. Systems are as follows: Motherboard: ASUS A7M266-D CPUs : 2 x 2000+ AMD MP Memory : 2 x 512MB Crucial part: CT6472Y265 Disks (all UDMA100): Master Slave System 1: WDC WD400BB WDC WD1000BB System 2: WDC WD400BB WDC WD1000BB System 3: WDC WD400BB WDC WD800BB System 4: WDC WD400BB Maxtor 98196H8 Kernel : 4.7-RELEASE, custom kernel (compared to GENERIC): commented out: cpu I386_CPU cpu I486_CPU enabled options SMP # Symmetric MultiProcessor Kernel options APIC_IO # Symmetric (APIC) I/O I am running a test with "dbench" (/usr/ports/benchmarks/dbench) with a script which runs: dbench 1 sleep for 5 minutes dbench 2 sleep for 5 minutes dbench 3 ... to simulate 1,2,3... clients. The following has happened on systems 2,3 and 4, after about 15 hours of running the test: Dec 30 23:26:59 ecserv13 /kernel: ad0: WRITE command timeout tag=0 serv=0 - resetting Dec 30 23:26:59 ecserv13 /kernel: ata0: resetting devices .. done Dec 30 23:26:59 ecserv13 /kernel: ad0: WRITE command timeout tag=0 serv=0 resetting Dec 30 23:27:00 ecserv13 /kernel: ata0: resetting devices .. done Dec 30 23:27:00 ecserv13 /kernel: ad0: WRITE command timeout tag=0 serv=0 resetting Dec 30 23:27:00 ecserv13 /kernel: ata0: resetting devices .. done Dec 30 23:27:00 ecserv13 /kernel: ad0: WRITE command timeout tag=0 serv=0 resetting Dec 30 23:27:00 ecserv13 /kernel: ad0: timeout waiting for cmd=ef s=d0 e=00 Dec 30 23:27:00 ecserv13 /kernel: ad0: trying fallback to PIO mode Dec 30 23:27:00 ecserv13 /kernel: ata0: resetting devices .. done The test continues to run with the ata controller in PIO mode, with slower performance, and higher load average. Once the master drops to PIO, attempts to access the slave then cause it to drop to PIO. If I run: atacontrol mode 0 UDMA100 UDMA100 attempts to access either drive result in a delay until the controller drops to PIO, and then operations resume. A soft reboot and things work in UDMA mode again. Also tried UDMA33 and UDMA66 with no change. I also tried "atacontrol reinit 0" with no help. Theories when I search the web for "fallback to PIO mode" include: - bad disks - something to do with thermal recalibration I don't believe the problems are bad disks, as the slave drops to PIO after the master does, and I can't get in back to UDMA, other than by soft reboot. Plus I see the problem on 6 of 8 disks. The problem is very repeatable. Can anyone offer any ideas, or suggest investigative steps ? I have a system in PIO mode right now. Thanks, -- Bruce Campbell Engineering Computing CPH-2374B University of Waterloo (519)888-4567 ext 5889 This mail sent through www.mywaterloo.ca To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-questions" in the body of the message