Re: 2.6.23-rc7-mm1 AHCI ATA errors -- won't boot
Berck E. Nash wrote: Greetings, I get a few million of these on boot-- the system never actually boots. Works fine in 2.6.23-rc7. [ 50.456012] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [ 50.462484] ata2.00: irq_stat 0x4001 [ 50.466441] ata2.00: cmd e5/00:00:00:00:00/00:00:00:00:00/a0 tag 0 cdb 0x0 data 0 [ 50.466442] res 51/04:00:01:01:80/00:00:00:00:00/a0 Emask 0x1 (device error) [ 50.481914] ata2.00: status: {DRDY ERR } [ 50.485876] ata2.00: error: {ABRT } [ 50.489533] ata2.00: configured for UDMA/133 [ 50.493839] ata2: EH complete FWIW I haven't had time to debug this, so I'm going to simply revert the patch, and make sure it does not make it into 2.6.24. Jeff - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc7-mm1 AHCI ATA errors -- won't boot
Berck E. Nash wrote: > Bernd Schmidt wrote: >> One of these appears in my system as well (ASUS P5W-DH Deluxe >> mainboard). Here's the hdparm output: > > Yup, same mainboard here. > >> Since about 2.6.17 or 2.6.18, it has been causing long delays while >> booting: >> ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300) >> ata2.00: qc timeout (cmd 0xec) >> ata2.00: failed to IDENTIFY (I/O error, err_mask=0x5) >> ata2: port is slow to respond, please be patient (Status 0x80) >> ata2: COMRESET failed (errno=-16) >> ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300) >> ata2.00: ATA-6: Config Disk, RGL10364, max UDMA/133 >> ata2.00: 640 sectors, multi 1: LBA >> ata2.00: configured for UDMA/133 > > And yup, same problem with the painful boot delays since 2.6.18. Tejun > indicated that a fix would get merged with 2.6.23, but that didn't > happen. Here's hoping something makes it into .24! Yeah, it is the sil4726 virtual device which is really crappy as an ATA device. About the fix, I thought PMP support would fix it but the controller on P5W-DH doesn't support PMP. It can only talk to the virtual device or the device attached to the first port depending on how the PMP chip is configured. It seems we'll have to blacklist the mainboard and skip or use modified reset sequence on the affected port, so that's why the fix was delayed. I'm currently on the road but I'll look into it when I get back (next week). Thanks. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc7-mm1 AHCI ATA errors -- won't boot
Bernd Schmidt wrote: > One of these appears in my system as well (ASUS P5W-DH Deluxe > mainboard). Here's the hdparm output: Yup, same mainboard here. > Since about 2.6.17 or 2.6.18, it has been causing long delays while > booting: > ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300) > ata2.00: qc timeout (cmd 0xec) > ata2.00: failed to IDENTIFY (I/O error, err_mask=0x5) > ata2: port is slow to respond, please be patient (Status 0x80) > ata2: COMRESET failed (errno=-16) > ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300) > ata2.00: ATA-6: Config Disk, RGL10364, max UDMA/133 > ata2.00: 640 sectors, multi 1: LBA > ata2.00: configured for UDMA/133 And yup, same problem with the painful boot delays since 2.6.18. Tejun indicated that a fix would get merged with 2.6.23, but that didn't happen. Here's hoping something makes it into .24! Berck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc7-mm1 AHCI ATA errors -- won't boot
Jeff Garzik wrote: > Would it also be possible for you to send along 'hdparm --Istdout' > output for your config disk thingy, /dev/sdd ? Sure, just don't ask me what it is! (I've generally assumed that writing to it would be a bad idea.) Berck /dev/sdd: 0040 3fff c837 0010 003f 3030 3030 3030 315f 5f5f 5f5f 5f5f 5f5f 5f30 5f45 0003 3e00 0004 5247 4c31 3033 3634 436f 6e66 6967 2020 4469 736b 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 8001 2f00 4000 0200 0007 3fff 0010 003f fc10 00fb 0101 0280 0407 0003 0078 0078 0078 0078 0201 007e 001b 0068 5060 4000 1000 4000 407f fffe c0fe 0002 0001 0017 2040 b4a5
Re: 2.6.23-rc7-mm1 AHCI ATA errors -- won't boot
Jeff Garzik wrote: Would it also be possible for you to send along 'hdparm --Istdout' output for your config disk thingy, /dev/sdd ? One of these appears in my system as well (ASUS P5W-DH Deluxe mainboard). Here's the hdparm output: /dev/sdb: 0040 3fff c837 0010 003f 3030 3030 3030 305f 5f5f 5f5f 5f5f 5f5f 5f30 5f41 0003 3e00 0004 5247 4c31 3033 3634 436f 6e66 6967 2020 4469 736b 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 8001 2f00 4000 0200 0007 3fff 0010 003f fc10 00fb 0101 0280 0407 0003 0078 0078 0078 0078 0201 007e 001b 0068 5060 4000 1000 4000 407f fffe c0fe 0001 0001 0017 2040 baa5 Since about 2.6.17 or 2.6.18, it has been causing long delays while booting: ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata2.00: qc timeout (cmd 0xec) ata2.00: failed to IDENTIFY (I/O error, err_mask=0x5) ata2: port is slow to respond, please be patient (Status 0x80) ata2: COMRESET failed (errno=-16) ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata2.00: ATA-6: Config Disk, RGL10364, max UDMA/133 ata2.00: 640 sectors, multi 1: LBA ata2.00: configured for UDMA/133 Bernd - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc7-mm1 AHCI ATA errors -- won't boot
Would it also be possible for you to send along 'hdparm --Istdout' output for your config disk thingy, /dev/sdd ? Jeff - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc7-mm1 AHCI ATA errors -- won't boot
Berck E. Nash wrote: Jeff Garzik wrote: Does the attached patch change behavior at all? You should be able to apply it on top of libata-dev.git#upstream or -mm. Still broken, dmesg with ATA_DEBUG defined, attached. Great, this will be useful output. It will probably be a couple days before my next patch. In the meantime, you can extract the bad commit to a patch git-diff-tree -p 268fe6f9f15551be9abedd44a237392675d529d5 > \ /tmp/patch and then revert it locally in your kernel tree patch -sp1 -R < /tmp/patch to temporarily work around this. I will definitely make sure this is either fixed or reverted before it goes upstream to Linus. Thanks, Jeff - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc7-mm1 AHCI ATA errors -- won't boot
Berck E. Nash wrote: Jeff Garzik wrote: Once the blame has been squared fixed upon me :) you can use git-bisect to locate the precise change that broke your setup. Okay, here's the problem: 268fe6f9f15551be9abedd44a237392675d529d5 is first bad commit commit 268fe6f9f15551be9abedd44a237392675d529d5 Author: Jeff Garzik <[EMAIL PROTECTED]> Date: Fri Sep 21 07:09:36 2007 -0400 [libata] SCSI: simple TEST UNIT READY simulation It's trivial to ping the device, and that's a much more sane behavior than no-op. Signed-off-by: Jeff Garzik <[EMAIL PROTECTED]> :04 04 44d34cdad073bd623545b8239aca9a113652c6d0 df6d21f7ce56a4e796f8f856c1f647b0395ab4df M drivers Does the attached patch change behavior at all? You should be able to apply it on top of libata-dev.git#upstream or -mm. If there are still problems, an updated dmesg (w/ the attached patch) and output from enabling ATA_DEBUG (include/linux/libata.h) would be very helpful. Thanks! Jeff diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c index 3882c72..c9838f1 100644 --- a/drivers/ata/libata-scsi.c +++ b/drivers/ata/libata-scsi.c @@ -2800,7 +2800,9 @@ static inline ata_xlat_func_t ata_get_xlat_func(struct ata_device *dev, u8 cmd) return ata_scsi_start_stop_xlat; case TEST_UNIT_READY: - return ata_scsi_tur_xlat; + if (ata_id_has_pm(dev->id)) + return ata_scsi_tur_xlat; + return NULL; } return NULL; @@ -3021,6 +3023,7 @@ void ata_scsi_simulate(struct ata_device *dev, struct scsi_cmnd *cmd, case REZERO_UNIT: case SEEK_6: case SEEK_10: + case TEST_UNIT_READY: /* only for !PM devices */ ata_scsi_rbuf_fill(&args, ata_scsiop_noop); break;
Re: 2.6.23-rc7-mm1 AHCI ATA errors -- won't boot
Robert Hancock wrote: ATA spec says "The device shall return command aborted if the device does not support the Power Management feature set." Whereas TEST UNIT READY is required for SCSI. It seems the SAT authors didn't consider this case. Dumb me -- I misread that as mandatory. Jeff - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc7-mm1 AHCI ATA errors -- won't boot
Berck E. Nash wrote: > hdparm output attached. Whoops, it really is this time. /dev/sde: 427a 3fff 0010 e100 0258 003f 000e 5744 2d57 4d41 4b48 3131 3235 3131 3700 0003 4000 004a 3331 2e30 3846 3331 5744 4320 5744 3336 3047 442d 3030 464c 4132 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 8010 2f00 4001 0280 0007 3fff 0010 003f fc10 00fb 0110 44e0 044f 0007 0003 0078 0078 0078 0078 001f 0202 007e 74eb 7f63 4003 74e9 3e43 4003 407f 80fe 44e0 044f 0001 0141 0746 0002 0001 001f 001f 8da5
Re: 2.6.23-rc7-mm1 AHCI ATA errors -- won't boot
Jeff Garzik wrote: Berck E. Nash wrote: Jeff Garzik wrote: Once the blame has been squared fixed upon me :) you can use git-bisect to locate the precise change that broke your setup. Okay, here's the problem: 268fe6f9f15551be9abedd44a237392675d529d5 is first bad commit commit 268fe6f9f15551be9abedd44a237392675d529d5 Author: Jeff Garzik <[EMAIL PROTECTED]> Date: Fri Sep 21 07:09:36 2007 -0400 [libata] SCSI: simple TEST UNIT READY simulation It's trivial to ping the device, and that's a much more sane behavior than no-op. Signed-off-by: Jeff Garzik <[EMAIL PROTECTED]> :04 04 44d34cdad073bd623545b8239aca9a113652c6d0 df6d21f7ce56a4e796f8f856c1f647b0395ab4df M drivers Thanks for debugging! Can you tell me something about this device? [ 49.045635] ata2.00: ATA-6: Config Disk, RGL10364, max UDMA/133 [ 49.051677] ata2.00: 640 sectors, multi 1: LBA [ 49.056321] ata2.00: configured for UDMA/133 It seems like it does not support the 'check power mode' command. Can you post a text file attachment, containing the output of 'hdparm --Istdout' ? ATA spec says "The device shall return command aborted if the device does not support the Power Management feature set." Whereas TEST UNIT READY is required for SCSI. It seems the SAT authors didn't consider this case. I assume we can tell from the identify data that the device doesn't support power management and just fake success for TEST UNIT READY in this case? -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc7-mm1 AHCI ATA errors -- won't boot
Jeff Garzik wrote: > Can you tell me something about this device? > > [ 49.045635] ata2.00: ATA-6: Config Disk, RGL10364, max UDMA/133 > [ 49.051677] ata2.00: 640 sectors, multi 1: LBA > [ 49.056321] ata2.00: configured for UDMA/133 > > It seems like it does not support the 'check power mode' command. > > Can you post a text file attachment, containing the output of 'hdparm > --Istdout' ? No problem. The device in question is a Western Digital Raptor WD360GD 36.7GB 10,000 RPM Serial ATA150 Hard Drive. hdparm output attached. Berck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc7-mm1 AHCI ATA errors -- won't boot
Berck E. Nash wrote: Jeff Garzik wrote: Once the blame has been squared fixed upon me :) you can use git-bisect to locate the precise change that broke your setup. Okay, here's the problem: 268fe6f9f15551be9abedd44a237392675d529d5 is first bad commit commit 268fe6f9f15551be9abedd44a237392675d529d5 Author: Jeff Garzik <[EMAIL PROTECTED]> Date: Fri Sep 21 07:09:36 2007 -0400 [libata] SCSI: simple TEST UNIT READY simulation It's trivial to ping the device, and that's a much more sane behavior than no-op. Signed-off-by: Jeff Garzik <[EMAIL PROTECTED]> :04 04 44d34cdad073bd623545b8239aca9a113652c6d0 df6d21f7ce56a4e796f8f856c1f647b0395ab4df M drivers Thanks for debugging! Can you tell me something about this device? [ 49.045635] ata2.00: ATA-6: Config Disk, RGL10364, max UDMA/133 [ 49.051677] ata2.00: 640 sectors, multi 1: LBA [ 49.056321] ata2.00: configured for UDMA/133 It seems like it does not support the 'check power mode' command. Can you post a text file attachment, containing the output of 'hdparm --Istdout' ? Jeff - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc7-mm1 AHCI ATA errors -- won't boot
Jeff Garzik wrote: > Once the blame has been squared fixed upon me :) you can use git-bisect > to locate the precise change that broke your setup. Okay, here's the problem: 268fe6f9f15551be9abedd44a237392675d529d5 is first bad commit commit 268fe6f9f15551be9abedd44a237392675d529d5 Author: Jeff Garzik <[EMAIL PROTECTED]> Date: Fri Sep 21 07:09:36 2007 -0400 [libata] SCSI: simple TEST UNIT READY simulation It's trivial to ping the device, and that's a much more sane behavior than no-op. Signed-off-by: Jeff Garzik <[EMAIL PROTECTED]> :04 04 44d34cdad073bd623545b8239aca9a113652c6d0 df6d21f7ce56a4e796f8f856c1f647b0395ab4df M drivers Berck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc7-mm1 AHCI ATA errors -- won't boot
On Tue, Sep 25 2007, Berck E. Nash wrote: > Jens Axboe wrote: > > On Tue, Sep 25 2007, Berck E. Nash wrote: > >> Jeff Garzik wrote: > >> > >>> The first step would be to clone the "upstream" branch of > >>> git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev.git > >>> > >>> and see if the problem is reproducible there. If yes, then you have > >>> narrowed down the problem to something my ATA devel tree has introduced > >>> into -mm. > >> Nope, you're off the hook. The libata tree works great, so it must be > >> something else in -mm conflicting. > > Whoops, sorry! I just lied. I'm a git newbie, and failed to actually > get the "upstream" branch the first time, so rc8 is clean, but it fails > when I actually pull the upstream branch. I'll git bisect and get back > to you. OK, you probably realize this, but you can forget about the git-block testing for now then. -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc7-mm1 AHCI ATA errors -- won't boot
Jens Axboe wrote: > On Tue, Sep 25 2007, Berck E. Nash wrote: >> Jeff Garzik wrote: >> >>> The first step would be to clone the "upstream" branch of >>> git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev.git >>> >>> and see if the problem is reproducible there. If yes, then you have >>> narrowed down the problem to something my ATA devel tree has introduced >>> into -mm. >> Nope, you're off the hook. The libata tree works great, so it must be >> something else in -mm conflicting. Whoops, sorry! I just lied. I'm a git newbie, and failed to actually get the "upstream" branch the first time, so rc8 is clean, but it fails when I actually pull the upstream branch. I'll git bisect and get back to you. BErck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc7-mm1 AHCI ATA errors -- won't boot
On Tue, Sep 25 2007, Berck E. Nash wrote: > Jeff Garzik wrote: > > > The first step would be to clone the "upstream" branch of > > git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev.git > > > > and see if the problem is reproducible there. If yes, then you have > > narrowed down the problem to something my ATA devel tree has introduced > > into -mm. > > Nope, you're off the hook. The libata tree works great, so it must be > something else in -mm conflicting. Can you try 2.6.23-rc8 plus this patch: http://brick.kernel.dk/git-block.patch.bz2 and see if that works? -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc7-mm1 AHCI ATA errors -- won't boot
Jeff Garzik wrote: > The first step would be to clone the "upstream" branch of > git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev.git > > and see if the problem is reproducible there. If yes, then you have > narrowed down the problem to something my ATA devel tree has introduced > into -mm. Nope, you're off the hook. The libata tree works great, so it must be something else in -mm conflicting. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc7-mm1 AHCI ATA errors -- won't boot
Berck E. Nash wrote: Greetings, I get a few million of these on boot-- the system never actually boots. Works fine in 2.6.23-rc7. [ 50.456012] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [ 50.462484] ata2.00: irq_stat 0x4001 [ 50.466441] ata2.00: cmd e5/00:00:00:00:00/00:00:00:00:00/a0 tag 0 cdb 0x0 data 0 [ 50.466442] res 51/04:00:01:01:80/00:00:00:00:00/a0 Emask 0x1 (device error) [ 50.481914] ata2.00: status: {DRDY ERR } [ 50.485876] ata2.00: error: {ABRT } [ 50.489533] ata2.00: configured for UDMA/133 [ 50.493839] ata2: EH complete I've attached the entire dmesg and lspci. Are you "git-friendly"? A few quick kernel compiles and reboots would help us narrow down the problem, given that it's a reproducible regression. The first step would be to clone the "upstream" branch of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev.git and see if the problem is reproducible there. If yes, then you have narrowed down the problem to something my ATA devel tree has introduced into -mm. Once the blame has been squared fixed upon me :) you can use git-bisect to locate the precise change that broke your setup. Info at http://kerneltrap.org/node/11753 or http://www.kernel.org/pub/software/scm/git/docs/v1.3.3/howto/isolate-bugs-with-bisect.txt or "man git-bisect" Jeff - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/