Re: HSM violation erros on sata_promise

2007-12-27 Thread Mikael Pettersson
: enabled, read cache: enabled, doesn't support DPO or FUA ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 ata4.00: port_status 0x2008 ata4.00: cmd c8/00:08:3f:00:00/00:00:00:00:00/e0 tag 0 cdb 0x0 data 4096 in res 50/00:00:46:00:00/00:00:00:00:00/e0 Emask 0x2 (HSM

Re: HSM violation errors

2007-12-25 Thread Robert Hancock
Jeff Mitchell wrote: I'm seeing errors in dmesg and the like. It appears to be somewhat similar to the issue reported here: http://kerneltrap.org/mailarchive/linux-kernel/2007/8/25/164711 except that my machine doesn't freeze, and everything seems normal -- hopefully nothing like silent

HSM violation errors

2007-12-24 Thread Jeff Mitchell
0x2 (HSM violation) ata1.00: cmd 61/08:10:a6:fb:c5/00:00:01:00:00/40 tag 2 cdb 0x0 data 4096 out res 50/00:08:46:4c:d4/00:00:01:00:00/40 Emask 0x2 (HSM violation) ata1.00: cmd 61/08:18:fe:00:c8/00:00:01:00:00/40 tag 3 cdb 0x0 data 4096 out res 50/00:08:46:4c:d4/00:00:01:00:00/40

Re: [PATCH 02/15] libata: zero xfer length on ATAPI data xfer IRQ is HSM violation

2007-12-05 Thread Albert Lee
Tejun Heo wrote: From: Albert Lee [EMAIL PROTECTED] Treat zero xfer length as HSM violation. While at it, add unlikely()'s to ATAPI ireason and transfer length checks. tj: Formatted patch and added unlikely()'s. Signed-off-by: Albert Lee [EMAIL PROTECTED] Signed-off-by: Tejun Heo

[PATCH 02/15] libata: zero xfer length on ATAPI data xfer IRQ is HSM violation

2007-12-04 Thread Tejun Heo
From: Albert Lee [EMAIL PROTECTED] Treat zero xfer length as HSM violation. While at it, add unlikely()'s to ATAPI ireason and transfer length checks. tj: Formatted patch and added unlikely()'s. Signed-off-by: Albert Lee [EMAIL PROTECTED] Signed-off-by: Tejun Heo [EMAIL PROTECTED] --- drivers

Hitachi SATA HSM violation (Was: Re: Adding SATA disk with broken NCQ)

2007-11-06 Thread Simos Xenitellis
(HSM violation) [ 51.294232] ata1.00: cmd 60/08:10:f9:5b:5a/00:00:06:00:00/40 tag 2 cdb 0x0 data 4096 in [ 51.294235] res 50/00:08:f9:5b:5a/00:00:06:00:00/40 Emask 0x2 (HSM violation) [ 51.294244] ata1.00: cmd 60/08:38:51:64:9f/00:00:06:00:00/40 tag 7 cdb 0x0 data 4096

Re: [PATCH] libata drain fifo on stuck DRQ HSM violation

2007-09-29 Thread Alan Cox
Why 512 words ? Though I have queued Mark's patch to be applied, my gut feeling would lean towards a single DRQ block, rather than 512. Why not just work from the old IDE code. ata_altstatus(ap); - ata_chk_status(ap); + ata_drain_fifo(ap, qc); ap-ops-cleanup();

Re: [PATCH] libata drain fifo on stuck DRQ HSM violation (try#2)

2007-09-29 Thread Jeff Garzik
Mark Lord wrote: I think this original patch still applies cleanly on at least 2.6.23-rc7. Drain up to 512 words from host/bridge FIFO on stuck DRQ HSM violation, rather than just getting stuck there forever. Signed-off-by: Mark Lord [EMAIL PROTECTED] --- --- old/drivers/ata/libata-sff.c

Re: [PATCH] libata drain fifo on stuck DRQ HSM violation

2007-09-29 Thread Mark Lord
Alan Cox wrote: Why 512 words ? Though I have queued Mark's patch to be applied, my gut feeling would lean towards a single DRQ block, rather than 512. Why not just work from the old IDE code. ata_altstatus(ap); - ata_chk_status(ap); + ata_drain_fifo(ap, qc);

Re: [PATCH] libata drain fifo on stuck DRQ HSM violation

2007-09-28 Thread Andrew Morton
On Fri, 28 Sep 2007 02:48:28 -0700 Tejun Heo [EMAIL PROTECTED] wrote: Mark Lord wrote: Drain up to 512 words from host/bridge FIFO on stuck DRQ HSM violation, rather than just getting stuck there forever. Signed-Off-By: Mark Lord [EMAIL PROTECTED] Acked-by: Tejun Heo [EMAIL

[PATCH] libata drain fifo on stuck DRQ HSM violation (try#2)

2007-09-28 Thread Mark Lord
Alan Cox wrote: Drain up to 512 words from host/bridge FIFO on stuck DRQ HSM violation, rather than just getting stuck there forever. Why 512 words ? ata_altstatus(ap); - ata_chk_status(ap); + ata_drain_fifo(ap, qc); ap-ops-cleanup(); might be wiser Actually, I

Re: [PATCH] libata drain fifo on stuck DRQ HSM violation

2007-09-28 Thread Alan Cox
Drain up to 512 words from host/bridge FIFO on stuck DRQ HSM violation, rather than just getting stuck there forever. Why 512 words ? ata_altstatus(ap); - ata_chk_status(ap); + ata_drain_fifo(ap, qc); ap-ops-cleanup(); might be wiser - To unsubscribe from this list: send

Re: [PATCH] libata drain fifo on stuck DRQ HSM violation

2007-09-28 Thread Tejun Heo
Mark Lord wrote: Drain up to 512 words from host/bridge FIFO on stuck DRQ HSM violation, rather than just getting stuck there forever. Signed-Off-By: Mark Lord [EMAIL PROTECTED] Acked-by: Tejun Heo [EMAIL PROTECTED] -- tejun - To unsubscribe from this list: send the line unsubscribe linux

Re: [PATCH] libata drain fifo on stuck DRQ HSM violation

2007-09-28 Thread Tejun Heo
Nacked-by: scripts/checkpatch.pl Mark, it seems you'll have to get ACK from this dude first. :-) -- tejun - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH] libata drain fifo on stuck DRQ HSM violation

2007-09-28 Thread Jeff Garzik
Alan Cox wrote: Drain up to 512 words from host/bridge FIFO on stuck DRQ HSM violation, rather than just getting stuck there forever. Why 512 words ? Though I have queued Mark's patch to be applied, my gut feeling would lean towards a single DRQ block, rather than 512

Re: Stardom SATA HSM violation

2007-09-27 Thread Alan Cox
I think there have been enough cases where this draining was necessary. IIRC, ata_piix was involved in those cases, right? If so, can you please submit a patch which applies this only to affected controllers? I don't feel too confident about applying this to all SFF controllers. Old IDE

Re: Stardom SATA HSM violation

2007-09-27 Thread Tejun Heo
Alan Cox wrote: I think there have been enough cases where this draining was necessary. IIRC, ata_piix was involved in those cases, right? If so, can you please submit a patch which applies this only to affected controllers? I don't feel too confident about applying this to all SFF

Re: Stardom SATA HSM violation

2007-09-27 Thread Jeff Garzik
Tejun Heo wrote: Alan Cox wrote: I think there have been enough cases where this draining was necessary. IIRC, ata_piix was involved in those cases, right? If so, can you please submit a patch which applies this only to affected controllers? I don't feel too confident about applying this to

[PATCH] libata drain fifo on stuck DRQ HSM violation

2007-09-27 Thread Mark Lord
be bothered to regenerate the patch and post it one more time (again)? It seems we all agree the update is needed. I think this original patch still applies cleanly on at least 2.6.23-rc7. Drain up to 512 words from host/bridge FIFO on stuck DRQ HSM violation, rather than just getting stuck

Re: Stardom SATA HSM violation

2007-09-27 Thread Mark Lord
Tejun Heo wrote: Alan Cox wrote: I think there have been enough cases where this draining was necessary. IIRC, ata_piix was involved in those cases, right? If so, can you please submit a patch which applies this only to affected controllers? I don't feel too confident about applying this to

Re: HSM violation with ahci+WDC WD1600BEVS-22RST0

2007-09-24 Thread Maurizio Monge
No, i did not manage to improve (it should NOT be a dangerous error BTW). I simply think that this issue is because of buggy firmware, so i posted to linux-ide a patch to blacklist this hard disk from using NCQ (because it is triggering spurious completions). I don't know what the blacklisting

Re: HSM violation on bootup, ICH7 + ata_piix 2.6.22

2007-09-14 Thread Bruce Allen
this isn't good :( Anybody gt any suggestions? The violations are : ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen ata1.00: cmd b0/d2:f1:00:4f:c2/00:00:00:00:00/00 tag 0 cdb 0x0 data 123392 in res 50/00:f1:00:4f:c2/00:00:00:00:00/00 Emask 0x202 (HSM violation

Re: HSM violation on bootup, ICH7 + ata_piix 2.6.22

2007-09-14 Thread Eamonn Hamilton
OK, On Fri, 2007-09-14 at 08:56 -0500, Bruce Allen wrote: ... Eamonn: could you please build the latest version of smartmontools from CVS HEAD source and see if the problem exists in that version? Then write back. I don't think this will help but want to eliminate obvious things. I

Re: HSM violation on bootup, ICH7 + ata_piix 2.6.22

2007-09-13 Thread Tejun Heo
suggestions? The violations are : ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen ata1.00: cmd b0/d2:f1:00:4f:c2/00:00:00:00:00/00 tag 0 cdb 0x0 data 123392 in res 50/00:f1:00:4f:c2/00:00:00:00:00/00 Emask 0x202 (HSM violation) -- tejun - To unsubscribe

Re: HSM violation on bootup, ICH7 + ata_piix 2.6.22

2007-09-10 Thread Eamonn Hamilton
Hi Tejun, Please disable or upgrade smartd. Thanks for that, I checked the disks and sure enough they were in an extended self test, I aborted that and it's all back to normal. the only problem, however, is that the system is already running version 5.37 of the smartmontools package, which

Re: HSM violation on bootup, ICH7 + ata_piix 2.6.22

2007-09-08 Thread Tejun Heo
(HSM violation) Please disable or upgrade smartd. -- tejun - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Stardom SATA HSM violation

2007-09-07 Thread Mark Lord
Tejun Heo wrote: Hello, Mark Lord wrote: I reported a very similar bug back a few releases ago. Anyone who wants to try it themselves, can do this with hdparm-7.7 (from sourceforge): hdparm --drq-hsm-error /dev/sda Whether or not it hangs the machine does depend upon exactly which SATA

Re: Stardom SATA HSM violation

2007-09-06 Thread Bryan Woods
and at different points in the process I get an HSM violation and the system becomes unresponsive. It looks like a similar situation to: http://lkml.org/lkml/2007/6/6/195 Will more recent kernels work with this hardware (should I keep it and try the install again) or should I switch

Re: Stardom SATA HSM violation

2007-09-06 Thread Tejun Heo
Bryan Woods wrote: The full dmesg and hdparm -I command output are attached. I have received word from the vendor that the Stardom 2611 will do RAID0 or 1 under windows, but only RAID1 under Linux. (Their manual said it worked with Linux but failed to mention the RAID mode restriction:

Re: HSM violation spew.

2007-09-06 Thread Tejun Heo
Dave Jones wrote: scsi 2:0:0:0: Direct-Access ATA WDC WD3200AAJS-0 12.0 PQ: 0 ANSI: 5 This could have been truncated, please post the result of 'hdparm -I /dev/sda'. Thanks. -- tejun - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to

Re: Stardom SATA HSM violation

2007-09-06 Thread Tejun Heo
Hello, Mark Lord wrote: I reported a very similar bug back a few releases ago. Anyone who wants to try it themselves, can do this with hdparm-7.7 (from sourceforge): hdparm --drq-hsm-error /dev/sda Whether or not it hangs the machine does depend upon exactly which SATA LLD is used,

Re: Stardom SATA HSM violation

2007-09-05 Thread Andrew Morton
/Synetic-Products/Stardoms/SR-2611-SA/Stardom-2611.htm During the install and at different points in the process I get an HSM violation and the system becomes unresponsive. It looks like a similar situation to: http://lkml.org/lkml/2007/6/6/195 Will more recent kernels work

Re: Stardom SATA HSM violation

2007-09-05 Thread Mark Lord
-Products/Stardoms/SR-2611-SA/Stardom-2611.htm During the install and at different points in the process I get an HSM violation and the system becomes unresponsive. It looks like a similar situation to: http://lkml.org/lkml/2007/6/6/195 Will more recent kernels work with this hardware (should I

Re: Stardom SATA HSM violation

2007-09-05 Thread Andrew Morton
ata3.00: cmd ec/00:00:00:00:00/00:00:00:00:00/40 tag 0 cdb 0x0 data 0 res 58/00:01:00:00:00/00:00:00:00:00/40 Emask 0x2 (HSM violation) ata3: soft resetting port ata3.00: configured for UDMA/100 ata3: EH complete sd 2:0:0:0: [sda] 195371568 512-byte hardware sectors (100030 MB) sd 2:0:0:0

Re: Stardom SATA HSM violation

2007-09-05 Thread Mark Lord
0x0 SErr 0x0 action 0x2 frozen ata3.00: cmd ec/00:00:00:00:00/00:00:00:00:00/40 tag 0 cdb 0x0 data 0 res 58/00:01:00:00:00/00:00:00:00:00/40 Emask 0x2 (HSM violation) ata3: soft resetting port ata3.00: configured for UDMA/100 ata3: EH complete sd 2:0:0:0: [sda] 195371568 512-byte hardware

HSM violation on bootup, ICH7 + ata_piix 2.6.22

2007-09-03 Thread Eamonn Hamilton
:( Anybody gt any suggestions? The violations are : ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen ata1.00: cmd b0/d2:f1:00:4f:c2/00:00:00:00:00/00 tag 0 cdb 0x0 data 123392 in res 50/00:f1:00:4f:c2/00:00:00:00:00/00 Emask 0x202 (HSM violation) ata1: soft resetting port

HSM violation spew.

2007-08-29 Thread Dave Jones
:00:24:00:00/40 Emask 0x2 (HSM violation) ata3.00: cmd 61/10:10:b1:f4:09/00:00:24:00:00/40 tag 2 cdb 0x0 data 8192 out res 40/00:84:11:f8:09/00:00:24:00:00/40 Emask 0x2 (HSM violation) ata3.00: cmd 61/10:18:c9:f4:09/00:00:24:00:00/40 tag 3 cdb 0x0 data 8192 out res 40/00:84:11:f8:09

Re: HSM violation spew.

2007-08-29 Thread Dave Jones
On Wed, Aug 29, 2007 at 02:49:25PM -0400, Dave Jones wrote: Just noticed this in dmesg.. ata3.00: exception Emask 0x2 SAct 0x1fffd SErr 0x0 action 0x2 frozen ata3.00: spurious completions during NCQ issue=0x0 SAct=0x1fffd FIS=004040a1:0004 There's a bunch of these that have been

Re: Stardom SATA HSM violation

2007-08-26 Thread Michal Piotrowski
an HSM violation and the system becomes unresponsive. It looks like a similar situation to: http://lkml.org/lkml/2007/6/6/195 Will more recent kernels work with this hardware (should I keep it and try the install again) or should I switch hardware to something more compatible (like

Re: hsm violation

2007-06-24 Thread Andrew Morton
:00/40 tag 1 cdb 0x0 data 4096 in [ 61.176000] res 50/00:08:27:3c:ed/00:00:0b:00:00/40 Emask 0x2 (HSM violation) [ 61.488000] ata1: soft resetting port [ 61.66] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [ 61.66] ata1.00: ata_hpa_resize 1: sectors

Re: hsm violation

2007-06-24 Thread Robert Hancock
Andrew Morton wrote: On Sun, 24 Jun 2007 14:32:22 +0200 Enrico Sardi [EMAIL PROTECTED] wrote: [ 61.176000] ata1.00: exception Emask 0x2 SAct 0x2 SErr 0x0 action 0x2 frozen [ 61.176000] ata1.00: (spurious completions during NCQ issue=0x0 SAct=0x2 FIS=005040a1:0004) .. It's not

Re: hsm violation

2007-06-24 Thread Tejun Heo
Andrew Morton wrote: That great spew of set_level status: 0 is fairly annoying and useless. I don't know where those are coming from. It's not from libata. -- tejun - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More

Re: libata fails to recover from HSM violation involving DRQ status

2007-05-10 Thread Mark Lord
to HSM violation caused by stuck DRQ. Yeah, so far it's just PIO FROM DEVICE on a SATA device on ata_piix. It *may* be more widespread than that, but we'll have to test some others. I retested this again today on my new pure-SATA notebook with ata_piix. In this case, the DRQ drain

Re: libata fails to recover from HSM violation involving DRQ status

2007-05-10 Thread Mark Lord
Mark Lord wrote: Mark Lord wrote: I retested this again today on my new pure-SATA notebook with ata_piix. In this case, the DRQ drain is not necessary, but also doesn't harm anything. Tested it both ways. This is with a Hitachi HTS541612J9SA00 SATA drive. The original fault was on ata_piix

Re: libata fails to recover from HSM violation involving DRQ status

2007-05-01 Thread Mark Lord
Mark Lord wrote: Tejun Heo wrote: So, this is specific to SATA (the host side at least) piix PIO READ, right? I think we can fit this code nicely into piix_sata_error_handler() if we make sure that it triggers under the right condition - after a PIO READ command fails due to HSM violation

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-30 Thread Mark Lord
:00:02/00:00:00:00:00/40 Emask 0x2 (HSM violation) ata4: soft resetting port ata4.00: configured for UDMA/66 ata4: EH complete And in this case, the first line of diagnostics (the cmd line) is always missing. Why? Hmmm... that's very weird. I've never seen such problems. Well, from looking

sata_nv and smartctl -o/-S trigger HSM violation (2.6.21.1, 2.6.20)

2007-04-30 Thread Robin H. Johnson
: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen ata1.00: cmd b0/d2:f1:00:4f:c2/00:00:00:00:00/00 tag 0 cdb 0x0 data 123392 in res 50/00:f1:00:4f:c2/00:00:00:00:00/00 Emask 0x202 (HSM violation) ata1: soft resetting port ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-30 Thread Mark Lord
Emask 0x2 (HSM violation) Why do we not always put a '\n' in front of that last line above ?? Sometimes it seems to have it, and lots of times it does not have a '\n'. Weird. ## Test stuck DRQ on VIA-pata (ATAPI DVD/RW): ## Notice how the first ata4.00: cmd ... line is *missing

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-30 Thread Tejun Heo
:00/00:00:00:00:00/a0 Emask 0x2 (HSM violation) Why do we not always put a '\n' in front of that last line above ?? Sometimes it seems to have it, and lots of times it does not have a '\n'. Weird. ## Test stuck DRQ on VIA-pata (ATAPI DVD/RW): ## Notice how the first ata4.00: cmd

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-29 Thread Tejun Heo
Jeff Garzik wrote: Tejun Heo wrote: and thus clear DRQ, right? Stuck DRQ after SRST seems odd to me. Unfortunately not odd on ata_piix, which can get stuck DRQ-on somewhere deep inside its IDE emulation engine. And neither draining the FIFO nor SRST nor a couple other tricks ever helped.

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-29 Thread Mark Lord
Tejun Heo wrote: Tejun Heo wrote: .. Anyways, can you try to hack it into ata_bmdma_error_handler() and see whether it actually works? You can check for AC_ERR_HSM there and drain data port if DRQ is set. After HSM, ATA_NIEN is set and the port should be quiescent at that point. Sure, I'll

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-29 Thread Mark Lord
Tejun Heo wrote: Anyways, can you try to hack it into ata_bmdma_error_handler() From greping the code, I don't see how that function would ever be called from ata_piix. ?? - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED]

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-29 Thread Mark Lord
Mark Lord wrote: Tejun Heo wrote: Tejun Heo wrote: .. Anyways, can you try to hack it into ata_bmdma_error_handler() and see whether it actually works? You can check for AC_ERR_HSM there and drain data port if DRQ is set. After HSM, ATA_NIEN is set and the port should be quiescent at that

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-29 Thread Mark Lord
Ah.. one more thing, is this draining also needed after DMA commands or only after PIO commands? My drive doesn't do IDENTIFY_DMA, so I fed it a READ_DMA instead with no data, and libata recovered without draining. More specifically, here's what happens for READ_DMA(1 sector) with NON_DATA

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-29 Thread Tejun Heo
Mark Lord wrote: Tejun Heo wrote: Anyways, can you try to hack it into ata_bmdma_error_handler() From greping the code, I don't see how that function would ever be called from ata_piix. ?? Yeah, I meant ata_bmdma_drive_eh(). You apparently have figured that out already. Sorry about the

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-29 Thread Tejun Heo
to SATA (the host side at least) piix PIO READ, right? I think we can fit this code nicely into piix_sata_error_handler() if we make sure that it triggers under the right condition - after a PIO READ command fails due to HSM violation caused by stuck DRQ. Can you please perform similar test

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-29 Thread Mark Lord
Tejun Heo wrote: So, this is specific to SATA (the host side at least) piix PIO READ, right? I think we can fit this code nicely into piix_sata_error_handler() if we make sure that it triggers under the right condition - after a PIO READ command fails due to HSM violation caused by stuck DRQ

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-29 Thread Mark Lord
SErr 0x0 action 0x2 frozen ata1.00: cmd ec/00:00:00:00:00/00:00:00:00:00/00 tag 0 cdb 0x0 data 0 res 58/00:00:00:00:00/00:00:00:00:00/a0 Emask 0x2 (HSM violation) ata1: soft resetting port ATA: abnormal status 0x7F on port 0x0001d807 ATA: abnormal status 0x7F on port 0x0001d807 ata1.00

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-29 Thread Mark Lord
:00:00:00:00/40 Emask 0x2 (HSM violation) ata1: soft resetting port ata1.00: configured for UDMA/100 ata1: EH complete SCSI device sda: 320173056 512-byte hdwr sectors (163929 MB) sda: Write Protect is off sda: Mode Sense: 00 3a 00 00 SCSI device sda: write cache: enabled, read cache: enabled

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-29 Thread Mark Lord
Mark Lord wrote: ## Test stuck DRQ on VIA-sata (disk): ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen ata1.00: cmd ec/00:00:00:00:00/00:00:00:00:00/00 tag 0 cdb 0x0 data 0 res 58/00:00:00:00:00/00:00:00:00:00/a0 Emask 0x2 (HSM violation) Why do we not always

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-29 Thread Tejun Heo
Mark Lord wrote: .. And here is another test of un-hacked 2.6.21, this time for ata_piix with a pure PATA configuration. Again, it passes with flying colours. Thanks a lot. I'd also like to try but I'm on the road and not bored enough (yet) to do that on my only working machine. It's good to

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-29 Thread Tejun Heo
Mark Lord wrote: Mark Lord wrote: ## Test stuck DRQ on VIA-sata (disk): ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen ata1.00: cmd ec/00:00:00:00:00/00:00:00:00:00/00 tag 0 cdb 0x0 data 0 res 58/00:00:00:00:00/00:00:00:00:00/a0 Emask 0x2 (HSM violation) Why

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-29 Thread Mark Lord
Tejun Heo wrote: Hmmm... that's very weird. I've never seen such problems. The report messages are printed in ata_eh_report() and both the cmd and res lines are printed by single invocation to printk(). Is the log captured using serial console? I think it could be transmission error or

libata fails to recover from HSM violation involving DRQ status

2007-04-28 Thread Mark Lord
cache: enabled, doesn't support DPO or FUA ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen ata1.00: cmd ec/00:00:00:00:00/00:00:00:00:00/00 tag 0 cdb 0x0 data 0 res 58/00:00:00:00:00/00:00:00:00:00/00 Emask 0x2 (HSM violation) ata1: soft resetting port ata1.00: configured

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-28 Thread Mark Lord
Mark Lord wrote: Tejun, While working on the new hdparm (version 7.0, released today), I ran into trouble when a buggy SG_IO/ATA_16 packet caused the libata EH to get confused. I triggered this by accident, issuing an IDENTIFY command which incorrectly specified ATA_PROT_NODATA. My error, for

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-28 Thread Alan Cox
In the IDE driver, we had code to try and cope with stuck DRQ, by just looping and reading from the data port a few times. That could have been done better, but it worked a lot of the time, back in those simpler days. It works very well. The current old IDE has some changes in the area but

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-28 Thread Jeff Garzik
Mark Lord wrote: Tejun, While working on the new hdparm (version 7.0, released today), I ran into trouble when a buggy SG_IO/ATA_16 packet caused the libata EH to get confused. I triggered this by accident, issuing an IDENTIFY command which incorrectly specified ATA_PROT_NODATA. My error, for

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-28 Thread Mark Lord
Jeff Garzik wrote: Mark Lord wrote: .. I triggered this by accident, issuing an IDENTIFY command which incorrectly specified ATA_PROT_NODATA. My error, for sure, but libata never recovered from the stuck DRQ bit that resulted. .. Maybe we do need to recover from a stuck DRQ bit, but I'll wait

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-28 Thread Jeff Garzik
Mark Lord wrote: Actually, I'm not so sure that this problem hasn't *already* been posted to this very mailing list. http://lkml.org/lkml/2006/10/1/264 http://www.mail-archive.com/linux-ide@vger.kernel.org/msg05078.html ... What Tejun said at the end of that thread :) That one is a phy-level

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-28 Thread Alan Cox
I am reluctant to do anything about this. This one does need dealing with. It happens in the real world and the old IDE paths for this do get triggered and used now and then (we know this because bugs in them were found). All it takes is a device and a controller disagreeing about the length of

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-28 Thread Mark Lord
Alan Cox wrote: I am reluctant to do anything about this. This one does need dealing with. It happens in the real world and the old IDE paths for this do get triggered and used now and then (we know this because bugs in them were found). All it takes is a device and a controller disagreeing

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-28 Thread Jeff Garzik
Alan Cox wrote: I am reluctant to do anything about this. This one does need dealing with. It happens in the real world and the old IDE paths for this do get triggered and used now and then (we know this because bugs in them were found). All it takes is a device and a controller disagreeing

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-28 Thread Mark Lord
Jeff Garzik wrote: It's not really a good idea for SATA. The FIFO often co-emulated by the SATA controller and SATA phy. You just want to kick SATA really hard (i.e. bus reset and friends). Sure. So why don't we do that now? - To unsubscribe from this list: send the line unsubscribe

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-28 Thread Mark Lord
: enabled, doesn't support DPO or FUA ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen ata1.00: cmd ec/00:00:00:00:00/00:00:00:00:00/00 tag 0 cdb 0x0 data 0 res 58/00:00:00:00:00/00:00:00:00:00/00 Emask 0x2 (HSM violation) ata1: soft resetting port ata1.00: configured

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-28 Thread Alan Cox
This one does need dealing with. It happens in the real world and the old IDE paths for this do get triggered and used now and then (we know this because bugs in them were found). All it takes is a device and a controller disagreeing about the length of a data transfer to get in a How

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-28 Thread Tejun Heo
: enabled, read cache: enabled, doesn't support DPO or FUA ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen ata1.00: cmd ec/00:00:00:00:00/00:00:00:00:00/00 tag 0 cdb 0x0 data 0 res 58/00:00:00:00:00/00:00:00:00:00/00 Emask 0x2 (HSM violation) ata1: soft resetting port

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-28 Thread Tejun Heo
Mark Lord wrote: Jeff Garzik wrote: It's not really a good idea for SATA. The FIFO often co-emulated by the SATA controller and SATA phy. You just want to kick SATA really hard (i.e. bus reset and friends). Sure. So why don't we do that now? We do that. It's just that ata_piix is

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-28 Thread Jeff Garzik
Tejun Heo wrote: and thus clear DRQ, right? Stuck DRQ after SRST seems odd to me. Unfortunately not odd on ata_piix, which can get stuck DRQ-on somewhere deep inside its IDE emulation engine. And neither draining the FIFO nor SRST nor a couple other tricks ever helped. The only thing that

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-28 Thread Tejun Heo
Tejun Heo wrote: Mark Lord wrote: Jeff Garzik wrote: It's not really a good idea for SATA. The FIFO often co-emulated by the SATA controller and SATA phy. You just want to kick SATA really hard (i.e. bus reset and friends). Sure. So why don't we do that now? We do that. It's just that

Re: [PATCH 5/5] ahci: consider SDB FIS containing spurious NCQ completions HSM violation (regenerated)

2007-02-23 Thread Jeff Garzik
errors. Consider spurious NCQ completions HSM violation and freeze the port after it. EH will turn off NCQ after this happens several times. Eventually drives which show this behavior should be blacklisted for NCQ. Signed-off-by: Tejun Heo [EMAIL PROTECTED] --- Regenerated against the current

Re: [PATCH 5/5] ahci: consider SDB FIS containing spurious NCQ completions HSM violation

2007-02-20 Thread Jeff Garzik
errors. Consider spurious NCQ completions HSM violation and freeze the port after it. EH will turn off NCQ after this happens several times. Eventually drives which show this behavior should be blacklisted for NCQ. Signed-off-by: Tejun Heo [EMAIL PROTECTED] --- drivers/ata/ahci.c | 26

[PATCH 5/5] ahci: consider SDB FIS containing spurious NCQ completions HSM violation (regenerated)

2007-02-20 Thread Tejun Heo
spurious NCQ completions HSM violation and freeze the port after it. EH will turn off NCQ after this happens several times. Eventually drives which show this behavior should be blacklisted for NCQ. Signed-off-by: Tejun Heo [EMAIL PROTECTED] --- Regenerated against the current #upstream

[PATCH 5/5] ahci: consider SDB FIS containing spurious NCQ completions HSM violation

2007-02-01 Thread Tejun Heo
spurious NCQ completions HSM violation and freeze the port after it. EH will turn off NCQ after this happens several times. Eventually drives which show this behavior should be blacklisted for NCQ. Signed-off-by: Tejun Heo [EMAIL PROTECTED] --- drivers/ata/ahci.c | 26