Re: [PATCH] Re: 2.6.19.1, sata_sil: sata dvd writer doesn't work

2007-06-19 Thread Tejun Heo
Harald Dunkel wrote:
>> Harald Dunkel wrote:
>>> Hi Tejun,
>>>
>>> Setting the timeout to 15 did not help, either :-(. All your patches
>>> are still in, of course.
>>
>> Hmm... I'm out of ideas.  I'll try it when I get back home.
>>
> 
> Any news about this? At the end I had the impression that this is
> a bug in the chip design. Is this correct?

I can't reproduce that with sil3112 and the same SH-S183A.  I don't know
what's going on here.  Mine behaves really good but yours seems to be
causing all sorts of problems.

> Is there any chance to get the 60-byte patch into the kernel?

We still don't know what's going on so

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Pioneer DVR-111 problems with DMA

2007-06-19 Thread Vlad
Hi,

I'd like to bring to your attention an article that claims the Pioneer
DVR-111 drive doesn't work with libata because the drive needs the
nodma option:

http://apcmag.com/6415/first_look_at_fedora_core_7
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=242956

The drive was released recently (around 2006). Perhaps it should be
added to the blacklist?

Thanks,
Vlad 


   

Boardwalk for $500? In 2007? Ha! Play Monopoly Here and Now (it's updated for 
today's economy) at Yahoo! Games.
http://get.games.yahoo.com/proddesc?gamekey=monopolyherenow  
-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2.6.22-rc5] ahci: fix PORTS_IMPL override

2007-06-19 Thread Tejun Heo
If PORTS_IMPL register is zero, ahci initialize it to full mask
corresponding to nr_ports in the CAP register.  hpriv->cap, which is
initialized at the end of the function, is incorrectly used as value
of CAP causing ahci to always override PORTS_IMPL to 0x1 if it's zero.
Fix it.

This fixes a bug where early ich6 ahci can only access the first port.

Signed-off-by: Tejun Heo <[EMAIL PROTECTED]>
---
 drivers/ata/ahci.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
index 545f330..ca5229d 100644
--- a/drivers/ata/ahci.c
+++ b/drivers/ata/ahci.c
@@ -527,7 +527,7 @@ static void ahci_save_initial_config(struct pci_dev *pdev,
 
/* fixup zero port_map */
if (!port_map) {
-   port_map = (1 << ahci_nr_ports(hpriv->cap)) - 1;
+   port_map = (1 << ahci_nr_ports(cap)) - 1;
dev_printk(KERN_WARNING, &pdev->dev,
   "PORTS_IMPL is zero, forcing 0x%x\n", port_map);
 
-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: VIA VT6420: SATA disconnects

2007-06-19 Thread Vasily Averin
Jeff Garzik wrote:
> Vasily Averin wrote:
>> Jeff, Tejun,
>>
>> Our RHEL5-based OpenVZ linux kernel reports about SATA-related issues:
>> VIA VT6420 SATA RAID Controller on MSI motherboard, x86_64 kernel based on 
>> latest RHEL5 kernel,
>> On booting hardware initialized properly and all works fine some time, but 
>> then it detects timeout and disables devices. We have replaced SATA cables, 
>> but issue didn't go away and still present.
>>
>> I've googled and found similair bugreport in linux-ide@
>> http://www.mail-archive.com/linux-ide@vger.kernel.org/msg06011.html
>>
>> Are you know something about this issue? I've seen that you have fixed SATA 
>> reset procedure recently, probably this issue was fixed already?
> 
> RHEL5 SATA is unfortunately way out of date :(  The next RHEL5 update 
> should include a boatload of fixes.
> 
> Try running the latest upstream kernel (2.6.21.3 or 2.6.22-rc2-git7), 
> and see if the problem is reproducible.

I've reproduced this issue. But on this kernel EH works well and node is still 
alive:

Linux version 2.6.22-rc4 ([EMAIL PROTECTED]) (gcc version 3.4.6 20060404 (Red 
Hat 3.4.6-3)) #1 SMP Fri Jun 8 14:32:01 MSD 2007
...

hda: lost interrupt
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata1.00: cmd ca/00:78:30:1a:24/00:00:00:00:00/e2 tag 0 cdb 0x0 data 61440 out
 res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
ata1: soft resetting port
ATA: abnormal status 0x7F on port 0x0001c007
ATA: abnormal status 0x7F on port 0x0001c007
ata1.00: qc timeout (cmd 0x27)
ata1.00: ata_hpa_resize 1: sectors = 156301488, hpa_sectors = 0
ata1.00: failed to set xfermode (err_mask=0x40)
ata1: failed to recover some devices, retrying in 5 secs
ata1: soft resetting port
ATA: abnormal status 0x7F on port 0x0001c007
ATA: abnormal status 0x7F on port 0x0001c007
ata1.00: qc timeout (cmd 0x27)
ata1.00: ata_hpa_resize 1: sectors = 156301488, hpa_sectors = 0
ata1.00: failed to set xfermode (err_mask=0x40)
ata1.00: limiting speed to UDMA/133:PIO3
ata1: failed to recover some devices, retrying in 5 secs
ata1: soft resetting port
ATA: abnormal status 0x7F on port 0x0001c007
ATA: abnormal status 0x7F on port 0x0001c007
ata1.00: qc timeout (cmd 0x27)
ata1.00: ata_hpa_resize 1: sectors = 156301488, hpa_sectors = 0
ata1.00: failed to set xfermode (err_mask=0x40)
ata1.00: disabled
ata1: EH complete
sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET 
driverbyte=DRIVER_OK,SUGGEST_OK

You can find some additional details in bug #8650
http://bugzilla.kernel.org/show_bug.cgi?id=8650
 
thank you,
Vasily Averin
-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Pioneer DVR-111 problems with DMA

2007-06-19 Thread Alan Cox
On Tue, 19 Jun 2007 01:55:38 -0700 (PDT)
Vlad <[EMAIL PROTECTED]> wrote:
> I'd like to bring to your attention an article that claims the Pioneer
> DVR-111 drive doesn't work with libata because the drive needs the
> nodma option:
> 
> http://apcmag.com/6415/first_look_at_fedora_core_7
> https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=242956
> 
> The drive was released recently (around 2006). Perhaps it should be
> added to the blacklist?

The blacklist is for specific drive problems. At the moment I see no
evidence the Pioneer problem is that. 
-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


ata1: soft resetting port

2007-06-19 Thread Soeren Sonnenburg
Dear List,

since the switch to 

CONFIG_ATA=y
CONFIG_ATA_ACPI=y
CONFIG_ATA_PIIX=y,

the ATA_PIIX driver manages both, internal sata disk aswell as cd/dvd
rom. However I am being flooded with the error messages below (well they
appear from time to time, dominating dmesg). 

This happens on kernel 2.6.22-rc5, I am copying relevant parts from dmesg:

libata version 2.21 loaded.
ata_piix :00:1f.1: version 2.11
ata1: PATA max UDMA/133 cmd 0x000101f0 ctl 0x000103f6 bmdma 0x000140c0 irq 14
ata2: PATA max UDMA/133 cmd 0x00010170 ctl 0x00010376 bmdma 0x000140c8 irq 15
ata1.00: ATAPI: HL-DT-ST DVDRW GWA4080M, AA26, max UDMA/33
ata1.00: configured for UDMA/33
ATA: abnormal status 0x7F on port 0x00010177
scsi 0:0:0:0: CD-ROMHL-DT-ST DVDRW GWA4080M   AA26 PQ: 0 ANSI: 5
sr0: scsi3-mmc drive: 24x/24x writer cd/rw xa/form2 cdda tray
sr 0:0:0:0: Attached scsi CD-ROM sr0
sr 0:0:0:0: Attached scsi generic sg0 type 5
ata_piix :00:1f.2: MAP [ P0 P2 XX XX ]
ata_piix :00:1f.2: invalid MAP value 0
PCI: Setting latency timer of device :00:1f.2 to 64
scsi2 : ata_piix
scsi3 : ata_piix
ata3: SATA max UDMA/133 cmd 0x000140d8 ctl 0x000140f6 bmdma 0x00014020 irq 0
ata4: SATA max UDMA/133 cmd 0x000140d0 ctl 0x000140f2 bmdma 0x00014028 irq 0
ata3.01: ata_hpa_resize 1: sectors = 234441648, hpa_sectors = 234441648
ata3.01: ATA-7: ST9120821AS, 7.01, max UDMA/133
ata3.01: 234441648 sectors, multi 16: LBA48 NCQ (depth 0/32)
ata3.01: ata_hpa_resize 1: sectors = 234441648, hpa_sectors = 234441648
ata3.01: configured for UDMA/133
ATA: abnormal status 0x7F on port 0x000140d7


the actual errors:


ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata1.00: cmd a0/00:00:00:00:20/00:00:00:00:00/a0 tag 0 cdb 0x0 data 0 
 res 51/24:03:00:00:20/00:00:00:00:00/a0 Emask 0x1 (device error)
ata1.00: configured for UDMA/33
ata1: EH complete
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
ata1.00: (BMDMA stat 0x5)
ata1.00: cmd a0/01:00:00:00:00/00:00:00:00:00/a0 tag 0 cdb 0x25 data 8 in
 res 00/24:03:00:00:00/00:00:00:00:00/a0 Emask 0x2 (HSM violation)
ata1: soft resetting port
ata1.00: configured for UDMA/33
ata1: EH complete
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
ata1.00: cmd a0/00:00:00:00:20/00:00:00:00:00/a0 tag 0 cdb 0x0 data 0 
 res 00/24:03:00:00:20/00:00:00:00:00/a0 Emask 0x2 (HSM violation)
ata1: soft resetting port
ata1.00: configured for UDMA/33
ata1: EH complete
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
ata1.00: (BMDMA stat 0x5)
ata1.00: cmd a0/01:00:00:00:00/00:00:00:00:00/a0 tag 0 cdb 0x25 data 8 in
 res 00/24:03:00:00:00/00:00:00:00:00/a0 Emask 0x2 (HSM violation)
ata1: soft resetting port
ata1.00: configured for UDMA/33
ata1: EH complete
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
ata1.00: (BMDMA stat 0x5)
ata1.00: cmd a0/01:00:00:00:00/00:00:00:00:00/a0 tag 0 cdb 0x43 data 12 in
 res 00/24:03:00:00:00/00:00:00:00:00/a0 Emask 0x2 (HSM violation)
ata1: soft resetting port
ata1.00: configured for UDMA/33
ata1: EH complete
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata1.00: cmd a0/00:00:00:00:20/00:00:00:00:00/a0 tag 0 cdb 0x0 data 0 
 res 51/24:03:00:00:20/00:00:00:00:00/a0 Emask 0x1 (device error)
ata1.00: configured for UDMA/33
ata1: EH complete
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata1.00: (BMDMA stat 0x5)
ata1.00: cmd a0/01:00:00:00:00/00:00:00:00:00/a0 tag 0 cdb 0x25 data 8 in
 res 51/24:03:00:00:00/00:00:00:00:00/a0 Emask 0x1 (device error)
ata1.00: configured for UDMA/33
ata1: EH complete
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
ata1.00: (BMDMA stat 0x5)
ata1.00: cmd a0/01:00:00:00:00/00:00:00:00:00/a0 tag 0 cdb 0x43 data 12 in
 res 00/24:03:00:00:00/00:00:00:00:00/a0 Emask 0x2 (HSM violation)
ata1: soft resetting port

-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-19 Thread Sergei Shtylyov

Hello.

[EMAIL PROTECTED] wrote:

I think reading the IDE status register clears the interrupt in the IDE
device, which might be causing the drive to think it's OK to generate
another interrupt.


   This is not how IDE drives are supposed to act -- they won't proceed any 
further until "interrupt pending" condition is cleared, so these aren't 
supposed to be "stacked". This behavior however is not strictly specified by 
ATA standards IIRC, but I can't readily imagine such situaltion anyway unless 
tagged command queueing  (which is not supported by IDE core) and/or ATAPI 
command overlapping is in action...



 This could either cause it to get stuck trying to
service an interrupt that is never getting cleared as you suggested, or
possibly when the next IRQ comes in the IDE IRQ handler gets stuck
waiting for a spinlock that the code you're looking at already owns...?


   I could also imagine the HPT366 chip going mad and stalling the reads if 
the taskfile regs forever because of the incomplete DMA or even the drive 
going mad and not replying to I/O cycles with proper -IORDY handshake (i.e. 
holding it low all the time)...



Perhaps a printk in the IDE IRQ handler would be informative?  It
wouldn't help you figure out how it got where it is, but it might help
you figure out why the system is hanging.



Stuart



-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Linas Vepstas
Sent: Monday, June 18, 2007 12:57 PM
To: linux-ide@vger.kernel.org; [EMAIL PROTECTED]
Subject: [BUG] ide dma_timer_expiry, then hard lockup



I've got a hard lockup in the ide subsystem, probably due to some irq
spew or something like that.

I've just bought a brand new Maxtor 320GB disk driver for the insane
price of $70 US to replace another failing drive. It works well under
light load; I was able to copy about 60GB to it. However, under heavy
load, such as reconstruction of an MD
RAID-1 array, it'll lock up the kernel.  Which means that my system
won't boot :-(

I'm running 2.6.21.1, although the problem seems to occur in 2.6.19 and
2.6.18 too; its been there a while; I vageuly remember similar problems
in 2.6.5 or 2.6.10.

I get an
"hdc: dma_timer_expiry: dma status == 0x21" 


   This means "DMA not complete".


and 10 seconds later,


   The above condition causes another, 10 sec timeout...


"hdc: DMA Timeout error"



at which point the system is locked up hard.
Magic sysreq does not work at all. The hard drive activity light stays
fully lit.  Inserting printk's into the kernel, I find the hang to be in
a surprising place: 


ide_dma_timeout_retry() in ide-io.c 
  prints the "hdc: DMA Timeout error" then calls

  HWIF(drive)->ide_dma_end(drive);
which returns, and then calls 
  hwif->INB(IDE_STATUS_REG) which is needed as an argument to

ide_error()



But this hangs! -- The INB never returns.
Now:  hwif->INB = ide_inb; in ide-iops.c



So putting a printk into ide_inb() shows that
the printk before the readb() is printed, and the
printk after the readb is not (!!)



I find this rather surpriseing, as I can't imagine how the
readb can fail. My current vague theory is that doing this
readb makes the hard drive go really nuts, and it probably


   As I said, this is not the only way how it all might have gone nuts... :-)

ties some interrupt line high, and so the linux kernel 
gets stuck trying to handle the irq flood. I just don't know
enough about the i386 architecture, or about interrupts, to 
prove or disprove this.



Any suggestions, experiments, experimental patches, data gathering,
etc. is welcome. The sooner, the better... 



--linas


MBR, Sergei
-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-19 Thread Sergei Shtylyov

Hello.

Alan Cox wrote:
I can prepare a patch, but only with a lot of guidance. I can test 
& debug, I'm highly motivated just right now ... 



If you've got a nice repeatable problem please try using the libata
driver. That handles the error paths differently and doesn't try a FIFO
drain which might matter in this case I guess.


   FIFO drain for DMA commands?

MBR, Sergei
-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-19 Thread Alan Cox
On Tue, 19 Jun 2007 18:10:04 +0400
Sergei Shtylyov <[EMAIL PROTECTED]> wrote:

> Hello.
> 
> Alan Cox wrote:
> >>I can prepare a patch, but only with a lot of guidance. I can test 
> >>& debug, I'm highly motivated just right now ... 
> 
> > If you've got a nice repeatable problem please try using the libata
> > driver. That handles the error paths differently and doesn't try a FIFO
> > drain which might matter in this case I guess.
> 
> FIFO drain for DMA commands?

Welcome to the old IDE layer which I am so glad I left behind 8)

ide_ata_error will try and do a PIO flush regardless of the command type
if DRQ_STAT is asserted. See ide_dma_intr -> ide_error -> ...

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-19 Thread Sergei Shtylyov

Alan Cox wrote:

I can prepare a patch, but only with a lot of guidance. I can test 
& debug, I'm highly motivated just right now ... 



If you've got a nice repeatable problem please try using the libata
driver. That handles the error paths differently and doesn't try a FIFO
drain which might matter in this case I guess.



   FIFO drain for DMA commands?



Welcome to the old IDE layer which I am so glad I left behind 8)



ide_ata_error will try and do a PIO flush regardless of the command type
if DRQ_STAT is asserted. See ide_dma_intr -> ide_error -> ...


   Indeed... but the thing is we don't know what's asserted in this case -- 
remember, it's reading the status register that locks everything up...



Alan


MBR, Sergei
-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


bug in libata [was Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-19 Thread Linas Vepstas
On Mon, Jun 18, 2007 at 04:22:38PM -0500, linas wrote:
> On Mon, Jun 18, 2007 at 10:04:41PM +0100, Alan Cox wrote:
> > please try using the libata
> > driver. 

Its worse. I get a hard hang (sysrq doesn't work) during boot, 
just when the system goes to read the partition table.

Recap: this is an older dual cpu intel box, with a vintage
HTP366 (and not a newer HPT370) on the system planar.  I'm
testing a configration that works fine with the old ide
drivers.

It looks like libata and scsi comes up. The disk is correctly
recognized; i.e. its brand, model number & size are correctly 
reported. printk shows that it hangs in msdos_partition, trying 
to read the partition table. The drive light is on full-solid,
again suggesting a possible irq storm.

Same behaviour for both 2.6.22-rc5-git1 and for 2.6.22-rc4-mm2

I seem to have trouble turning on scsi logging; it doesn't
seem to generate any output ... 

Any suggestions on how to proceed? 

--linas
-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-19 Thread Linas Vepstas
Hi Sergei,

On Tue, Jun 19, 2007 at 06:07:07PM +0400, Sergei Shtylyov wrote:
> 
> [EMAIL PROTECTED] wrote:
> >I think reading the IDE status register clears the interrupt in the IDE
> >device, which might be causing the drive to think it's OK to generate
> >another interrupt.
> 
>This is not how IDE drives are supposed to act -- they won't proceed any 
> further until "interrupt pending" condition is cleared, so these aren't 
> supposed to be "stacked". This behavior however is not strictly specified 
> by ATA standards IIRC, but I can't readily imagine such situaltion anyway 
> unless tagged command queueing  (which is not supported by IDE core) and/or 
> ATAPI command overlapping is in action...

The problem only manifests during high io load; perhaps a missing mutex
somewhere is blasting one thing too many out to the hard drive?

> > This could either cause it to get stuck trying to
> >service an interrupt that is never getting cleared as you suggested, or
> >possibly when the next IRQ comes in the IDE IRQ handler gets stuck
> >waiting for a spinlock that the code you're looking at already owns...?
> 
>I could also imagine the HPT366 chip going mad and stalling the reads if 
> the taskfile regs forever because of the incomplete DMA or even the drive 
> going mad and not replying to I/O cycles with proper -IORDY handshake (i.e. 
> holding it low all the time)...

In my case, ctrl-alt-sysrq doesn't work, which makes it hard to debug.

I'm thinking that trying to debug libata is a better idea, rather than
investing time in ide, right?  Although at the moment, libata works even 
less; see other email.

--linas

-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-19 Thread Mark Lord

Sergei Shtylyov wrote:

Alan Cox wrote:

I can prepare a patch, but only with a lot of guidance. I can test 
& debug, I'm highly motivated just right now ... 



If you've got a nice repeatable problem please try using the libata
driver. That handles the error paths differently and doesn't try a FIFO
drain which might matter in this case I guess.



   FIFO drain for DMA commands?



Welcome to the old IDE layer which I am so glad I left behind 8)



ide_ata_error will try and do a PIO flush regardless of the command type
if DRQ_STAT is asserted. See ide_dma_intr -> ide_error -> ...


   Indeed... but the thing is we don't know what's asserted in this case 
-- remember, it's reading the status register that locks everything up...


Exactly.  And IORDY shouldn't really apply there,
unless some nitwit standards person wrote it into a spec..

-ml
-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-19 Thread Sergei Shtylyov

Mark Lord wrote:

I can prepare a patch, but only with a lot of guidance. I can test 
& debug, I'm highly motivated just right now ... 



If you've got a nice repeatable problem please try using the libata
driver. That handles the error paths differently and doesn't try a 
FIFO

drain which might matter in this case I guess.



   FIFO drain for DMA commands?



Welcome to the old IDE layer which I am so glad I left behind 8)



ide_ata_error will try and do a PIO flush regardless of the command type
if DRQ_STAT is asserted. See ide_dma_intr -> ide_error -> ...


   Indeed... but the thing is we don't know what's asserted in this 
case -- remember, it's reading the status register that locks 
everything up...



Exactly.  And IORDY shouldn't really apply there,
unless some nitwit standards person wrote it into a spec..


   Wrote what? IORDY throttling does *apply* to both data and non-data 
register accesses, of course.



-ml


MBR, Sergei
-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-19 Thread Sergei Shtylyov

Hello.

Linas Vepstas wrote:


[EMAIL PROTECTED] wrote:



I think reading the IDE status register clears the interrupt in the IDE
device, which might be causing the drive to think it's OK to generate
another interrupt.


  This is not how IDE drives are supposed to act -- they won't proceed any 
further until "interrupt pending" condition is cleared, so these aren't 
supposed to be "stacked". This behavior however is not strictly specified 
by ATA standards IIRC, but I can't readily imagine such situaltion anyway 
unless tagged command queueing  (which is not supported by IDE core) and/or 
ATAPI command overlapping is in action...



The problem only manifests during high io load; perhaps a missing mutex
somewhere is blasting one thing too many out to the hard drive?


   Hm... not sure about this.


This could either cause it to get stuck trying to
service an interrupt that is never getting cleared as you suggested, or
possibly when the next IRQ comes in the IDE IRQ handler gets stuck
waiting for a spinlock that the code you're looking at already owns...?


  I could also imagine the HPT366 chip going mad and stalling the reads if 
the taskfile regs forever because of the incomplete DMA or even the drive 
going mad and not replying to I/O cycles with proper -IORDY handshake (i.e. 
holding it low all the time)...



In my case, ctrl-alt-sysrq doesn't work, which makes it hard to debug.



I'm thinking that trying to debug libata is a better idea, rather than
investing time in ide, right?  Although at the moment, libata works even 
less; see other email.


   Which makes me think this really is some *hardware* issue.


--linas


MBR, Sergei
-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-19 Thread Alan Cox
> >Indeed... but the thing is we don't know what's asserted in this case 
> > -- remember, it's reading the status register that locks everything up...
> 
> Exactly.  And IORDY shouldn't really apply there,
> unless some nitwit standards person wrote it into a spec..

Could it be we need to reset the state machine at this point before we
touch the registers again - that wouldn't be the first controller with
this limit and undocumented.

On the 370 we already 

Linas; For the debug on the libata one turn on ATA_DEBUG and
ATA_VERBOSE_DEBUG in include/linux/libata.h and it should spew
diagnostics before the freeze. I suspect thats a different problem to the
hang you see now but I'd like to debug both.

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-19 Thread Sergei Shtylyov

Alan Cox wrote:
  Indeed... but the thing is we don't know what's asserted in this case 
-- remember, it's reading the status register that locks everything up...



Exactly.  And IORDY shouldn't really apply there,
unless some nitwit standards person wrote it into a spec..



Could it be we need to reset the state machine at this point before we
touch the registers again - that wouldn't be the first controller with
this limit and undocumented.


On the 370 we already 


   Yeah, that could be. And because IORDY pin becomes DSTROBE for UltraDMA it 
might have stuck low due to this (if the chip never asserted STOP)...



Alan


MBR, Sergei
-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-19 Thread Linas Vepstas
On Tue, Jun 19, 2007 at 08:10:25PM +0400, Sergei Shtylyov wrote:
> 
> >I'm thinking that trying to debug libata is a better idea, rather than
> >investing time in ide, right?  Although at the moment, libata works even 
> >less; see other email.
> 
>Which makes me think this really is some *hardware* issue.

There are two distinct issues.
-- libata locks up in partition table read on an hpt366+old maxtor disk
   that has ben working fine for many years with old ide driver. (It
   still works fine when I boot to the alternate ide-based kernel).

-- ide driver locks up on hpt366+new maxtor disk under heavy 
   i/o load. I was able to copy 60GB from old to new disk without a
   problem; however, raid reconstruction locks it up, maybe after 5-15
   seconds.

   This probably is "hardware related"; its something that the new 
   hard drive does. Given that its being sold at a big discount, it
   may even be that the sellers know that this is a crappy disk. :-)

   All I want is some way of resetting the disk, and continuing on.

I'm stalled in debugging; I'm not sue what I'm looking for.

--linas


-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


sata_promise error handling

2007-06-19 Thread Theo Baumgartner
Hello

I'm making a backup of some disks with dd (disk image) and sata_promise 
reported these errors:

--
ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata8.00: cmd c8/00:08:40:83:7a/00:00:00:00:00/e4 tag 0 cdb 0x0 data 4096 in
 res 50/00:00:47:83:7a/00:00:00:00:00/e4 Emask 0x1 (device error)
ata8.00: configured for UDMA/133
ata8: EH complete
sd 7:0:0:0: [sdh] 488397168 512-byte hardware sectors (250059 MB)
sd 7:0:0:0: [sdh] Write Protect is off
sd 7:0:0:0: [sdh] Mode Sense: 00 3a 00 00
sd 7:0:0:0: [sdh] Write cache: enabled, read cache: enabled, doesn't support 
DPO or FUA
ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata5.00: cmd 25/00:08:e8:af:02/00:00:10:00:00/e0 tag 0 cdb 0x0 data 4096 in
 res 50/00:00:ef:af:02/00:00:10:00:00/e0 Emask 0x1 (device error)
ata5.00: configured for UDMA/133
ata5: EH complete
sd 4:0:0:0: [sde] 488397168 512-byte hardware sectors (250059 MB)
sd 4:0:0:0: [sde] Write Protect is off
sd 4:0:0:0: [sde] Mode Sense: 00 3a 00 00
sd 4:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support 
DPO or FUA
--

Can I ignore them (EH handled them) or do i have to worry that the dd images 
are corrupted
(don't wanna make an md5sum of a 250gb disk and image)?

Theo
-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Pioneer DVR-111 problems with DMA

2007-06-19 Thread Mark Lord

Vlad wrote:

Hi,

I'd like to bring to your attention an article that claims the Pioneer
DVR-111 drive doesn't work with libata because the drive needs the
nodma option:


Not on my systems -- works fine.
-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-19 Thread Bartlomiej Zolnierkiewicz

Hi,

On Tuesday 19 June 2007, Linas Vepstas wrote:
> On Tue, Jun 19, 2007 at 08:10:25PM +0400, Sergei Shtylyov wrote:
> > 
> > >I'm thinking that trying to debug libata is a better idea, rather than
> > >investing time in ide, right?  Although at the moment, libata works even 
> > >less; see other email.
> > 
> >Which makes me think this really is some *hardware* issue.

Linas, have you checked that there are no firmware updates available
for this drive?

> There are two distinct issues.
> -- libata locks up in partition table read on an hpt366+old maxtor disk
>that has ben working fine for many years with old ide driver. (It
>still works fine when I boot to the alternate ide-based kernel).
> 
> -- ide driver locks up on hpt366+new maxtor disk under heavy 
>i/o load. I was able to copy 60GB from old to new disk without a
>problem; however, raid reconstruction locks it up, maybe after 5-15
>seconds.
> 
>This probably is "hardware related"; its something that the new 
>hard drive does. Given that its being sold at a big discount, it
>may even be that the sellers know that this is a crappy disk. :-)
> 
>All I want is some way of resetting the disk, and continuing on.

It would be useful to see hdparm --Istdout output for *both* disks.

> I'm stalled in debugging; I'm not sue what I'm looking for.

Sergei, do you think that testing the drive with DMA disabled may
tell us something new?

Thanks,
Bart
-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2.6.22-rc5 1/2] sata_promise: cleanups

2007-06-19 Thread Mikael Pettersson
This patch applies some trivial cleanups to sata_promise:
- repair whitespace damage
- correct comment at board_2057x_pata definition
- pull SATAII TX4 support code out to separate functions
- rename ata_nr to ata_no for consistency with libata's port_no
- remove some init-time debug printks (requested by Jeff)

This patch should cause no behavioural changes, except for
the removed printks.

Signed-off-by: Mikael Pettersson <[EMAIL PROTECTED]>
--
 drivers/ata/sata_promise.c |   56 ++---
 1 files changed, 23 insertions(+), 33 deletions(-)

--- linux-2.6.22-rc5/drivers/ata/sata_promise.c.~1~ 2007-06-19 
20:38:13.0 +0200
+++ linux-2.6.22-rc5/drivers/ata/sata_promise.c 2007-06-19 20:40:28.0 
+0200
@@ -45,8 +45,7 @@
 #include "sata_promise.h"
 
 #define DRV_NAME   "sata_promise"
-#define DRV_VERSION"2.07"
-
+#define DRV_VERSION"2.08"
 
 enum {
PDC_MAX_PORTS   = 4,
@@ -94,7 +93,7 @@ enum {
board_20319 = 2,/* FastTrak S150 TX4 */
board_20619 = 3,/* FastTrak TX4000 */
board_2057x = 4,/* SATAII150 Tx2plus */
-   board_2057x_pata= 5,/* SATAII150 Tx2plus */
+   board_2057x_pata= 5,/* SATAII150 Tx2plus PATA port */
board_40518 = 6,/* SATAII150 Tx4 */
 
PDC_HAS_PATA= (1 << 1), /* PDC20375/20575 has PATA */
@@ -124,7 +123,6 @@ enum {
PDC_FLAG_4_PORTS= (1 << 26), /* 4 ports */
 };
 
-
 struct pdc_port_priv {
u8  *pkt;
dma_addr_t  pkt_dma;
@@ -340,7 +338,6 @@ static const struct pci_device_id pdc_at
{ } /* terminate list */
 };
 
-
 static struct pci_driver pdc_ata_pci_driver = {
.name   = DRV_NAME,
.id_table   = pdc_ata_pci_tbl,
@@ -348,7 +345,6 @@ static struct pci_driver pdc_ata_pci_dri
.remove = ata_pci_remove_one,
 };
 
-
 static int pdc_common_port_start(struct ata_port *ap)
 {
struct device *dev = ap->host->dev;
@@ -438,7 +434,6 @@ static u32 pdc_sata_scr_read (struct ata
return readl(ap->ioaddr.scr_addr + (sc_reg * 4));
 }
 
-
 static void pdc_sata_scr_write (struct ata_port *ap, unsigned int sc_reg,
   u32 val)
 {
@@ -657,8 +652,8 @@ static void pdc_error_intr(struct ata_po
ata_port_abort(ap);
 }
 
-static inline unsigned int pdc_host_intr( struct ata_port *ap,
-  struct ata_queued_cmd *qc)
+static inline unsigned int pdc_host_intr(struct ata_port *ap,
+struct ata_queued_cmd *qc)
 {
unsigned int handled = 0;
void __iomem *port_mmio = ap->ioaddr.cmd_addr;
@@ -685,10 +680,10 @@ static inline unsigned int pdc_host_intr
handled = 1;
break;
 
-default:
+   default:
ap->stats.idle_irq++;
break;
-}
+   }
 
return handled;
 }
@@ -701,6 +696,18 @@ static void pdc_irq_clear(struct ata_por
readl(mmio + PDC_INT_SEQMASK);
 }
 
+static inline int pdc_is_sataii_tx4(unsigned long flags)
+{
+   const unsigned long mask = PDC_FLAG_GEN_II | PDC_FLAG_4_PORTS;
+   return (flags & mask) == mask;
+}
+
+static inline unsigned int pdc_port_no_to_ata_no(unsigned int port_no, int 
is_sataii_tx4)
+{
+   static const unsigned char sataii_tx4_port_remap[4] = { 3, 1, 0, 2};
+   return is_sataii_tx4 ? sataii_tx4_port_remap[port_no] : port_no;
+}
+
 static irqreturn_t pdc_interrupt (int irq, void *dev_instance)
 {
struct ata_host *host = dev_instance;
@@ -807,7 +814,6 @@ static void pdc_tf_load_mmio(struct ata_
ata_tf_load(ap, tf);
 }
 
-
 static void pdc_exec_command_mmio(struct ata_port *ap, const struct 
ata_taskfile *tf)
 {
WARN_ON (tf->protocol == ATA_PROT_DMA ||
@@ -867,7 +873,6 @@ static void pdc_ata_setup_port(struct at
ap->ioaddr.scr_addr = scr_addr;
 }
 
-
 static void pdc_host_init(struct ata_host *host)
 {
void __iomem *mmio = host->iomap[PDC_MMIO_BAR];
@@ -955,10 +960,8 @@ static int pdc_ata_init_one (struct pci_
 
if (pi->flags & PDC_FLAG_SATA_PATA) {
u8 tmp = readb(base + PDC_FLASH_CTL+1);
-   if (!(tmp & 0x80)) {
+   if (!(tmp & 0x80))
ppi[n_ports++] = pi + 1;
-   dev_printk(KERN_INFO, &pdev->dev, "PATA port found\n");
-   }
}
 
host = ata_host_alloc_pinfo(&pdev->dev, ppi, n_ports);
@@ -968,22 +971,12 @@ static int pdc_ata_init_one (struct pci_
}
host->iomap = pcim_iomap_table(pdev);
 
-   is_sataii_tx4 = 0;
-   if ((pi->flags & (PDC_FLAG_GEN_II|PDC_FLAG_4_PORTS)) == 
(PDC_FLAG_GEN_II|PDC_FLAG_4_PORTS)) {
-   is_sataii_tx4 = 1;
-   dev_printk(KERN_INFO, &pdev->dev, "applying SAT

[PATCH 2.6.22-rc5 2/2] sata_promise: SATA hotplug support

2007-06-19 Thread Mikael Pettersson
This patch enables hotplugging of SATA devices in the
sata_promise driver. It's been tested successfully on
both first- and second-generation Promise SATA chips:
SATA150 TX2plus, SATAII150 TX2plus, SATA300 TX2plus,
and SATA300 TX4.

The only quirk I've seen is that hotplugging (insertion)
on the first-generation SATA150 TX2plus requires a lengthier
EH sequence than on the second-generation chips.
On the second-generation chips a simple soft reset seems
to suffice, but on the first-generation chip there's a
"port is slow to respond" after the initial soft reset,
after which libata issues a hard reset, and then the
device is recognised.

The hotplug checks are high up in the interrupt handling
path, not deep down in error_intr as in ahci/sata_sil24.
That's because the chip doesn't signal hotplug status changes
in the per-port status register: instead a global register
contains hotplug control and status flags for all ports.
I considered following the ahci/sata_sil24 structure, but
that would have required non-trivial changes to the interrupt
handling path, so I chose to keep the hotplug changes simple
and unobtrusive.

Signed-off-by: Mikael Pettersson <[EMAIL PROTECTED]>
--
This patch depends on patch 1/2: sata_promise: cleanups.

Changes since the preliminary version:
- cleanups
- added testing on the first-generation SATA150 TX2plus

 drivers/ata/sata_promise.c |   40 +++-
 1 files changed, 35 insertions(+), 5 deletions(-)

--- linux-2.6.22-rc5/drivers/ata/sata_promise.c.~1~ 2007-06-19 
20:40:28.0 +0200
+++ linux-2.6.22-rc5/drivers/ata/sata_promise.c 2007-06-19 20:41:36.0 
+0200
@@ -45,7 +45,7 @@
 #include "sata_promise.h"
 
 #define DRV_NAME   "sata_promise"
-#define DRV_VERSION"2.08"
+#define DRV_VERSION"2.09"
 
 enum {
PDC_MAX_PORTS   = 4,
@@ -716,6 +716,9 @@ static irqreturn_t pdc_interrupt (int ir
unsigned int i, tmp;
unsigned int handled = 0;
void __iomem *mmio_base;
+   unsigned int hotplug_offset, ata_no;
+   u32 hotplug_status;
+   int is_sataii_tx4;
 
VPRINTK("ENTER\n");
 
@@ -726,10 +729,20 @@ static irqreturn_t pdc_interrupt (int ir
 
mmio_base = host->iomap[PDC_MMIO_BAR];
 
+   /* read and clear hotplug flags for all ports */
+   if (host->ports[0]->flags & PDC_FLAG_GEN_II)
+   hotplug_offset = PDC2_SATA_PLUG_CSR;
+   else
+   hotplug_offset = PDC_SATA_PLUG_CSR;
+   hotplug_status = readl(mmio_base + hotplug_offset);
+   if (hotplug_status & 0xff)
+   writel(hotplug_status | 0xff, mmio_base + hotplug_offset);
+   hotplug_status &= 0xff; /* clear uninteresting bits */
+
/* reading should also clear interrupts */
mask = readl(mmio_base + PDC_INT_SEQMASK);
 
-   if (mask == 0x) {
+   if (mask == 0x && hotplug_status == 0) {
VPRINTK("QUICK EXIT 2\n");
return IRQ_NONE;
}
@@ -737,16 +750,33 @@ static irqreturn_t pdc_interrupt (int ir
spin_lock(&host->lock);
 
mask &= 0x; /* only 16 tags possible */
-   if (!mask) {
+   if (mask == 0 && hotplug_status == 0) {
VPRINTK("QUICK EXIT 3\n");
goto done_irq;
}
 
writel(mask, mmio_base + PDC_INT_SEQMASK);
 
+   is_sataii_tx4 = pdc_is_sataii_tx4(host->ports[0]->flags);
+
for (i = 0; i < host->n_ports; i++) {
VPRINTK("port %u\n", i);
ap = host->ports[i];
+
+   /* check for a plug or unplug event */
+   ata_no = pdc_port_no_to_ata_no(i, is_sataii_tx4);
+   tmp = hotplug_status & (0x11 << ata_no);
+   if (tmp && ap &&
+   !(ap->flags & ATA_FLAG_DISABLED)) {
+   struct ata_eh_info *ehi = &ap->eh_info;
+   ata_ehi_clear_desc(ehi);
+   ata_ehi_hotplugged(ehi);
+   ata_ehi_push_desc(ehi, "hotplug_status %#x", tmp);
+   ata_port_freeze(ap);
+   continue;
+   }
+
+   /* check for a packet interrupt */
tmp = mask & (1 << (i + 1));
if (tmp && ap &&
!(ap->flags & ATA_FLAG_DISABLED)) {
@@ -902,9 +932,9 @@ static void pdc_host_init(struct ata_hos
tmp = readl(mmio + hotplug_offset);
writel(tmp | 0xff, mmio + hotplug_offset);
 
-   /* mask plug/unplug ints */
+   /* unmask plug/unplug ints */
tmp = readl(mmio + hotplug_offset);
-   writel(tmp | 0xff, mmio + hotplug_offset);
+   writel(tmp & ~0xff, mmio + hotplug_offset);
 
/* don't initialise TBG or SLEW on 2nd generation chips */
if (is_gen2)
-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordo

Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-19 Thread Sergei Shtylyov

Bartlomiej Zolnierkiewicz wrote:


There are two distinct issues.
-- libata locks up in partition table read on an hpt366+old maxtor disk
  that has ben working fine for many years with old ide driver. (It
  still works fine when I boot to the alternate ide-based kernel).


-- ide driver locks up on hpt366+new maxtor disk under heavy 
  i/o load. I was able to copy 60GB from old to new disk without a

  problem; however, raid reconstruction locks it up, maybe after 5-15
  seconds.


  This probably is "hardware related"; its something that the new 
  hard drive does. Given that its being sold at a big discount, it

  may even be that the sellers know that this is a crappy disk. :-)



  All I want is some way of resetting the disk, and continuing on.



It would be useful to see hdparm --Istdout output for *both* disks.



I'm stalled in debugging; I'm not sue what I'm looking for.



Sergei, do you think that testing the drive with DMA disabled may
tell us something new?


   Not sure. I'll try to come up with a patch esetting the state machine in 
dma_timeout() method (following Alan's idea) -- HPT366 regs are different 
enough to use the one for HPT370.



Thanks,
Bart


MBR, Sergei
-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


SiI 3124 bus reset issues

2007-06-19 Thread Jeff Gustafson
Hi all,
I found issue similar to the one I am having in the archive on this
list.  I never found a resolution to the problem in the archive.
I have a 4-port Sil 3124 64-bit PCI-X card in a Dell PowerEdge 750.  It
is connected to four SATA hard drives in an external drive bay that
provides power and hotswap rails.
There doesn't seem to be a problem running three drives, but as soon as
I do something with a forth drive all hell breaks loose.  The problem
doesn't seem localized to a slot in the drive bay or a particular drive.
The issue seems to only happen when I try to access a forth drive.
I have also tried the card in another system (in a 32-bit slot) and I
experience the same problem.

/proc/interrupts
   CPU0   CPU1   
  0:250  1   IO-APIC-edge  timer
  1:  8  0   IO-APIC-edge  i8042
  6:  2  0   IO-APIC-edge  floppy
  8:  1  0   IO-APIC-edge  rtc
  9:  1  0   IO-APIC-fasteoi   acpi
 12:  4  0   IO-APIC-edge  i8042
 14:   4002  0   IO-APIC-edge  libata
 15:   5192  0   IO-APIC-edge  libata
 16:  0  0   IO-APIC-fasteoi   uhci_hcd:usb1
 17:  0  0   IO-APIC-fasteoi   uhci_hcd:usb2
 18:  0  0   IO-APIC-fasteoi   ehci_hcd:usb3
 20:  0  0   IO-APIC-fasteoi   libata
 21:  0  0   IO-APIC-fasteoi   __tmp1014268147
 22:   8737  0   IO-APIC-fasteoi   eth0
 23:   1434  0   IO-APIC-fasteoi   eth1
NMI:  0  0 
LOC:  45044  36735 
ERR:  0
MIS:  0


Jun 12 13:10:06 silo2 kernel: ata3: waiting for device to spin up (8
secs)
Jun 12 13:10:15 silo2 kernel: ata3: soft resetting port
Jun 12 13:10:15 silo2 kernel: ata3: SATA link up 1.5 Gbps (SStatus 113
SControl 300)
Jun 12 13:10:15 silo2 kernel: ata3.00: ata_hpa_resize 1: sectors =
781422768, hpa_sectors = 781422768
Jun 12 13:10:15 silo2 kernel: ata3.00: ata_hpa_resize 1: sectors =
781422768, hpa_sectors = 781422768
Jun 12 13:10:15 silo2 kernel: ata3.00: configured for UDMA/100
Jun 12 13:10:15 silo2 kernel: ata3: EH complete
Jun 12 13:10:15 silo2 kernel: SCSI device sde: 781422768 512-byte hdwr
sectors (400088 MB)
Jun 12 13:10:15 silo2 kernel: sde: Write Protect is off
Jun 12 13:10:15 silo2 kernel: SCSI device sde: write cache: enabled,
read cache: enabled, doesn't support DPO or FUA
Jun 12 13:10:15 silo2 kernel: ata3.00: exception Emask 0x10 SAct 0x9ff
SErr 0x8 action 0x2 frozen
Jun 12 13:10:15 silo2 kernel: ata3.00: (irq_stat 0x01100010, PHY RDY
changed)
Jun 12 13:10:15 silo2 kernel: ata3.00: cmd
61/08:00:00:1c:00/00:00:00:00:00/40 tag 0 cdb 0x0 data 4096 out
Jun 12 13:10:15 silo2 kernel:  res
50/00:00:af:90:93/00:00:2e:00:00/e0 Emask 0x10 (ATA bus error)
Jun 12 13:10:15 silo2 kernel: ata3.00: cmd
61/e0:08:08:1c:00/00:00:00:00:00/40 tag 1 cdb 0x0 data 114688 out
Jun 12 13:10:15 silo2 kernel:  res
50/00:00:af:90:93/00:00:2e:00:00/e0 Emask 0x10 (ATA bus error)
Jun 12 13:10:15 silo2 kernel: ata3.00: cmd
61/c8:10:38:19:00/00:00:00:00:00/40 tag 2 cdb 0x0 data 102400 out
Jun 12 13:10:15 silo2 kernel:  res
50/00:00:af:90:93/00:00:2e:00:00/e0 Emask 0x10 (ATA bus error)
Jun 12 13:10:15 silo2 kernel: ata3.00: cmd
61/10:18:e8:1c:00/00:00:00:00:00/40 tag 3 cdb 0x0 data 8192 out
Jun 12 13:10:15 silo2 kernel:  res
50/00:00:af:90:93/00:00:2e:00:00/e0 Emask 0x10 (ATA bus error)
Jun 12 13:10:15 silo2 kernel: ata3.00: cmd
61/20:20:f8:1c:00/00:00:00:00:00/40 tag 4 cdb 0x0 data 16384 out
Jun 12 13:10:15 silo2 kernel:  res
50/00:00:af:90:93/00:00:2e:00:00/e0 Emask 0x10 (ATA bus error)
Jun 12 13:10:15 silo2 kernel: ata3.00: cmd
61/c8:28:00:1a:00/00:00:00:00:00/40 tag 5 cdb 0x0 data 102400 out
Jun 12 13:10:15 silo2 kernel:  res
50/00:00:af:90:93/00:00:2e:00:00/e0 Emask 0x10 (ATA bus error)
Jun 12 13:10:15 silo2 kernel: ata3.00: cmd
61/38:30:c8:1a:00/00:00:00:00:00/40 tag 6 cdb 0x0 data 28672 out
Jun 12 13:10:15 silo2 kernel:  res
50/00:00:af:90:93/00:00:2e:00:00/e0 Emask 0x10 (ATA bus error)
Jun 12 13:10:15 silo2 kernel: ata3.00: cmd
61/00:38:00:1b:00/01:00:00:00:00/40 tag 7 cdb 0x0 data 131072 out
Jun 12 13:10:15 silo2 kernel:  res
50/00:00:af:90:93/00:00:2e:00:00/e0 Emask 0x10 (ATA bus error)
Jun 12 13:10:15 silo2 kernel: ata3.00: cmd
61/48:40:18:1d:00/00:00:00:00:00/40 tag 8 cdb 0x0 data 36864 out
Jun 12 13:10:15 silo2 kernel:  res
50/00:00:af:90:93/00:00:2e:00:00/e0 Emask 0x10 (ATA bus error)
Jun 12 13:10:15 silo2 kernel: ata3.00: cmd
61/08:58:30:19:00/00:00:00:00:00/40 tag 11 cdb 0x0 data 4096 out
Jun 12 13:10:15 silo2 kernel:  res
50/00:00:af:90:93/00:00:2e:00:00/e0 Emask 0x10 (ATA bus error)
Jun 12 13:10:15 silo2 kernel: ata3: waiting for device to spin up (8
secs)
Jun 12 13:10:24 silo2 kernel: ata3: soft resetting port
Jun 12 13:10:24 silo2 kernel: ata3: SATA link up 1.5 

Re: sata_promise error handling

2007-06-19 Thread Mark
> I'm making a backup of some disks with dd (disk image) and sata_promise 
reported these errors:
> 
> --

> ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
> ata8.00: cmd c8/00:08:40:83:7a/00:00:00:00:00/e4 tag 0 cdb 0x0 data 4096 in
>  res 50/00:00:47:83:7a/00:00:00:00:00/e4 Emask 0x1 (device error)
> ata8.00: configured for UDMA/133
> ata8: EH complete
> sd 7:0:0:0: [sdh] 488397168 512-byte hardware sectors (250059 MB)
> sd 7:0:0:0: [sdh] Write Protect is off
> sd 7:0:0:0: [sdh] Mode Sense: 00 3a 00 00
> sd 7:0:0:0: [sdh] Write cache: enabled, read cache: enabled, doesn't support 
DPO or FUA
> ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
> ata5.00: cmd 25/00:08:e8:af:02/00:00:10:00:00/e0 tag 0 cdb 0x0 data 4096 in
>  res 50/00:00:ef:af:02/00:00:10:00:00/e0 Emask 0x1 (device error)
> ata5.00: configured for UDMA/133
> ata5: EH complete
> sd 4:0:0:0: [sde] 488397168 512-byte hardware sectors (250059 MB)
> sd 4:0:0:0: [sde] Write Protect is off
> sd 4:0:0:0: [sde] Mode Sense: 00 3a 00 00
> sd 4:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support 
DPO or FUA
> --

> 
> Can I ignore them (EH handled them) or do i have to worry that the dd images 
are corrupted
> (don't wanna make an md5sum of a 250gb disk and image)?
> 
> Theo
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ide" in
> the body of a message to majordomo  vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 


I get the same errors with my Promise SATA300 TX4 with 3 Samsung SATA II 
drives. I've been playing with it for a few days and I couldn't get to work 
reliably in anything but 2.6.22-rc5.

Now that I use 2.6.22-rc5, I get those errors for a few minutes then the driver 
hard resets and locks at 1.5 Gb/s. It seems reliable after that but I do get 
those errors every now and then under heavy I/O.

I read in another thread by a developer those errors are handled and no loss of 
data was occuring, they make me uneasy though.

After a fresh reboot, I get these errors for a while:

Jun 19 13:28:15 phoenix ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 
0x2
Jun 19 13:28:15 phoenix ata2.00: (port_status 0x2008)
Jun 19 13:28:15 phoenix ata2.00: cmd c8/00:00:3f:7a:08/00:00:00:00:00/e0 tag 0 
cdb 0x0 data 131072 in
Jun 19 13:28:15 phoenix res 50/00:00:3e:7b:08/00:00:00:00:00/e0 Emask 0x2 (HSM 
violation)
Jun 19 13:28:15 phoenix ata2: soft resetting port
Jun 19 13:28:15 phoenix ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Jun 19 13:28:15 phoenix ata2.00: ata_hpa_resize 1: sectors = 781422768, 
hpa_sectors = 781422768
Jun 19 13:28:15 phoenix ata2.00: ata_hpa_resize 1: sectors = 781422768, 
hpa_sectors = 781422768
Jun 19 13:28:15 phoenix ata2.00: configured for UDMA/133
Jun 19 13:28:15 phoenix ata2: EH complete
Jun 19 13:28:15 phoenix sd 1:0:0:0: [sdb] 781422768 512-byte hardware sectors 
(400088 MB)
Jun 19 13:28:15 phoenix sd 1:0:0:0: [sdb] Write Protect is off
Jun 19 13:28:15 phoenix sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
Jun 19 13:28:15 phoenix sd 1:0:0:0: [sdb] Write cache: enabled, read cache: 
enabled, doesn't support DPO or FUA
Jun 19 13:28:16 phoenix ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 
0x2
Jun 19 13:28:16 phoenix ata2.00: (port_status 0x2008)
Jun 19 13:28:16 phoenix ata2.00: cmd c8/00:48:bf:68:09/00:00:00:00:00/e0 tag 0 
cdb 0x0 data 36864 in
Jun 19 13:28:16 phoenix res 50/00:00:06:69:09/00:00:00:00:00/e0 Emask 0x2 (HSM 
violation)
Jun 19 13:28:17 phoenix ata2: soft resetting port
Jun 19 13:28:17 phoenix ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Jun 19 13:28:17 phoenix ata2.00: ata_hpa_resize 1: sectors = 781422768, 
hpa_sectors = 781422768
Jun 19 13:28:17 phoenix ata2.00: ata_hpa_resize 1: sectors = 781422768, 
hpa_sectors = 781422768
Jun 19 13:28:17 phoenix ata2.00: configured for UDMA/133
Jun 19 13:28:17 phoenix ata2: EH complete
Jun 19 13:28:17 phoenix sd 1:0:0:0: [sdb] 781422768 512-byte hardware sectors 
(400088 MB)
Jun 19 13:28:17 phoenix sd 1:0:0:0: [sdb] Write Protect is off
Jun 19 13:28:17 phoenix sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
Jun 19 13:28:17 phoenix sd 1:0:0:0: [sdb] Write cache: enabled, read cache: 
enabled, doesn't support DPO or FUA
Jun 19 13:28:20 phoenix ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 
0x2
Jun 19 13:28:20 phoenix ata2.00: (port_status 0x2008)
Jun 19 13:28:20 phoenix ata2.00: cmd c8/00:58:67:a2:0c/00:00:00:00:00/e0 tag 0 
cdb 0x0 data 45056 in
Jun 19 13:28:20 phoenix res 50/00:00:be:a2:0c/00:00:00:00:00/e0 Emask 0x2 (HSM 
violation)
Jun 19 13:28:20 phoenix ata2: soft resetting port
Jun 19 13:28:21 phoenix ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Jun 19 13:28:21 phoenix ata2.0

Random freezes with libata + ICH7 PATA

2007-06-19 Thread Patrick Nagel
Hello,

I tried libata with my ICH7 PATA controller for some (five? seven?) days. 
During those days I experienced four random system freezes. I guess this 
message is not all that helpful - I have no idea what information I should 
have collected so it could be used for debugging. But at least now you know 
there is a problem at all. ;)
After switching back to the old IDE driver, the system runs stable again for 
three days or so, so it's not a sudden coincidental hardware problem.

Some details about my system (a Samsung R65 laptop, custom harddisk):

$ lspci | grep IDE
00:1f.1 IDE interface: Intel Corporation 82801G (ICH7 Family) IDE Controller 
(rev 02)

$ dmesg | grep "hd[ab]: "
hda: HTS721010G9AT00, ATA DISK drive
hdb: DV-W28EA, ATAPI CD/DVD-ROM drive
hda: max request size: 512KiB
hda: 195371568 sectors (100030 MB) w/7539KiB Cache, CHS=16383/255/63, 
UDMA(100)
hda: cache flushes supported
 hda: hda1 hda2 hda3
hdb: ATAPI 24X DVD-ROM DVD-R-RAM CD-R/RW drive, 1419kB Cache, UDMA(33)

Kernel versions I tried (the problem appeared in both):
- 2.6.21 with CK patchset
- 2.6.21.5 with Gentoo patchset (http://dev.gentoo.org/~dsd/genpatches) 
(gentoo-sources-2.6.21-r3 package in Gentoo's repository)

There was no particularly heavy IDE load during the freezes.

If I can be of help with more information, please tell me what you need.

Patrick.

-- 
Key ID: 0x86E346D4http://patrick-nagel.net/key.asc
Fingerprint: 7745 E1BE FA8B FBAD 76AB 2BFC C981 E686 86E3 46D4


signature.asc
Description: This is a digitally signed message part.