Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-29 Thread Sergei Shtylyov
Hello, I wrote: I've got a hard lockup in the ide subsystem, probably due to some irq spew or something like that. I've just bought a brand new Maxtor 320GB disk driver for the insane price of $70 US to replace another failing drive. It works well under light load; I was able to copy about 6

Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-22 Thread Sergei Shtylyov
Linas Vepstas wrote: I've got a hard lockup in the ide subsystem, probably due to some irq spew or something like that. I've just bought a brand new Maxtor 320GB disk driver for the insane price of $70 US to replace another failing drive. It works well under light load; I was able to copy ab

Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-21 Thread Alan Cox
> sd 0:0:0:0 [sda] Done: 0xeff3aba0 TIMEOUT > sd 0:0:0:0 [sda] Result: host_byte=DID_OK driver_byte=DRV_OK, SUG_OK > sd 0:0:0:0 [sda] CDB: Read(10): 28 00 00 ... 00 08 00 > sd 0:0:0:0 [sda] scsi host busy 1 failed 0 > ata_scsi_timed_out: ENTER > ata_scsi_timed_out: EXIT, ret=0 > ata_port_flush_task

Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-21 Thread Alan Cox
> > queue ? You are overestimating IDE ;) > > He's not -- there is queued commands support since ATA[PI]-5. I'm not > sure > why but Linux decided not to support it. Almost no hardware supports it and the functionality is really really ugly to use when it works at all - NCQ is rather more e

Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-21 Thread Linas Vepstas
On Wed, Jun 20, 2007 at 06:01:23PM +0100, Alan Cox wrote: > > Its unlikely the command got lost. The IRQ could have done but the error > path tries to spot that case by reading the status register - which > hangs. So in theory it could be a lost IRQ and if the reset works we'll > find that out. O

Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-21 Thread Sergei Shtylyov
Hello. Alan Cox wrote: Google seems to show that there is no publically available firmware updates for Maxtor disks. There are for some but only if you irritate the tech support people. hours at high cpu usage There were maybe a a dozen DriveReady SeekComplete Timeout errors clustered

Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-20 Thread Alan Cox
> Google seems to show that there is no publically available > firmware updates for Maxtor disks. There are for some but only if you irritate the tech support people. > hours at high cpu usage There were maybe a a dozen DriveReady > SeekComplete Timeout errors clustered a few minutes apart.

Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-20 Thread Linas Vepstas
On Wed, Jun 20, 2007 at 12:07:19AM +0400, Sergei Shtylyov wrote: > Bartlomiej Zolnierkiewicz wrote: > > [...frmware...] Google seems to show that there is no publically available firmware updates for Maxtor disks. > >It would be useful to see hdparm --Istdout output for *both* disks. Lets do o

Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-19 Thread Sergei Shtylyov
Bartlomiej Zolnierkiewicz wrote: There are two distinct issues. -- libata locks up in partition table read on an hpt366+old maxtor disk that has ben working fine for many years with old ide driver. (It still works fine when I boot to the alternate ide-based kernel). -- ide driver locks up

Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-19 Thread Bartlomiej Zolnierkiewicz
Hi, On Tuesday 19 June 2007, Linas Vepstas wrote: > On Tue, Jun 19, 2007 at 08:10:25PM +0400, Sergei Shtylyov wrote: > > > > >I'm thinking that trying to debug libata is a better idea, rather than > > >investing time in ide, right? Although at the moment, libata works even > > >less; see other

Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-19 Thread Linas Vepstas
On Tue, Jun 19, 2007 at 08:10:25PM +0400, Sergei Shtylyov wrote: > > >I'm thinking that trying to debug libata is a better idea, rather than > >investing time in ide, right? Although at the moment, libata works even > >less; see other email. > >Which makes me think this really is some *hard

Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-19 Thread Sergei Shtylyov
Alan Cox wrote: Indeed... but the thing is we don't know what's asserted in this case -- remember, it's reading the status register that locks everything up... Exactly. And IORDY shouldn't really apply there, unless some nitwit standards person wrote it into a spec.. Could it be we need

Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-19 Thread Alan Cox
> >Indeed... but the thing is we don't know what's asserted in this case > > -- remember, it's reading the status register that locks everything up... > > Exactly. And IORDY shouldn't really apply there, > unless some nitwit standards person wrote it into a spec.. Could it be we need to res

Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-19 Thread Sergei Shtylyov
Hello. Linas Vepstas wrote: [EMAIL PROTECTED] wrote: I think reading the IDE status register clears the interrupt in the IDE device, which might be causing the drive to think it's OK to generate another interrupt. This is not how IDE drives are supposed to act -- they won't proceed any

Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-19 Thread Sergei Shtylyov
Mark Lord wrote: I can prepare a patch, but only with a lot of guidance. I can test & debug, I'm highly motivated just right now ... If you've got a nice repeatable problem please try using the libata driver. That handles the error paths differently and doesn't try a FIFO drain which might

Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-19 Thread Mark Lord
Sergei Shtylyov wrote: Alan Cox wrote: I can prepare a patch, but only with a lot of guidance. I can test & debug, I'm highly motivated just right now ... If you've got a nice repeatable problem please try using the libata driver. That handles the error paths differently and doesn't try a F

Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-19 Thread Linas Vepstas
Hi Sergei, On Tue, Jun 19, 2007 at 06:07:07PM +0400, Sergei Shtylyov wrote: > > [EMAIL PROTECTED] wrote: > >I think reading the IDE status register clears the interrupt in the IDE > >device, which might be causing the drive to think it's OK to generate > >another interrupt. > >This is not ho

bug in libata [was Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-19 Thread Linas Vepstas
On Mon, Jun 18, 2007 at 04:22:38PM -0500, linas wrote: > On Mon, Jun 18, 2007 at 10:04:41PM +0100, Alan Cox wrote: > > please try using the libata > > driver. Its worse. I get a hard hang (sysrq doesn't work) during boot, just when the system goes to read the partition table. Recap: this is an

Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-19 Thread Sergei Shtylyov
Alan Cox wrote: I can prepare a patch, but only with a lot of guidance. I can test & debug, I'm highly motivated just right now ... If you've got a nice repeatable problem please try using the libata driver. That handles the error paths differently and doesn't try a FIFO drain which might ma

Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-19 Thread Alan Cox
On Tue, 19 Jun 2007 18:10:04 +0400 Sergei Shtylyov <[EMAIL PROTECTED]> wrote: > Hello. > > Alan Cox wrote: > >>I can prepare a patch, but only with a lot of guidance. I can test > >>& debug, I'm highly motivated just right now ... > > > If you've got a nice repeatable problem please try using

Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-19 Thread Sergei Shtylyov
Hello. Alan Cox wrote: I can prepare a patch, but only with a lot of guidance. I can test & debug, I'm highly motivated just right now ... If you've got a nice repeatable problem please try using the libata driver. That handles the error paths differently and doesn't try a FIFO drain which m

Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-19 Thread Sergei Shtylyov
as Sent: Monday, June 18, 2007 12:57 PM To: [EMAIL PROTECTED]; linux-kernel@vger.kernel.org Subject: [BUG] ide dma_timer_expiry, then hard lockup I've got a hard lockup in the ide subsystem, probably due to some irq spew or something like that. I've just bought a brand new Maxtor 320GB d

Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-18 Thread Linas Vepstas
On Mon, Jun 18, 2007 at 10:04:41PM +0100, Alan Cox wrote: > > If you've got a nice repeatable problem Very highly repeatable :-( > please try using the libata > driver. That handles the error paths differently and doesn't try a FIFO > drain which might matter in this case I guess. Dohh, yes, o

Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-18 Thread Alan Cox
> So what do you suggest? (I could buy an alternate ide controller, > and hope that goes away, or just buy a different hard drive. But > that's beside the point). The DMA timeout itself could be all sorts of things - crap driver, crap hardware, PCI bus contention, noise, problem disk, phase of the

Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-18 Thread Linas Vepstas
On Mon, Jun 18, 2007 at 09:27:04PM +0100, Alan Cox wrote: > > ide_dma_timeout_retry() in ide-io.c > > prints the "hdc: DMA Timeout error" then calls > > HWIF(drive)->ide_dma_end(drive); > > which returns, and then calls > > hwif->INB(IDE_STATUS_REG) which is needed as an argument to ide

Re: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-18 Thread Alan Cox
> ide_dma_timeout_retry() in ide-io.c > prints the "hdc: DMA Timeout error" then calls > HWIF(drive)->ide_dma_end(drive); > which returns, and then calls > hwif->INB(IDE_STATUS_REG) which is needed as an argument to ide_error() > > But this hangs! -- The INB never returns. > Now: hwif

RE: [BUG] ide dma_timer_expiry, then hard lockup

2007-06-18 Thread Stuart_Hayes
y the system is hanging. Stuart -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Linas Vepstas Sent: Monday, June 18, 2007 12:57 PM To: [EMAIL PROTECTED]; linux-kernel@vger.kernel.org Subject: [BUG] ide dma_timer_expiry, then hard lockup I've got a ha

[BUG] ide dma_timer_expiry, then hard lockup

2007-06-18 Thread Linas Vepstas
I've got a hard lockup in the ide subsystem, probably due to some irq spew or something like that. I've just bought a brand new Maxtor 320GB disk driver for the insane price of $70 US to replace another failing drive. It works well under light load; I was able to copy about 60GB to it. However