Re: SATA exceptions

2007-07-13 Thread S.Çağlar Onur
13 Tem 2007 Cum tarihinde, Tejun Heo şunları yazmıştı: > >> OS and driver can't really do much about the reallocation event. Some > >> number of reallocations is okay but if you it going up constantly, you > >> probably have a dying disk. > > > > Hmm... cut the power while writing is doable from

Re: SATA exceptions

2007-07-13 Thread S.Çağlar Onur
13 Tem 2007 Cum tarihinde, Tejun Heo şunları yazmıştı: OS and driver can't really do much about the reallocation event. Some number of reallocations is okay but if you it going up constantly, you probably have a dying disk. Hmm... cut the power while writing is doable from OS and might

Re: SATA exceptions

2007-07-12 Thread Tejun Heo
Pavel Machek wrote: Your SMART log shows 309 reallocated sectors. That seems somewhat high.. >>> Ah sorry to misinterpret the content:), its a quiet new piece of hardware >>> (at >>> most ~1.5 month old) and "Reallocated_Event_Count" constantly increases >>> (currently its increased to

Re: SATA exceptions

2007-07-12 Thread Pavel Machek
Hi! > >> Your SMART log shows 309 reallocated sectors. That seems somewhat high.. > > > > Ah sorry to misinterpret the content:), its a quiet new piece of hardware > > (at > > most ~1.5 month old) and "Reallocated_Event_Count" constantly increases > > (currently its increased to 313) and

Re: SATA exceptions

2007-07-12 Thread Pavel Machek
Hi! Your SMART log shows 309 reallocated sectors. That seems somewhat high.. Ah sorry to misinterpret the content:), its a quiet new piece of hardware (at most ~1.5 month old) and Reallocated_Event_Count constantly increases (currently its increased to 313) and although i'm not

Re: SATA exceptions

2007-07-12 Thread Tejun Heo
Pavel Machek wrote: Your SMART log shows 309 reallocated sectors. That seems somewhat high.. Ah sorry to misinterpret the content:), its a quiet new piece of hardware (at most ~1.5 month old) and Reallocated_Event_Count constantly increases (currently its increased to 313) and although

Re: SATA exceptions

2007-07-11 Thread Tejun Heo
Mark Lord wrote: > I'm not even sure how to interpret those numbers. > It seems rather odd that nearly all fields are either "100" or "253", > so those are probably pre-programmed numbers rather than actual counts. > The raw value at the end of the line (for the various "Reallocated*" > fields) >

Re: SATA exceptions

2007-07-11 Thread Mark Lord
S.Çag(lar Onur wrote: Hi; 07 Tem 2007 Cts tarihinde, Robert Hancock şunları yazmıştı: It's not the free space on the drive that matters, it's the number of free sectors in the spare sector pool on the drive, which is invisible to software. Your SMART log shows 309 reallocated sectors.

Re: SATA exceptions

2007-07-11 Thread Bill Davidsen
Tejun Heo wrote: Hello, S.Çağlar Onur wrote: 07 Tem 2007 Cts tarihinde, Robert Hancock şunları yazmıştı: It's not the free space on the drive that matters, it's the number of free sectors in the spare sector pool on the drive, which is invisible to software. Your SMART log shows 309

Re: SATA exceptions

2007-07-11 Thread Bill Davidsen
Tejun Heo wrote: Hello, S.Çağlar Onur wrote: 07 Tem 2007 Cts tarihinde, Robert Hancock şunları yazmıştı: It's not the free space on the drive that matters, it's the number of free sectors in the spare sector pool on the drive, which is invisible to software. Your SMART log shows 309

Re: SATA exceptions

2007-07-11 Thread Mark Lord
S.Çag(lar Onur wrote: Hi; 07 Tem 2007 Cts tarihinde, Robert Hancock şunları yazmıştı: It's not the free space on the drive that matters, it's the number of free sectors in the spare sector pool on the drive, which is invisible to software. Your SMART log shows 309 reallocated sectors.

Re: SATA exceptions

2007-07-11 Thread Tejun Heo
Mark Lord wrote: I'm not even sure how to interpret those numbers. It seems rather odd that nearly all fields are either 100 or 253, so those are probably pre-programmed numbers rather than actual counts. The raw value at the end of the line (for the various Reallocated* fields) is probably

Re: SATA exceptions

2007-07-09 Thread S.Çağlar Onur
Hi; 09 Tem 2007 Pts tarihinde, Tejun Heo şunları yazmıştı: > > 07 Tem 2007 Cts tarihinde, Robert Hancock şunları yazmıştı: > >> It's not the free space on the drive that matters, it's the number of > >> free sectors in the spare sector pool on the drive, which is invisible > >> to software. > >>

Re: SATA exceptions

2007-07-09 Thread Tejun Heo
Hello, S.Çağlar Onur wrote: > 07 Tem 2007 Cts tarihinde, Robert Hancock şunları yazmıştı: >> It's not the free space on the drive that matters, it's the number of >> free sectors in the spare sector pool on the drive, which is invisible >> to software. >> >> Your SMART log shows 309 reallocated

Re: SATA exceptions

2007-07-09 Thread Tejun Heo
Hello, S.Çağlar Onur wrote: 07 Tem 2007 Cts tarihinde, Robert Hancock şunları yazmıştı: It's not the free space on the drive that matters, it's the number of free sectors in the spare sector pool on the drive, which is invisible to software. Your SMART log shows 309 reallocated sectors.

Re: SATA exceptions

2007-07-09 Thread S.Çağlar Onur
Hi; 09 Tem 2007 Pts tarihinde, Tejun Heo şunları yazmıştı: 07 Tem 2007 Cts tarihinde, Robert Hancock şunları yazmıştı: It's not the free space on the drive that matters, it's the number of free sectors in the spare sector pool on the drive, which is invisible to software. Your SMART

Re: SATA exceptions

2007-07-07 Thread S.Çağlar Onur
Hi; 07 Tem 2007 Cts tarihinde, Robert Hancock şunları yazmıştı: > It's not the free space on the drive that matters, it's the number of > free sectors in the spare sector pool on the drive, which is invisible > to software. > > Your SMART log shows 309 reallocated sectors. That seems somewhat

Re: SATA exceptions

2007-07-07 Thread Robert Hancock
S.Çağlar Onur wrote: 06 Tem 2007 Cum tarihinde, Tejun Heo şunları yazmıştı: S.Çağlar Onur wrote: [ 4260.278427] ata1.00: cmd ca/00:08:d0:88:bc/00:00:00:00:00/ee tag 0 cdb 0x0 data 4096 out [ 4260.278430] res 51/40:01:d7:88:bc/00:00:0e:00:00/ee Emask 0x9 (media error) That's media

Re: SATA exceptions

2007-07-07 Thread Robert Hancock
S.Çağlar Onur wrote: 06 Tem 2007 Cum tarihinde, Tejun Heo şunları yazmıştı: S.Çağlar Onur wrote: [ 4260.278427] ata1.00: cmd ca/00:08:d0:88:bc/00:00:00:00:00/ee tag 0 cdb 0x0 data 4096 out [ 4260.278430] res 51/40:01:d7:88:bc/00:00:0e:00:00/ee Emask 0x9 (media error) That's media

Re: SATA exceptions

2007-07-07 Thread S.Çağlar Onur
Hi; 07 Tem 2007 Cts tarihinde, Robert Hancock şunları yazmıştı: It's not the free space on the drive that matters, it's the number of free sectors in the spare sector pool on the drive, which is invisible to software. Your SMART log shows 309 reallocated sectors. That seems somewhat high..

Re: SATA exceptions

2007-07-06 Thread S.Çağlar Onur
Hi; 06 Tem 2007 Cum tarihinde, Tejun Heo şunları yazmıştı: > S.Çağlar Onur wrote: > > [ 4260.278427] ata1.00: cmd ca/00:08:d0:88:bc/00:00:00:00:00/ee tag 0 cdb > > 0x0 data 4096 out > > [ 4260.278430] res 51/40:01:d7:88:bc/00:00:0e:00:00/ee Emask 0x9 > > (media error) > > That's media

Re: SATA exceptions

2007-07-06 Thread S.Çağlar Onur
Hi; 06 Tem 2007 Cum tarihinde, Tejun Heo şunları yazmıştı: S.Çağlar Onur wrote: [ 4260.278427] ata1.00: cmd ca/00:08:d0:88:bc/00:00:00:00:00/ee tag 0 cdb 0x0 data 4096 out [ 4260.278430] res 51/40:01:d7:88:bc/00:00:0e:00:00/ee Emask 0x9 (media error) That's media error on

Re: SATA exceptions

2007-07-05 Thread Tejun Heo
Hello, S.Çağlar Onur wrote: > [ 4260.278427] ata1.00: cmd ca/00:08:d0:88:bc/00:00:00:00:00/ee tag 0 cdb 0x0 > data 4096 out > [ 4260.278430] res 51/40:01:d7:88:bc/00:00:0e:00:00/ee Emask 0x9 > (media error) That's media error on sector 247236823 on WRITE. Media errors on write are

Re: SATA exceptions

2007-07-05 Thread Tejun Heo
Hello, S.Çağlar Onur wrote: [ 4260.278427] ata1.00: cmd ca/00:08:d0:88:bc/00:00:00:00:00/ee tag 0 cdb 0x0 data 4096 out [ 4260.278430] res 51/40:01:d7:88:bc/00:00:0e:00:00/ee Emask 0x9 (media error) That's media error on sector 247236823 on WRITE. Media errors on write are bad

Re: SATA exceptions with 2.6.20-rc5

2007-02-09 Thread Björn Steinbrink
On 2007.02.04 02:13:51 +0100, Björn Steinbrink wrote: > On 2007.02.02 23:48:14 -0600, Robert Hancock wrote: > > There's a patch in -mm (sata_nv-use-adma-for-nodata-commands.patch) > > which should hopefully avoid this problem for the cache flush commands, > > at least - can you try that one out?

Re: SATA exceptions with 2.6.20-rc5

2007-02-09 Thread Björn Steinbrink
On 2007.02.04 02:13:51 +0100, Björn Steinbrink wrote: On 2007.02.02 23:48:14 -0600, Robert Hancock wrote: There's a patch in -mm (sata_nv-use-adma-for-nodata-commands.patch) which should hopefully avoid this problem for the cache flush commands, at least - can you try that one out? You'll

Re: SATA exceptions with 2.6.20-rc5

2007-02-03 Thread Björn Steinbrink
On 2007.02.02 23:48:14 -0600, Robert Hancock wrote: > Björn Steinbrink wrote: > >On 2007.01.24 01:39:23 +0100, Björn Steinbrink wrote: > >>On 2007.01.23 17:18:43 -0600, Robert Hancock wrote: > >>>Larry Walton wrote: > The last patch (sata_nv-force-int-dev-in-interrupt.patch) > seems to

Re: SATA exceptions with 2.6.20-rc5

2007-02-03 Thread Björn Steinbrink
On 2007.02.02 23:48:14 -0600, Robert Hancock wrote: Björn Steinbrink wrote: On 2007.01.24 01:39:23 +0100, Björn Steinbrink wrote: On 2007.01.23 17:18:43 -0600, Robert Hancock wrote: Larry Walton wrote: The last patch (sata_nv-force-int-dev-in-interrupt.patch) seems to have fix the problem.

Re: SATA exceptions with 2.6.20-rc5

2007-02-02 Thread Robert Hancock
Björn Steinbrink wrote: On 2007.01.24 01:39:23 +0100, Björn Steinbrink wrote: On 2007.01.23 17:18:43 -0600, Robert Hancock wrote: Larry Walton wrote: The last patch (sata_nv-force-int-dev-in-interrupt.patch) seems to have fix the problem. Much appreciated, thank you. I'd consider it a must

Re: SATA exceptions with 2.6.20-rc5

2007-02-02 Thread Björn Steinbrink
On 2007.01.24 01:39:23 +0100, Björn Steinbrink wrote: > On 2007.01.23 17:18:43 -0600, Robert Hancock wrote: > > Larry Walton wrote: > > >The last patch (sata_nv-force-int-dev-in-interrupt.patch) > > >seems to have fix the problem. Much appreciated, > > >thank you. I'd consider it a must have in

Re: SATA exceptions with 2.6.20-rc5

2007-02-02 Thread Björn Steinbrink
On 2007.01.24 01:39:23 +0100, Björn Steinbrink wrote: On 2007.01.23 17:18:43 -0600, Robert Hancock wrote: Larry Walton wrote: The last patch (sata_nv-force-int-dev-in-interrupt.patch) seems to have fix the problem. Much appreciated, thank you. I'd consider it a must have in 2.6.20.

Re: SATA exceptions with 2.6.20-rc5

2007-02-02 Thread Robert Hancock
Björn Steinbrink wrote: On 2007.01.24 01:39:23 +0100, Björn Steinbrink wrote: On 2007.01.23 17:18:43 -0600, Robert Hancock wrote: Larry Walton wrote: The last patch (sata_nv-force-int-dev-in-interrupt.patch) seems to have fix the problem. Much appreciated, thank you. I'd consider it a must

Re: SATA exceptions with 2.6.20-rc5

2007-01-24 Thread Björn Steinbrink
On 2007.01.24 09:24:00 +0100, Ian Kumlien wrote: > On tis, 2007-01-23 at 17:18 -0600, Robert Hancock wrote: > > Larry Walton wrote: > > > The last patch (sata_nv-force-int-dev-in-interrupt.patch) > > > seems to have fix the problem. Much appreciated, > > > thank you. I'd consider it a must have

Re: SATA exceptions with 2.6.20-rc5

2007-01-24 Thread Ian Kumlien
On tis, 2007-01-23 at 17:18 -0600, Robert Hancock wrote: > Larry Walton wrote: > > The last patch (sata_nv-force-int-dev-in-interrupt.patch) > > seems to have fix the problem. Much appreciated, > > thank you. I'd consider it a must have in 2.6.20. > > Can any of the rest of you that have been

Re: SATA exceptions with 2.6.20-rc5

2007-01-24 Thread Ian Kumlien
On tis, 2007-01-23 at 17:18 -0600, Robert Hancock wrote: Larry Walton wrote: The last patch (sata_nv-force-int-dev-in-interrupt.patch) seems to have fix the problem. Much appreciated, thank you. I'd consider it a must have in 2.6.20. Can any of the rest of you that have been seeing

Re: SATA exceptions with 2.6.20-rc5

2007-01-24 Thread Björn Steinbrink
On 2007.01.24 09:24:00 +0100, Ian Kumlien wrote: On tis, 2007-01-23 at 17:18 -0600, Robert Hancock wrote: Larry Walton wrote: The last patch (sata_nv-force-int-dev-in-interrupt.patch) seems to have fix the problem. Much appreciated, thank you. I'd consider it a must have in 2.6.20.

Re: SATA exceptions with 2.6.20-rc5

2007-01-23 Thread Björn Steinbrink
On 2007.01.23 17:18:43 -0600, Robert Hancock wrote: > Larry Walton wrote: > >The last patch (sata_nv-force-int-dev-in-interrupt.patch) > >seems to have fix the problem. Much appreciated, > >thank you. I'd consider it a must have in 2.6.20. > > Can any of the rest of you that have been seeing

Re: SATA exceptions with 2.6.20-rc5

2007-01-23 Thread Robert Hancock
Larry Walton wrote: The last patch (sata_nv-force-int-dev-in-interrupt.patch) seems to have fix the problem. Much appreciated, thank you. I'd consider it a must have in 2.6.20. Can any of the rest of you that have been seeing this problem also confirm that this fixes it? -- Robert Hancock

Re: SATA exceptions with 2.6.20-rc5

2007-01-23 Thread Larry Walton
The last patch (sata_nv-force-int-dev-in-interrupt.patch) seems to have fix the problem. Much appreciated, thank you. I'd consider it a must have in 2.6.20. -- *--* Mail: [EMAIL PROTECTED] *--* Voice: 206.892.6269 *--* Cell: 206.225.0154 *--* HTTP://real.com

Re: SATA exceptions with 2.6.20-rc5

2007-01-23 Thread Larry Walton
The last patch (sata_nv-force-int-dev-in-interrupt.patch) seems to have fix the problem. Much appreciated, thank you. I'd consider it a must have in 2.6.20. -- *--* Mail: [EMAIL PROTECTED] *--* Voice: 206.892.6269 *--* Cell: 206.225.0154 *--* HTTP://real.com

Re: SATA exceptions with 2.6.20-rc5

2007-01-23 Thread Robert Hancock
Larry Walton wrote: The last patch (sata_nv-force-int-dev-in-interrupt.patch) seems to have fix the problem. Much appreciated, thank you. I'd consider it a must have in 2.6.20. Can any of the rest of you that have been seeing this problem also confirm that this fixes it? -- Robert Hancock

Re: SATA exceptions with 2.6.20-rc5

2007-01-23 Thread Björn Steinbrink
On 2007.01.23 17:18:43 -0600, Robert Hancock wrote: Larry Walton wrote: The last patch (sata_nv-force-int-dev-in-interrupt.patch) seems to have fix the problem. Much appreciated, thank you. I'd consider it a must have in 2.6.20. Can any of the rest of you that have been seeing this

Re: SATA exceptions with 2.6.20-rc5

2007-01-22 Thread Robert Hancock
Björn Steinbrink wrote: Hm, I don't think it is unhappy about looking at NV_INT_STATUS_CK804. I'm running 2.6.20-rc5 with the INT_DEV check removed for 8 hours now without a single problem and that should still look at NV_INT_STATUS_CK804, right? I just noticed that my last email might not have

Re: SATA exceptions with 2.6.20-rc5

2007-01-22 Thread Björn Steinbrink
On 2007.01.22 19:24:22 -0600, Robert Hancock wrote: > Björn Steinbrink wrote: > >>>Running a kernel with the return statement replace by a line that prints > >>>the irq_stat instead. > >>> > >>>Currently I'm seeing lots of 0x10 on ata1 and 0x0 on ata2. > >>40 minutes stress test now and no

Re: SATA exceptions with 2.6.20-rc5

2007-01-22 Thread Robert Hancock
Alistair John Strachan wrote: On Tuesday 23 January 2007 01:24, Robert Hancock wrote: As a final aside, this is another case where the hardware docs for this controller would really be useful, in order to know whether we are actually supposed to be reading that register in ADMA mode or not. I

Re: SATA exceptions with 2.6.20-rc5

2007-01-22 Thread Alistair John Strachan
On Tuesday 23 January 2007 01:24, Robert Hancock wrote: > As a final aside, this is another case where the hardware docs for this > controller would really be useful, in order to know whether we are > actually supposed to be reading that register in ADMA mode or not. I > sent a query to Allen

Re: SATA exceptions with 2.6.20-rc5

2007-01-22 Thread Robert Hancock
Björn Steinbrink wrote: Running a kernel with the return statement replace by a line that prints the irq_stat instead. Currently I'm seeing lots of 0x10 on ata1 and 0x0 on ata2. 40 minutes stress test now and no exception yet. What's interesting is that ata1 saw exactly one interrupt with

Re: SATA exceptions with 2.6.20-rc5

2007-01-22 Thread Eric D. Mudama
On 1/15/07, Jeff Garzik <[EMAIL PROTECTED]> wrote: Jens Axboe wrote: > On Mon, Jan 15 2007, Jeff Garzik wrote: >> Jens Axboe wrote: >>> I'd be surprised if the device would not obey the 7 second timeout rule >>> that seems to be set in stone and not allow more dirty in-drive cache >>> than it

Re: SATA exceptions with 2.6.20-rc5

2007-01-22 Thread Björn Steinbrink
On 2007.01.22 17:57:08 +0100, Björn Steinbrink wrote: > On 2007.01.22 17:12:40 +0100, Björn Steinbrink wrote: > > On 2007.01.21 18:17:01 -0600, Robert Hancock wrote: > > > Hmm, another miss, apparently.. Has anyone tried removing these lines > > > >from nv_host_intr in 2.6.20-rc5 sata_nv.c and see

Re: SATA exceptions with 2.6.20-rc5

2007-01-22 Thread Björn Steinbrink
On 2007.01.22 17:12:40 +0100, Björn Steinbrink wrote: > On 2007.01.21 18:17:01 -0600, Robert Hancock wrote: > > Björn Steinbrink wrote: > > >On 2007.01.21 13:58:01 -0600, Robert Hancock wrote: > > >>Björn Steinbrink wrote: > > >>>All kernels were bad using that approach. So back to square 1. :/ >

Re: SATA exceptions with 2.6.20-rc5

2007-01-22 Thread Björn Steinbrink
On 2007.01.21 18:17:01 -0600, Robert Hancock wrote: > Björn Steinbrink wrote: > >On 2007.01.21 13:58:01 -0600, Robert Hancock wrote: > >>Björn Steinbrink wrote: > >>>All kernels were bad using that approach. So back to square 1. :/ > >>> > >>>Björn > >>> > >>OK guys, here's a new patch to try

Re: SATA exceptions with 2.6.20-rc5

2007-01-22 Thread Chr
On Monday, 22. January 2007 03:39, Tejun Heo wrote: > Hello, > > Chr wrote: > > Ok, you won't believe this... I opened my case and rewired my drives... > > And guess what, my second (aka the "good") HDD is now failing! > > I guess, my mainboard has a (but maybe two, or three :( ) "bad" > >

Re: SATA exceptions triggered by XFS (since 2.6.18)

2007-01-22 Thread Paolo Ornati
On Mon, 22 Jan 2007 18:35:05 +0900 Tejun Heo <[EMAIL PROTECTED]> wrote: > Yeap, certainly. I'll ask people first before actually proceeding with > the blacklisting. I'm just getting a bit tired of tides of NCQ firmware > problems. Another interesting thing: it seems that I'm unable to

Re: SATA exceptions triggered by XFS (since 2.6.18)

2007-01-22 Thread Paolo Ornati
On Mon, 22 Jan 2007 18:35:05 +0900 Tejun Heo <[EMAIL PROTECTED]> wrote: > Yeap, certainly. I'll ask people first before actually proceeding with > the blacklisting. I'm just getting a bit tired of tides of NCQ firmware > problems. > > Anyways, for the time being, you can easily turn off NCQ

Re: SATA exceptions triggered by XFS (since 2.6.18)

2007-01-22 Thread Tejun Heo
Paolo Ornati wrote: === START OF INFORMATION SECTION === Model Family: Seagate Barracuda 7200.7 and 7200.7 Plus family Device Model: ST380817AS I'll blacklist it. Thanks. Ok. It will be better if someone else with the same HD could confirm. It looks so strange that an HD that works

Re: SATA exceptions triggered by XFS (since 2.6.18)

2007-01-22 Thread Paolo Ornati
On Mon, 22 Jan 2007 11:46:01 +0900 Tejun Heo <[EMAIL PROTECTED]> wrote: > > I don't know. It's a two years old ST380817AS. > > > > # smartctl -a -d ata /dev/sda > > > > smartctl version 5.36 [x86_64-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen > > Home page is

Re: SATA exceptions triggered by XFS (since 2.6.18)

2007-01-22 Thread Paolo Ornati
On Mon, 22 Jan 2007 01:53:21 +0059 Jiri Slaby <[EMAIL PROTECTED]> wrote: > >> 7 Seek_Error_Rate 0x000f 083 060 030Pre-fail Always > >> - 204305750 > >> 1 Raw_Read_Error_Rate 0x000f 059 049 006Pre-fail Always > >> - 215927244 > >> 195

Re: SATA exceptions triggered by XFS (since 2.6.18)

2007-01-22 Thread Paolo Ornati
On Mon, 22 Jan 2007 01:53:21 +0059 Jiri Slaby [EMAIL PROTECTED] wrote: 7 Seek_Error_Rate 0x000f 083 060 030Pre-fail Always - 204305750 1 Raw_Read_Error_Rate 0x000f 059 049 006Pre-fail Always - 215927244 195

Re: SATA exceptions triggered by XFS (since 2.6.18)

2007-01-22 Thread Paolo Ornati
On Mon, 22 Jan 2007 11:46:01 +0900 Tejun Heo [EMAIL PROTECTED] wrote: I don't know. It's a two years old ST380817AS. # smartctl -a -d ata /dev/sda smartctl version 5.36 [x86_64-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen Home page is http://smartmontools.sourceforge.net/ ===

Re: SATA exceptions triggered by XFS (since 2.6.18)

2007-01-22 Thread Tejun Heo
Paolo Ornati wrote: === START OF INFORMATION SECTION === Model Family: Seagate Barracuda 7200.7 and 7200.7 Plus family Device Model: ST380817AS I'll blacklist it. Thanks. Ok. It will be better if someone else with the same HD could confirm. It looks so strange that an HD that works

Re: SATA exceptions triggered by XFS (since 2.6.18)

2007-01-22 Thread Paolo Ornati
On Mon, 22 Jan 2007 18:35:05 +0900 Tejun Heo [EMAIL PROTECTED] wrote: Yeap, certainly. I'll ask people first before actually proceeding with the blacklisting. I'm just getting a bit tired of tides of NCQ firmware problems. Anyways, for the time being, you can easily turn off NCQ using

Re: SATA exceptions triggered by XFS (since 2.6.18)

2007-01-22 Thread Paolo Ornati
On Mon, 22 Jan 2007 18:35:05 +0900 Tejun Heo [EMAIL PROTECTED] wrote: Yeap, certainly. I'll ask people first before actually proceeding with the blacklisting. I'm just getting a bit tired of tides of NCQ firmware problems. Another interesting thing: it seems that I'm unable to reproduce

Re: SATA exceptions with 2.6.20-rc5

2007-01-22 Thread Chr
On Monday, 22. January 2007 03:39, Tejun Heo wrote: Hello, Chr wrote: Ok, you won't believe this... I opened my case and rewired my drives... And guess what, my second (aka the good) HDD is now failing! I guess, my mainboard has a (but maybe two, or three :( ) bad sata-port(s)!

Re: SATA exceptions with 2.6.20-rc5

2007-01-22 Thread Björn Steinbrink
On 2007.01.21 18:17:01 -0600, Robert Hancock wrote: Björn Steinbrink wrote: On 2007.01.21 13:58:01 -0600, Robert Hancock wrote: Björn Steinbrink wrote: All kernels were bad using that approach. So back to square 1. :/ Björn OK guys, here's a new patch to try against 2.6.20-rc5: Right

Re: SATA exceptions with 2.6.20-rc5

2007-01-22 Thread Björn Steinbrink
On 2007.01.22 17:12:40 +0100, Björn Steinbrink wrote: On 2007.01.21 18:17:01 -0600, Robert Hancock wrote: Björn Steinbrink wrote: On 2007.01.21 13:58:01 -0600, Robert Hancock wrote: Björn Steinbrink wrote: All kernels were bad using that approach. So back to square 1. :/ Björn

Re: SATA exceptions with 2.6.20-rc5

2007-01-22 Thread Björn Steinbrink
On 2007.01.22 17:57:08 +0100, Björn Steinbrink wrote: On 2007.01.22 17:12:40 +0100, Björn Steinbrink wrote: On 2007.01.21 18:17:01 -0600, Robert Hancock wrote: Hmm, another miss, apparently.. Has anyone tried removing these lines from nv_host_intr in 2.6.20-rc5 sata_nv.c and see what that

Re: SATA exceptions with 2.6.20-rc5

2007-01-22 Thread Eric D. Mudama
On 1/15/07, Jeff Garzik [EMAIL PROTECTED] wrote: Jens Axboe wrote: On Mon, Jan 15 2007, Jeff Garzik wrote: Jens Axboe wrote: I'd be surprised if the device would not obey the 7 second timeout rule that seems to be set in stone and not allow more dirty in-drive cache than it could flush out

Re: SATA exceptions with 2.6.20-rc5

2007-01-22 Thread Robert Hancock
Björn Steinbrink wrote: Running a kernel with the return statement replace by a line that prints the irq_stat instead. Currently I'm seeing lots of 0x10 on ata1 and 0x0 on ata2. 40 minutes stress test now and no exception yet. What's interesting is that ata1 saw exactly one interrupt with

Re: SATA exceptions with 2.6.20-rc5

2007-01-22 Thread Alistair John Strachan
On Tuesday 23 January 2007 01:24, Robert Hancock wrote: As a final aside, this is another case where the hardware docs for this controller would really be useful, in order to know whether we are actually supposed to be reading that register in ADMA mode or not. I sent a query to Allen Martin

Re: SATA exceptions with 2.6.20-rc5

2007-01-22 Thread Robert Hancock
Alistair John Strachan wrote: On Tuesday 23 January 2007 01:24, Robert Hancock wrote: As a final aside, this is another case where the hardware docs for this controller would really be useful, in order to know whether we are actually supposed to be reading that register in ADMA mode or not. I

Re: SATA exceptions with 2.6.20-rc5

2007-01-22 Thread Björn Steinbrink
On 2007.01.22 19:24:22 -0600, Robert Hancock wrote: Björn Steinbrink wrote: Running a kernel with the return statement replace by a line that prints the irq_stat instead. Currently I'm seeing lots of 0x10 on ata1 and 0x0 on ata2. 40 minutes stress test now and no exception yet. What's

Re: SATA exceptions with 2.6.20-rc5

2007-01-22 Thread Robert Hancock
Björn Steinbrink wrote: Hm, I don't think it is unhappy about looking at NV_INT_STATUS_CK804. I'm running 2.6.20-rc5 with the INT_DEV check removed for 8 hours now without a single problem and that should still look at NV_INT_STATUS_CK804, right? I just noticed that my last email might not have

Re: SATA exceptions triggered by XFS (since 2.6.18)

2007-01-21 Thread Tejun Heo
Paolo Ornati wrote: I don't know. It's a two years old ST380817AS. # smartctl -a -d ata /dev/sda smartctl version 5.36 [x86_64-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF INFORMATION SECTION === Model Family: Seagate

Re: SATA exceptions with 2.6.20-rc5

2007-01-21 Thread Tejun Heo
Hello, Chr wrote: Ok, you won't believe this... I opened my case and rewired my drives... And guess what, my second (aka the "good") HDD is now failing! I guess, my mainboard has a (but maybe two, or three :( ) "bad" sata-port(s)! Or, you have power related problem. Try to rewire the

Re: SATA exceptions triggered by XFS (since 2.6.18)

2007-01-21 Thread Jiri Slaby
Chr wrote: >> 7 Seek_Error_Rate 0x000f 083 060 030Pre-fail Always >> - 204305750 >> 1 Raw_Read_Error_Rate 0x000f 059 049 006Pre-fail Always >> - 215927244 >> 195 Hardware_ECC_Recovered 0x001a 059 049 000Old_age Always

Re: SATA exceptions with 2.6.20-rc5

2007-01-21 Thread Robert Hancock
Björn Steinbrink wrote: On 2007.01.21 13:58:01 -0600, Robert Hancock wrote: Björn Steinbrink wrote: All kernels were bad using that approach. So back to square 1. :/ Björn OK guys, here's a new patch to try against 2.6.20-rc5: Right now when switching between ADMA mode and legacy mode

Re: SATA exceptions with 2.6.20-rc5

2007-01-21 Thread Robert Hancock
Björn Steinbrink wrote: On 2007.01.21 23:08:11 +0100, Björn Steinbrink wrote: On 2007.01.21 13:58:01 -0600, Robert Hancock wrote: Björn Steinbrink wrote: All kernels were bad using that approach. So back to square 1. :/ Björn OK guys, here's a new patch to try against 2.6.20-rc5: Right

Re: SATA exceptions with 2.6.20-rc5

2007-01-21 Thread Björn Steinbrink
On 2007.01.21 13:58:01 -0600, Robert Hancock wrote: > Björn Steinbrink wrote: > >All kernels were bad using that approach. So back to square 1. :/ > > > >Björn > > > > OK guys, here's a new patch to try against 2.6.20-rc5: > > Right now when switching between ADMA mode and legacy mode (i.e. when

Re: SATA exceptions with 2.6.20-rc5

2007-01-21 Thread Björn Steinbrink
On 2007.01.21 23:08:11 +0100, Björn Steinbrink wrote: > On 2007.01.21 13:58:01 -0600, Robert Hancock wrote: > > Björn Steinbrink wrote: > > >All kernels were bad using that approach. So back to square 1. :/ > > > > > >Björn > > > > > > > OK guys, here's a new patch to try against 2.6.20-rc5: > >

Re: SATA exceptions with 2.6.20-rc5

2007-01-21 Thread Björn Steinbrink
On 2007.01.21 13:58:01 -0600, Robert Hancock wrote: > Björn Steinbrink wrote: > >All kernels were bad using that approach. So back to square 1. :/ > > > >Björn > > > > OK guys, here's a new patch to try against 2.6.20-rc5: > > Right now when switching between ADMA mode and legacy mode (i.e. when

Re: SATA exceptions triggered by XFS (since 2.6.18)

2007-01-21 Thread Chr
On Sunday, 21. January 2007 20:25, Paolo Ornati wrote: > On Sun, 21 Jan 2007 11:32:02 -0600 > Robert Hancock <[EMAIL PROTECTED]> wrote: > > > It looks like what you're getting is an actual NCQ write timing out. > > That makes the bisect result not very interesting since obviously it > >

Re: SATA exceptions with 2.6.20-rc5

2007-01-21 Thread Chr
On Sunday, 21. January 2007 19:01, Björn Steinbrink wrote: > On 2007.01.21 18:34:40 +0100, Chr wrote: > > I run those two in parallel: > while /bin/true; do ls -lR / > /dev/null 2>&1; done > while /bin/true; do echo 255 > /proc/sys/vm/drop_caches; sleep 1; done > > Not sure if running them in

Re: SATA exceptions with 2.6.20-rc5

2007-01-21 Thread Robert Hancock
Björn Steinbrink wrote: All kernels were bad using that approach. So back to square 1. :/ Björn OK guys, here's a new patch to try against 2.6.20-rc5: Right now when switching between ADMA mode and legacy mode (i.e. when going from doing normal DMA reads/writes to doing a FLUSH CACHE) we

Re: SATA exceptions triggered by XFS (since 2.6.18)

2007-01-21 Thread Paolo Ornati
On Sun, 21 Jan 2007 11:32:02 -0600 Robert Hancock <[EMAIL PROTECTED]> wrote: > It looks like what you're getting is an actual NCQ write timing out. > That makes the bisect result not very interesting since obviously it > wouldn't have issued any NCQ writes before NCQ support was > implemented.

Re: SATA exceptions with 2.6.20-rc5

2007-01-21 Thread Björn Steinbrink
On 2007.01.21 09:36:18 +0100, Björn Steinbrink wrote: > On 2007.01.21 00:39:20 -0600, Robert Hancock wrote: > > Björn Steinbrink wrote: > > >On 2007.01.20 22:34:27 -0500, Jeff Garzik wrote: > > >>Robert Hancock wrote: > > >>>change in 2.6.20-rc is either causing or triggering this problem. It > >

Re: SATA exceptions with 2.6.20-rc5

2007-01-21 Thread Björn Steinbrink
On 2007.01.21 18:34:40 +0100, Chr wrote: > On Sunday, 21. January 2007 09:36, Björn Steinbrink wrote: > > On 2007.01.21 00:39:20 -0600, Robert Hancock wrote: > > > > Ah, right... sata_nv.c of course interacts with the outside world, d'oh! > > > > Up to now, I only got bad kernels, latest tested

Re: SATA exceptions with 2.6.20-rc5

2007-01-21 Thread Chr
On Sunday, 21. January 2007 09:36, Björn Steinbrink wrote: > On 2007.01.21 00:39:20 -0600, Robert Hancock wrote: > > Ah, right... sata_nv.c of course interacts with the outside world, d'oh! > > Up to now, I only got bad kernels, latest tested being: > 94fcda1f8ab5e0cacc381c5ca1cc9aa6ad523576 > >

Re: SATA exceptions triggered by XFS (since 2.6.18)

2007-01-21 Thread Robert Hancock
Paolo Ornati wrote: On Sun, 21 Jan 2007 15:29:32 +0100 Paolo Ornati <[EMAIL PROTECTED]> wrote: Sorry for starting a new thread, but I've deleted the messages from my mail-box, and I'm sot sure it's the same problem as here: http://lkml.org/lkml/2007/1/14/108 Today I've decided to try

Re: SATA exceptions with 2.6.20-rc5

2007-01-21 Thread Björn Steinbrink
On 2007.01.21 00:39:20 -0600, Robert Hancock wrote: > Björn Steinbrink wrote: > >On 2007.01.20 22:34:27 -0500, Jeff Garzik wrote: > >>Robert Hancock wrote: > >>>change in 2.6.20-rc is either causing or triggering this problem. It > >>>would be useful if you could try git bisect between 2.6.19 and

Re: SATA exceptions with 2.6.20-rc5

2007-01-21 Thread Björn Steinbrink
On 2007.01.21 00:39:20 -0600, Robert Hancock wrote: Björn Steinbrink wrote: On 2007.01.20 22:34:27 -0500, Jeff Garzik wrote: Robert Hancock wrote: change in 2.6.20-rc is either causing or triggering this problem. It would be useful if you could try git bisect between 2.6.19 and 2.6.20-rc5,

Re: SATA exceptions triggered by XFS (since 2.6.18)

2007-01-21 Thread Robert Hancock
Paolo Ornati wrote: On Sun, 21 Jan 2007 15:29:32 +0100 Paolo Ornati [EMAIL PROTECTED] wrote: Sorry for starting a new thread, but I've deleted the messages from my mail-box, and I'm sot sure it's the same problem as here: http://lkml.org/lkml/2007/1/14/108 Today I've decided to try

Re: SATA exceptions with 2.6.20-rc5

2007-01-21 Thread Chr
On Sunday, 21. January 2007 09:36, Björn Steinbrink wrote: On 2007.01.21 00:39:20 -0600, Robert Hancock wrote: Ah, right... sata_nv.c of course interacts with the outside world, d'oh! Up to now, I only got bad kernels, latest tested being: 94fcda1f8ab5e0cacc381c5ca1cc9aa6ad523576 Which,

Re: SATA exceptions with 2.6.20-rc5

2007-01-21 Thread Björn Steinbrink
On 2007.01.21 18:34:40 +0100, Chr wrote: On Sunday, 21. January 2007 09:36, Björn Steinbrink wrote: On 2007.01.21 00:39:20 -0600, Robert Hancock wrote: Ah, right... sata_nv.c of course interacts with the outside world, d'oh! Up to now, I only got bad kernels, latest tested being:

Re: SATA exceptions with 2.6.20-rc5

2007-01-21 Thread Björn Steinbrink
On 2007.01.21 09:36:18 +0100, Björn Steinbrink wrote: On 2007.01.21 00:39:20 -0600, Robert Hancock wrote: Björn Steinbrink wrote: On 2007.01.20 22:34:27 -0500, Jeff Garzik wrote: Robert Hancock wrote: change in 2.6.20-rc is either causing or triggering this problem. It would be useful

Re: SATA exceptions triggered by XFS (since 2.6.18)

2007-01-21 Thread Paolo Ornati
On Sun, 21 Jan 2007 11:32:02 -0600 Robert Hancock [EMAIL PROTECTED] wrote: It looks like what you're getting is an actual NCQ write timing out. That makes the bisect result not very interesting since obviously it wouldn't have issued any NCQ writes before NCQ support was implemented. Seeing

Re: SATA exceptions with 2.6.20-rc5

2007-01-21 Thread Robert Hancock
Björn Steinbrink wrote: All kernels were bad using that approach. So back to square 1. :/ Björn OK guys, here's a new patch to try against 2.6.20-rc5: Right now when switching between ADMA mode and legacy mode (i.e. when going from doing normal DMA reads/writes to doing a FLUSH CACHE) we

Re: SATA exceptions with 2.6.20-rc5

2007-01-21 Thread Chr
On Sunday, 21. January 2007 19:01, Björn Steinbrink wrote: On 2007.01.21 18:34:40 +0100, Chr wrote: I run those two in parallel: while /bin/true; do ls -lR / /dev/null 21; done while /bin/true; do echo 255 /proc/sys/vm/drop_caches; sleep 1; done Not sure if running them in parallel is

Re: SATA exceptions triggered by XFS (since 2.6.18)

2007-01-21 Thread Chr
On Sunday, 21. January 2007 20:25, Paolo Ornati wrote: On Sun, 21 Jan 2007 11:32:02 -0600 Robert Hancock [EMAIL PROTECTED] wrote: It looks like what you're getting is an actual NCQ write timing out. That makes the bisect result not very interesting since obviously it wouldn't have

Re: SATA exceptions with 2.6.20-rc5

2007-01-21 Thread Björn Steinbrink
On 2007.01.21 13:58:01 -0600, Robert Hancock wrote: Björn Steinbrink wrote: All kernels were bad using that approach. So back to square 1. :/ Björn OK guys, here's a new patch to try against 2.6.20-rc5: Right now when switching between ADMA mode and legacy mode (i.e. when going from

Re: SATA exceptions with 2.6.20-rc5

2007-01-21 Thread Björn Steinbrink
On 2007.01.21 23:08:11 +0100, Björn Steinbrink wrote: On 2007.01.21 13:58:01 -0600, Robert Hancock wrote: Björn Steinbrink wrote: All kernels were bad using that approach. So back to square 1. :/ Björn OK guys, here's a new patch to try against 2.6.20-rc5: Right now when

  1   2   >