Re: 2.6.22.6 sata_sil device errors timeouts

2007-09-27 Thread Tejun Heo
Bruce Allen wrote:
 Andrew: thanks for isolating this problem.
 
 Tejun: any thoughts?  The STRANGE_BUFFER_LENGTH problem is fixed in the
 code that Andrew is running.  The problem is provoked with '-o on' which
 goes via a TASKFILE ioctl.

I suppose you mean HDIO_DRIVE_TASK, right?  libata doesn't implement
HDIO_DRIVE_TASKFILE and it probably never will.  I'll test it next week
when I get back.  Thanks.

-- 
tejun

-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.22.6 sata_sil device errors timeouts

2007-09-27 Thread Bruce Allen

Hi Tejun,

Thanks! Yes I meant HDIO_DRIVE_TASK.  Please let me know what your tests 
reveal.


Cheers,
Bruce


On Thu, 27 Sep 2007, Tejun Heo wrote:


Bruce Allen wrote:

Andrew: thanks for isolating this problem.

Tejun: any thoughts?  The STRANGE_BUFFER_LENGTH problem is fixed in the
code that Andrew is running.  The problem is provoked with '-o on' which
goes via a TASKFILE ioctl.


I suppose you mean HDIO_DRIVE_TASK, right?  libata doesn't implement
HDIO_DRIVE_TASKFILE and it probably never will.  I'll test it next week
when I get back.  Thanks.



-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.22.6 sata_sil device errors timeouts

2007-09-18 Thread Andrew Paprocki
It appears to be the '-o on' causing the problem. If I remove that,
the errors go away. The strange part is that according to the smartctl
documentation, my drives support it:

# smartctl -c /dev/sda
smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-7 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status:  (   0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (4797) seconds.
Offline data collection
capabilities:(0x5b) SMART execute Offline immediate.
Auto Offline data collection
on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:(0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:(0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time:(   1) minutes.
Extended self-test routine
recommended polling time:(  80) minutes.

Thanks, -Andrew

On 9/18/07, Bruce Allen [EMAIL PROTECTED] wrote:
 Does removing '-o on' and/or '-S on' eliminate the errors?


 On Mon, 17 Sep 2007, Andrew Paprocki wrote:

  Bruce,
 
  Just built it -- it eliminated the HSM violations, but I still get the
  device errors:
 
  smartmontools release 5.38 dated 2006/12/20 at 20:37:59 UTC
  (I see the above date, even though I verified it is built from CVS head)
 
  ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
  ata2.00: cmd b0/db:f8:00:4f:c2/00:00:00:00:00/00 tag 0 cdb 0x0 data 0
  res 51/04:f8:00:4f:c2/00:00:00:00:00/00 Emask 0x1 (device error)
  ata2.00: configured for UDMA/100
  ata2: EH complete
 
  This is what it is in smartd.conf:
  /dev/sda -d ata -a -o on -S on
  /dev/sdb -d ata -a -o on -S on
  /dev/sdc -d ata -a -o on -S on
 
  Thanks, -Andrew
-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.22.6 sata_sil device errors timeouts

2007-09-18 Thread Bruce Allen

Andrew: thanks for isolating this problem.

Tejun: any thoughts?  The STRANGE_BUFFER_LENGTH problem is fixed in the 
code that Andrew is running.  The problem is provoked with '-o on' which 
goes via a TASKFILE ioctl.


Cheers,
Bruce

On Tue, 18 Sep 2007, Andrew Paprocki wrote:


It appears to be the '-o on' causing the problem. If I remove that,
the errors go away. The strange part is that according to the smartctl
documentation, my drives support it:

# smartctl -c /dev/sda
smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-7 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
   was never started.
   Auto Offline Data Collection: Disabled.
Self-test execution status:  (   0) The previous self-test routine completed
   without error or no self-test has ever
   been run.
Total time to complete Offline
data collection: (4797) seconds.
Offline data collection
capabilities:(0x5b) SMART execute Offline immediate.
   Auto Offline data collection
on/off support.
   Suspend Offline collection upon new
   command.
   Offline surface scan supported.
   Self-test supported.
   No Conveyance Self-test supported.
   Selective Self-test supported.
SMART capabilities:(0x0003) Saves SMART data before entering
   power-saving mode.
   Supports SMART auto save timer.
Error logging capability:(0x01) Error logging supported.
   General Purpose Logging supported.
Short self-test routine
recommended polling time:(   1) minutes.
Extended self-test routine
recommended polling time:(  80) minutes.

Thanks, -Andrew

On 9/18/07, Bruce Allen [EMAIL PROTECTED] wrote:

Does removing '-o on' and/or '-S on' eliminate the errors?


On Mon, 17 Sep 2007, Andrew Paprocki wrote:


Bruce,

Just built it -- it eliminated the HSM violations, but I still get the
device errors:

smartmontools release 5.38 dated 2006/12/20 at 20:37:59 UTC
(I see the above date, even though I verified it is built from CVS head)

ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata2.00: cmd b0/db:f8:00:4f:c2/00:00:00:00:00/00 tag 0 cdb 0x0 data 0
res 51/04:f8:00:4f:c2/00:00:00:00:00/00 Emask 0x1 (device error)
ata2.00: configured for UDMA/100
ata2: EH complete

This is what it is in smartd.conf:
/dev/sda -d ata -a -o on -S on
/dev/sdb -d ata -a -o on -S on
/dev/sdc -d ata -a -o on -S on

Thanks, -Andrew



-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


2.6.22.6 sata_sil device errors timeouts

2007-09-17 Thread Andrew Paprocki
I have a sata_sil 3114 integrated chipset with 2 Hitachi 250gb sata
drives connected, and I'm seeing errors print out during use. The
problems seem to get much worse when I switch from these 250gb drives
to brand new Hitachi HDS721010KLA330 1tb drives, and eventually the
system hangs. With the 250gb drives, I haven't seen a hang, but I
still see the errors below.

Also, I'm seeing two other issues:

1) When built with modules disabled, and libata handling the sata +
pata (AMD CS5536) connections, the pata drives come _after_ the sata
drives (i.e. w/ 2 sata drives, the first IDE drive is sdc). This makes
boot configuration more complicated if booting off the pata drive. Is
there any way to control which order the drives are assigned when not
building w/ modules?

2) The drives display that they support udma6 in hdparm -I, but only
udma5 is being used. And hdparm -i only shows up to udma2.. ?

Any ideas? Thanks, -Andrew


ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x240 action 0x2 frozen
ata2.00: cmd 35/00:00:80:31:54/00:04:02:00:00/e0 tag 0 cdb 0x0 data 524288 out
 res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
ata2: soft resetting port
ata2: port is slow to respond, please be patient (Status 0xd1)
ata2: SRST failed (errno=-16)
ata2: hard resetting port
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata2.00: configured for UDMA/100
ata2: EH complete
sd 1:0:0:0: [sdb] 488397168 512-byte hardware sectors (250059 MB)
sd 1:0:0:0: [sdb] Write Protect is off
sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't
support DPO or FUA


ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x28 action 0x0
ata1.00: (BMDMA2 stat 0x617d9009)
ata1.00: cmd 25/00:80:00:d6:bd/00:02:0b:00:00/e0 tag 0 cdb 0x0 data 327680 in
 res 51/04:e0:9f:d7:bd/00:00:0b:00:00/eb Emask 0x1 (device error)
ata1.00: configured for UDMA/100
ata1: EH complete
sd 0:0:0:0: [sda] 488397168 512-byte hardware sectors (250059 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't
support DPO or FUA


# hdparm -i /dev/sda

/dev/sda:

 Model=HDT722525DLA380 , FwRev=V44OA96A,
SerialNo=  VDK41GT5F3S4JK
 Config={ HardSect NotMFM HdSw15uSec Fixed DTR10Mbs }
 RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=52
 BuffType=DualPortCache, BuffSize=7674kB, MaxMultSect=16, MultSect=?16?
 CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=268435455
 IORDY=on/off, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120}
 PIO modes:  pio0 pio1 pio2 pio3 pio4
 DMA modes:  mdma0 mdma1 mdma2
 UDMA modes: udma0 udma1 udma2
 AdvancedPM=yes: disabled (255) WriteCache=enabled
 Drive conforms to: ATA/ATAPI-7 T13 1532D revision 1:  ATA/ATAPI-2
ATA/ATAPI-3 ATA/ATAPI-4 ATA/ATAPI-5 ATA/ATAPI-6 ATA/ATAPI-7

# hdparm -I /dev/sda | grep udma
DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 *udma5 udma6

# lspci -vv -d 1095:3114
:00:11.0 0180: 1095:3114 (rev 02)
Subsystem: 1095:3114
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium
TAbort- TAbort- MAbort- SERR- PERR-
Latency: 64, Cache Line Size: 0x08 (32 bytes)
Interrupt: pin A routed to IRQ 11
Region 0: I/O ports at fd00 [size=8]
Region 1: I/O ports at fc00 [size=4]
Region 2: I/O ports at fb00 [size=8]
Region 3: I/O ports at fa00 [size=4]
Region 4: I/O ports at f900 [size=16]
Region 5: Memory at efffb000 (32-bit, non-prefetchable) [size=1K]
Expansion ROM at 2000 [disabled] [size=512K]
Capabilities: [60] Power Management version 2
Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA
PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=2 PME-


CONFIG_X86_32=y
CONFIG_GENERIC_TIME=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_QUICKLIST=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_DMI=y
CONFIG_EXPERIMENTAL=y
CONFIG_BROKEN_ON_SMP=y
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_SYSFS_DEPRECATED=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_ANON_INODES=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_SLAB=y
CONFIG_RT_MUTEXES=y
CONFIG_BLOCK=y
CONFIG_LBD=y
CONFIG_LSF=y
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y

Re: 2.6.22.6 sata_sil device errors timeouts

2007-09-17 Thread Tejun Heo
Andrew Paprocki wrote:
 1) When built with modules disabled, and libata handling the sata +
 pata (AMD CS5536) connections, the pata drives come _after_ the sata
 drives (i.e. w/ 2 sata drives, the first IDE drive is sdc). This makes
 boot configuration more complicated if booting off the pata drive. Is
 there any way to control which order the drives are assigned when not
 building w/ modules?

Please use mount-by-LABEL or UUID.

 2) The drives display that they support udma6 in hdparm -I, but only
 udma5 is being used. And hdparm -i only shows up to udma2.. ?

For SATA, UDMA mode doesn't matter at all.  As long as you're in DMA
mode, the only thing that matters is PHY link speed and whether NCQ is
enabled.

 ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x240 action 0x2 frozen

 ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x28 action 0x0

In both cases, SError is indicating transmission problem. Handshake
error and Unrecognized FIS type in the first case, 10b to 8b decode
error and CRC error on the second case.  I can't tell why but signals
flying through those redish cables are getting corrupted.

There have been quite a few cases of bad PSU causing transmission
failures on SATA or you might have a bad controller and/or cables.  The
best way to debug this kind of problem is by elimination - by swapping
hardware piece by piece you can find out which one is causing the problem.

-- 
tejun
-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.22.6 sata_sil device errors timeouts

2007-09-17 Thread Andrew Paprocki
On 9/17/07, Tejun Heo [EMAIL PROTECTED] wrote:
 Andrew Paprocki wrote:
  boot configuration more complicated if booting off the pata drive. Is
  there any way to control which order the drives are assigned when not
  building w/ modules?

 Please use mount-by-LABEL or UUID.

Thanks, wasn't aware of that functionality. Works like a charm.

  ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x240 action 0x2 frozen
  ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x28 action 0x0

 In both cases, SError is indicating transmission problem. Handshake
 error and Unrecognized FIS type in the first case, 10b to 8b decode
 error and CRC error on the second case.  I can't tell why but signals
 flying through those redish cables are getting corrupted.

I've replaced the cables with a different brand I had laying around,
and I haven't seen a problem yet. I'll need to test it heavily, though
to see if I can trigger anything to pop up.

I didn't mention it before, but I'm also getting these errors every
time I boot. I'm thinking they're related to the drive not supporting
cmds that smartd is sending it. If so, is there any way that
libata/smartd can handle this more gracefully? This stuff spews into
dmesg and gives a scare that there is a real hardware problem that may
cause data corruption. I get exactly 6 instances of each of these two
blocks of output prior to reaching the login prompt:

ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata1.00: cmd b0/db:f8:00:4f:c2/00:00:00:00:00/00 tag 0 cdb 0x0 data 0
 res 51/04:f8:00:4f:c2/00:00:00:00:00/00 Emask 0x1 (device error)
ata1.00: configured for UDMA/100
ata1: EH complete

ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata2.00: cmd b0/db:f8:00:4f:c2/00:00:00:00:00/00 tag 0 cdb 0x0 data 0
 res 51/04:f8:00:4f:c2/00:00:00:00:00/00 Emask 0x1 (device error)
ata2.00: configured for UDMA/100
ata2: EH complete

Thanks, -Andrew
-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.22.6 sata_sil device errors timeouts

2007-09-17 Thread Tejun Heo
[cc'ing Bruce Allen]

Andrew Paprocki wrote:
 I didn't mention it before, but I'm also getting these errors every
 time I boot. I'm thinking they're related to the drive not supporting
 cmds that smartd is sending it. If so, is there any way that
 libata/smartd can handle this more gracefully? This stuff spews into
 dmesg and gives a scare that there is a real hardware problem that may
 cause data corruption. I get exactly 6 instances of each of these two
 blocks of output prior to reaching the login prompt:
 
 ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
 ata1.00: cmd b0/db:f8:00:4f:c2/00:00:00:00:00/00 tag 0 cdb 0x0 data 0
  res 51/04:f8:00:4f:c2/00:00:00:00:00/00 Emask 0x1 (device error)
 ata1.00: configured for UDMA/100
 ata1: EH complete
 
 ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
 ata2.00: cmd b0/db:f8:00:4f:c2/00:00:00:00:00/00 tag 0 cdb 0x0 data 0
  res 51/04:f8:00:4f:c2/00:00:00:00:00/00 Emask 0x1 (device error)
 ata2.00: configured for UDMA/100
 ata2: EH complete

Upgrading smartd should fix it.  Which version are you using?

-- 
tejun
-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.22.6 sata_sil device errors timeouts

2007-09-17 Thread Andrew Paprocki
On 9/17/07, Tejun Heo [EMAIL PROTECTED] wrote:
 [cc'ing Bruce Allen]

 Andrew Paprocki wrote:
  ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
  ata2.00: cmd b0/db:f8:00:4f:c2/00:00:00:00:00/00 tag 0 cdb 0x0 data 0
   res 51/04:f8:00:4f:c2/00:00:00:00:00/00 Emask 0x1 (device error)
  ata2.00: configured for UDMA/100
  ata2: EH complete

 Upgrading smartd should fix it.  Which version are you using?

smartmontools release 5.36 dated 2006/04/12 at 17:39:01 UTC
smartmontools configure arguments: '--prefix=/opt/smartmontools'

I see a newer experimental 5.37 is out. I'll give it a go and see if
the trace goes away.

Thanks, -Andrew
-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.22.6 sata_sil device errors timeouts

2007-09-17 Thread Andrew Paprocki
On 9/17/07, Andrew Paprocki [EMAIL PROTECTED] wrote:
 On 9/17/07, Tejun Heo [EMAIL PROTECTED] wrote:
  Upgrading smartd should fix it.  Which version are you using?

 smartmontools release 5.36 dated 2006/04/12 at 17:39:01 UTC
 smartmontools configure arguments: '--prefix=/opt/smartmontools'

 I see a newer experimental 5.37 is out. I'll give it a go and see if
 the trace goes away.

Upgrading made it worse.. I now receive the same device errors as well
as a slew of new HSM violation errors when smartd starts up:

smartmontools release 5.37 dated 2006/12/20 at 20:37:59 UTC
smartmontools configure arguments:  '--prefix=/opt/smartmontools'

ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata5.00: cmd b0/db:f8:00:4f:c2/00:00:00:00:00/00 tag 0 cdb 0x0 data 126976 in
 res 50/00:f8:00:4f:c2/00:00:00:00:00/a0 Emask 0x202 (HSM violation)
ata5: soft resetting port
ata5.00: configured for UDMA/100
ata5: EH complete

# smartctl -i /dev/sda
smartctl version 5.37 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Model Family: Hitachi Deskstar T7K250 series
Device Model: HDT722525DLA380
Serial Number:VDK41GT5F3S4JK
Firmware Version: V44OA96A
User Capacity:250,059,350,016 bytes
Device is:In smartctl database [for details use: -P show]
ATA Version is:   7
ATA Standard is:  ATA/ATAPI-7 T13 1532D revision 1
Local Time is:Mon Sep 17 15:25:29 2007 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

Thanks, -Andrew
-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.22.6 sata_sil device errors timeouts

2007-09-17 Thread Bruce Allen

Hi Andrew,

Please build the CVS version (unreleased) of smartmontools.  The versions 
below are dated 2006/12/20 and 2006/04/12.  You need to build a code 
version based on the past few weeks of code.


Cheers,
Bruce


On Mon, 17 Sep 2007, Andrew Paprocki wrote:


On 9/17/07, Andrew Paprocki [EMAIL PROTECTED] wrote:

On 9/17/07, Tejun Heo [EMAIL PROTECTED] wrote:

Upgrading smartd should fix it.  Which version are you using?


smartmontools release 5.36 dated 2006/04/12 at 17:39:01 UTC
smartmontools configure arguments: '--prefix=/opt/smartmontools'

I see a newer experimental 5.37 is out. I'll give it a go and see if
the trace goes away.


Upgrading made it worse.. I now receive the same device errors as well
as a slew of new HSM violation errors when smartd starts up:
smartmontools release 5.37 dated 2006/12/20 at 20:37:59 UTC
smartmontools configure arguments:  '--prefix=/opt/smartmontools'

ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata5.00: cmd b0/db:f8:00:4f:c2/00:00:00:00:00/00 tag 0 cdb 0x0 data 126976 in
res 50/00:f8:00:4f:c2/00:00:00:00:00/a0 Emask 0x202 (HSM violation)
ata5: soft resetting port
ata5.00: configured for UDMA/100
ata5: EH complete

# smartctl -i /dev/sda
smartctl version 5.37 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Model Family: Hitachi Deskstar T7K250 series
Device Model: HDT722525DLA380
Serial Number:VDK41GT5F3S4JK
Firmware Version: V44OA96A
User Capacity:250,059,350,016 bytes
Device is:In smartctl database [for details use: -P show]
ATA Version is:   7
ATA Standard is:  ATA/ATAPI-7 T13 1532D revision 1
Local Time is:Mon Sep 17 15:25:29 2007 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

Thanks, -Andrew


-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.22.6 sata_sil device errors timeouts

2007-09-17 Thread Andrew Paprocki
Bruce,

Just built it -- it eliminated the HSM violations, but I still get the
device errors:

smartmontools release 5.38 dated 2006/12/20 at 20:37:59 UTC
(I see the above date, even though I verified it is built from CVS head)

ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata2.00: cmd b0/db:f8:00:4f:c2/00:00:00:00:00/00 tag 0 cdb 0x0 data 0
 res 51/04:f8:00:4f:c2/00:00:00:00:00/00 Emask 0x1 (device error)
ata2.00: configured for UDMA/100
ata2: EH complete

This is what it is in smartd.conf:
/dev/sda -d ata -a -o on -S on
/dev/sdb -d ata -a -o on -S on
/dev/sdc -d ata -a -o on -S on

Thanks, -Andrew

On 9/17/07, Bruce Allen [EMAIL PROTECTED] wrote:
 Hi Andrew,

 Please build the CVS version (unreleased) of smartmontools.  The versions
 below are dated 2006/12/20 and 2006/04/12.  You need to build a code
 version based on the past few weeks of code.

 Cheers,
 Bruce
-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.22.6 sata_sil device errors timeouts

2007-09-17 Thread Bruce Allen

Does removing '-o on' and/or '-S on' eliminate the errors?


On Mon, 17 Sep 2007, Andrew Paprocki wrote:


Bruce,

Just built it -- it eliminated the HSM violations, but I still get the
device errors:

smartmontools release 5.38 dated 2006/12/20 at 20:37:59 UTC
(I see the above date, even though I verified it is built from CVS head)

ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata2.00: cmd b0/db:f8:00:4f:c2/00:00:00:00:00/00 tag 0 cdb 0x0 data 0
res 51/04:f8:00:4f:c2/00:00:00:00:00/00 Emask 0x1 (device error)
ata2.00: configured for UDMA/100
ata2: EH complete

This is what it is in smartd.conf:
/dev/sda -d ata -a -o on -S on
/dev/sdb -d ata -a -o on -S on
/dev/sdc -d ata -a -o on -S on

Thanks, -Andrew

On 9/17/07, Bruce Allen [EMAIL PROTECTED] wrote:

Hi Andrew,

Please build the CVS version (unreleased) of smartmontools.  The versions
below are dated 2006/12/20 and 2006/04/12.  You need to build a code
version based on the past few weeks of code.

Cheers,
Bruce



-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html