Re: [OmniOS-discuss] How to disable ata module / driver at boot

2014-03-31 Thread John D Groenveld
In message <7409d33d8efc08eccda1cecdc31bd7ea.squir...@emailmg.netfirms.com>, st
e...@linuxsuite.org writes:
>  May not be related, but I would like to reboot so that OmniOS
>does not
>see the device by not loading the driver / module. I do not need the
>device after
>system install..

disable-ata=true
http://permalink.gmane.org/gmane.os.solaris.opensolaris.indiana/8851>

John
groenv...@acm.org
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


[OmniOS-discuss] How to disable ata module / driver at boot

2014-03-31 Thread steve

  Howdy!

 I have omnios running on Dell R710, and get these warnings for
device ata0. This device is a TEAC DVD ROM.

kern.warning<4>: Nov 11 09:30:05 dfs2 #011timeout: reset target, target=0
lun=0
kern.warning<4>: Nov 11 09:30:05 dfs2 scsi: [ID 107833 kern.warning]
WARNING: /pci@0,0/pci-ide@1f,2/ide@0 (ata0):
kern.warning<4>: Nov 11 09:30:05 dfs2 #011timeout: reset bus, target=0 lun=0
kern.info<6>: Nov 11 09:35:56 dfs2 pci_autoconfig: [ID 595143 kern.info]
NOTICE: add io-range on subtractive ppb[0/1e/0]: 0x3000 ~ 0x3fff

 Then system hangs and needs to be power cycled..

kern.info<6>: Nov 11 09:35:56 dfs2 genunix: [ID 936769 kern.info] pseudo0
is /pseudo

  May not be related, but I would like to reboot so that OmniOS
does not
see the device by not loading the driver / module. I do not need the
device after
system install..

 What is the best way to do this?

 thanx - steve



___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] zpool degraded while smart sais disks are OK

2014-03-31 Thread Tobias Oetiker
Hi Richard,

Mar 23 Richard Elling wrote:

>
> On Mar 21, 2014, at 10:13 PM, Tobias Oetiker  wrote:
>
> > Yesterday Richard Elling wrote:
> >
> >>
> >> On Mar 21, 2014, at 3:23 PM, Tobias Oetiker  wrote:
> >
> > [...]
> >>>
> >>> it happened over time as you can see from the timestamps in the
> >>> log. The errors from zfs's point of view were 1 read and about 30 write
> >>>
> >>> but according to smart the disks are without flaw
> >>
> >> Actually, SMART is pretty dumb. In most cases, it only looks for 
> >> uncorrectable
> >> errors that are related to media or heads. For a clue to more permanent 
> >> errors,
> >> you will want to look at the read/write error reports for errors that are
> >> corrected with possible delays. You can also look at the grown defects 
> >> list.
> >>
> >> This behaviour is expected for drives with errors that are not being 
> >> quickly
> >> corrected or have firmware bugs (horrors!) and where the disk does not do 
> >> TLER
> >> (or its vendor's equivalent)
> >> -- richard
> >
> > the error counters look like this:
> >
> >
> > Error counter log:
> >   Errors Corrected by   Total   Correction Gigabytes
> > Total
> >   ECC  rereads/errors   algorithm  processed
> > uncorrected
> >   fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  
> > errors
> > read:   34940 0  3494  44904530.879 
> >   0
> > write: 00 0 0  39111   1793.323 
> >   0
> > verify:00 0 0   8133  0.000 
> >   0
>
> Errors corrected without delay looks good. The problem lies elsewhere.
>
> >
> > the disk vendor is HGST in case anyone has further ideas ... the system has 
> > 20 of these disks and the problems occured with
> > three of them. The system has been running fine for two months previously.
>
> ...and yet there are aborted commands, likely due to a reset after a timeout.
> Resets aren't issued without cause.
>
> There are two different resets issued by the sd driver: LU and bus. If the
> LU reset doesn't work, the resets are escalated to bus. This is, of course,
> tunable, but is rarely tuned. A bus reset for SAS is a questionable practice,
> since SAS is a fabric, not a bus. But the effect of a device in the fabric
> being reset could be seen as aborted commands by more than one target. To
> troubleshoot these cases, you need to look at all of the devices in the data
> path and map the common causes: HBAs, expanders, enclosures, etc. Traverse
> the devices looking for errors, as you did with the disks. Useful tools:
> sasinfo, lsiutil/sas2ircu, smp_utils, sg3_utils, mpathadm, fmtopo.

thanks for the hints ... after detatching/attaching the 'failed'
disks, they got resilvered and a subsequent scrub did not detect
any errors ...

all a bit mysterious ... will keep an eye on the box to see how it
fares on the future ...

cheers
tobi


-- 
Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland
www.oetiker.ch t...@oetiker.ch +41 62 775 9902
*** We are hiring IT staff: www.oetiker.ch/jobs ***
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss