Michelle Sullivan
http://www.mhix.org/
Sent from my iPad

> On 09 May 2019, at 19:41, Borja Marcos <bor...@sarenet.es> wrote:
> 
> 
> 
>> On 9 May 2019, at 00:55, Michelle Sullivan <miche...@sorbs.net> wrote:
>> 
>> 
>> 
>> This is true, but I am of the thought in alignment with the zFs devs this 
>> might not be a good idea... if zfs can’t work it out already, the best thing 
>> to do will probably be get everything off it and reformat…  
> 
> That’s true, I would rescue what I could and create the pool again but after 
> testing the setup thoroughly.
> 
+1

> It would be worth to have a look at the excellent guide offered by the 
> FreeNAS people. It’s full of excellent advice and a
> priceless list of “donts” such as SATA port multipliers, etc. 
> 

Yeah already worked out over time port multipliers can’t be good.

>> 
>>> That sound not be hard to write if everything else on the disk has no
>>> issues. Don't you say in another message that the system is now returning
>>> 100's of drive errors.
>> 
>> No, one disk in the 16 disk zRAID2 ...  previously unseen but it could be 
>> the errors have occurred in the last 6 weeks... everytime I reboot it 
>> started resilvering, gets to 761M resilvered and then stops.
> 
> That’s a really bad sign. It shouldn’t happen. 

That’s since the metadata corruption.  That is probably part of the problem.

> 
>>> How does that relate the statement =>Everything on
>>> the disk is fine except for a little bit of corruption in the freespace map?
>> 
>> Well I think it goes through until it hits that little bit of corruption at 
>> stops it mounting...  then stops again..
>> 
>> I’m seeing 100s of hard errors at the beginning of one of the drives.. they 
>> were reported in syslog but only just so could be a new thing.  Could be 
>> previously undetected.. no way to know.
> 
> As for disk monitoring, smartmontools can be pretty good although only as an 
> indicator. I also monitor my systems using Orca (I wrote a crude “devilator” 
> many years
> ago) and I gather disk I/O statistics using GEOM of which the 
> read/write/delete/flush times are very valuable. An ailing disk can be 
> returning valid data but become very slow due to retries. 

Yes, though often these will show up in syslog (something I monitor 
religiously...  though I concede that when it hits syslog it’s probably already 
and urgent issue.


Michelle
_______________________________________________
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Reply via email to