Michelle Sullivan http://www.mhix.org/ Sent from my iPad
> On 09 May 2019, at 19:41, Borja Marcos <bor...@sarenet.es> wrote: > > > >> On 9 May 2019, at 00:55, Michelle Sullivan <miche...@sorbs.net> wrote: >> >> >> >> This is true, but I am of the thought in alignment with the zFs devs this >> might not be a good idea... if zfs can’t work it out already, the best thing >> to do will probably be get everything off it and reformat… > > That’s true, I would rescue what I could and create the pool again but after > testing the setup thoroughly. > +1 > It would be worth to have a look at the excellent guide offered by the > FreeNAS people. It’s full of excellent advice and a > priceless list of “donts” such as SATA port multipliers, etc. > Yeah already worked out over time port multipliers can’t be good. >> >>> That sound not be hard to write if everything else on the disk has no >>> issues. Don't you say in another message that the system is now returning >>> 100's of drive errors. >> >> No, one disk in the 16 disk zRAID2 ... previously unseen but it could be >> the errors have occurred in the last 6 weeks... everytime I reboot it >> started resilvering, gets to 761M resilvered and then stops. > > That’s a really bad sign. It shouldn’t happen. That’s since the metadata corruption. That is probably part of the problem. > >>> How does that relate the statement =>Everything on >>> the disk is fine except for a little bit of corruption in the freespace map? >> >> Well I think it goes through until it hits that little bit of corruption at >> stops it mounting... then stops again.. >> >> I’m seeing 100s of hard errors at the beginning of one of the drives.. they >> were reported in syslog but only just so could be a new thing. Could be >> previously undetected.. no way to know. > > As for disk monitoring, smartmontools can be pretty good although only as an > indicator. I also monitor my systems using Orca (I wrote a crude “devilator” > many years > ago) and I gather disk I/O statistics using GEOM of which the > read/write/delete/flush times are very valuable. An ailing disk can be > returning valid data but become very slow due to retries. Yes, though often these will show up in syslog (something I monitor religiously... though I concede that when it hits syslog it’s probably already and urgent issue. Michelle _______________________________________________ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"