On Wed, Nov 26, 2014 at 02:18:12PM +1100, Paul Ripke wrote: > On Tue, Nov 25, 2014 at 12:00:48PM +0100, Edgar Fu? wrote: > > > Nope, it's inaccessible - again, it was the daily security reports > > > that alerted me: > > > > > > slave:ksh$ find /home/tmp > /dev/null > > > find: /home/tmp/badfile: Bad file descriptor > > I remember having run into the same thing. > > I simultaneously had another FS problem (the kernel repeatedly panic()ing > > on a directory missing . and ..), so I may mix up the two. > > I attributed both to a silently corrupted FFS (after mpt(4) problems which > > made me add the timeout recovery logic to it). > > Fortunately, for me, the problems disappeared after deleting the entry in > > question (maybe the parent directory also, my memory is fuzzy). > > I'm afraid the only clean way out is dump, newfs, restore (I'm very happy > > I didn't had to do that). > > I remember something similar either mucking with either async mounts or > in the early days of soft updates... specifically, fsck reported: > CANNOT FIX, FIRST ENTRY IN DIRECTORY CONTAINS xxx > and sure enough, the directory didn't have '.' and '..' as the first > entries. fsdb to the rescue... > > > > Is there an mpt(4) controller involved? If yes, did you get any timeouts on > > it? > > Nope, no mpt(4). All boring directly connected SATA thru ahcisatai(4). > However, I notice upon closer inspection, I did get a SATA timeout the > night the corruption was noticed: > > Nov 23 03:47:10 slave /netbsd: wd0a: device timeout writing fsbn 2165563424 > of 2165563424-2 > 165563455 (wd0 bn 2165563488; cn 2148376 tn 7 sn 39), retrying > Nov 23 03:47:15 slave /netbsd: ahcisata0 port 0: device present, speed: > 3.0Gb/s > Nov 23 03:47:15 slave /netbsd: wd0: soft error (corrected) > > Next is to figure out the offset of the corrupted inode and see if > this is in the vicinity... (I doubt it - it's a 31 sector write, for > starters).
Also, the drive reported success for the retried write so I would expect it to be OK. ahcisata0 port 0: device present, speed: 3.0Gb/s means that the drive did get a soft reset. I hope this didn't cause it to drop its cache content. -- Manuel Bouyer <bou...@antioche.eu.org> NetBSD: 26 ans d'experience feront toujours la difference --