Yes, indeed. Checking dmesg should be the first thing of course. And I do see errors there:
hammer2_bulkfree: Scanning BACKUP chain 0000004186d3400a.01 (Inode) meth=32 CHECK FAIL (flags=00144002, bref/data ac9f8ef29097a55b/05e63f87e50fb2e2) Resides at/in inode 49686 In pfs UNKNOWN on device serno/WCJ35GE0.s1d No CRC errors though, but I will check the media anyway. Thanks for the hint on how to repair the directory. While here, unrelated question. On my server I have a hardware raid (LSI MegaSAS Gen2) without battery. By default write cache was enabled. Do I understand it right that it's not safe to have write cache enabled unless there is a battery? What kind of cache is that anyway. I guess if it's an nvm then maybe it's ok? Another thing: the drive write chache is set to "default" which might mean that it's on too? The drives cache is probably a volatile memory - so it's definitely should be off, right? -- Aleksej Lebedev On Mon, Oct 12, 2020, at 20:19, Matthew Dillon wrote: > Generally speaking this error occurs if a directory entry is present but the > related inode cannot be found. You can use a hammer2 directive to destroy > the directory entry to clean it up. But before you do so you want to check > the media for CHECK FAIL errors. > > The easiest way to do this is to just read off the entire directory structure > with tar, e.g. 'tar cf /dev/null filesystem' and then check the dmesg output > for errors. 'dmesg | fgrep CHECK'. Something like that. > > If the filesystem appears clean other than the disconnected directory entry, > then you can use 'hammer2 destroy filename' to destroy the directory entry. > Be very careful when doing that. > > If the filesystem has other problems, such as CRC errors, other CHECK errors, > etc.... then it is best to make a full backup and reformat. > > Also make sure that bulkfree runs don't have errors. 'hammer2 bulkfree ...' > and then check dmesg output as well. > > -- > > In terms of how a disconnected inode can happen. It has become more rare but > it might still be possible if a power failure or panic occurs during heavy > filesystem activity. It shouldn't be possible for CRC errors to occur unless > the media itself corrupted the data. > > -Matt
