I should note here that salvage is like fsck; most Unix-like systems force you to periodically run fsck to catch any incidental filesystem damage that may have occurred. Hard drives are far from perfect; having worked with a number of storage researchers over the past 10+ years, I have learned that undetected disk errors are *much* more common than most people realize. (zfs, for example, can do continuous background checking to try to catch and fix these kinds of disk errors before they destroy data.) OpenAFS does not currently force periodic salvages, but perhaps it should, as running fsck is not enough: fsck will catch problems with the filesystem structure, but won't catch logical inconsistencies within AFS's metadata. Unfortunately, salvaging volumes can take a long time. 1.6's demand attach fileserver can help by deferring salvage until a volume is actually used instead of keeping the entire fileserver offline until all volumes have been salvaged.
In short, these kinds of things *will* occur over time if you don't salvage periodically. -----Original Message----- From: openafs-info-ad...@openafs.org [mailto:openafs-info-ad...@openafs.org] On Behalf Of Lars Schimmer Sent: Thursday, June 13, 2013 3:00 AM To: openafs-info@openafs.org Subject: Re: [OpenAFS] Salvaging user volumes On 2013-06-13 03:24, Garance A Drosihn wrote: > Hi. > > We have an odd situation come up in our AFS cell, and I'm not sure > what I need to do to correct it. > (aside: this is the first time I've had to salvage any AFS volumes in > the few years that I've been responsible for our AFS cell, and I can't > remember any time in the last 12 years that a volume has shown up in > this state] First a few obvious things: 1. 1.4.6 is kinda old, 1.6.2 is recent. Though 1.4.6 should still work fine 2. Only one RW copy of a users home volume? No RO copy as a lazy man backup? 3. Bos salvage should be run from time to time (once every 6 month against the partitions), thats at least my experience. Now for some specific: idec failed. inode 2308463244341149695 errno 22 Do not know that error code. Looks like some of iinc, idec - increment or decrement an inode's link count And a very big number of inode. To much files on that partition? To many inodes? (a bos salvage server partition would maybe have cleaned up a few unused ones) Salvaged user.92602 (537480981): 0 files, 0 blocks A empty volume? The second one: No applicable vice inodes on vicepb; not salvaged Temporary file /vicepb/salvage.inodes.vicepb.30835 is missing... Looks like it does not find any data of that volume on that partition? MfG, Lars Schimmer -- ------------------------------------------------------------- TU Graz, Institut für ComputerGraphik & WissensVisualisierung Tel: +43 316 873-5405 E-Mail: l.schim...@cgv.tugraz.at Fax: +43 316 873-5402 PGP-Key-ID: 0x4A9B1723 :�� T���&j)b� b�өzpJ)ߢ�^��좸!��l��b��(���~�+����Y���b�ا~�����~ȧ~