On Tue, 2009-11-10 at 12:10 -0500, Erik Garrison wrote: > I recently encountered JFS filesystem corruption on a system which > suffered memory corruption. I was unaware of this failure until > fsck's of the filesystem in question failed and caused system > instability. I ran memtest86+ and discovered that several bits of ram > had failed. The error messages from the first fsck failure were > roughtly "duplicate filehandle", but I don't have logs so I can't > provide an exact report. > > I removed the bad ram and began efforts to recover the system. I then > booted the system using an Ubuntu Karmic live CD and tried to back up > the data via a simple cp -a <src> <dest>. This failed upon reaching > one of the corrupted files, and additionally left the target (also > JFS) filesystem damaged. I had to reformat the target filesystem and > try again. > > I was able to use ls -Rl to discover which files had been lost. But > it provided very cryptic error messages: > > ..... > ls: cannot access ./lib/firmware/2.6.28-3-rt/emi62: Stale NFS file handle > ls: cannot access ./lib/firmware/2.6.28-3-rt/korg: Stale NFS file handle > ls: cannot access ./lib/firmware/2.6.28-3-rt/sun: Stale NFS file handle > ls: cannot access > ./lib/linux-restricted-modules/2.6.28-15-generic/wlan: Stale NFS file > handle > ls: cannot access > ./lib/modules/2.6.27-14-generic/kernel/drivers/input/joystick/interact.ko: > Stale NFS file handle > ls: cannot access > ./lib/modules/2.6.27-14-generic/kernel/drivers/input/joystick/joydump.ko: > Stale NFS file handle > ls: cannot access > ./lib/modules/2.6.27-14-generic/kernel/drivers/input/joystick/magellan.ko: > Stale NFS file handle > ls: cannot access > ./lib/modules/2.6.27-14-generic/kernel/drivers/input/joystick/sidewinder.ko: > Stale NFS file handle > ..... > > To my knowledge none of these files were ever mounted via NFS.
JFS is incorrectly return the return code -ESTALE in some situations where the metadata isn't what is expected. That error code should really only be used by nfs or interfaces used by nfs. It's a bug, but nothing really serious, since the fix would be simply to return another error which is less cryptic. > I used this list of failures as an exclusion list for rsync and was > then able to save almost everything from the ro-mounted filesystem. > > I am waiting to reformat the partition because I wanted to file a bug > report to document the case. Are there any recommendations as to what > data I should try to save from the disk and how I should find it? I > want to complete this step quickly as I need to use the system for > work purposes. Thanks for reporting the bug. There really isn't a need to preserve the damaged file system. The bugs in the code can easily be found by grepping for ESTALE. Thanks, Shaggy -- David Kleikamp IBM Linux Technology Center ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Jfs-discussion mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/jfs-discussion
