Data loss after power out - fsck: bad inode number to nextinode

2008-07-08 Thread Polytropon
Hi,

since last week I'm in big trouble: After an power outage my main
system didn't boot up anymore, so I checked its hard disk (FreeBSD
5.4) in my new system (FreeBSD 7.0).

I booted the system in SUM and ran fsck on the partitions. / on
/dev/ad1s1a could be repaired, /var on 1d too, /usr on 1e lost
many directory entries (X11R6, for exmaple), but all files and
directory entry points got restored to lost+found. Okay, that's
as I know it should be. But it doesn't matter, because everything
there could be reinstalled.

Problems occured when checking /home on /dev/ad1s1f. After lot
of

1101472 DUP I=260035
UNEXPECTED SOFT UPDATE INCONSISTENCY

and

EXCESSIVE DUP BLKS I=260039
CONTINUE? yes

and

7310315658325879925 BAD I=260051
UNEXPECTED SOFT UPDATE INCONSISTENCY

fsck ended up this way:

INCORRECT BLOCK COUNT I=290557 (3104 should be 736)
CORRECT? yes

fsck_4.2bsd: bad inode number 306176 to nextinode

The result: The home directories of all other users where present,
but mine (!) - /home/adec - was missing. I may explain this a bit
more precise: When looking at the files using the Midnight Commander,
the name of my home directory was displayed, preceeded by ?, and
in red colour, with a strange date (the epoch?).

|?adec|  0|Jan  1  1970|

So I could not change into this directory and get my files out
of there.

In order not to damage the system more, I made a ddrescue dump
of the partition:

% ddrescue -d -r 3 -n /dev/ad1s1f home.ddrescue logfile

The data could be read without problems. The resulting file seemed
to be an 1:1 copy of the partition.

% file home.ddrescue
home.ddrescue: Unix Fast File system [v2] (little-endian) last mounted on /mnt,
last written at Wed Jul  2 18:51:06 2008,
clean flag 0,
readonly flag 0,
number of blocks 44322272,
number of data blocks 42925108,
number of cylinder groups 472,
block size 16384,
fragment size 2048,
average file size 16384,
average number of files in dir 64,
pending blocks to free 0,
pending inodes to free 0,
system-wide uuid 0,
minimum percentage of free blocks 8,
TIME optimization

When checking it with

% fsck -t ufs -yf /dev/md10

fsck gives the same error message as above.

Then I mounted the image:

% sudo mdconfig -a -t vnode -u 10 -f home.ddrescue
% mount -t ufs -o ro /dev/md10 mnt

And guess what? Same problem: Directory name shown, but directory
not changable.

But then, I noticed something interesting:

% df -h
Filesystem SizeUsed   Avail Capacity  Mounted on
/dev/md10   82G 75G716M99%
/export/home/adec/rescue/mnt

See the size differences? Something seems to be missing. I hope it
is the content of my home directory that's still on the disk. Some
checking:

% sudo du -sch mnt
du: mnt/adec: Bad file descriptor
du: mnt/archiv/cr/clips.w32/s01.wmv: Bad file descriptor
du: mnt/archiv/cr/clips.w32/s02.wmv: Bad file descriptor
 52Gmnt
 52Gtotal

This reveals that it seems to be possible that approx. 30 GB are
not marked as free.

% file mnt/adec
mnt/adec: cannot open `mnt/adec' (Bad file descriptor)

% cd mnt/adec
mnt/adec: Not a directory.

Before bothering anyone here at this list, I checked information on
the net and found that only one (!!!) person except me seemd to have
this problem. And he got no help. Do I? =^_^=

Of course I took the time to read about the FFS architecture. If I did
understand it correctly, fsck stops working, showing the informative
error message bad inode number 306176 to nextinode because it cannot
get the next inode from a concatenated list that represents the file
and directory hierarchy, so there must be a bad pointer. While the
names of the next things represented by inodes reside within a data
structure at level N, the corresponting data entries reside at level
N + 1 where a pointer should lead to. This may be an explaination why
the name adec is still in ad1s1f's root directory, but the data that
says I'm a directory, this is my content is not referenced anymore.
So fsck cannot continue. The missing inodes need to get reconnected.
In most cases, that's what lost+found usually contains: unreferenced
inodes that are not marked free: their names are gone (N), but their
content is still there (N + 1), and the new file name is # plus
their inode number.

What should I do?

Help is VERY welcome! If you have any ideas what to do, I'd be glad
to save the money I would have to spend when sending the disk to a
data recovery service - 1000 Euro and more are nothing I can afford.
And when you're low on money, adequate tape backup systems are too
expensive (allthoug such a device would be my first choice).

By the way, this must be the revenge of a 

Re: Data loss after power out - fsck: bad inode number to nextinode

2008-07-08 Thread Anish Mistry
On Tuesday 08 July 2008, Polytropon wrote:
 Hi,

 since last week I'm in big trouble: After an power outage my main
 system didn't boot up anymore, so I checked its hard disk (FreeBSD
 5.4) in my new system (FreeBSD 7.0).

 I booted the system in SUM and ran fsck on the partitions. / on
 /dev/ad1s1a could be repaired, /var on 1d too, /usr on 1e lost
 many directory entries (X11R6, for exmaple), but all files and
 directory entry points got restored to lost+found. Okay, that's
 as I know it should be. But it doesn't matter, because everything
 there could be reinstalled.

 Problems occured when checking /home on /dev/ad1s1f. After lot
 of

   1101472 DUP I=260035
   UNEXPECTED SOFT UPDATE INCONSISTENCY

 and

   EXCESSIVE DUP BLKS I=260039
   CONTINUE? yes

 and

   7310315658325879925 BAD I=260051
   UNEXPECTED SOFT UPDATE INCONSISTENCY

 fsck ended up this way:

   INCORRECT BLOCK COUNT I=290557 (3104 should be 736)
   CORRECT? yes

   fsck_4.2bsd: bad inode number 306176 to nextinode

 The result: The home directories of all other users where present,
 but mine (!) - /home/adec - was missing. I may explain this a bit
 more precise: When looking at the files using the Midnight
 Commander, the name of my home directory was displayed, preceeded
 by ?, and in red colour, with a strange date (the epoch?).

   |?adec|  0|Jan  1  1970|

 So I could not change into this directory and get my files out
 of there.

 In order not to damage the system more, I made a ddrescue dump
 of the partition:

   % ddrescue -d -r 3 -n /dev/ad1s1f home.ddrescue logfile

 The data could be read without problems. The resulting file seemed
 to be an 1:1 copy of the partition.

 % file home.ddrescue
 home.ddrescue: Unix Fast File system [v2] (little-endian) last
 mounted on /mnt, last written at Wed Jul  2 18:51:06 2008,
 clean flag 0,
 readonly flag 0,
 number of blocks 44322272,
 number of data blocks 42925108,
 number of cylinder groups 472,
 block size 16384,
 fragment size 2048,
 average file size 16384,
 average number of files in dir 64,
 pending blocks to free 0,
 pending inodes to free 0,
 system-wide uuid 0,
 minimum percentage of free blocks 8,
 TIME optimization

 When checking it with

   % fsck -t ufs -yf /dev/md10

 fsck gives the same error message as above.

 Then I mounted the image:

   % sudo mdconfig -a -t vnode -u 10 -f home.ddrescue
   % mount -t ufs -o ro /dev/md10 mnt

 And guess what? Same problem: Directory name shown, but directory
 not changable.

 But then, I noticed something interesting:

   % df -h
   Filesystem SizeUsed   Avail Capacity  Mounted on
   /dev/md10   82G 75G716M99%   
 /export/home/adec/rescue/mnt

 See the size differences? Something seems to be missing. I hope it
 is the content of my home directory that's still on the disk. Some
 checking:

   % sudo du -sch mnt
   du: mnt/adec: Bad file descriptor
   du: mnt/archiv/cr/clips.w32/s01.wmv: Bad file descriptor
   du: mnt/archiv/cr/clips.w32/s02.wmv: Bad file descriptor
52Gmnt
52Gtotal

 This reveals that it seems to be possible that approx. 30 GB are
 not marked as free.

   % file mnt/adec
   mnt/adec: cannot open `mnt/adec' (Bad file descriptor)

   % cd mnt/adec
   mnt/adec: Not a directory.

 Before bothering anyone here at this list, I checked information on
 the net and found that only one (!!!) person except me seemd to
 have this problem. And he got no help. Do I? =^_^=

 Of course I took the time to read about the FFS architecture. If I
 did understand it correctly, fsck stops working, showing the
 informative error message bad inode number 306176 to nextinode
 because it cannot get the next inode from a concatenated list that
 represents the file and directory hierarchy, so there must be a
 bad pointer. While the names of the next things represented by
 inodes reside within a data structure at level N, the corresponting
 data entries reside at level N + 1 where a pointer should lead to.
 This may be an explaination why the name adec is still in
 ad1s1f's root directory, but the data that says I'm a directory,
 this is my content is not referenced anymore. So fsck cannot
 continue. The missing inodes need to get reconnected. In most
 cases, that's what lost+found usually contains: unreferenced inodes
 that are not marked free: their names are gone (N), but their
 content is still there (N + 1), and the new file name is # plus
 their inode number.

 What should I do?

 Help is VERY welcome! If you have any ideas what to do, I'd be glad
 to save the money I would have to spend when sending the disk to a
 data recovery service - 1000 Euro and more are nothing I can
 afford. And when you're low on money, adequate tape backup systems
 are too expensive 

Re: Data loss after power out - fsck: bad inode number to nextinode

2008-07-08 Thread perryh
 What should I do?

In theory,

  clri {special-file} 306176

should wipe the inode containing the bad pointer and allow fsck to
continue, perhaps recovering the files pointed to by that directory
into lost+found.

Definitely try this on a copy first if at all possible.
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]