Re: [reiserfs-list] Filesystem corruption after resize

2002-06-12 Thread Baldur Norddahl

Quoting Vitaly Fertman ([EMAIL PROTECTED]):
 Hi, 
 
  Hello,
  The exact commands used are:
 
  resize_reiserfs -s 400G /dev/vg01/stuff
  lvreduce -l 16693 /dev/vg01/stuff
  pvmove -v /dev/md1
  vgreduce -v vg01 /dev/md1
  resize_reiserfs /dev/vg01/stuff
  reiserfsck --check /dev/vg01/stuff
 
  This all worked like a charm, until I noticed that a nightly script that
  scans all files, no longer was able to access about 20 files (access denied
  even though the script is running as root).
 
 Do you mean reiserfsck finished without any error/warning massage? 

Yes, it did not detect any errors after the resize. The errors turned up a
day after. So it might not be 100% that those two events are linked. But
since nothing else was done that could justify corruptions, that is the
theory I am working on.

 This progs I send to you is what is going to be the next release. 
 Please run --check and tell me what is in fsck.log. You can run 
 --fix-fixable if it says so, but it would be better to run 
 rebuild-tree on a copy (it is not a release). Or you can do the following:
 
 debugreiserfs/debugreiserfs -p /dev/vg01/stuff | gzip -p  stuff.gz
 
 it will pack metadata (without filebodies), I will download it and test 
 locally.

I will send you those two files in a seperate mail.

I copied all the data over to the other raid device, so I am not so much
concerned about rescueing the filesystem - I could just reformat the whole
thing and copy the files back.

But I would very much like to find out what happened so I can take actions
to prevent it from happening again. Particularly I need to know if resizing
on lvm devices is working properly, since I will need to resize again
shortly when the replacement disk arrives.

Baldur



[reiserfs-list] Filesystem corruption after resize

2002-06-11 Thread Baldur Norddahl

Hello,

First something about my setup:

md0: 8x80 GB in a RAID5 configuration
md1: 4x160 GB in a RAID5 configuration
/dev/vg01/stuff: the union of md0 and md1 done with lvm.

dark:/mnt# reiserfsck -V

-reiserfsck, 2002-
reiserfsprogs 3.x.1a

dark:/mnt# resize_reiserfs -v

-resize_reiserfs, 2002-
reiserfsprogs 3.x.1a

Usage: resize_reiserfs  [-s[+|-]#[G|M|K]] [-fqv] device

dark:/mnt# cat /proc/version 
Linux version 2.4.18 (root@dark) (gcc version 2.95.4 20011006 (Debian
prerelease)) #1 SMP Fri Apr 12 13:40:03 CEST 2002

The system is a dual AMD Athlon(tm) MP 1800+ (1533 MHz), with 1 GB memory.

Now recently one of the 160 GB disks died. Since I still had enough free
space and I wanted to preserve the redundancy, I used resize_reiserfs to
shrink the filesystem. Then I used lvm to move it away from the
non-redundant md1 device.

The exact commands used are:

resize_reiserfs -s 400G /dev/vg01/stuff
lvreduce -l 16693 /dev/vg01/stuff
pvmove -v /dev/md1
vgreduce -v vg01 /dev/md1
resize_reiserfs /dev/vg01/stuff
reiserfsck --check /dev/vg01/stuff

This all worked like a charm, until I noticed that a nightly script that
scans all files, no longer was able to access about 20 files (access denied
even though the script is running as root).

Dmesg is full of this:

vs-5150: search_by_key: invalid format found in block 66153. Fsck?
vs-13070: reiserfs_read_inode2: i/o failure occurred trying to find stat
data of [163330 163334 0x0 SD]
is_leaf: free space seems wrong: level=1, nr_items=1, free_space=3040 rdkey 
vs-5150: search_by_key: invalid format found in block 72879. Fsck?
vs-13070: reiserfs_read_inode2: i/o failure occurred trying to find stat
data of [168724 168732 0x0 SD]
is_tree_node: node level 29122 does not match to the expected one 1
vs-5150: search_by_key: invalid format found in block 70647. Fsck?
vs-13070: reiserfs_read_inode2: i/o failure occurred trying to find stat
data of [167220 167223 0x0 SD]
is_tree_node: node level 2 does not match to the expected one 1
vs-5150: search_by_key: invalid format found in block 66153. Fsck?
vs-13070: reiserfs_read_inode2: i/o failure occurred trying to find stat
data of [163330 163334 0x0 SD]

and so on, there is alot of this stuff repeating.

reiserfsck --fix-fixable /dev/vg01/stuff crashes.

Btw. a seperate problem, I am never able to unmount this filesystem
properly. I always get this error:

dark:/mnt# umount stuff
umount: /mnt/stuff: device is busy
dark:/mnt# fuser -v stuff

 USERPID ACCESS COMMAND
stuffroot kernel mount  /mnt/stuff

So without rebooting I can't quote the exact output from --fix-fixable, but
it is approximate the same as when I just run it plain:

dark:/mnt# reiserfsck -l /root/reiserfsck.log /dev/vg01/stuff

-reiserfsck, 2002-
reiserfsprogs 3.x.1a

Will read-only check consistency of the filesystem on /dev/vg01/stuff
Will put log info to '/root/reiserfsck.log'

Do you want to run this program?[N/Yes] (note need to type Yes):Yes
###
reiserfsck --check started at Tue Jun 11 16:36:38 2002
###
Filesystem seems mounted read-only. Skipping journal replay..
Checking S+tree../  4 (of   6)/ 27 (of 132)/ 44 (of 152)bit 1359513587,
bitsize 136749056
reiserfsck: bitmap.c:168: reiserfs_bitmap_test_bit: Assertion `bit_number 
bm-bm_bit_size' failed.
Aborted


What can I do to resolve this?

Thanks,
  Baldur