My e-mail server suffered a panic this afternoon. I'm not sure what the
underlying problem is, whether it be a bug in the RAID code or some sort
of hardware problem.
I am running raid0145-19990421-2.2.6 with version 2.2.6. The system has
3 disks, sda, sdb, sdc. They are the same size. sda and sdc
partitions normally mirror, and sdb provides spare partitions, in case of
disk failure.
A few days ago, the following warnings started appearing in the logs:
May 1 12:49:31 postal kernel: EXT2-fs warning (device md(9,0)):
ext2_free_blocks: bit already cleared for block 73716
May 2 10:23:18 postal kernel: EXT2-fs error (device md(9,0)):
ext2_check_descriptors: Block bitmap for group 8 not in group (block 0)!
May 2 10:23:18 postal kernel: EXT2-fs: group descriptors corrupted !
e2fsck was run on May 2 to correct these problems.
Thinking that the messages I got today (EXT2-fs errors) would be logged, I
did not hand-write all of the error messages; However, I did scribble down
much of the panic message.
load_block_bitmap: block_group >= groups_count - block_group = 524287,
group_count = 89.
After rebooting the machine, raid reconstruction started, and I received
several messages, such as:
md8 has overlapping physical units with md2!
md7 has overlapping physical units with md2!
md6 has overlapping physical units with md2!
[...]
md7 has overlapping physical units with md8!
md6 has overlapping physical units with md8!
I did not write down all the instances of this message.
After the rebuild, a couple of sdc partitions were removed, since the sdb
partitions were more "fresh."
I realize this report is pretty spotty. Please let me know what
other information I can provide which will help me narrow down what the
problem could be. I upgraded to 2.2.6 after my e-mail server crashed a
couple of weeks ago. It was running raid 1 + 2.1.115, and had run for
several months without a problem before that.
-- Paul ([EMAIL PROTECTED])