Re: [CentOS] Server offline :-( please help to repair software RAID
On 1.5.2011 08:52, Alexander Farber wrote: > Hello Mark and others, > > On Thu, Apr 28, 2011 at 10:21 PM, > wrote: >> At this point, I'd run the long test on each drive, and (after coming back >> an hour or two later, see the results. > > I have that dreadly warning again - > > /etc/cron.weekly/99-raid-check: >WARNING: mismatch_cnt is not 0 on /dev/md0 This does not mean necessarily mean that something is wrong. Writes are not occuring at exactly the same time, there is a short timespan where data is written to disk A but not yet to disk B. So it is possible that 2 mirrored blocks hold different data. http://marc.info/?l=linux-raid&m=117555829807542&w=2 http://marc.info/?l=linux-raid&m=117304688617916&w=2 https://bugzilla.redhat.com/show_bug.cgi?id=566828 > By the "long tests" do you mean some Linux command > I could run while booted in "rescue mode"? $ smartctl -t long /dev/sdX No need for rescue mode. -- Kind Regards, Markus Falb signature.asc Description: OpenPGP digital signature ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server offline :-( please help to repair software RAID
Hello Mark and others, On Thu, Apr 28, 2011 at 10:21 PM, wrote: > At this point, I'd run the long test on each drive, and (after coming back > an hour or two later, see the results. I have that dreadly warning again - /etc/cron.weekly/99-raid-check: WARNING: mismatch_cnt is not 0 on /dev/md0 By the "long tests" do you mean some Linux command I could run while booted in "rescue mode"? Or do you mean inserting Seagate/WD/whatever CD? (Because Strato.de people refuse to do the latter - I only pay EUR 29 + 59/month, locked until Dec., why would they do anything for me /sarcasm) Regards Alex PS: below my disk info: # cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sdb1[1] sda1[0] 1023936 blocks [2/2] [UU] md2 : active raid1 sdb5[1] sda5[0] 277728192 blocks [2/2] [UU] md3 : active raid1 sdb6[1] sda6[0] 185151360 blocks [2/2] [UU] md1 : active raid1 sdb3[1] sda3[0] 20479936 blocks [2/2] [UU] unused devices: # df -h FilesystemSize Used Avail Use% Mounted on /dev/md1 20G 1.7G 17G 9% / /dev/md3 176G 6.2G 161G 4% /var /dev/md0 993M 42M 901M 5% /boot /dev/md2 263G 2.0G 248G 1% /home tmpfs 2.0G 0 2.0G 0% /dev/shm ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server offline :-( please help to repair software RAID
Alexander Farber wrote: > Turned out, smartd kept saying, that it had no entries in smartd.conf. > > I've copied smartd.rpmnew over smartd.conf, restarted it, > now I have (in /var/log/messages, date+hostname removed): At this point, I'd run the long test on each drive, and (after coming back an hour or two later, see the results. mark "yes, it does take that long" ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server offline :-( please help to repair software RAID
Turned out, smartd kept saying, that it had no entries in smartd.conf. I've copied smartd.rpmnew over smartd.conf, restarted it, now I have (in /var/log/messages, date+hostname removed): smartd version 5.38 [x86_64-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ Opened configuration file /etc/smartd.conf Configuration file /etc/smartd.conf was parsed, found DEVICESCAN, scanning devices Problem creating device name scan list Device: /dev/sda, opened Device /dev/sda: using '-d sat' for ATA disk behind SAT layer. Device: /dev/sda, opened Device: /dev/sda, not found in smartd database. Device: /dev/sda, is SMART capable. Adding to "monitor" list. Device: /dev/sdb, opened Device /dev/sdb: using '-d sat' for ATA disk behind SAT layer. Device: /dev/sdb, opened Device: /dev/sdb, not found in smartd database. Device: /dev/sdb, is SMART capable. Adding to "monitor" list. Monitoring 0 ATA and 2 SCSI devices smartd has fork()ed into background mode. New PID=3427. And the /etc/smartd.conf contains: DEVICESCAN -H -m root and the rest are comments. Do you think it is configured okay this way? My disk info is: # df -h FilesystemSize Used Avail Use% Mounted on /dev/md1 20G 1.7G 17G 9% / /dev/md3 176G 7.0G 160G 5% /var /dev/md0 993M 42M 901M 5% /boot /dev/md2 263G 2.0G 248G 1% /home tmpfs 2.0G 0 2.0G 0% /dev/shm # cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sdb1[1] sda1[0] 1023936 blocks [2/2] [UU] md2 : active raid1 sdb5[1] sda5[0] 277728192 blocks [2/2] [UU] md3 : active raid1 sdb6[1] sda6[0] 185151360 blocks [2/2] [UU] md1 : active raid1 sdb3[1] sda3[0] 20479936 blocks [2/2] [UU] unused devices: ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server offline :-( please help to repair software RAID
On Thu, 2011-04-28 at 21:52 +0200, Alexander Farber wrote: > On the 2nd try it has booted and seems to work. Did it give an error on the first try and if so, which one ? You should check /var/log/messages for i/o errors and check your disks with smartctl I have had my raid1 arrays rebuild sometime without a (for me known) reason. Even had a defective networkcard kernel panic the machine for two hours and the raids were still working afterwards ;) Regards, Michel ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server offline :-( please help to repair software RAID
on 4/28/2011 12:40 PM Alexander Farber spake the following: > Thank you all, it seems to have finished - I'm rebooting. > > Just curious why is the State of md3 "active", while the others are "clean"? > If I remember right, clean means it is completely synced and not being written to or mounted. Active means it is or has been written to and is synced. Usually dirty means that there is un-synced data on one or the other drives. ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server offline :-( please help to repair software RAID
Alexander Farber wrote: > On the 2nd try it has booted and seems to work. > > The /var/log/mcelog is (and was) empty. To be expected - I'd expect this as a h/d error. Check your logfiles for info from smartd mark ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server offline :-( please help to repair software RAID
On the 2nd try it has booted and seems to work. The /var/log/mcelog is (and was) empty. # sudo cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sdb1[1] sda1[0] 1023936 blocks [2/2] [UU] md2 : active raid1 sdb5[1] sda5[0] 277728192 blocks [2/2] [UU] md3 : active raid1 sdb6[1] sda6[0] 185151360 blocks [2/2] [UU] md1 : active raid1 sdb3[1] sda3[0] 20479936 blocks [2/2] [UU] unused devices: Below is the output from the remote console of my hoster. If you notice anything or have any advice, please share. GNU GRUB version 0.97 (636K lower / 3635904K upper memory) +-+ | CentOS (2.6.18-238.9.1.el5) | | CentOS (2.6.18-238.5.1.el5) | | CentOS (2.6.18-194.32.1.el5)| | CentOS 5| | CentOS 5 Disk 2 | | | | | | | | | | | | | | | +-+ Use the ^ and v keys to select which entry is highlighted. Press enter to boot the selected OS, 'e' to edit the commands before booting, 'a' to modify the kernel arguments before booting, or 'c' for a command-line. The highlighted entry will be booted automatically in 1 seconds. Booting 'CentOS (2.6.18-238.9.1.el5)' root (hd0,0) Filesystem type is ext2fs, partition type 0xfd kernel /vmlinuz-2.6.18-238.9.1.el5 root=/dev/md1 console=tty0 console=ttyS0,576 00 [Linux-bzImage, setup=0x1e00, size=0x1fd9fc] initrd /initrd-2.6.18-238.9.1.el5.img [Linux-initrd @ 0x37d5f000, 0x290aac bytes] Linux version 2.6.18-238.9.1.el5 (mockbu...@builder10.centos.org) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-50)) #1 SMP Tue Apr 12 18:10:13 EDT 2011 Command line: root=/dev/md1 console=tty0 console=ttyS0,57600 BIOS-provided physical RAM map: BIOS-e820: 0001 - 0009f000 (usable) BIOS-e820: 0009f000 - 000a (reserved) BIOS-e820: 000e4000 - 0010 (reserved) BIOS-e820: 0010 - ddfb (usable) BIOS-e820: ddfb - ddfbe000 (ACPI data) BIOS-e820: ddfbe000 - ddfe (ACPI NVS) BIOS-e820: ddfe - ddfee000 (reserved) BIOS-e820: ddff - de00 (reserved) BIOS-e820: ff70 - 0001 (reserved) BIOS-e820: 0001 - 00012000 (usable) DMI present. No NUMA configuration found Faking a node at -00012000 Bootmem setup node 0 -00012000 Memory for crash kernel (0x0 to 0x0) notwithin permissible range disabling kdump ACPI: PM-Timer IO Port: 0x808 ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) Processor #0 0:4 APIC version 16 ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled) Processor #1 0:4 APIC version 16 ACPI: LAPIC (acpi_id[0x03] lapic_id[0x02] enabled) Processor #2 0:4 APIC version 16 ACPI: LAPIC (acpi_id[0x04] lapic_id[0x03] enabled) Processor #3 0:4 APIC version 16 ACPI: IOAPIC (id[0x04] address[0xfec0] gsi_base[0]) IOAPIC[0]: apic_id 4, version 33, address 0xfec0, GSI 0-23 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level) Setting APIC routing to flat ACPI: HPET id: 0x8300 base: 0xfed0 Using ACPI (MADT) for SMP configuration information Nosave address range: 0009f000 - 000a Nosave address range: 000a - 000e4000 Nosave address range: 000e4000 - 0010 Nosave address range: ddfb - ddfbe000 Nosave address range: ddfbe000 - ddfe Nosave address range: ddfe - ddfee000 Nosave address range: ddfee000 - ddff Nosave address range: ddff - de00 Nosave address range: de00 - ff70 Nosave address range: ff70 - 0001 Allocating PCI resources starting at e000 (gap: de00:2170) SMP: Allowing 4 CPUs, 0 hotplug CPUs Built 1 zonelists. Total pages: 1022573 Kernel command line: root=/dev/md1 console=tty0 console=ttyS0,57600 Initializing CPU#0 PID hash table entries: 4096 (order: 12, 32768 bytes) Console: colour VGA+ 80x25
Re: [CentOS] Server offline :-( please help to repair software RAID
Thank you all, it seems to have finished - I'm rebooting. Just curious why is the State of md3 "active", while the others are "clean"? # cat /proc/mdstat Personalities : [linear] [raid0] [raid1] md0 : active raid1 sda1[0] sdb1[1] 1023936 blocks [2/2] [UU] md1 : active raid1 sda3[0] sdb3[1] 20479936 blocks [2/2] [UU] [=>...] resync = 86.6% (17746816/20479936) finish=0.3min speed=131514K/sec md2 : active raid1 sda5[0] sdb5[1] 277728192 blocks [2/2] [UU] md3 : active raid1 sda6[0] sdb6[1] 185151360 blocks [2/2] [UU] unused devices: ..Then after some wait:... # cat /proc/mdstat Personalities : [linear] [raid0] [raid1] md0 : active raid1 sda1[0] sdb1[1] 1023936 blocks [2/2] [UU] md1 : active raid1 sda3[0] sdb3[1] 20479936 blocks [2/2] [UU] md2 : active raid1 sda5[0] sdb5[1] 277728192 blocks [2/2] [UU] md3 : active raid1 sda6[0] sdb6[1] 185151360 blocks [2/2] [UU] unused devices: # mdadm -D /dev/md3 /dev/md3: Version : 00.90 Creation Time : Sat Mar 19 22:53:25 2011 Raid Level : raid1 Array Size : 185151360 (176.57 GiB 189.59 GB) Used Dev Size : 185151360 (176.57 GiB 189.59 GB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 3 Persistence : Superblock is persistent Update Time : Thu Apr 28 21:31:12 2011 State : active Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 UUID : 1b3668a3:4b6c5593:3d186b3c:53958f34 Events : 0.39 Number Major Minor RaidDevice State 0 860 active sync /dev/sda6 1 8 221 active sync /dev/sdb6 # mdadm -D /dev/md1 /dev/md1: Version : 00.90 Creation Time : Sat Mar 19 22:52:20 2011 Raid Level : raid1 Array Size : 20479936 (19.53 GiB 20.97 GB) Used Dev Size : 20479936 (19.53 GiB 20.97 GB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 1 Persistence : Superblock is persistent Update Time : Thu Apr 28 21:33:56 2011 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 UUID : 8812725b:ea156bc6:3d186b3c:53958f34 Events : 0.48 Number Major Minor RaidDevice State 0 830 active sync /dev/sda3 1 8 191 active sync /dev/sdb3 # mdadm -D /dev/md0 /dev/md0: Version : 00.90 Creation Time : Sat Mar 19 22:52:12 2011 Raid Level : raid1 Array Size : 1023936 (1000.11 MiB 1048.51 MB) Used Dev Size : 1023936 (1000.11 MiB 1048.51 MB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Thu Apr 28 21:06:24 2011 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 UUID : 87db17c2:d806a38c:3d186b3c:53958f34 Events : 0.14 Number Major Minor RaidDevice State 0 810 active sync /dev/sda1 1 8 171 active sync /dev/sdb1 # mdadm -D /dev/md2 /dev/md2: Version : 00.90 Creation Time : Sat Mar 19 22:52:32 2011 Raid Level : raid1 Array Size : 277728192 (264.86 GiB 284.39 GB) Used Dev Size : 277728192 (264.86 GiB 284.39 GB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 2 Persistence : Superblock is persistent Update Time : Thu Apr 28 21:17:54 2011 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 UUID : 2db0174b:e45768d5:3d186b3c:53958f34 Events : 0.14 Number Major Minor RaidDevice State 0 850 active sync /dev/sda5 1 8 211 active sync /dev/sdb5 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server offline :-( please help to repair software RAID
On 04/28/2011 03:31 PM, Michel van Deventer wrote: > Hi, > > On Thu, 2011-04-28 at 21:26 +0200, Alexander Farber wrote: >> Hello, I didn't touch anything, just booted the hoster's "rescue image". > Cool :) > >> # cat /proc/mdstat >> Personalities : [linear] [raid0] [raid1] >> md0 : active raid1 sda1[0] sdb1[1] >> 1023936 blocks [2/2] [UU] >> >> md1 : active raid1 sda3[0] sdb3[1] >> 20479936 blocks [2/2] [UU] >> resync=DELAYED >> >> md2 : active raid1 sda5[0] sdb5[1] >> 277728192 blocks [2/2] [UU] >> >> md3 : active raid1 sda6[0] sdb6[1] >> 185151360 blocks [2/2] [UU] >> [=>...] resync = 85.3% (158109056/185151360) >> finish=5.3min speed=83532K/sec >> > Let md3 rebuild, wait for md1 to rebuild (check regularly with > cat /proc/mdstat) and reboot your machine without the rescue, it should > come up again. > > Regards, > > Michel Run 'watch cat /proc/mdstat'. :) -- Digimer E-Mail: digi...@alteeve.com AN!Whitepapers: http://alteeve.com Node Assassin: http://nodeassassin.org ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server offline :-( please help to repair software RAID
Hi, On Thu, 2011-04-28 at 21:26 +0200, Alexander Farber wrote: > Hello, I didn't touch anything, just booted the hoster's "rescue image". Cool :) > # cat /proc/mdstat > Personalities : [linear] [raid0] [raid1] > md0 : active raid1 sda1[0] sdb1[1] > 1023936 blocks [2/2] [UU] > > md1 : active raid1 sda3[0] sdb3[1] > 20479936 blocks [2/2] [UU] > resync=DELAYED > > md2 : active raid1 sda5[0] sdb5[1] > 277728192 blocks [2/2] [UU] > > md3 : active raid1 sda6[0] sdb6[1] > 185151360 blocks [2/2] [UU] > [=>...] resync = 85.3% (158109056/185151360) > finish=5.3min speed=83532K/sec > Let md3 rebuild, wait for md1 to rebuild (check regularly with cat /proc/mdstat) and reboot your machine without the rescue, it should come up again. Regards, Michel ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server offline :-( please help to repair software RAID
On 04/28/2011 03:26 PM, Alexander Farber wrote: > Hello, I didn't touch anything, just booted the hoster's "rescue image". > > # cat /etc/mdadm.conf > cat: /etc/mdadm.conf: No such file or directory > > > # cat /proc/mdstat > Personalities : [linear] [raid0] [raid1] > md0 : active raid1 sda1[0] sdb1[1] > 1023936 blocks [2/2] [UU] > > md1 : active raid1 sda3[0] sdb3[1] > 20479936 blocks [2/2] [UU] > resync=DELAYED > > md2 : active raid1 sda5[0] sdb5[1] > 277728192 blocks [2/2] [UU] > > md3 : active raid1 sda6[0] sdb6[1] > 185151360 blocks [2/2] [UU] > [=>...] resync = 85.3% (158109056/185151360) > finish=5.3min speed=83532K/sec > > unused devices: I'd wait for it to finish and then try rebooting normally. Post back after md1 and md3 are completed sync'ing. -- Digimer E-Mail: digi...@alteeve.com AN!Whitepapers: http://alteeve.com Node Assassin: http://nodeassassin.org ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server offline :-( please help to repair software RAID
Hi, what is the output of 'cat /proc/mdstat' ? A healthy raid should look something like below : [root@janeway ~]# cat /proc/mdstat Personalities : [raid1] md2 : active raid1 sdb1[0] sda1[1] 256896 blocks [2/2] [UU] md0 : active raid1 sdd1[0] sdc1[1] 1465135936 blocks [2/2] [UU] md3 : active raid1 sdb3[1] sda3[0] 730218432 blocks [2/2] [UU] I have 3 RAID1 arrays (over 4 disks) On Thu, 2011-04-28 at 21:10 +0200, Alexander Farber wrote: > Additional info (how many RAID arrays do I have??): > > # mdadm -D /dev/md3 > /dev/md3: > Version : 00.90 > Creation Time : Sat Mar 19 22:53:25 2011 > Raid Level : raid1 > Array Size : 185151360 (176.57 GiB 189.59 GB) > Used Dev Size : 185151360 (176.57 GiB 189.59 GB) >Raid Devices : 2 > Total Devices : 2 > Preferred Minor : 3 > Persistence : Superblock is persistent > > Update Time : Thu Apr 28 21:09:12 2011 > State : clean, resyncing > Active Devices : 2 > Working Devices : 2 > Failed Devices : 0 > Spare Devices : 0 > > Rebuild Status : 38% complete > >UUID : 1b3668a3:4b6c5593:3d186b3c:53958f34 > Events : 0.15 > > Number Major Minor RaidDevice State >0 860 active sync /dev/sda6 >1 8 221 active sync /dev/sdb6 > ___ > CentOS mailing list > CentOS@centos.org > http://lists.centos.org/mailman/listinfo/centos ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server offline :-( please help to repair software RAID
On 4/28/2011 2:07 PM, Alexander Farber wrote: > Hello, > > since weeks I was ignoring this warning at my CentOS 5.6/64 bit machine - > > /etc/cron.weekly/99-raid-check: > WARNING: mismatch_cnt is not 0 on /dev/md0 > > in the hope that the software RAID will slowly repair itself. > > I also had executed "echo 10> /proc/sys/dev/raid/speed_limit_max" > on the advice from the mailing list. > > But now my web server is offline - I had to boot it remotely with rescue > system. > > Does anybody please have an advice what commands to run > and do you think it is a RAID problem at all? > A 'cat /proc/mdstat' should show the state of the raid mirroring. I don't see anything that would explain not booting, though. Raid1 works normally even when only one member is available and should continue to work while rebuilding. Maybe the problem that caused the mismatch has corrupted the drive the system normally boots. -- Les Mikesell lesmikes...@gmail.com ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server offline :-( please help to repair software RAID
Hello, I didn't touch anything, just booted the hoster's "rescue image". # cat /etc/mdadm.conf cat: /etc/mdadm.conf: No such file or directory # cat /proc/mdstat Personalities : [linear] [raid0] [raid1] md0 : active raid1 sda1[0] sdb1[1] 1023936 blocks [2/2] [UU] md1 : active raid1 sda3[0] sdb3[1] 20479936 blocks [2/2] [UU] resync=DELAYED md2 : active raid1 sda5[0] sdb5[1] 277728192 blocks [2/2] [UU] md3 : active raid1 sda6[0] sdb6[1] 185151360 blocks [2/2] [UU] [=>...] resync = 85.3% (158109056/185151360) finish=5.3min speed=83532K/sec unused devices: # mdadm -D /dev/md3 /dev/md3: Version : 00.90 Creation Time : Sat Mar 19 22:53:25 2011 Raid Level : raid1 Array Size : 185151360 (176.57 GiB 189.59 GB) Used Dev Size : 185151360 (176.57 GiB 189.59 GB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 3 Persistence : Superblock is persistent Update Time : Thu Apr 28 21:23:48 2011 State : active, resyncing Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Rebuild Status : 85% complete UUID : 1b3668a3:4b6c5593:3d186b3c:53958f34 Events : 0.31 Number Major Minor RaidDevice State 0 860 active sync /dev/sda6 1 8 221 active sync /dev/sdb6 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server offline :-( please help to repair software RAID
On 04/28/2011 03:10 PM, Alexander Farber wrote: > Rebuild Status : 38% complete That's potentially promising. What does 'cat /proc/mdstat' show? Did you have to recover the array, or were you able to use /etc/mdadm.conf? -- Digimer E-Mail: digi...@alteeve.com AN!Whitepapers: http://alteeve.com Node Assassin: http://nodeassassin.org ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server offline :-( please help to repair software RAID
Additional info (how many RAID arrays do I have??): # mdadm -D /dev/md3 /dev/md3: Version : 00.90 Creation Time : Sat Mar 19 22:53:25 2011 Raid Level : raid1 Array Size : 185151360 (176.57 GiB 189.59 GB) Used Dev Size : 185151360 (176.57 GiB 189.59 GB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 3 Persistence : Superblock is persistent Update Time : Thu Apr 28 21:09:12 2011 State : clean, resyncing Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Rebuild Status : 38% complete UUID : 1b3668a3:4b6c5593:3d186b3c:53958f34 Events : 0.15 Number Major Minor RaidDevice State 0 860 active sync /dev/sda6 1 8 221 active sync /dev/sdb6 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
[CentOS] Server offline :-( please help to repair software RAID
Hello, since weeks I was ignoring this warning at my CentOS 5.6/64 bit machine - /etc/cron.weekly/99-raid-check: WARNING: mismatch_cnt is not 0 on /dev/md0 in the hope that the software RAID will slowly repair itself. I also had executed "echo 10 > /proc/sys/dev/raid/speed_limit_max" on the advice from the mailing list. But now my web server is offline - I had to boot it remotely with rescue system. Does anybody please have an advice what commands to run and do you think it is a RAID problem at all? # dmesg Linux version 2.6.34 (root@imagemaster30) (gcc version 4.3.2 (Debian 4.3.2-1.1) ) #20 SMP Mon Jul 19 18:35:15 CEST 2010 Command line: ramdisk_size=81920 initrd=rescue-image-2.6-64 root=/dev/ram BOOT_IMAGE=rescue-kernel-2.6-64 BIOS-provided physical RAM map: BIOS-e820: - 0009f000 (usable) BIOS-e820: 0009f000 - 000a (reserved) BIOS-e820: 000e4000 - 0010 (reserved) BIOS-e820: 0010 - ddfb (usable) BIOS-e820: ddfb - ddfbe000 (ACPI data) BIOS-e820: ddfbe000 - ddfe (ACPI NVS) BIOS-e820: ddfe - ddfee000 (reserved) BIOS-e820: ddff - de00 (reserved) BIOS-e820: ff70 - 0001 (reserved) BIOS-e820: 0001 - 00012000 (usable) NX (Execute Disable) protection: active DMI present. AMI BIOS detected: BIOS may corrupt low RAM, working around it. e820 update range: - 0001 (usable) ==> (reserved) e820 update range: - 1000 (usable) ==> (reserved) e820 remove range: 000a - 0010 (usable) No AGP bridge found last_pfn = 0x12 max_arch_pfn = 0x4 MTRR default type: uncachable MTRR fixed ranges enabled: 0-9 write-back A-E uncachable F-F write-protect MTRR variable ranges enabled: 0 base mask 8000 write-back 1 base 8000 mask C000 write-back 2 base C000 mask E000 write-back 3 disabled 4 disabled 5 disabled 6 disabled 7 disabled TOM2: 00012000 aka 4608M x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106 e820 update range: e000 - 0001 (usable) ==> (reserved) last_pfn = 0xddfb0 max_arch_pfn = 0x4 initial memory mapped : 0 - 2000 found SMP MP-table at [880ff780] ff780 Using GB pages for direct mapping init_memory_mapping: -ddfb 00 - 00c000 page 1G 00c000 - 00dde0 page 2M 00dde0 - 00ddfb page 4k kernel direct mapping tables up to ddfb @ 12000-15000 init_memory_mapping: 0001-00012000 01 - 012000 page 2M kernel direct mapping tables up to 12000 @ 14000-16000 RAMDISK: 7d792000 - 8000 ACPI: RSDP 000faf80 00014 (v00 ACPIAM) ACPI: RSDT ddfb 0003C (v01 032510 RSDT1503 20100325 MSFT 0097) ACPI: FACP ddfb0200 00084 (v02 032510 FACP1503 20100325 MSFT 0097) ACPI: DSDT ddfb0440 0447E (v01 A96B3 A96B3210 0210 INTL 20051117) ACPI: FACS ddfbe000 00040 ACPI: APIC ddfb0390 0006C (v01 032510 APIC1503 20100325 MSFT 0097) ACPI: MCFG ddfb0400 0003C (v01 032510 OEMMCFG 20100325 MSFT 0097) ACPI: OEMB ddfbe040 00071 (v01 032510 OEMB1503 20100325 MSFT 0097) ACPI: HPET ddfb48c0 00038 (v01 032510 OEMHPET 20100325 MSFT 0097) ACPI: SSDT ddfb4900 0088C (v01 A M I POWERNOW 0001 AMD 0001) ACPI: Local APIC address 0xfee0 Scanning NUMA topology in Northbridge 24 No NUMA configuration found Faking a node at -00012000 Initmem setup node 0 -00012000 NODE_DATA [0001 - 00014fff] [ea00-ea0003ff] PMD -> [88010020-880103bf] on node 0 Zone PFN ranges: DMA 0x0010 -> 0x1000 DMA320x1000 -> 0x0010 Normal 0x0010 -> 0x0012 Movable zone start PFN for each node early_node_map[3] active PFN ranges 0: 0x0010 -> 0x009f 0: 0x0100 -> 0x000ddfb0 0: 0x0010 -> 0x0012 On node 0 totalpages: 1040191 DMA zone: 56 pages used for memmap DMA zone: 0 pages reserved DMA zone: 3927 pages, LIFO batch:0 DMA32 zone: 14280 pages used for memmap DMA32 zone: 890856 pages, LIFO batch:31 Normal zone: 1792 pages used for memmap Normal zone: 129280 pages, LIFO batch:31 ACPI: PM-Timer IO Port: 0x808 ACPI: Local APIC address 0xfee0 ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled) ACPI: LAPIC (acpi_id[0x03] lapic_id[0x02] enabled) ACPI: LAPIC (acpi_id[0x04] lapic_id[0x03] enabled) ACPI: IOAPIC (id[0x04] address[0xfec0] gsi_base[0]) IOAPIC[0]: apic_id 4, version 33, address 0xfec0, GSI 0-23 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)