raid 0.90 on 2.0.37 ?
I am using raid-1 0.90 on 2.2.5-22 redhat kernel, and I experienced a lot of instability problem: services (http, syslog, named ...) stay up for some hours, then they suddenly go down. Then I am waiting for a new raid patch for new kernels. But I have seen in mailing-list that 2.2 kernel has filesystem stability problem. So I would try to use a 2.0 kernel. I know that 2.0.36 kernel is unsafe against internet attacks, so I tried to patch 2.0.37 kernel with 1999-04-21-2.0.36 0.90 patches. But I experienced several .rej files. How can I obtain a really-stable, really-safe 0.90 raid system ? (I don't need any of 2.2 new features).
Help planning Raid-1 system -and- problem with loadlin
Linux and software-raid are new to me so I thought I'd run my plans by experienced users to help me find holes in my understanding and thinking. I am building a web server using RH 6.0 boxed on a system with 3 IDE drives. I want to use software raid-1 mirroring on two of the drives. I plan to use the third drive for a combination of a configuration/emergency system and to provide a mountable filesystem for storing system and application archives for disaster recovery. I hope to use raid-1 on the prod system for all but the /boot and swap filesystems. It should look as follows: pri-mast -> hda: prod sys: -/boot, swap1 -plus "left mirror": /, /usr, /home, /var, /var/lib/mysql pri-slave-> hdb: config/emergency sys -plus /backup (tar archive) filesystem sec-mast-> hdc: prod sys (right mirror) -/boot, swap2 -plus "right mirror": /, /usr, /home, /var, /var/lib/mysql Recovery scenarios I've thought of are as follows: o If I loose one of the raid drives and my system survives (unlikely due to swap): -continue to run on surviving drive until maint convenient. Then: -Power down -replace failed drive -boot emergency system (hdb using Lilo if hdc failed -or- loadlin.exe if hda failed "loadlin vmlinuz root=/dev/hdb8") -mount and tar backup surviving file systems -partition new drive -rebuild mirrors -tar restore surviving filesystems -reboot production system. o If I loose one of the raid drives and the system crashes (more likely) - boot the emergency system (as above) -edit the /etc/fstab on the surviving drive point to surviving drive filesystems (i.e. 'md0'->'hdc5'). -boot production system (hda or hdc) using Lilo if hdc failed -or- loadlin.exe if hda failed "loadlin vmlinuz root=/dev/hdc8") -run unmirrored until maint is convenient -perform 1st recovery procedure above. o If data corruption has occurred -boot the emergency system -perform any necessary prepatory steps (i.e. break mirrors if necessary (as above)) -tar restore system and application archives from periodic backup (daily) -reboot production system -run until maint is convenient -perform 1st recovery procedure above (if appropriate). What do you think? Also, I'm having some trouble starting Linux with loadlin which I will run from an emergency diskette which I created as follows: from W98-DOS window -> format a: /s from Linux ->: mount /mnt/cdrom mount -t vfat /dev/fd0 /mnt/floppy cp /mnt/cdrom/dosutils/loadlin.exe /mnt/floppy cp /boot/vmlinuz /mnt/floppy umount /mnt/floppy To boot emergency linux system without Lilo: -boot emergency diskette -enter following command-> a:loadlin vmlinuz root=/dev/hdb8 When I enter the loadlin command I get a message saying that vmlinuz is not an image file. This procedure came from the RH Linux Unleashed book. Thanks for any help, Joel Fowler
another RAID0 oops, same as before
Just to report that I got the same oops I get about once a month using the RAID0 driver. System is: Dual PII-450, 1gb ram, Adaptec 2940U2W with 6 9GB Cheetah, kernel 2.2.9 carefully patched with latest (April?) raid patch. System also has DAC960 hardware raid, which was also patched in. Always in raid0_map... Hmmm, maybe this is the same is the 'Attempt to access beyond end of device...' which plagues generic 2.2.x for x > 7. I have seen this too on this system, but she's still up and running. uch. I just got it again. Twice in 5 minutes, that's never happened before. Same as the one below, basically, except now accesses to the file result in process stuck in 'D' state. Damn. Here's the oops: Unable to handle kernel paging request at virtual address 3a6c6f6a current->tss.cr3 = 1d67f000, %cr3 = 1d67f000 *pde = Oops: CPU:1 EIP:0010:[] EFLAGS: 00010212 eax: fc8670a0 ebx: 3a6c6f62 ecx: 0020 edx: 000a esi: 0010 edi: 35333134 ebp: 0004 esp: d74ebdd4 ds: 0018 es: 0018 ss: 0018 Process wc (pid: 4898, process nr: 131, stackpage=d74eb000) Stack: fb537310 fb53730e 0060 fc865000 fc8670a0 fc863000 0060 0009 c016f231 c0085be0 0900 fb53730e fb537310 0002 0001 d74ebe80 0009 0004 c016b25d 0900 fb53730e fb537310 0002 Call Trace: [] [] [] [] [] [] [] [] [] [] [] [] [] [] Code: 8b 53 08 03 13 39 d7 7c 25 8b 58 04 85 db 75 1e 57 68 4e 1d Warning: trailing garbage ignored on Code: line Text: 'Code: 8b 53 08 03 13 39 d7 7c 25 8b 58 04 85 db 75 1e 57 68 4e 1d ' Garbage: ' ' >>EIP: c01780b6 Trace: fc865000 Trace: fc8670a0 Trace: fc863000 Trace: c016f231 Trace: c016b25d Trace: c0129e48 Trace: c012a00f Trace: c011ecd3 Code: c01780b6 <_EIP>: <=== Code: c01780b6 0:8b 53 08 movl 0x8(%ebx),%edx <=== Code: c01780b9 3:03 13 addl (%ebx),%edx Code: c01780bb 5:39 d7 cmpl %edx,%edi Code: c01780bd 7:7c 25 jl c01780e4 Code: c01780bf 9:8b 58 04 movl 0x4(%eax),%ebx Code: c01780c2 c:85 db testl %ebx,%ebx Code: c01780c4 e:75 1e jne c01780e4 Code: c01780c610:57 pushl %edi Code: c01780c711:68 4e 1d 00 00 pushl $0x1d4e -- /==\ | David Mansfield | | [EMAIL PROTECTED] | \==/
Re: RAID under 2.2.10
That would be a good idea, if there is a problem with 2.2.10 then confusing it with raid isn't ideal I don't think people understood my previous message which appears to have started this thread... I was only really wanting more information about the development status rather than rushing release! On Tue, 6 Jul 1999 [EMAIL PROTECTED] wrote: > Christoph Martin wrote: > > Ingo Molnar talked about working on the patches for 2.2.10 and that he > > fixed the problem with resync, but he did not yet release the code. > > Perhaps he is "waiting" for the fs-corruption bug in 2.2.10 to go away? > > René > A.J. ([EMAIL PROTECTED]) Sometimes you're ahead, somtimes you're behind. The race is long, and in the end it's only with yourself.
Re: Smart Controller problems
Well, the only reason I have it, is because I got the card and disks for free :) So I'm in a no-lose situation here. However the card have some nice specs and it would be great to get it working. To bad I need a compaq for it... :( / Tomas Compaq makes "compatibles" not "clones". It is the difference between androids and humans. Walk and talk the same, but under the skin's a different story. Talk to anyone who's ever cussed at compaq hardware. Sorry if this sounds like an anti-compaq rant, it isn't. Just a warning.
Re: RAID under 2.2.10
Christoph Martin wrote: > Ingo Molnar talked about working on the patches for 2.2.10 and that he > fixed the problem with resync, but he did not yet release the code. Perhaps he is "waiting" for the fs-corruption bug in 2.2.10 to go away? René
Hardware RAID problems
Hello, I've got an external hardware RAID connected via SCSI to my Linux box (Mandrake 6.0). Linux sees it, and lets me partition and format it. The problems start when I reboot, it complains that something is wrong with the superblock on /dev/sda (the raid), try e2fsck -b 8193. I do, and it still won't work. When I e2fsck /dev/sda1 everything is normal. Any ideas what I've done wrong? Kent R. Nilsen [EMAIL PROTECTED]
Re: RAID under 2.2.10
Robert Stuart <[EMAIL PROTECTED]> writes: > > I'm wanting to use the latest kernel with raid patches and I'm new to > the mailing list... Is raid with 2.2.10 a matter of applying the 2.2.6 > raid patches, and adding that code above? What are the "AC" patches? > Is the fix in the second paragraph above required? > > What are good sites for raid info - can I find digests of this list > anywhere? > You can apply the 2.2.6 patches to 2.2.10. But it is not working correctly. Normal operation is ok, but if a raid comes out of sync and need a resync (like when you reboot without a proper shutdown), this would fail. I have a 2.2.10 with 2.2.6 raid patches running at the moment, because I need 2.2.10 to get Informix IDS running. But when the machine crashed I had to boot my old 2.2.6 to get the rebuild done and then reboot with 2.2.10 for Informix. Ingo Molnar talked about working on the patches for 2.2.10 and that he fixed the problem with resync, but he did not yet release the code. Christoph -- Christoph Martin, Uni-Mainz, Germany Internet-Mail: [EMAIL PROTECTED] --export-a-crypto-system-sig -RSA-3-lines-PERL-- #!/usr/bin/perl -sp0777ihttp://www.dcs.ex.ac.uk/~aba/rsa/