Re: RAID questions
On Mon, 7 Aug 2000, Adam McKenna wrote: > 2) If I do, will it still broken unless I apply the "2.2.16combo" patch? > 3) If it will, then how do I resolve the problem with the md.c hunk failing > with "2.2.16combo"? If I remember correctly, 2.2.16combo was there to make it possible to use Ingo's older raid patches on 2.2.16 (before raid-2.2.16-A0 was released). I'm not 100% sure, though. > This is a production system I am working on here. I can't afford to have it > down for an hour or two to test a new kernel. I'd rather not be working with > this mess to begin with, but unfortunately this box was purchased before I > started this job, and whoever ordered it decided that software raid was > "Good enough". A test machine comes in handy. Not to actually test the new RAID code (we did/do that already ;) ), but just to train handling of SW raid. > I am not subscribed to either list so CC's are desirable. However if you > don't want to CC then you don't have to -- I'll just read the archives. > That is, if someone fixes the "Mailing list archives" link on www.linux.org > to point to someplace that exists and actually has archives. IMHO, if you need (or want) to work with SW raid, it would be better to subscribe. It's not all that much traffic here and (usually) the stuff we get here is relevant (with exception of too many questions on patches location, but that should be fixed anyway). Besides, any real problems, bug reports, warnings appear here very soon. D.
RE: RAID questions
> -Original Message- > From: Adam McKenna [mailto:[EMAIL PROTECTED]] > Sent: Monday, August 07, 2000 9:27 PM > To: Gregory Leblanc > Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] > Subject: Re: RAID questions > > On Mon, Aug 07, 2000 at 08:07:58PM -0700, Gregory Leblanc wrote: > > I'm a little verbose, but this should answer most of your questions, > > although sometimes in a slightly annoyed tone. Don't take > it personally. > > There's a difference between being annoyed and being > immature. You seem to > have answered everything with maturity, so no offense taken. Phew. Sometimes I come off poorly, and people flip out. I hate that. > I did a search on google. The majority of posts I was able > to find mentioned > a 2.2.15 patch which could be applied to 2.2.16 as long as > several hunks were > hand-patched. Personally, I don't particularly like > hand-patching code. > Especially when the data that my job depends on is involved. Hmm, there have been some recently (I think) for the 2.2.16 kernels. I've not kept up on my testing (no time), so I'm still running 2.2.14. > > > 2) The current 2.2.16 errata lists a problem with md.c which > > > is fixed by the > > > patch "2.2.16combo". > > > > I believe that md software RAID applies to the old RAID > code. The RAID > > stuff has been VERY good for quite a while now. > > The howto on linux.org listed > ftp://www.fi.kernel.org/pub/linux/daemons/raid > as the "official" location for the RAID patches. The patches > located there > only went up to 2.2.11. In fact, looking now, the > linuxdoc.org howto lists > the same location. True enough, it's out of date. I'm going to try to get Jacob to point to my FAQ, but I haven't gotten enough feedback just yet. > the first place. In retrospect, I suppose it was a stupid > question, but I'd > rather be safe than sorry. There are no stupid questions, only stupid answers. :-) Amongst the other 45 kernel compiles I've got to do this week, I'll try to find some time to look at the 2.2.16 patch, and see if it works nicely with Ingo's RAID patch on my system. > Thanks for the link. However as mentioned above the howto > there still gives > the incorrect location for current kernel patches. Sorry, I can't fix that. I just help put the HOWTOs online, I don't write them (at least not much). > > > So, I have the following questions. > > > > > > 1) Do I need to apply the RAID patch to 2.2.16 or not? > > > > Do you want new RAID, or old RAID? > > Well, the box won't boot with the stock MD driver. In that case, you need to patch your kernel. :) I think you mentioned that you'd already found the 2.2.16 patch, so run with that, and see what happens. > > > 2) If I do, will it still broken unless I apply the > > > "2.2.16combo" patch? D'uh, I'll look and see what it does for me, and report back. Probably NOT tomorrow, but some time this week. Maybe somebody else will step forward with results before I get to it. > I was hoping my post would serve as a reminder to those on > the list who are in > charge of maintaining those resources. I dunno, the kernel list just scares me. There's too much extraneous stuff that goes through there anyway, and 90% of it is over my head. Speaking of which, I'll trim them from the list, after this email (since somebody there might have tried more patches than myself). > > If you don't know what you're doing, GET A TEST MACHINE. > Sorry to yell, but > > don't play with things on production boxes. Find a nice > cheapie P-133 type > > box, grab a couple of drives, and test out RAID that way. > Don't do that one > > production boxes. If somebody can't come up with $200 to > get you a test > > box, then spring for it yourself, and get a decent X term for home. [SNIP] > My current prime objective is getting rid of the current kernel we are > running, as I am having other problems with the box that I > think are kernel > related. (EAGAIN errors -- resource temporarily unavailable > when trying to > make a TCP connection to a remote host after about 5 days of > uptime) A test > box would be nice but it could take weeks to obtain one. > Personally, I'd > rather avoid having to go in at 2:30 in the morning again to > reboot the box. Ah, sorry about that one, it might have been a little out of line. However, do get yourself a test box, doesn't even need to be the same hardware, just something that you can break. > > I looked at geocrawler, but I found their site to be really > slow and their > search engine to be crap. I didn't have the time or the > patience to go > wading through messages one by one. It seemed good to me, but I've already read everything that I've been looking for. :-) From aforementioned FAQ: 1.1. Where can I find archives for the linux-raid mailing list? My favorite archives are at Geocrawler. http://www.geocrawler.com/lists/3/Linux/57/0/ Other archives are available at
Re: RAID questions
On Mon, Aug 07, 2000 at 08:07:58PM -0700, Gregory Leblanc wrote: > I'm a little verbose, but this should answer most of your questions, > although sometimes in a slightly annoyed tone. Don't take it personally. There's a difference between being annoyed and being immature. You seem to have answered everything with maturity, so no offense taken. > > Hello, > > > > I consider the current state of affairs with Software-RAID to > > be unbelievable. > > It's not as bad as you think. :-) Maybe not to someone who follows the list regularly, but for someone who needs to get things accomplished, it's pretty bad. > > 1) The current RAID-Howto (on www.linux.org) does not > > indicate the correct > > location of RAID patches. I had to go searching all over > > the web to find > > the 2.2.16 RAID patch. > > Did you try reading the archives for the Linux-RAID list? I've started on a > FAQ that will be updated at very least monthly, and posted to linux-raid. I did a search on google. The majority of posts I was able to find mentioned a 2.2.15 patch which could be applied to 2.2.16 as long as several hunks were hand-patched. Personally, I don't particularly like hand-patching code. Especially when the data that my job depends on is involved. > > 2) The current 2.2.16 errata lists a problem with md.c which > > is fixed by the > > patch "2.2.16combo". > > I believe that md software RAID applies to the old RAID code. The RAID > stuff has been VERY good for quite a while now. The howto on linux.org listed ftp://www.fi.kernel.org/pub/linux/daemons/raid as the "official" location for the RAID patches. The patches located there only went up to 2.2.11. In fact, looking now, the linuxdoc.org howto lists the same location. > > 3) The patch "2.2.16combo" FAILS if the RAID patch has > > already been applied. > > Ditto with the RAID patches to md.c if the 2.2.16combo > > patch has already > > been applied. > > Perhaps they're not compatible, or perhaps one includes the other? Have you > looked at the patches to try to figure out why they don't work? I'm NOT a > hacker, but I can certainly try to figure out why patches don't work. I looked at them. It appears as though the RAID patch changes the relevant section to something totally different than it was before, so that the patch can't be applied, even with an offset. This is why I asked the question in the first place. In retrospect, I suppose it was a stupid question, but I'd rather be safe than sorry. > > 4) The kernel help for all of the MD drivers lists a nonexistant > > Software-RAID mini-howto, which is supposedly located at > > ftp://metalab.unc.edu/pub/Linux/docs/HOWTO/mini. There is no such > > document at this location. > > There are 2 Software-RAID HOWTOs available there, although they are 1 > directory higher than that URL. For the code included in the stock kernels, > see ftp://metalab.unc.edu/pub/Linux/docs/HOWTO/Software-RAID-0.4x-HOWTO. > For the new RAID code by Ingo and others, see > ftp://metalab.unc.edu/pub/Linux/docs/HOWTO/Software-RAID-HOWTO. Both of > these documents are easily available from http://www.LinuxDoc.org/ Thanks for the link. However as mentioned above the howto there still gives the incorrect location for current kernel patches. > > 5) The kernel help also does not make it clear that you even > > need a RAID > > patch with current kernels. It is implied that if you > > "Say Y here" then > > your kernel will support RAID. This problem is > > exacerbated by the missing > > RAID patches at the location specified in the actual > > Software-RAID-Howto. > > No, you don't NEED to patch your kernel to get RAID (md raid, that is) > working. You DO need to patch the kernel if you want the new RAID code. > Everyone on the Linux-RAID list will recommend the new code, I don't know > about anybody else. > > > So, I have the following questions. > > > > 1) Do I need to apply the RAID patch to 2.2.16 or not? > > Do you want new RAID, or old RAID? Well, the box won't boot with the stock MD driver. > > 2) If I do, will it still broken unless I apply the > > "2.2.16combo" patch? > > If you apply the combo patch, that will fix things with the old code (I > think, have not verified this yet). If you apply the RAID patch (from the > location above), then you don't need to worry about the fixes in the > 2.2.16combo. > > > 3) If it will, then how do I resolve the problem with the > > md.c hunk failing > > with "2.2.16combo"? > > Apply manually? Just take a look at the .rej files (from /usr/src/linux do > a 'find . -name "*rej*"') and see what failed to apply. I generally open a > split pane editor, (for emacs, just put two file names on the command line), > and see if I can find where the patch failed, and try to add the > missing/remove the extraneous lines by hand. It's worked so far. See above. > > 4) Is there someone I can contact who can
RE: RAID questions
I'm a little verbose, but this should answer most of your questions, although sometimes in a slightly annoyed tone. Don't take it personally. > -Original Message- > From: Adam McKenna [mailto:[EMAIL PROTECTED]] > Sent: Monday, August 07, 2000 12:10 PM > To: [EMAIL PROTECTED] > Cc: [EMAIL PROTECTED] > Subject: RAID questions > > Hello, > > I consider the current state of affairs with Software-RAID to > be unbelievable. It's not as bad as you think. :-) > 1) The current RAID-Howto (on www.linux.org) does not > indicate the correct > location of RAID patches. I had to go searching all over > the web to find > the 2.2.16 RAID patch. Did you try reading the archives for the Linux-RAID list? I've started on a FAQ that will be updated at very least monthly, and posted to linux-raid. > 2) The current 2.2.16 errata lists a problem with md.c which > is fixed by the > patch "2.2.16combo". I believe that md software RAID applies to the old RAID code. The RAID stuff has been VERY good for quite a while now. > 3) The patch "2.2.16combo" FAILS if the RAID patch has > already been applied. > Ditto with the RAID patches to md.c if the 2.2.16combo > patch has already > been applied. Perhaps they're not compatible, or perhaps one includes the other? Have you looked at the patches to try to figure out why they don't work? I'm NOT a hacker, but I can certainly try to figure out why patches don't work. > 4) The kernel help for all of the MD drivers lists a nonexistant > Software-RAID mini-howto, which is supposedly located at > ftp://metalab.unc.edu/pub/Linux/docs/HOWTO/mini. There is no such > document at this location. There are 2 Software-RAID HOWTOs available there, although they are 1 directory higher than that URL. For the code included in the stock kernels, see ftp://metalab.unc.edu/pub/Linux/docs/HOWTO/Software-RAID-0.4x-HOWTO. For the new RAID code by Ingo and others, see ftp://metalab.unc.edu/pub/Linux/docs/HOWTO/Software-RAID-HOWTO. Both of these documents are easily available from http://www.LinuxDoc.org/ > 5) The kernel help also does not make it clear that you even > need a RAID > patch with current kernels. It is implied that if you > "Say Y here" then > your kernel will support RAID. This problem is > exacerbated by the missing > RAID patches at the location specified in the actual > Software-RAID-Howto. No, you don't NEED to patch your kernel to get RAID (md raid, that is) working. You DO need to patch the kernel if you want the new RAID code. Everyone on the Linux-RAID list will recommend the new code, I don't know about anybody else. > So, I have the following questions. > > 1) Do I need to apply the RAID patch to 2.2.16 or not? Do you want new RAID, or old RAID? > 2) If I do, will it still broken unless I apply the > "2.2.16combo" patch? If you apply the combo patch, that will fix things with the old code (I think, have not verified this yet). If you apply the RAID patch (from the location above), then you don't need to worry about the fixes in the 2.2.16combo. > 3) If it will, then how do I resolve the problem with the > md.c hunk failing > with "2.2.16combo"? Apply manually? Just take a look at the .rej files (from /usr/src/linux do a 'find . -name "*rej*"') and see what failed to apply. I generally open a split pane editor, (for emacs, just put two file names on the command line), and see if I can find where the patch failed, and try to add the missing/remove the extraneous lines by hand. It's worked so far. > 4) Is there someone I can contact who can update publically > available > documentation to make it easier for people to find what > they're looking > for? Not sure about the stuff in the Linux kernel sources, but I'd assume that somebody on the Linux-kernel list can do that. As for the Software-RAID HOWTO, tell Jacob (he IS on the raid list). Again, I've created a FAQ for the Linux-raid mailing list, which should cover many of these questions. I'll be asking the list maintainer about putting a footer onto posts to the list, but I'm not sure about the feasibility of that just yet. > This is a production system I am working on here. I can't > afford to have it > down for an hour or two to test a new kernel. I'd rather not > be working with > this mess to begin with, but unfortunately this box was > purchased before I > started this job, and whoever ordered it decided that > software raid was > "Good enough". If you don't know what you're doing, GET A TEST MACHINE. Sorry to yell, but don't play with things on production boxes. Find a nice cheapie P-133 type box, grab a couple of drives, and test out RAID that way. Don't do that one production boxes. If somebody can't come up with $200 to get you a test box, then spring for it yourself, and get a decent X term for home. As for Software RAID being good enough, I find that to be true. If I needed hot swap,
No Subject
Re: RAID questions
I found it sufficient to apply http://people.redhat.com/mingo/raid-patches/raid-2.2.16-A0 to the stock 2.2.16 kernel. Works fine with rh6.2 raid tools. Hope it helps, --andrew So, I have the following questions. 1) Do I need to apply the RAID patch to 2.2.16 or not? 2) If I do, will it still broken unless I apply the "2.2.16combo" patch? 3) If it will, then how do I resolve the problem with the md.c hunk failing with "2.2.16combo"? 4) Is there someone I can contact who can update publically available documentation to make it easier for people to find what they're looking for? This is a production system I am working on here. I can't afford to have it down for an hour or two to test a new kernel. I'd rather not be working with this mess to begin with, but unfortunately this box was purchased before I started this job, and whoever ordered it decided that software raid was "Good enough". I am not subscribed to either list so CC's are desirable. However if you don't want to CC then you don't have to -- I'll just read the archives. That is, if someone fixes the "Mailing list archives" link on www.linux.org to point to someplace that exists and actually has archives. Thanks for your time, --Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
RAID questions
Hello, I consider the current state of affairs with Software-RAID to be unbelievable. 1) The current RAID-Howto (on www.linux.org) does not indicate the correct location of RAID patches. I had to go searching all over the web to find the 2.2.16 RAID patch. 2) The current 2.2.16 errata lists a problem with md.c which is fixed by the patch "2.2.16combo". 3) The patch "2.2.16combo" FAILS if the RAID patch has already been applied. Ditto with the RAID patches to md.c if the 2.2.16combo patch has already been applied. 4) The kernel help for all of the MD drivers lists a nonexistant Software-RAID mini-howto, which is supposedly located at ftp://metalab.unc.edu/pub/Linux/docs/HOWTO/mini. There is no such document at this location. 5) The kernel help also does not make it clear that you even need a RAID patch with current kernels. It is implied that if you "Say Y here" then your kernel will support RAID. This problem is exacerbated by the missing RAID patches at the location specified in the actual Software-RAID-Howto. So, I have the following questions. 1) Do I need to apply the RAID patch to 2.2.16 or not? 2) If I do, will it still broken unless I apply the "2.2.16combo" patch? 3) If it will, then how do I resolve the problem with the md.c hunk failing with "2.2.16combo"? 4) Is there someone I can contact who can update publically available documentation to make it easier for people to find what they're looking for? This is a production system I am working on here. I can't afford to have it down for an hour or two to test a new kernel. I'd rather not be working with this mess to begin with, but unfortunately this box was purchased before I started this job, and whoever ordered it decided that software raid was "Good enough". I am not subscribed to either list so CC's are desirable. However if you don't want to CC then you don't have to -- I'll just read the archives. That is, if someone fixes the "Mailing list archives" link on www.linux.org to point to someplace that exists and actually has archives. Thanks for your time, --Adam
Re: fsck & fstab
Hi, On 07-Aug-00 octave klaba wrote: > raiddev /dev/md0 > device /dev/sda2 > device /dev/sdb2 > /dev/hda6 / ext2defaults1 1 > /dev/hda1 /boot ext2defaults1 2 > /dev/hda5 /usr/local/apache/logs ext2defaults1 3 > /dev/md0/home ext2defaults,usrquota 1 0 > /dev/sda1 swapswapdefaults0 0 > /dev/sdb1 swapswapdefaults0 0 So you do not have any other partitions on sda and sdb which could get checked simultaniously. Then my first guess is not the cause it seems. Did you have a look what flags will be given to fsck at boottime? Try to run it by hand with the same flags (it's fsck -C -a -t $type in my case, SuSE 6.3). Here the '-a' could be the cause for longer checktimes. May there are more tests and automtic decisions involved that way. -a Automatically repair the file system without any questions (use this option with caution). Note that e2fsck(8) supports -a for backwards compatiĀ bility only. This option is mapped to e2fsck's -p option which is safe to use, unlike the -a option that most file system checkers support. If that runs also much faster if you run it by hand then at bootime I've no idea anymore I fear And maybe somebody else on the list knows something. regards, Karl-Heinz - Karl-Heinz Herrmann E-Mail: [EMAIL PROTECTED] -
Re: FAQ update
On Mon, Aug 07, 2000 at 08:47:47AM -0700, Gregory Leblanc wrote: > > -Original Message- > > From: James Manning [mailto:[EMAIL PROTECTED]] > > Sent: Saturday, August 05, 2000 6:08 AM > > To: Linux Raid list (E-mail) > > Subject: Re: FAQ update > > > > [Luca Berra] > > > >The patches for 2.2.14 and later kernels are at > > > >http://people.redhat.com/mingo/raid-patches/. Use the > > right patch for > > > >your kernel, these patches haven't worked on other > > kernel revisions > > > >yet. > > > > > > i'd add: dont use netscape to fetch patches from mingo's > > site, it hurts > > > use lynx/wget/curl/lftp > > > > Yes, *please* *please* *please* > > I need some clarification on this. I couldn't make lynx work, it chopped > off long lines or something. wget works, I've never heard of the other two. > Why exactly is NetScrape bad? That server load thing sounds fishy to me... > > Greg > ok, i'll clarify NutScrape may not work for the same reason lynx failed for you redhat server says the file is text/plain so both netscape and lynx fail if you view the file and than save it to a local file. If you Shift-click on netscape or press 'd' on lynx it should work. i don't give a damn about the load on redhat http server, but i don't like receiving tons of mails saying that the patch from mingo site fails for them. L. P.S. someone could suggest mingo gzips the blasted patches : -- Luca Berra -- [EMAIL PROTECTED] Communication Media & Services S.r.l.
RE: raid-2.2.17-A0 cleanup for LVM
> -Original Message- > From: Carlos Carvalho [mailto:[EMAIL PROTECTED]] > Sent: Monday, August 07, 2000 10:57 AM > To: Andrea Arcangeli > Cc: [EMAIL PROTECTED] > Subject: Re: raid-2.2.17-A0 cleanup for LVM > > >In 2.2.x that's not possible but for _very_ silly reasons. > > So can't this be fixed? I wouldn't expect it to be fixed. 2.4 is well on it's way, and seems to have quite a few "silly" things fixed. > >On 2.4.x we now have a modular and recursive make_request > callback, that > >will allow us to handle all the volume management layering > correctly (so > >if raid5 on top of raid0 isn't working right now in 2.4.x send a bug > >report ;). > > Yes, but it's useless because of the abysmal (absence of) speed. And > all the VM problems... The machine I need raid50 on is a central > server, if it stops everything else goes down. In fact I'm not using > 2.4 on it precisely because of the VM/raid problems!! :-( :-( > > If I can't do raid50 on our server I'll have to resort to raid10, > losing 50% of our so expensive disks... No, DASD (disks) are cheap, compared with other things, like upgrading the processor(s) on your oracle or DB2 server. If you're dealing with SCSI (which you must be, for that many drives), and using RAID 5, speed can't be that paramount. Just put another drive on each bus. I know, nobody likes to spend money on disks, but they're cheaper than losing data. Greg
RE: FAQ update
> -Original Message- > From: James Manning [mailto:[EMAIL PROTECTED]] > Sent: Saturday, August 05, 2000 6:08 AM > To: Linux Raid list (E-mail) > Subject: Re: FAQ update > > [Luca Berra] > > >The patches for 2.2.14 and later kernels are at > > >http://people.redhat.com/mingo/raid-patches/. Use the > right patch for > > >your kernel, these patches haven't worked on other > kernel revisions > > >yet. > > > > i'd add: dont use netscape to fetch patches from mingo's > site, it hurts > > use lynx/wget/curl/lftp > > Yes, *please* *please* *please* I need some clarification on this. I couldn't make lynx work, it chopped off long lines or something. wget works, I've never heard of the other two. Why exactly is NetScrape bad? That server load thing sounds fishy to me... Greg
Re: fsck & fstab
Hi, > > I realized after a crash, if in /etc/fstab the fsck is > > on, it takes about 45-50 minutes to check 2x18Go scsi in raidsoft. > > [...] and make all > > folks handly (unmount, fsck, reboot) and it takes 6 minutes. > > H... Do you have *only one* raid partitions on that drive or are there > other partitions in use (and checked) as well? I have this: /dev/hda6 / ext2defaults1 1 /dev/hda1 /boot ext2defaults1 2 /dev/hda5 /usr/local/apache/logs ext2defaults1 3 /dev/md0/home ext2defaults,usrquota 1 0 /dev/sda1 swapswapdefaults0 0 /dev/sdb1 swapswapdefaults0 0 /md0 was in 4. now it is 0 it was very very slow between 80% & 95% then it was very quick. doing it handly it is very quick all time: /dev/md0 17251748 9969024 6406384 61% /home > If there is a /dev/md0 on /dev/sda1 and /dev/sdb1 and /dev/sda2, /dev/sdb2 > to check it will cause e2fsck to run on md0 and the sda2, sdb2 at the same > time, because it doesn't know its the same physical drive. This will lead to > a lot of head moevements (should get quite loud) and will slow down fsck > tremendously. so why fsck making handly is quicker ? Amicalement, oCtAvE "Internet ? Welcome in the slave economy."
Re: raid-2.2.17-A0 cleanup for LVM
On Mon, 7 Aug 2000, Carlos Carvalho wrote: >So can't this be fixed? Everything can be fixed, the fact is that I'm not sure if it worth, we'd better spend efforts in making 2.4.x more stable than overbackporting new stuff to 2.2.x... The fix precisely to allow raid5 on raid0 could be pretty localized if done in the wrong way though (with wrong way I mean not in the 2.4.x way). Andrea
Re: raid-2.2.17-A0 cleanup for LVM
Andrea Arcangeli ([EMAIL PROTECTED]) wrote on 7 August 2000 16:50: >On Sun, 6 Aug 2000, Carlos Carvalho wrote: > >>Does this patch allow raid5 over raid0? That'd be really wonderful... > >Despite it's useful nor not, which 2.?.x? The latest if possible, but the one your patch applies to if I have no other choice... >In 2.2.x that's not possible but for _very_ silly reasons. So can't this be fixed? >On 2.4.x we now have a modular and recursive make_request callback, that >will allow us to handle all the volume management layering correctly (so >if raid5 on top of raid0 isn't working right now in 2.4.x send a bug >report ;). Yes, but it's useless because of the abysmal (absence of) speed. And all the VM problems... The machine I need raid50 on is a central server, if it stops everything else goes down. In fact I'm not using 2.4 on it precisely because of the VM/raid problems!! :-( :-( If I can't do raid50 on our server I'll have to resort to raid10, losing 50% of our so expensive disks...
Re: fsck & fstab
> > /dev/md0 17251748 9969024 6406384 61% /home > > Which partitions are included in your md0? raiddev /dev/md0 raid-level 1 nr-raid-disks 2 nr-spare-disks 0 chunk-size 32 persistent-superblock 1 device /dev/sda2 raid-disk 0 device /dev/sdb2 raid-disk 1 Octave Amicalement, oCtAvE "Internet ? Welcome in the slave economy." - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
RE: FAQ
Here's one more update of the FAQ. Assuming not too many objections, I'll send it to Jacob, and see if I can contact the list owner and get a footer onto this list. Greg Linux-RAID FAQ Gregory Leblanc gleblanc (at) cu-portland.edu Revision History Revision v0.03 7 August 2000 Revised by: gml Added a request to use a wget type program to fetch the patch. Tried to make things look a little bit better, failed miserably. Revision v0.02 4 August 2000 Revised by: gml Revised a the How do I patch? and the What does /proc/mdstat look like? questions. This is a FAQ for the Linux-RAID mailing list, hosted on vger.rutgers.edu. It's intended as a supplement to the existing Linux-RAID HOWTO, to cover questions that keep occurring on the mailing list. PLEASE read this document before your post to the list. _ 1. General 1.1. Where can I find archives for the linux-raid mailing list? 2. Kernel 2.1. I'm running [insert your linux distribution here]. Do I need to patch my kernel to make RAID work? 2.2. How can I tell if I need to patch my kernel? 2.3. Where can I get the latest RAID patches for my kernel? 2.4. How do I apply the patch to a kernel that I just downloaded from ftp.kernel.org? 1. General 1.1. Where can I find archives for the linux-raid mailing list? My favorite archives are at Geocrawler. http://www.geocrawler.com/lists/3/Linux/57/0/ Other archives are available at http://marc.theaimsgroup.com/?l=linux-raid&r=1&w=2 Another archive site is http://www.mail-archive.com/linux-raid@vger.rutgers.edu/ 2. Kernel 2.1. I'm running [insert your linux distribution here]. Do I need to patch my kernel to make RAID work? Well, the short answer is, it depends. Distributions that are keeping up to date have the RAID patches included in their kernels. The kernel that RedHat distributes, as do some others. If you download a 2.2.x kernel from ftp.kernel.org, then you will need to patch your kernel. 2.2. How can I tell if I need to patch my kernel? The easiest way is to check what's in /proc/mdstat. Here's a sample from a 2.2.x kernel, with the RAID patches applied. [gleblanc@grego1 gleblanc]$ cat /proc/mdstat Personalities : [linear] [raid0] [raid1] [raid5] [translucent] read_ahead not set unused devices: If the contents of /proc/mdstat looks like the above, then you don't need to patch your kernel. Here's a sample from a 2.2.x kernel, without the RAID patches applied. [root@finch root]$ cat /proc/mdstat Personalities : [1 linear] [2 raid0] [3 raid1] [4 raid5] read_ahead not set md0 : inactive md1 : inactive md2 : inactive md3 : inactive If your /proc/mdstat looks like this one, then you need to patch your kernel. 2.3. Where can I get the latest RAID patches for my kernel? The patches for the 2.2.x kernels up to, and including, 2.2.13 are available from ftp.kernel.org. Use the kernel patch that most closely matches your kernel revision. For example, the 2.2.11 patch can also be used on 2.2.12 and 2.2.13. The patches for 2.2.14 and later kernels are at http://people.redhat.com/mingo/raid-patches/. Use the right patch for your kernel, these patches haven't worked on other kernel revisions yet. Please use something like wget/curl/lftp to retrieve this patch, as it's easier on the server than using a client like Netscape. Downloading patches with Lynx has been unsuccessful for me; wget may be the easiest way. 2.4. How do I apply the patch to a kernel that I just downloaded from ftp.kernel.org? First, unpack the kernel into some directory, generally people use /usr/src/linux. Change to this directory, and type patch -p1 < /path/to/raid-version.patch. On my RedHat 6.2 system, I decompressed the 2.2.16 kernel into /usr/src/linux-2.2.16. From /usr/src/linux-2.2.16, I type in patch -p1 < /home/gleblanc/raid-2.2.16-A0. Then I rebuild the kernel using make menuconfig and related builds.
Re: newbie question
you could write a simple shell or perl script to do this using the /proc/mdstat as a reference, but it is a bad idea to put in a drive and have the kernel _assume_ you want to put things back the way they were. i prefer the control, rather than have the kernel assume. allan Emmanuel Galanos <[EMAIL PROTECTED]> said: > Greetings, > If this is documented somewhere feel free to tell me where it is: > > I just setup a software RAID 1 using 2 IDE disks and no spares. I'm > using the kernel that comes with the RH beta (md 0.90.0, raidtools 0.90). > > Anyway to test it, I halted the machine then disconnected one of the > drives. Booted the machine, it goes into degraded mode. Everything fine. > Power down. Reconnect drive. Restart. The array still stays in degraded mode :( > (timecounter was out by 2). > > Looking at the source this is the intended behaviour if the md devices > are out of sync by more than one time increment. I managed to then find the > command raidhotadd, and was thus able to add the extra partitions back into > the array (I am using 5 md devices) manually, and everything was peachy again. > Only problem was that it was ugly having to specify each of the > individual partitions/md devices. > > Question: Without having spare disks is there a way to get md to > automatically start a reconstruction using the "freshest" copy? (besides > getting rid of the test in the source). Is there a reason why it should not > do this? > > Thanks. > emmanuel > --
Re: fsck & fstab
On 07-Aug-00 octave klaba wrote: > /dev/md0 17251748 9969024 6406384 61% /home Which partitions are included in your md0? K.-H. E-Mail: Karl-Heinz Herrmann <[EMAIL PROTECTED]> http://www.kfa-juelich.de/icg/icg7/FestFluGre/transport/khh/general.html Sent: 07-Aug-00, 17:52:38 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
newbie question
Greetings, If this is documented somewhere feel free to tell me where it is: I just setup a software RAID 1 using 2 IDE disks and no spares. I'm using the kernel that comes with the RH beta (md 0.90.0, raidtools 0.90). Anyway to test it, I halted the machine then disconnected one of the drives. Booted the machine, it goes into degraded mode. Everything fine. Power down. Reconnect drive. Restart. The array still stays in degraded mode :( (timecounter was out by 2). Looking at the source this is the intended behaviour if the md devices are out of sync by more than one time increment. I managed to then find the command raidhotadd, and was thus able to add the extra partitions back into the array (I am using 5 md devices) manually, and everything was peachy again. Only problem was that it was ugly having to specify each of the individual partitions/md devices. Question: Without having spare disks is there a way to get md to automatically start a reconstruction using the "freshest" copy? (besides getting rid of the test in the source). Is there a reason why it should not do this? Thanks. emmanuel
RE: fsck & fstab
Hi! On 07-Aug-00 octave klaba wrote: > I realized after a crash, if in /etc/fstab the fsck is > on, it takes about 45-50 minutes to check 2x18Go scsi in raidsoft. > [...] and make all > folks handly (unmount, fsck, reboot) and it takes 6 minutes. H... Do you have *only one* raid partitions on that drive or are there other partitions in use (and checked) as well? If there is a /dev/md0 on /dev/sda1 and /dev/sdb1 and /dev/sda2, /dev/sdb2 to check it will cause e2fsck to run on md0 and the sda2, sdb2 at the same time, because it doesn't know its the same physical drive. This will lead to a lot of head moevements (should get quite loud) and will slow down fsck tremendously. I changed my fsck to first check root, then my raids, and then -a (the raids come back immediately as clean). K.-H. E-Mail: Karl-Heinz Herrmann <[EMAIL PROTECTED]> http://www.kfa-juelich.de/icg/icg7/FestFluGre/transport/khh/general.html Sent: 07-Aug-00, 17:31:53 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: raid-2.2.17-A0 cleanup for LVM
On Sun, 6 Aug 2000, Carlos Carvalho wrote: >Does this patch allow raid5 over raid0? That'd be really wonderful... Despite it's useful nor not, which 2.?.x? In 2.2.x that's not possible but for _very_ silly reasons. Raid0 in general is a no brainer and fully transparent layer that we can place anywhere/anytime we want, definitely also behynd raid5. Fixing this in 2.2.x is ugly (it's just ugly enough supporting LVM (lvm does linear and raid0) on top of RAID{[015],linear} ;). (note the other way around doesn't work for the same silly reasons raid5 on top of raid0 doesn't work) Other raids (1/5) that needs to generate additional requests are a little bit more problematics though. On 2.4.x we now have a modular and recursive make_request callback, that will allow us to handle all the volume management layering correctly (so if raid5 on top of raid0 isn't working right now in 2.4.x send a bug report ;). Andrea
fsck & fstab
Hi, I realized after a crash, if in /etc/fstab the fsck is on, it takes about 45-50 minutes to check 2x18Go scsi in raidsoft. So I put 0 in /etc/fstab, and it mounts /dev/md0 direclty (which is no good, I agree). But after I go to init 1 and make all folks handly (unmount, fsck, reboot) and it takes 6 minutes. any idea why it is so long ? Octave -- Amicalement, oCtAvE "Internet ? Welcome in the slave economy." - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
RE: Problems booting from RAID
> -Original Message- > From: Jane Dawson [mailto:[EMAIL PROTECTED]] > Sent: Monday, August 07, 2000 3:13 AM > To: [EMAIL PROTECTED] > Subject: Problems booting from RAID > > Hi, > > I decided to set up a completely RAID-based system using two > identical IDE > hard disks, each with 3 partitions (boot, swap and data). Be careful about swap on RAID. There are lots of details in the archives, but SWAP on RAID during reconstruction is a really bad thing, so don't let it happen. > But I am having appalling problems in getting the machine to boot from > RAID! I've been through the Software-RAID-HOWTO so many times > I can almost > recite it, but still things aren't going as they should. > > Does anyone have any pointers as to where I'm going wrong? At > the moment, > all six paritions are set in 'cfdisk' to type 'fd'. What > should I put in > lilo.conf, etc. ? Probably need to patch lilo, and then have a look at the LDP document on the subject. http://www.LinuxDoc.org/HOWTO/Boot+Root+Raid+LILO.html This one looked good, but I haven't dug into it very much, as RH works wonderfully to install onto RAID. Greg
RE: owie, disk failure
> > > disks are less than two weeks old, although I have heard > of people > > > having similar problems (disks failing in less than a > month new from > > > the factory) with this brand and model I would like > to get the > > In my experience 95% of drive failures occur in the first couple of > weeks. If they get out of this timeframe, then I find they > usually last > for a long time. I don't think this is a failing of this brand and/or > model. Well, from the drives that I've had, they either fail after a few weeks, or after several years (like 5+). Almost never in between. We keep a spare drive of each size around anyway. :-) > > To check and see if the drive is actually in good > condition, grab the > > diagnostic utility from the support site of your drive manufacturer, > > boot from a DOS floppy, and run diagnostics on the drive. > > I have to confess I've never heard of manufacturers offering > diagnostic > utilities for disks... Gregory, can you point me at any examples? Am I > just being a complete dumbass here? Yes, you are. :-) From Maxtor's site (since I just RM'd a drive last week) (http://www.maxtor.com/) click on software download. Right on that page is info about the MaxDiag utility. It does a little more than badblocks and friends, at least for IDE drives. It will return drive specific error codes, and if you've run all of those tests by the time you call support, you can just give them the error numbers, and they issue an RMA. The other nice feature is that it gives you the tech support number to call as soon as it shows the error. :-) > > In order for them to replace my drives, I've had to do "write" > > testing, which destroys all data on the drive, so you may > want to disconnect > > power from one of the drives before you play around with that. > > If you don't trust yourself to get the right disk for a write > test then > you need to do this. However, if you check *EXACTLY* what you > are doing > before running a write-test, then I don't see any reason to > go so far as > to unplug the disks. YMMV. Well, that's true, but if you don't trust yourself to get the right drive, then you should unplug the one that still has the data intact. Depending on the value of the data, it may be worth unplugging it just for safety's sake, although if it's that important, it should be backed up. Later, Greg
RE: RAID1 problem under 2.2.16 linux kernel.
Hi! On 07-Aug-00 Tomasz Gralewski wrote: > I need help. > I use on the corporate machine linux kernel of 2.2.16 version. > The problem is: > This version of kernel does not support CONFIG_AUTODETECT_RAID. > I'd like to start the root filesystem from /dev/md0 but the kernel tell > me durring a boot time, that it can not to mount > root filesystem /dev/md0. You seem to have the stock 2.2.16 kernel without the newer raid-pathes. You can find a 2.2.16 path at: http://people.redhat.com/mingo/raid-patches/ apply that to your kernel source (add --dry-run to test if the patch will succeed before actually changing your kernel source). > Is there different method to setup mirroring under 2.2.16 kernel. > Maybe the patches, lilo configuration, initrd would help. > Is that proble is solved? In my kernel-configuration is an option for autodetect and booting from raid. There is also a Raid-Howto which describes how to realize a raid1 at for booting from. I don't boot from it, but by raid is detected and started at boottime right after the scsi-device scan and partition check. I hope this helps, K.-H. E-Mail: Karl-Heinz Herrmann <[EMAIL PROTECTED]> http://www.kfa-juelich.de/icg/icg7/FestFluGre/transport/khh/general.html Sent: 07-Aug-00, 13:34:37
RAID1 problem under 2.2.16 linux kernel.
I need help. I use on the corporate machine linux kernel of 2.2.16 version. The problem is: This version of kernel does not support CONFIG_AUTODETECT_RAID. I'd like to start the root filesystem from /dev/md0 but the kernel tell me durring a boot time, that it can not to mount root filesystem /dev/md0. Is there different method to setup mirroring under 2.2.16 kernel. Maybe the patches, lilo configuration, initrd would help. Is that proble is solved? Tomasz Gralewski
Problems booting from RAID
Hi, I decided to set up a completely RAID-based system using two identical IDE hard disks, each with 3 partitions (boot, swap and data). The setup is hda1 and hdc1 = 800Mb boot hda2 and hdc2 = 128Mb swap hda3 and hdc3 = 3Gb data I'm using kernel 2.4.0-test5 with Ingo's 'dangerous' raidtools and have successfully managed to get the data partitions running in RAID-1 as /dev/md2. Great :) But I am having appalling problems in getting the machine to boot from RAID! I've been through the Software-RAID-HOWTO so many times I can almost recite it, but still things aren't going as they should. Does anyone have any pointers as to where I'm going wrong? At the moment, all six paritions are set in 'cfdisk' to type 'fd'. What should I put in lilo.conf, etc. ? Any help / common problems would be most appreciated because this thing is driving me crazy! Kind regards, Jane
RE: owie, disk failure
On Mon, 7 Aug 2000, Corin Hartland-Swann wrote: > I have to confess I've never heard of manufacturers offering diagnostic > utilities for disks... Gregory, can you point me at any examples? Am I > just being a complete dumbass here? At least Western Digital does on their ftp address ftp://ftp.wdc.com/pub/drivers/hdutil, however I don't know what and how those utils do better than badblocks & friends. D.
RE: owie, disk failure
Jeffrey, On Sun, 6 Aug 2000, Gregory Leblanc wrote: > On Sun, 6 Aug 2000, Jeffrey Paul wrote: > > h, the day i had hoped would never arrive has... It's _always_ waiting :( > > Aug 2 07:38:27 chrome kernel: raid1: Disk failure on hdg1, > > disabling device. OK, so it thinks hdg1 is faulty... > > Aug 2 07:38:27 chrome kernel: raid1: md0: rescheduling block 8434238 > > Aug 2 07:38:27 chrome kernel: md0: no spare disk to reconstruct > > array! -- continuing in degraded mode > > Aug 2 07:38:27 chrome kernel: raid1: md0: redirecting sector 8434238 > > to another mirror > > > > my setup is a two-disk (40gb each) raid1 configuration... hde1 and > > hdg1. I didn't have measures in place to notify me of such an > > event, so I didnt notice it until i looked at the console today and > > noticed it there... In 'degraded' mode it is basically just a normal disk without redundancy. Nothing bad is going to happen just because you're still running it in degraded mode. > I think I ran for about 2 weeks on a dead drive. Thankfully it wasn't a > production system, but notification isn't quite as "out of the box" as it > needs to be just yet. A simple cron script is probably the way to go. > > I ran raidhotremove /dev/md0 /dev/hdg1 and then raidhotadd /dev/md0 > > /dev/hdg1 and it seemed to begin reconstruction: I don't understand why you did this... it thinks the disk is failed, and yet you are using raidhotadd to reinsert it into the array. The idea is that you replace the disk, and _then_ raidhotadd the new disk. Having said that, there's nothing wrong with what you did - it will presumably just fail again at some later date. > > but I got scared and decided to stop it... so now it's sitting idle > > unmounted spun down (both disks) awaiting professional advice (rather > > than me stumbling around in the dark before i hose my data). Both I think what you need to do is to test the disk to see if it's really faulty. If it's a Maxtor DiamondMax Plus 40 (the only 40G disk I'm aware of), then try: badblocks -s -v /dev/hdg1 40017915 (or substitute the correct number of blocks for your partition) If this succeeds, you may want to try it with the '-w' option (enable writes). This takes a *VERY* *LONG* *TIME* though. I believe it could be several *DAYS* on a disk this size, since it repeatedly writes to the disk and then reads the data back to check for errors. > > disks are less than two weeks old, although I have heard of people > > having similar problems (disks failing in less than a month new from > > the factory) with this brand and model I would like to get the In my experience 95% of drive failures occur in the first couple of weeks. If they get out of this timeframe, then I find they usually last for a long time. I don't think this is a failing of this brand and/or model. > > drives back to the way the were before the system decided that the > > disk had failed (what causes it to think that, anyways?) and see if > > it continues to work, as I find it hard to believe that the drive > > would have died so quickly. What is the proper course of action? It is entirely possible that it has failed (but luckily you'll get a replacement really quickly when it fails so early). You can continue to run the system in degraded mode for the moment, as long as you're aware that there's no redundancy. If you confirm it's faulty, then I'd return it, get the new disk, and then raidhotadd it back into the array. > First, do you have ANY log messages from anything other than RAID indicating > a failed disk? Since these are IDE drives, I'd expect some messages from > the IDE subsystem if the drive really had died (my SCSI messages went pretty > wild when I had a disk fail). I'd agree with Gregory here - I'd definately expect something else in the logs (IDE bus resets, perhaps). The disk may well be fine, and just got ejected from the array by gremlins... > To check and see if the drive is actually in good condition, grab the > diagnostic utility from the support site of your drive manufacturer, > boot from a DOS floppy, and run diagnostics on the drive. I have to confess I've never heard of manufacturers offering diagnostic utilities for disks... Gregory, can you point me at any examples? Am I just being a complete dumbass here? > In order for them to replace my drives, I've had to do "write" > testing, which destroys all data on the drive, so you may want to disconnect > power from one of the drives before you play around with that. If you don't trust yourself to get the right disk for a write test then you need to do this. However, if you check *EXACTLY* what you are doing before running a write-test, then I don't see any reason to go so far as to unplug the disks. YMMV. Regards, Corin /+-\ | Corin Hartland-Swann | Direct: +44 (0) 20 7544 4676| | Commerce Internet Ltd | Mobile: +44 (0) 79 5854 0027