raid10 far layout outperforms offset at writing? (was: Help with chunksize on raid10 -p o3 array)
Peter Rabbitson wrote: I have been trying to figure out the best chunk size for raid10 before migrating my server to it (currently raid1). I am looking at 3 offset stripes, as I want to have two drive failure redundancy, and offset striping is said to have the best write performance, with read performance equal to far. Incorporating suggestions from previous posts (thank you everyone), I used this modified script at http://rabbit.us/pool/misc/raid_test2.txt To negate effects of caching memory was jammed below 200mb free by using a full tmpfs mount with no swap. Here is what I got with far layout (-p f3): http://rabbit.us/pool/misc/raid_far.html The clear winner is 1M chunks, and is very consistent at any block size. I was surprised even more to see that my read speed was identical to that of a raid0 getting near the _maximum_ physical speed of 4 drives (roughly 55MB sustained across 1.2G). Unlike offset layout, far really shines at reading stuff back. The write speed did not suffer noticeably compared to offset striping. Here are the results (-p o3) for comparison: http://rabbit.us/pool/misc/raid_offset.html, and they roughly seem to correlate with my earlier testing using dd. So I guess the way to go for this system will be f3, although the md(4) says that offset layout should be more beneficial. Is there anything I missed while setting my o3 array, so that I got worse performance for both read and write compared to f3? Once again thanks everyone for the help. Peter - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
strange test results
Hi! I am running tests on our new test device. The device has 2x2 core Xeon, intel 5000 chipset, two 3ware sata raid card on pcie, and 15 sata2 disks, running debian etch. More info at the bottom. The first phase of the test is probing various raid levels. So i configured the cards to 15 JBOD disks, and hacked together a testing script. The script builds raid arrays, waits for sync, and then runs this command: iozone -eM -s 4g -r 1024 -i0 -i1 -i2 -i8 -t16 -+u The graphs of the results here: http://gergely.tomka.hu/dt/index.html And i have a lots of questions. http://gergely.tomka.hu/dt/1.html This graph is crazy, like thunderbolts. But the raid50 is generally slower than raid5. Why? http://gergely.tomka.hu/dt/3.html This is the only graph i can explain :) http://gergely.tomka.hu/dt/4.html With random readers, why raid0 slowing down? And why raid10 faster than raid0? http://gergely.tomka.hu/dt/2.html Why raid6 cant became faster, with multiple disks, as raid5 50? So lots of questions. I am generally surprised by the non-linearity of some results and the lack of acceleration with more disks on other results. And now, the details: Hardware: Base Board Information Manufacturer: Supermicro Product Name: X7DB8 Processor Information Socket Designation: LGA771/CPU1 Type: Central Processor Family: Xeon Manufacturer: Intel ID: 64 0F 00 00 FF FB EB BF Signature: Type 0, Family 15, Model 6, Stepping 4 (two cpus) Memory Device Array Handle: 0x0017 Error Information Handle: No Error Total Width: 72 bits Data Width: 64 bits Size: 1024 MB Form Factor: DIMM Set: 1 Locator: DIMM x 4 Bank Locator: Bank1 Type: DDR2 Type Detail: Synchronous Speed: 533 MHz (1.9 ns) Manufacturer: Not Specified Serial Number: Not Specified Asset Tag: Not Specified Part Number: Not Specified (two of this also) ursula:~# tw_cli show Ctl ModelPorts Drives Units NotOpt RRate VRate BBU c09590SE-8ML 8 77 01 1 - c19590SE-8ML 8 88 01 1 - The tests generally: mdadm mkfs.xfs blockdev --setra 524288 md (maybe not a good idea for multiple arrays) do iozone test raid10 is two disks raid1s in raid0 and raid50 is three disk raid6s in raid0. These test runs for a week, and now slowly finishing. For this reason, replicatong the test to filter out accidents not a good option. Any comments? -- Tomka Gergely, [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Raid1 replaced with raid10?
Hi, I just tried an idea I got after fiddling with raid10 and to my dismay it worked as I thought it will. I used two small partitions on separate disks to create a raid1 array. Then I did dd if=/dev/md2 of=/dev/null. I got only one of the disks reading. Nothing unexpected. Then I created a raid10 array on the same two partitions with the options -l10 -n2 -pf2. The same dd executed at twice the speed, reading _simultaneously_ from both drives. I did some bonnie++ benchmarking - same result - raid1 reads only from a single disk raid10 from both. Write performance is worse (about 10% slower) with raid10, but you get twice the read speed. In this light the obvious question is: can raid10 be used as a drop-in replacement for raid1 or there is a caveat with having the amount of disks equal the amount of chunk copies? Peter - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Linux-usb-users] Failed reads from RAID-0 array (from newbie who has read the FAQ)
Michael Schwarz wrote: More than ever, I am convinced that it is actually a hardware problem, but I am curious for the opinions of both of you on whether the system (meaning, I guess, the combination of usb-storage driver and raid) is really doing the best with what it has. See below, but the short answer is there is probably room for improvement. My last effort was to switch to a different computer. When I did, I got in the dmesg log (unfortunately, not preserved, although I should be able to recreate) that one of the flash drives had bad blocks. Some part of the system eventually decided it was a dead device (I believe dmesg indicate the scsi subsystem said so). The device (it happened to be /dev/sdc) was peremptorially dropped from the system. This appears to be what hanged the raid system. (Why these messages never appeared on the other computer is beyond me; obviously some difference in how the actual USB controller reports errors, but, as I said, I've never studied USB drivers or hardware. In fact, once you get beyond the UARTs you are getting sophisticated to me) I've built an array of five known-good devices and so far it works swimmingly (at least on the hardware that was better at error reporting). So it seems to me that there is probably nothing actually wrong with the drivers or their interactions at it leaves me only asking if there should be some sort of improvement in error reporting/recovery up to userland. If I am right and the scsi system was marking a device as dead, shouldn't the userland read against the md device get an error instead of an indefinite hang? Let me make sure I have this scenario right... one write process (dd or cp) hangs, but you can still access data on the array, so the devices (all of them?) are working. It would be useful at that point to see if /proc/mdstat shows one device as failed. Given that I have described the behavior, I would think that there is still a problem in the driver or md somewhere, hangs should time out, errors should be reported up, and if this is caused by a lost write completion, I would hope that would be timed out and reported. That's my read on it, these just hangs cases probably are undetected or mishandled errors which should be passed up and reported to the application or retried and completed. Or handled in some better way than what you describe. Bad hardware is a fact of life, if you feel like chasing this more, an understanding of what the hardware did wrong and what the kernel didn't do right would be helpful. Of course the failure mode may be so rare, and the fix so time-consuming that it won't get fixed, but it can get documented. Beyond this question which I leave to you (although I'd love to hear your answers/thoughts), I think we can safely say that the problem was hardware (even if hard to find). If either of you would like, I'd be happy to find time this week to recreate the error on my better PC and send that along. As for rolling a custom kernel with more message buffer, well, I'm going to be getting into a new device driver in the coming months, so a custom debug kernel is definitely in my future, but I'm not sure when. I must say, the kernel has become a much more complex beastie since 2.2.x! (Although it also appears to be improved and somewhat more organized -- but definitely MUCH larger!) Thank you both so much! I wouldn't even have diagnosed my hardware problem without your prompts. I'm very grateful. Let me know if you'd like those dmesg logs or if you'd just like to let it go! -- bill davidsen [EMAIL PROTECTED] CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Linux-usb-users] Failed reads from RAID-0 array (from newbie who has read the FAQ)
I'm going to hang on to the hardware. This is a pilot/demo that may lead to development of a new device, and, if so, I'll be getting back into device driver writing. Working this problem would be great practice for that. So I will do it. The only problem is I don't know when! I believe I can replicate the problem, so I'll find time (perhaps next weekend) to capture the data of interest. Mr. Stern: Where might I go for low level programming information on USB devices? I'm interested in registers/DMA/packet formats, etc. I've found info on the USB protocol itself, but I haven't found info on devices. Obviously I can dig through kernel source, but documents would be nice! Again, if this is an unreasonable request for you to do my homework, just say so! I won't be offended. I'm sure I can find it myself given time, but if you happen to have some URLs handy, they'd be appreciated. YET AGAIN thank you both! You've been of great help. -- Michael Schwarz Michael Schwarz wrote: More than ever, I am convinced that it is actually a hardware problem, but I am curious for the opinions of both of you on whether the system (meaning, I guess, the combination of usb-storage driver and raid) is really doing the best with what it has. See below, but the short answer is there is probably room for improvement. My last effort was to switch to a different computer. When I did, I got in the dmesg log (unfortunately, not preserved, although I should be able to recreate) that one of the flash drives had bad blocks. Some part of the system eventually decided it was a dead device (I believe dmesg indicate the scsi subsystem said so). The device (it happened to be /dev/sdc) was peremptorially dropped from the system. This appears to be what hanged the raid system. (Why these messages never appeared on the other computer is beyond me; obviously some difference in how the actual USB controller reports errors, but, as I said, I've never studied USB drivers or hardware. In fact, once you get beyond the UARTs you are getting sophisticated to me) I've built an array of five known-good devices and so far it works swimmingly (at least on the hardware that was better at error reporting). So it seems to me that there is probably nothing actually wrong with the drivers or their interactions at it leaves me only asking if there should be some sort of improvement in error reporting/recovery up to userland. If I am right and the scsi system was marking a device as dead, shouldn't the userland read against the md device get an error instead of an indefinite hang? Let me make sure I have this scenario right... one write process (dd or cp) hangs, but you can still access data on the array, so the devices (all of them?) are working. It would be useful at that point to see if /proc/mdstat shows one device as failed. Given that I have described the behavior, I would think that there is still a problem in the driver or md somewhere, hangs should time out, errors should be reported up, and if this is caused by a lost write completion, I would hope that would be timed out and reported. That's my read on it, these just hangs cases probably are undetected or mishandled errors which should be passed up and reported to the application or retried and completed. Or handled in some better way than what you describe. Bad hardware is a fact of life, if you feel like chasing this more, an understanding of what the hardware did wrong and what the kernel didn't do right would be helpful. Of course the failure mode may be so rare, and the fix so time-consuming that it won't get fixed, but it can get documented. Beyond this question which I leave to you (although I'd love to hear your answers/thoughts), I think we can safely say that the problem was hardware (even if hard to find). If either of you would like, I'd be happy to find time this week to recreate the error on my better PC and send that along. As for rolling a custom kernel with more message buffer, well, I'm going to be getting into a new device driver in the coming months, so a custom debug kernel is definitely in my future, but I'm not sure when. I must say, the kernel has become a much more complex beastie since 2.2.x! (Although it also appears to be improved and somewhat more organized -- but definitely MUCH larger!) Thank you both so much! I wouldn't even have diagnosed my hardware problem without your prompts. I'm very grateful. Let me know if you'd like those dmesg logs or if you'd just like to let it go! -- bill davidsen [EMAIL PROTECTED] CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to
Re: [Linux-usb-users] Failed reads from RAID-0 array (from newbie who has read the FAQ)
On Mon, 19 Mar 2007, Michael Schwarz wrote: I'm going to hang on to the hardware. This is a pilot/demo that may lead to development of a new device, and, if so, I'll be getting back into device driver writing. Working this problem would be great practice for that. So I will do it. The only problem is I don't know when! I believe I can replicate the problem, so I'll find time (perhaps next weekend) to capture the data of interest. Michael, you don't seem to appreciate the basic principles for tracking down problems. First: Simplify. Get rid of everything that isn't relevant to the problem and could serve to distract you. In particular, don't run X. That will eliminate around half of your running processes and shrink the stack dump down so that it might fit in the kernel buffer without overflowing. Second: Simplify. Don't run kernels that have been modified by Fedora or anybody else. Use a plain vanilla kernel from kernel.org. Third: Simplify. Try not to collect the same data over and over again (take a look at the starts of all those dmesg files you compressed and emailed). You can clear the kernel's log buffer after dumping it by doing dmesg -c /dev/null. Fourth: Be prepared to make changes. This means making changes to the kernel configuration or source code, another reason for using a stock kernel. To get some really useful data, you need to build a kernel with CONFIG_USB_DEBUG turned on. Without that setting there won't be any helpful debugging information in the log. Then you should run a minimal system. Single-user mode would be best, but that can be _too_ bare-bones. No GUI will suffice. Then you should clear the kernel log before before starting the big file copy. Basically nothing that happens before then is important, because nothing has gone wrong. Then after the hang occurs, see what shows up in the dmesg log. And get a stack dump. Mr. Stern: Where might I go for low level programming information on USB devices? I'm interested in registers/DMA/packet formats, etc. Are you interested in USB devices (i.e., flash drives, webcams, and so on -- the things you plug in to a USB connection) or USB controllers (the hardware in your computer that manages the USB bus)? I've found info on the USB protocol itself, but I haven't found info on devices. Obviously I can dig through kernel source, but documents would be nice! Again, if this is an unreasonable request for you to do my homework, just say so! I won't be offended. I'm sure I can find it myself given time, but if you happen to have some URLs handy, they'd be appreciated. There are three types of USB controllers used in personal computers: UHCI, OHCI, and EHCI. Links to their specifications are available here: http://www.usb.org/developers/resources/ Specifications for various classes of USB devices are available here: http://www.usb.org/developers/devclass_docs Alan Stern - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Linux-usb-users] Failed reads from RAID-0 array (from newbie who has read the FAQ)
Comments below. -- Michael Schwarz On Mon, 19 Mar 2007, Michael Schwarz wrote: I'm going to hang on to the hardware. This is a pilot/demo that may lead to development of a new device, and, if so, I'll be getting back into device driver writing. Working this problem would be great practice for that. So I will do it. The only problem is I don't know when! I believe I can replicate the problem, so I'll find time (perhaps next weekend) to capture the data of interest. Michael, you don't seem to appreciate the basic principles for tracking down problems. I want to bristle at this. I've been a professional software developer for nearly 20 years. But I can't because all of your points below are, of course, dead on for tracking down a device-level problem. First: Simplify. Get rid of everything that isn't relevant to the problem and could serve to distract you. In particular, don't run X. That will eliminate around half of your running processes and shrink the stack dump down so that it might fit in the kernel buffer without overflowing. Right on. And I know this; I should have had two boxes where I was working; one where I could do browsy-emaily things separate from the problem I was working. Second: Simplify. Don't run kernels that have been modified by Fedora or anybody else. Use a plain vanilla kernel from kernel.org. Yeah; But here was where I lacked confidence. I used to know every inch of my kernel and my hardware, but, as previously stated, that was back in the 2.2.x days. I wasn't confident that I could run my hardware with a plain-vanilla kernel or that I could successfully roll my own working 2.6.x kernel in a timely manner. But, of course, I understand why this is a good idea. Third: Simplify. Try not to collect the same data over and over again (take a look at the starts of all those dmesg files you compressed and emailed). You can clear the kernel's log buffer after dumping it by doing dmesg -c /dev/null. Thanks, I actually didn't know that flag. Makes me feel pretty stupid... Fourth: Be prepared to make changes. This means making changes to the kernel configuration or source code, another reason for using a stock kernel. I agree -- I just lacked confidence doing so with newer kernels. I used to ALWAYS build my own kernel right up through the 2.2.x series, building the kernel to exactly match my hardware. I just haven't kept up. And if you compare the 2.2.x kernel's configuration parameter list to the 2.6.x, well, you can maybe understand why I was reluctant to launch on that when under time pressure. But you point (I gather) is that if I had, it might well have taken less time than it did... To get some really useful data, you need to build a kernel with CONFIG_USB_DEBUG turned on. Without that setting there won't be any helpful debugging information in the log. Before I send any more info on this problem, I will do this and all of the above. Then you should run a minimal system. Single-user mode would be best, but that can be _too_ bare-bones. No GUI will suffice. Will do. Then you should clear the kernel log before before starting the big file copy. Basically nothing that happens before then is important, because nothing has gone wrong. Then after the hang occurs, see what shows up in the dmesg log. And get a stack dump. Mr. Stern: Where might I go for low level programming information on USB devices? I'm interested in registers/DMA/packet formats, etc. Are you interested in USB devices (i.e., flash drives, webcams, and so on -- the things you plug in to a USB connection) or USB controllers (the hardware in your computer that manages the USB bus)? Firstly the controllers, then specific devices. I've found info on the USB protocol itself, but I haven't found info on devices. Obviously I can dig through kernel source, but documents would be nice! Again, if this is an unreasonable request for you to do my homework, just say so! I won't be offended. I'm sure I can find it myself given time, but if you happen to have some URLs handy, they'd be appreciated. There are three types of USB controllers used in personal computers: UHCI, OHCI, and EHCI. Links to their specifications are available here: http://www.usb.org/developers/resources/ Thanks. This is just what I wanted. Specifications for various classes of USB devices are available here: http://www.usb.org/developers/devclass_docs And this. Thank you much. I won't post on this issue again until I've cleared the decks of the items you mention above. Thanks again. Alan Stern - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Linux-usb-users] Failed reads from RAID-0 array (from newbie who has read the FAQ)
On Mon, 19 Mar 2007, Michael Schwarz wrote: Yeah; But here was where I lacked confidence. I used to know every inch of my kernel and my hardware, but, as previously stated, that was back in the 2.2.x days. I wasn't confident that I could run my hardware with a plain-vanilla kernel or that I could successfully roll my own working 2.6.x kernel in a timely manner. But, of course, I understand why this is a good idea. It's not so hard to do, if you start from a known-good configuration. For instance, you could take the config your current distribution's kernel is built from and just use it, although it would take a long time to build because it includes so many drivers. Whittling it down to just the drivers you need would be tedious but not very difficult. Alan Stern - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [PPC32] ADMA support for PPC 440SPe processors.
On Mon, 2007-03-19 at 17:13 +0100, Benjamin Herrenschmidt wrote: BTW folks. Would it be hard to change your spe_ prefixes to something else ? There's already enough confusion between the freescale SPE unit and the cell SPEs :-) (such confusion is annoying when grepp'ing for code that might touch a given functionality for example). Please please please! cheers -- Michael Ellerman OzLabs, IBM Australia Development Lab wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person signature.asc Description: This is a digitally signed message part
Re: [PATCH] [PPC32] ADMA support for PPC 440SPe processors.
On Tuesday 20 March 2007 04:06, Michael Ellerman wrote: On Mon, 2007-03-19 at 17:13 +0100, Benjamin Herrenschmidt wrote: BTW folks. Would it be hard to change your spe_ prefixes to something else ? There's already enough confusion between the freescale SPE unit and the cell SPEs :-) (such confusion is annoying when grepp'ing for code that might touch a given functionality for example). Please please please! OK. Who can resist so much pleading. ;-) Best regards, Stefan - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html