[Lustre-discuss] mkfs options/tuning for RAID based OSTs

2010-10-19 Thread Edward Walter
Hello All, We're doing a fresh Lustre 1.8.4 install using Sun StorageTek 2540 arrays for our OST targets. We've configured these as RAID6 with no spares which means we have the equivalent of 10 data disks and 2 parity disks in play on each OST. We configured the "Segment Size" on these arrays

Re: [Lustre-discuss] mkfs options/tuning for RAID based OSTs

2010-10-19 Thread Paul Nowoczynski
Ed, Does 'segment size' refer to the amount of data written to each disk before proceeding to the next disk (e.g. stride)? This is my guess since these values are usually powers of two and therefore 52KB [512KB/(10 data disks)] is probably not the stride size. In any event I think you'll get

Re: [Lustre-discuss] mkfs options/tuning for RAID based OSTs

2010-10-19 Thread Dennis Nelson
Segment size should be 128. 128 KB * 8 data drives = 1 MB. On 10/19/10 3:42 PM, "Edward Walter" wrote: > Hello All, > > We're doing a fresh Lustre 1.8.4 install using Sun StorageTek 2540 > arrays for our OST targets. We've configured these as RAID6 with no > spares which means we have the eq

Re: [Lustre-discuss] mkfs options/tuning for RAID based OSTs

2010-10-19 Thread Andreas Dilger
On 2010-10-19, at 14:42, Edward Walter wrote: > We're doing a fresh Lustre 1.8.4 install using Sun StorageTek 2540 > arrays for our OST targets. We've configured these as RAID6 with no > spares which means we have the equivalent of 10 data disks and 2 parity > disks in play on each OST. As Pau

Re: [Lustre-discuss] mkfs options/tuning for RAID based OSTs

2010-10-19 Thread Edward Walter
Hi Dennis, That seems to validate how I'm interpreting the parameters. We have 10 data disks and 2 parity disks per array so it looks like we need to be at 64 KB or less. I'm guessing I'll just need to run some tests to see how performance changes as I adjust the segment size. Thanks, -Ed

Re: [Lustre-discuss] mkfs options/tuning for RAID based OSTs

2010-10-20 Thread Brian J. Murrell
On Tue, 2010-10-19 at 21:00 -0400, Edward Walter wrote: > Ed, > That seems to validate how I'm interpreting the parameters. We have 10 data > disks and 2 parity disks per array so it looks like we need to be at 64 KB or > less. I think you have been missing everyone's point in this thread.

Re: [Lustre-discuss] mkfs options/tuning for RAID based OSTs

2010-10-20 Thread Edward Walter
Hi Brian, Thanks for the clarification. It didn't click that the optimal data size is exactly 1MB... Everything you're saying makes sense though. Obviously with 12 disk arrays; there's tension between maximizing space and maximizing performance. I was hoping/trying to get the best of both.

Re: [Lustre-discuss] mkfs options/tuning for RAID based OSTs

2010-10-20 Thread Charland, Denis
Brian J. Murrell wrote: > On Tue, 2010-10-19 at 21:00 -0400, Edward Walter wrote: > > > This is why the recommendations in this thread have continued to be > using a number of data disks that divides evenly into 1MB (i.e. powers > of 2: 2, 4, 8, etc.). So for RAID6: 4+2 or 8+2, etc. > What

Re: [Lustre-discuss] mkfs options/tuning for RAID based OSTs

2010-10-20 Thread Edward Walter
Hi Denis, Changing the number of parity disks (RAID5 = 1, RAID6 = 2) doesn't change the math on the data disks and data segment size. You still need a power of 2 number of data disks to insure that the product of the RAID chunk size and the number of data disks is 1MB. Aside from that; I would

Re: [Lustre-discuss] mkfs options/tuning for RAID based OSTs

2010-10-20 Thread Bernd Schubert
On Wednesday, October 20, 2010, Charland, Denis wrote: > Brian J. Murrell wrote: > > On Tue, 2010-10-19 at 21:00 -0400, Edward Walter wrote: > > > > > > This is why the recommendations in this thread have continued to be > > using a number of data disks that divides evenly into 1MB (i.e. powers >

Re: [Lustre-discuss] mkfs options/tuning for RAID based OSTs

2010-10-20 Thread Wojciech Turek
Hi Edward, As Andreas mentioned earlier the max OST size is 16TB if one uses ext4 based ldiskfs. So creation of RAID group bigger than that will definitely hurt your performance because you would have to split the large array into smaller logical disks and that randomises IOs on the raid controlle