Re: [zfs-discuss] repost - high read iops

2009-12-30 Thread Bob Friesenhahn
On Wed, 30 Dec 2009, Richard Elling wrote: are written because you also assume that you can later read either side. For ZFS, if only one side of the mirror is written, you know the bad side is bad because of the checksum. The checksum is owned by the parent, which is an important design decision

Re: [zfs-discuss] repost - high read iops

2009-12-30 Thread Richard Elling
On Dec 30, 2009, at 9:35 AM, Bob Friesenhahn wrote: On Tue, 29 Dec 2009, Ross Walker wrote: Some important points to consider are that every write to a raidz vdev must be synchronous. In other words, the write needs to complete on all the drives in the stripe before the write may return

Re: [zfs-discuss] repost - high read iops

2009-12-30 Thread Ross Walker
On Wed, Dec 30, 2009 at 12:35 PM, Bob Friesenhahn wrote: > On Tue, 29 Dec 2009, Ross Walker wrote: >> >>> Some important points to consider are that every write to a raidz vdev >>> must be synchronous.  In other words, the write needs to complete on all the >>> drives in the stripe before the writ

Re: [zfs-discuss] repost - high read iops

2009-12-30 Thread Bob Friesenhahn
On Tue, 29 Dec 2009, Ross Walker wrote: Some important points to consider are that every write to a raidz vdev must be synchronous. In other words, the write needs to complete on all the drives in the stripe before the write may return as complete. This is also true of "RAID 1" (mirrors) whi

Re: [zfs-discuss] repost - high read iops

2009-12-30 Thread Toby Thain
On 29-Dec-09, at 11:53 PM, Ross Walker wrote: On Dec 29, 2009, at 12:36 PM, Bob Friesenhahn wrote: ... However, zfs does not implement "RAID 1" either. This is easily demonstrated since you can unplug one side of the mirror and the writes to the zfs mirror will still succeed, catching

Re: [zfs-discuss] repost - high read iops

2009-12-29 Thread Ross Walker
On Dec 29, 2009, at 12:36 PM, Bob Friesenhahn > wrote: On Tue, 29 Dec 2009, Ross Walker wrote: A mirrored raidz provides redundancy at a steep cost to performance and might I add a high monetary cost. I am not sure what a "mirrored raidz" is. I have never heard of such a thing before.

Re: [zfs-discuss] repost - high read iops

2009-12-29 Thread Eric D. Mudama
On Tue, Dec 29 at 11:14, Erik Trimble wrote: Eric D. Mudama wrote: On Tue, Dec 29 at 9:16, Brad wrote: The disk cost of a raidz pool of mirrors is identical to the disk cost of raid10. ZFS can't do a raidz of mirrors or a mirror of raidz. Members of a mirror or raidz[123] must be a fundament

Re: [zfs-discuss] repost - high read iops

2009-12-29 Thread Richard Elling
On Dec 29, 2009, at 11:26 AM, Brad wrote: @relling "For small, random read IOPS the performance of a single, top-level vdev is performance = performance of a disk * (N / (N - P)) 133 * 12/(12-1)= 133 * 12/11 where, N = number of disks in the vdev P = number of parity d

Re: [zfs-discuss] repost - high read iops

2009-12-29 Thread Brad
@relling "For small, random read IOPS the performance of a single, top-level vdev is performance = performance of a disk * (N / (N - P)) 133 * 12/(12-1)= 133 * 12/11 where, N = number of disks in the vdev P = number of parity devices in the vdev" performance of a dis

Re: [zfs-discuss] repost - high read iops

2009-12-29 Thread Erik Trimble
Eric D. Mudama wrote: On Tue, Dec 29 at 9:16, Brad wrote: The disk cost of a raidz pool of mirrors is identical to the disk cost of raid10. ZFS can't do a raidz of mirrors or a mirror of raidz. Members of a mirror or raidz[123] must be a fundamental device (i.e. file or drive) "This wind

Re: [zfs-discuss] repost - high read iops

2009-12-29 Thread Tim Cook
On Tue, Dec 29, 2009 at 12:07 PM, Richard Elling wrote: > On Dec 29, 2009, at 9:16 AM, Brad wrote: > > @eric >> >> "As a general rule of thumb, each vdev has the random performance >> roughly the same as a single member of that vdev. Having six RAIDZ >> vdevs in a pool should give roughly the per

Re: [zfs-discuss] repost - high read iops

2009-12-29 Thread Richard Elling
On Dec 29, 2009, at 9:16 AM, Brad wrote: @eric "As a general rule of thumb, each vdev has the random performance roughly the same as a single member of that vdev. Having six RAIDZ vdevs in a pool should give roughly the performance as a stripe of six bare drives, for random IO." This model be

Re: [zfs-discuss] repost - high read iops

2009-12-29 Thread Eric D. Mudama
On Tue, Dec 29 at 9:16, Brad wrote: @eric "As a general rule of thumb, each vdev has the random performance roughly the same as a single member of that vdev. Having six RAIDZ vdevs in a pool should give roughly the performance as a stripe of six bare drives, for random IO." It sounds like we'l

Re: [zfs-discuss] repost - high read iops

2009-12-29 Thread Mattias Pantzare
On Tue, Dec 29, 2009 at 18:16, Brad wrote: > @eric > > "As a general rule of thumb, each vdev has the random performance > roughly the same as a single member of that vdev. Having six RAIDZ > vdevs in a pool should give roughly the performance as a stripe of six > bare drives, for random IO." > >

Re: [zfs-discuss] repost - high read iops

2009-12-29 Thread Bob Friesenhahn
On Tue, 29 Dec 2009, Ross Walker wrote: A mirrored raidz provides redundancy at a steep cost to performance and might I add a high monetary cost. I am not sure what a "mirrored raidz" is. I have never heard of such a thing before. With raid10 each mirrored pair has the IOPS of a single dr

Re: [zfs-discuss] repost - high read iops

2009-12-29 Thread Brad
@eric "As a general rule of thumb, each vdev has the random performance roughly the same as a single member of that vdev. Having six RAIDZ vdevs in a pool should give roughly the performance as a stripe of six bare drives, for random IO." It sounds like we'll need 16 vdevs striped in a pool to at

Re: [zfs-discuss] repost - high read iops

2009-12-29 Thread Brad
@ross "Because each write of a raidz is striped across the disks the effective IOPS of the vdev is equal to that of a single disk. This can be improved by utilizing multiple (smaller) raidz vdevs which are striped, but not by mirroring them." So with random reads, would it perform better on a rai

Re: [zfs-discuss] repost - high read iops

2009-12-29 Thread Eric D. Mudama
On Tue, Dec 29 at 4:55, Brad wrote: Thanks for the suggestion! I have heard mirrored vdevs configuration are preferred for Oracle but whats the difference between a raidz mirrored vdev vs a raid10 setup? We have tested a zfs stripe configuration before with 15 disks and our tester was extremel

Re: [zfs-discuss] repost - high read iops

2009-12-29 Thread Ross Walker
On Dec 29, 2009, at 7:55 AM, Brad wrote: Thanks for the suggestion! I have heard mirrored vdevs configuration are preferred for Oracle but whats the difference between a raidz mirrored vdev vs a raid10 setup? A mirrored raidz provides redundancy at a steep cost to performance and might

Re: [zfs-discuss] repost - high read iops

2009-12-29 Thread Brad
Thanks for the suggestion! I have heard mirrored vdevs configuration are preferred for Oracle but whats the difference between a raidz mirrored vdev vs a raid10 setup? We have tested a zfs stripe configuration before with 15 disks and our tester was extremely happy with the performance. After

Re: [zfs-discuss] repost - high read iops

2009-12-29 Thread przemolicc
On Mon, Dec 28, 2009 at 01:40:03PM -0800, Brad wrote: > "This doesn't make sense to me. You've got 32 GB, why not use it? > Artificially limiting the memory use to 20 GB seems like a waste of > good money." > > I'm having a hard time convincing the dbas to increase the size of the SGA to > 20GB b

Re: [zfs-discuss] repost - high read iops

2009-12-28 Thread Richard Elling
On Dec 28, 2009, at 1:40 PM, Brad wrote: "This doesn't make sense to me. You've got 32 GB, why not use it? Artificially limiting the memory use to 20 GB seems like a waste of good money." I'm having a hard time convincing the dbas to increase the size of the SGA to 20GB because their philosop

Re: [zfs-discuss] repost - high read iops

2009-12-28 Thread Bob Friesenhahn
On Mon, 28 Dec 2009, Brad wrote: I'm having a hard time convincing the dbas to increase the size of the SGA to 20GB because their philosophy is, no matter what eventually you'll have to hit disk to pick up data thats not stored in cache (arc or l2arc). The typical database server in our envi

Re: [zfs-discuss] repost - high read iops

2009-12-28 Thread Brad
"This doesn't make sense to me. You've got 32 GB, why not use it? Artificially limiting the memory use to 20 GB seems like a waste of good money." I'm having a hard time convincing the dbas to increase the size of the SGA to 20GB because their philosophy is, no matter what eventually you'll have

Re: [zfs-discuss] repost - high read iops

2009-12-28 Thread Richard Elling
On Dec 28, 2009, at 12:40 PM, Brad wrote: "Try an SGA more like 20-25 GB. Remember, the database can cache more effectively than any file system underneath. The best I/O is the I/O you don't have to make." We'll be turning up the SGA size from 4GB to 16GB. The arc size will be set from 8GB to 4

Re: [zfs-discuss] repost - high read iops

2009-12-28 Thread Brad
"Try an SGA more like 20-25 GB. Remember, the database can cache more effectively than any file system underneath. The best I/O is the I/O you don't have to make." We'll be turning up the SGA size from 4GB to 16GB. The arc size will be set from 8GB to 4GB. "This can be a red herring. Judging by t

Re: [zfs-discuss] repost - high read iops

2009-12-28 Thread Richard Elling
Hi Brad, comments below... On Dec 27, 2009, at 10:24 PM, Brad wrote: Richard - the l2arc is c1t13d0. What tools can be use to show the l2arc stats? raidz1 2.68T 580G543453 4.22M 3.70M c1t1d0 - -258102 689K 358K c1t2d0 - -25610

Re: [zfs-discuss] repost - high read iops

2009-12-27 Thread Brad
Richard - the l2arc is c1t13d0. What tools can be use to show the l2arc stats? raidz1 2.68T 580G543453 4.22M 3.70M c1t1d0 - -258102 689K 358K c1t2d0 - -256103 684K 354K c1t3d0 - -258102 690K 359K

Re: [zfs-discuss] repost - high read iops

2009-12-27 Thread Richard Elling
OK, I'll take a stab at it... On Dec 26, 2009, at 9:52 PM, Brad wrote: repost - Sorry for ccing the other forums. I'm running into a issue where there seems to be a high number of read iops hitting disks and physical free memory is fluctuating between 200MB -> 450MB out of 16GB total. We h

[zfs-discuss] repost - high read iops

2009-12-26 Thread Brad
repost - Sorry for ccing the other forums. I'm running into a issue where there seems to be a high number of read iops hitting disks and physical free memory is fluctuating between 200MB -> 450MB out of 16GB total. We have the l2arc configured on a 32GB Intel X25-E ssd and slog on another 32GB