Re: [zfs-discuss] Re: ZFS and SE 3511
Anton Rang wrote: On Dec 19, 2006, at 7:14 AM, Mike Seda wrote: Anton B. Rang wrote: I have a Sun SE 3511 array with 5 x 500 GB SATA-I disks in a RAID 5. This 2 TB logical drive is partitioned into 10 x 200GB slices. I gave 4 of these slices to a Solaris 10 U2 machine and added each of them to a concat (non-raid) zpool as listed below: This is certainly a supportable configuration. However, it's not an optimal one. What would be the optimal configuration that you recommend? If you don't need ZFS redundancy, I would recommend taking a single slice for your ZFS file system (e.g. 6 x 200 GB for other file systems, and 1 x 800 GB for the ZFS pool). There would still be contention between the various file systems, but at least ZFS would be working with a single contiguous block of space on the array. Because of the implicit striping in ZFS, what you have right now is analogous to taking a single disk, partitioning it into several partitions, then striping across those partitions -- it works, you can use all of the space, but there's a rearrangement which means that logically contiguous blocks on disk are no longer physically contiguous, hurting performance substantially. Hmm... But, how is my current configuration (1 striped zpool consisting of 4 x 200 GB luns from a hardware RAID 5 logical drive) analogous to taking a single disk, partitioning it into several partitions, then striping across those partitions if each 200 GB lun is presented to solaris as a whole disk: Current partition table (original): Total disk sectors available: 390479838 + 16384 (reserved sectors) Part TagFlag First Sector Size Last Sector 0usrwm34 186.20GB 390479838 1 unassignedwm 0 0 0 2 unassignedwm 0 0 0 3 unassignedwm 0 0 0 4 unassignedwm 0 0 0 5 unassignedwm 0 0 0 6 unassignedwm 0 0 0 8 reservedwm 3904798398.00MB 390496222 Why is my current configuration not analogous to taking 4 disks and striping across those 4 disks? Yes, I am worried about the lack of redundancy. And, I have some new disks on order, at least one of which will be a hot spare. Glad to hear it. Anton ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: ZFS and SE 3511
On 19-Dec-06, at 2:42 PM, Jason J. W. Williams wrote: I do see this note in the 3511 documentation: Note - Do not use a Sun StorEdge 3511 SATA array to store single instances of data. It is more suitable for use in configurations where the array has a backup or archival role. My understanding of this particular scare-tactic wording (its also in the SANnet II OEM version manual almost verbatim) is that it has mostly to do with the relative unreliability of SATA firmware versus SCSI/FC firmware. That's such a sad sentence to have to read. Either prices are unrealistically low, or the revenues aren't being invested properly? --Toby Its possible that the disks are lower quality SATA disks too, but that was not what was relayed to us when we looked at buying the 3511 from Sun or the DotHill version (SANnet II). Best Regards, Jason ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: ZFS and SE 3511
Hi Toby, My understanding on the subject of SATA firmware reliability vs. FC/SCSI is that its mostly related to SATA firmware being a lot younger. The FC/SCSI firmware that's out there has been debugged for 10 years or so, so it has a lot fewer hiccoughs. Pillar Data Systems told us once that they found most of their SATA failed disks were just fine when examined, so their policy is to issue a RESET to the drive when a SATA error is detected, then retry the write/read and keep trucking. If they continue to get SATA errors, then they'll fail the drive. Looking at the latest Engenio SATA products, I believe they do the same thing. Its probably unfair to expect defect rates out of SATA firmware equivalent to firmware that's been around for a long time...particularly with the price pressures on SATA. SAS may suffer the same issue, though they seem to have 1,000,000 MTBF ratings like their traditional FC/SCSI counterparts. On a side-note, we experienced a path failure to a drive in our SATA Engenio array (older model), simply popping the drive out and back in fixed the issue...haven't had any notifications since. A RESET and RETRY would have been nice behavior to have, since popping and reinserting triggered a rebuild of the drive. Best Regards, Jason On 12/19/06, Toby Thain [EMAIL PROTECTED] wrote: On 19-Dec-06, at 2:42 PM, Jason J. W. Williams wrote: I do see this note in the 3511 documentation: Note - Do not use a Sun StorEdge 3511 SATA array to store single instances of data. It is more suitable for use in configurations where the array has a backup or archival role. My understanding of this particular scare-tactic wording (its also in the SANnet II OEM version manual almost verbatim) is that it has mostly to do with the relative unreliability of SATA firmware versus SCSI/FC firmware. That's such a sad sentence to have to read. Either prices are unrealistically low, or the revenues aren't being invested properly? --Toby Its possible that the disks are lower quality SATA disks too, but that was not what was relayed to us when we looked at buying the 3511 from Sun or the DotHill version (SANnet II). Best Regards, Jason ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: ZFS and SE 3511
Anton B. Rang wrote: I have a Sun SE 3511 array with 5 x 500 GB SATA-I disks in a RAID 5. This 2 TB logical drive is partitioned into 10 x 200GB slices. I gave 4 of these slices to a Solaris 10 U2 machine and added each of them to a concat (non-raid) zpool as listed below: This is certainly a supportable configuration. However, it's not an optimal one. What would be the optimal configuration that you recommend? You think that you have a 'concat' structure, but it's actually striped/RAID-0, because ZFS implicitly stripes across all of its top-level structures (your slices, in this case). This means that ZFS will constantly be writing data to addresses around 0, 50 GB, 100 GB, and 150 GB of each disk (presuming the first four slices are those you used). This will keep the disk arms constantly in motion, which isn't good for performance. do you think my zfs configuration caused the drive failure? I doubt it. I haven't investigated which disks ship in the 3511, but I would presume they are enterprise-class ATA drives, which can handle this type of head motion. (Standard ATA disks can overheat under a load which is heavy in seeks.) Then again, the 3511 is marketed as a near-line rather than on-line array ... that may be simply because the SATA drives don't perform as well as FC. I do see this note in the 3511 documentation: Note - Do not use a Sun StorEdge 3511 SATA array to store single instances of data. It is more suitable for use in configurations where the array has a backup or archival role. (I too am curious -- why do you consider yourself down? You've got a RAID 5, one disk is down, are you just worried about your current lack of redundancy? [I would be.] Will you be adding a hot spare?) Yes, I am worried about the lack of redundancy. And, I have some new disks on order, at least one of which will be a hot spare. Anton This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: ZFS and SE 3511
On Dec 19, 2006, at 7:14 AM, Mike Seda wrote: Anton B. Rang wrote: I have a Sun SE 3511 array with 5 x 500 GB SATA-I disks in a RAID 5. This 2 TB logical drive is partitioned into 10 x 200GB slices. I gave 4 of these slices to a Solaris 10 U2 machine and added each of them to a concat (non-raid) zpool as listed below: This is certainly a supportable configuration. However, it's not an optimal one. What would be the optimal configuration that you recommend? If you don't need ZFS redundancy, I would recommend taking a single slice for your ZFS file system (e.g. 6 x 200 GB for other file systems, and 1 x 800 GB for the ZFS pool). There would still be contention between the various file systems, but at least ZFS would be working with a single contiguous block of space on the array. Because of the implicit striping in ZFS, what you have right now is analogous to taking a single disk, partitioning it into several partitions, then striping across those partitions -- it works, you can use all of the space, but there's a rearrangement which means that logically contiguous blocks on disk are no longer physically contiguous, hurting performance substantially. Yes, I am worried about the lack of redundancy. And, I have some new disks on order, at least one of which will be a hot spare. Glad to hear it. Anton ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: ZFS and SE 3511
I do see this note in the 3511 documentation: Note - Do not use a Sun StorEdge 3511 SATA array to store single instances of data. It is more suitable for use in configurations where the array has a backup or archival role. My understanding of this particular scare-tactic wording (its also in the SANnet II OEM version manual almost verbatim) is that it has mostly to do with the relative unreliability of SATA firmware versus SCSI/FC firmware. Its possible that the disks are lower quality SATA disks too, but that was not what was relayed to us when we looked at buying the 3511 from Sun or the DotHill version (SANnet II). Best Regards, Jason ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: ZFS and SE 3511
I have a Sun SE 3511 array with 5 x 500 GB SATA-I disks in a RAID 5. This 2 TB logical drive is partitioned into 10 x 200GB slices. I gave 4 of these slices to a Solaris 10 U2 machine and added each of them to a concat (non-raid) zpool as listed below: This is certainly a supportable configuration. However, it's not an optimal one. You think that you have a 'concat' structure, but it's actually striped/RAID-0, because ZFS implicitly stripes across all of its top-level structures (your slices, in this case). This means that ZFS will constantly be writing data to addresses around 0, 50 GB, 100 GB, and 150 GB of each disk (presuming the first four slices are those you used). This will keep the disk arms constantly in motion, which isn't good for performance. do you think my zfs configuration caused the drive failure? I doubt it. I haven't investigated which disks ship in the 3511, but I would presume they are enterprise-class ATA drives, which can handle this type of head motion. (Standard ATA disks can overheat under a load which is heavy in seeks.) Then again, the 3511 is marketed as a near-line rather than on-line array ... that may be simply because the SATA drives don't perform as well as FC. I do see this note in the 3511 documentation: Note - Do not use a Sun StorEdge 3511 SATA array to store single instances of data. It is more suitable for use in configurations where the array has a backup or archival role. (I too am curious -- why do you consider yourself down? You've got a RAID 5, one disk is down, are you just worried about your current lack of redundancy? [I would be.] Will you be adding a hot spare?) Anton This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss