Re: [zfs-discuss] Re: ZFS and SE 3511

2006-12-23 Thread Mike Seda

Anton Rang wrote:

On Dec 19, 2006, at 7:14 AM, Mike Seda wrote:


Anton B. Rang wrote:
I have a Sun SE 3511 array with 5 x 500 GB SATA-I disks in a RAID 
5. This
2 TB logical drive is partitioned into 10 x 200GB slices. I gave 4 
of these slices to a Solaris 10 U2 machine and added each of them 
to a concat (non-raid) zpool as listed below:




This is certainly a supportable configuration.  However, it's not an 
optimal one.



What would be the optimal configuration that you recommend?


If you don't need ZFS redundancy, I would recommend taking a single 
slice for your ZFS file system (e.g. 6 x 200 GB for other file 
systems, and 1 x 800 GB for the ZFS pool).  There would still be 
contention between the various file systems, but at least ZFS would be 
working with a single contiguous block of space on the array.


Because of the implicit striping in ZFS, what you have right now is 
analogous to taking a single disk, partitioning it into several 
partitions, then striping across those partitions -- it works, you can 
use all of the space, but there's a rearrangement which means that 
logically contiguous blocks on disk are no longer physically 
contiguous, hurting performance substantially.
Hmm... But, how is my current configuration (1 striped zpool consisting 
of 4 x 200 GB luns from a hardware RAID 5 logical drive) analogous to 
taking a single disk, partitioning it into several partitions, then 
striping across those partitions if each 200 GB lun is presented to 
solaris as a whole disk:

Current partition table (original):
Total disk sectors available: 390479838 + 16384 (reserved sectors)
Part  TagFlag First Sector Size Last Sector
 0usrwm34  186.20GB  390479838
 1 unassignedwm 0   0   0
 2 unassignedwm 0   0   0
 3 unassignedwm 0   0   0
 4 unassignedwm 0   0   0
 5 unassignedwm 0   0   0
 6 unassignedwm 0   0   0
 8   reservedwm 3904798398.00MB  390496222

Why is my current configuration not analogous to taking 4 disks and 
striping across those 4 disks?


Yes, I am worried about the lack of redundancy. And, I have some new 
disks on order, at least one of which will be a hot spare.


Glad to hear it.

Anton




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS and SE 3511

2006-12-20 Thread Toby Thain


On 19-Dec-06, at 2:42 PM, Jason J. W. Williams wrote:

I do see this note in the 3511 documentation: Note - Do not use a  
Sun StorEdge 3511 SATA array to store single instances of data. It  
is more suitable for use in configurations where the array has a  
backup or archival role.


My understanding of this particular scare-tactic wording (its also in
the SANnet II OEM version manual almost verbatim) is that it has
mostly to do with the relative unreliability of SATA firmware versus
SCSI/FC firmware.


That's such a sad sentence to have to read.

Either prices are unrealistically low, or the revenues aren't being  
invested properly?


--Toby


Its possible that the disks are lower quality SATA
disks too, but that was not what was relayed to us when we looked at
buying the 3511 from Sun or the DotHill version (SANnet II).


Best Regards,
Jason
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS and SE 3511

2006-12-20 Thread Jason J. W. Williams

Hi Toby,

My understanding on the subject of SATA firmware reliability vs.
FC/SCSI is that its mostly related to SATA firmware being a lot
younger. The FC/SCSI firmware that's out there has been debugged for
10 years or so, so it has a lot fewer hiccoughs. Pillar Data Systems
told us once that they found most of their SATA failed disks were
just fine when examined, so their policy is to issue a RESET to the
drive when a SATA error is detected, then retry the write/read and
keep trucking. If they continue to get SATA errors, then they'll fail
the drive.

Looking at the latest Engenio SATA products, I believe they do the
same thing. Its probably unfair to expect defect rates out of SATA
firmware equivalent to firmware that's been around for a long
time...particularly with the price pressures on SATA. SAS may suffer
the same issue, though they seem to have 1,000,000 MTBF ratings like
their traditional FC/SCSI counterparts. On a side-note, we experienced
a path failure to a drive in our SATA Engenio array (older model),
simply popping the drive out and back in fixed the issue...haven't had
any notifications since. A RESET and RETRY would have been nice
behavior to have, since popping and reinserting triggered a rebuild of
the drive.

Best Regards,
Jason

On 12/19/06, Toby Thain [EMAIL PROTECTED] wrote:


On 19-Dec-06, at 2:42 PM, Jason J. W. Williams wrote:

 I do see this note in the 3511 documentation: Note - Do not use a
 Sun StorEdge 3511 SATA array to store single instances of data. It
 is more suitable for use in configurations where the array has a
 backup or archival role.

 My understanding of this particular scare-tactic wording (its also in
 the SANnet II OEM version manual almost verbatim) is that it has
 mostly to do with the relative unreliability of SATA firmware versus
 SCSI/FC firmware.

That's such a sad sentence to have to read.

Either prices are unrealistically low, or the revenues aren't being
invested properly?

--Toby

 Its possible that the disks are lower quality SATA
 disks too, but that was not what was relayed to us when we looked at
 buying the 3511 from Sun or the DotHill version (SANnet II).


 Best Regards,
 Jason
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS and SE 3511

2006-12-19 Thread Mike Seda

Anton B. Rang wrote:

I have a Sun SE 3511 array with 5 x 500 GB SATA-I disks in a RAID 5. This
2 TB logical drive is partitioned into 10 x 200GB slices. I gave 4 of these slices to a 
Solaris 10 U2 machine and added each of them to a concat (non-raid) zpool as listed below:



This is certainly a supportable configuration.  However, it's not an optimal 
one.
  

What would be the optimal configuration that you recommend?

You think that you have a 'concat' structure, but it's actually striped/RAID-0, 
because ZFS implicitly stripes across all of its top-level structures (your 
slices, in this case). This means that ZFS will constantly be writing data to 
addresses around 0, 50 GB, 100 GB, and 150 GB of each disk (presuming the first 
four slices are those you used). This will keep the disk arms constantly in 
motion, which isn't good for performance.

  

do you think my zfs configuration caused the drive failure?



I doubt it. I haven't investigated which disks ship in the 3511, but I would presume they are 
enterprise-class ATA drives, which can handle this type of head motion. (Standard ATA disks can 
overheat under a load which is heavy in seeks.)  Then again, the 3511 is marketed as a near-line 
rather than on-line array ... that may be simply because the SATA drives don't perform as well as 
FC.

I do see this note in the 3511 documentation: Note - Do not use a Sun StorEdge 3511 
SATA array to store single instances of data. It is more suitable for use in 
configurations where the array has a backup or archival role.

(I too am curious -- why do you consider yourself down? You've got a RAID 5, 
one disk is down, are you just worried about your current lack of redundancy? 
[I would be.] Will you be adding a hot spare?)
  
Yes, I am worried about the lack of redundancy. And, I have some new 
disks on order, at least one of which will be a hot spare.

Anton
 
 
This message posted from opensolaris.org

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
  


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS and SE 3511

2006-12-19 Thread Anton Rang

On Dec 19, 2006, at 7:14 AM, Mike Seda wrote:


Anton B. Rang wrote:
I have a Sun SE 3511 array with 5 x 500 GB SATA-I disks in a RAID  
5. This
2 TB logical drive is partitioned into 10 x 200GB slices. I gave  
4 of these slices to a Solaris 10 U2 machine and added each of  
them to a concat (non-raid) zpool as listed below:




This is certainly a supportable configuration.  However, it's not  
an optimal one.



What would be the optimal configuration that you recommend?


If you don't need ZFS redundancy, I would recommend taking a single  
slice for your ZFS file system (e.g. 6 x 200 GB for other file  
systems, and 1 x 800 GB for the ZFS pool).  There would still be  
contention between the various file systems, but at least ZFS would  
be working with a single contiguous block of space on the array.


Because of the implicit striping in ZFS, what you have right now is  
analogous to taking a single disk, partitioning it into several  
partitions, then striping across those partitions -- it works, you  
can use all of the space, but there's a rearrangement which means  
that logically contiguous blocks on disk are no longer physically  
contiguous, hurting performance substantially.


Yes, I am worried about the lack of redundancy. And, I have some  
new disks on order, at least one of which will be a hot spare.


Glad to hear it.

Anton


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS and SE 3511

2006-12-19 Thread Jason J. W. Williams

I do see this note in the 3511 documentation: Note - Do not use a Sun StorEdge 3511 
SATA array to store single instances of data. It is more suitable for use in 
configurations where the array has a backup or archival role.


My understanding of this particular scare-tactic wording (its also in
the SANnet II OEM version manual almost verbatim) is that it has
mostly to do with the relative unreliability of SATA firmware versus
SCSI/FC firmware. Its possible that the disks are lower quality SATA
disks too, but that was not what was relayed to us when we looked at
buying the 3511 from Sun or the DotHill version (SANnet II).


Best Regards,
Jason
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: ZFS and SE 3511

2006-12-18 Thread Anton B. Rang
 I have a Sun SE 3511 array with 5 x 500 GB SATA-I disks in a RAID 5. This
 2 TB logical drive is partitioned into 10 x 200GB slices. I gave 4 of these 
 slices to a 
 Solaris 10 U2 machine and added each of them to a concat (non-raid) zpool as 
 listed below:

This is certainly a supportable configuration.  However, it's not an optimal 
one.

You think that you have a 'concat' structure, but it's actually striped/RAID-0, 
because ZFS implicitly stripes across all of its top-level structures (your 
slices, in this case). This means that ZFS will constantly be writing data to 
addresses around 0, 50 GB, 100 GB, and 150 GB of each disk (presuming the first 
four slices are those you used). This will keep the disk arms constantly in 
motion, which isn't good for performance.

 do you think my zfs configuration caused the drive failure?

I doubt it. I haven't investigated which disks ship in the 3511, but I would 
presume they are enterprise-class ATA drives, which can handle this type of 
head motion. (Standard ATA disks can overheat under a load which is heavy in 
seeks.)  Then again, the 3511 is marketed as a near-line rather than on-line 
array ... that may be simply because the SATA drives don't perform as well as 
FC.

I do see this note in the 3511 documentation: Note - Do not use a Sun StorEdge 
3511 SATA array to store single instances of data. It is more suitable for use 
in configurations where the array has a backup or archival role.

(I too am curious -- why do you consider yourself down? You've got a RAID 5, 
one disk is down, are you just worried about your current lack of redundancy? 
[I would be.] Will you be adding a hot spare?)

Anton
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss