Re: [zfs-discuss] ZFS and spindle speed (7.2k / 10k / 15k)

2011-02-10 Thread Joerg Schilling
Andrew Gabriel andrew.gabr...@oracle.com wrote:

 If you go back to the late 1970's before tracks had embedded servo data, 
 on multi-platter disks you had one surface which contained the head 
 positioning servo data, and the drive relied on accurate vertical 
 alignment between heads/surfaces to keep on track (and drives could 
 head-switch instantly). Around 1980, tracks got too close together for 
 this to work anymore, and the servo positioning data was embedded into 
 each track itself. The very first drives of this type scanned all the 

The first drive I am aware to use embedded servo was the Simemens MegaFile 
drive series in 1986 and while it could increase the data density, it caused a 
slow down for the head switch time. I was forced to write my own disk 
formatting program in order to be able to apply a track skew value != 0 to 
compensate this problem.

Fortunately, I did this together with introducing a SCSI generic driver so I 
was able to format disks from a running OS and was not forced to boot the Sun 
standalone disk formatting program anymore ;-)

Jörg

-- 
 EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
   j...@cs.tu-berlin.de(uni)  
   joerg.schill...@fokus.fraunhofer.de (work) Blog: 
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and spindle speed (7.2k / 10k / 15k)

2011-02-06 Thread Brandon High
On Sat, Feb 5, 2011 at 3:34 PM, Roy Sigurd Karlsbakk r...@karlsbakk.net wrote:
 so as not to exceed the channel bandwidth. When they need to get higher disk
 capacity, they add more platters.

 May this mean those drives are more robust in terms of reliability, since the 
 leaks between sectors is less likely with the lower density?

More platters leads to more heat and higher power consumption. Most
drives are 3 or 4 platters, though Hitachi usually manufactures 5
platter drives as well.

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and spindle speed (7.2k / 10k / 15k)

2011-02-05 Thread Roy Sigurd Karlsbakk
 One characteristic people often overlook is: When you get a disk with
 higher capacity (say, 2T versus 600G) then you get more empty space
 and hence typically lower fragmentation in the drive. Also, the
 platter density is typically higher, so if the two drives have equal
 RPM's, typically the higher capacity drive can perform faster
 sustained sequential operations.

10k and 15k drives aren't true 3,5, but closer to 2,5, even though the 
casing is the standard 3,5 size (open one if you doubt this). Usually, these 
drives have similar density as their respective 7k2 drives, and thus higher 
speed because of the spin rate.

Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
r...@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er 
et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
relevante synonymer på norsk.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and spindle speed (7.2k / 10k / 15k)

2011-02-05 Thread Roy Sigurd Karlsbakk
 Nope. Most HDDs today have a single read channel, and they select
 which head uses that channel at any point in time. They cannot use
 multiple heads at the same time, because the heads to not travel the
 same path on their respective surfaces at the same time. There's no
 real vertical alignment of the tracks between surfaces, and every
 surface has its own embedded position information that is used when
 that surface's head is active. There were attempts at multi-actuator
 designs with separate servo arms and multiple channels, but
 mechanically they're too difficult to manufacture at high yields as I
 understood it.

Perhaps a stupid question, but why don't they read from all platters in 
parallel?

Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
r...@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er 
et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
relevante synonymer på norsk.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and spindle speed (7.2k / 10k / 15k)

2011-02-05 Thread Andrew Gabriel

Roy Sigurd Karlsbakk wrote:

Nope. Most HDDs today have a single read channel, and they select
which head uses that channel at any point in time. They cannot use
multiple heads at the same time, because the heads to not travel the
same path on their respective surfaces at the same time. There's no
real vertical alignment of the tracks between surfaces, and every
surface has its own embedded position information that is used when
that surface's head is active. There were attempts at multi-actuator
designs with separate servo arms and multiple channels, but
mechanically they're too difficult to manufacture at high yields as I
understood it.



Perhaps a stupid question, but why don't they read from all platters in 
parallel?
  


The answer is in the text you quoted above.

There are drives now with two level actuators.
The primary actuator is the standard actuator you are familiar with 
which moves all the arms.
The secondary actuator is a piezo crystal towards the head end of the 
arm which can move the head a few tracks very quickly without having to 
move the arm, and these are one per head. In theory, this might allow 
multiple heads to lock on to their respective tracks at the same time 
for parallel reads, but I haven't heard that they are used in this way.


If you go back to the late 1970's before tracks had embedded servo data, 
on multi-platter disks you had one surface which contained the head 
positioning servo data, and the drive relied on accurate vertical 
alignment between heads/surfaces to keep on track (and drives could 
head-switch instantly). Around 1980, tracks got too close together for 
this to work anymore, and the servo positioning data was embedded into 
each track itself. The very first drives of this type scanned all the 
surfaces on startup to build up an internal table of the relative 
misalignment of tracks across the surfaces, but this rapidly became 
unviable as drive capacity increased and this scan would take an 
unreasonable length of time. It may be that modern drives learn this as 
they go - I don't know.


--
Andrew Gabriel
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and spindle speed (7.2k / 10k / 15k)

2011-02-05 Thread Roy Sigurd Karlsbakk
 For the data sheet I referenced, all the drive sizes have the same sustained
 data rate OD, 125 MB/s. Eric posted an explanation for this, which
 seems entirely believable: The data rate is not being limited by the density
 of magnetic material on the platter or the rotational speed, but by the
 head or channel bandwidth to each platter itself. When they run the disks at a
 higher RPM, they need to stretch the bits longer on the disk surface
 so as not to exceed the channel bandwidth. When they need to get higher disk
 capacity, they add more platters.

May this mean those drives are more robust in terms of reliability, since the 
leaks between sectors is less likely with the lower density?

Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
r...@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er 
et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
relevante synonymer på norsk.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and spindle speed (7.2k / 10k / 15k)

2011-02-03 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of taemun
 
 Uhm. Higher RPM = higher linear speed of the head above the platter =
 higher throughput. If the bit pitch (ie the size of each bit on the
platter) is the

Nope.  That's what I originally said, but I was proven wrong.  

For the data sheet I referenced, all the drive sizes have the same sustained
data rate OD, 125 MB/s.  Eric posted an explanation for this, which seems
entirely believable:  The data rate is not being limited by the density of
magnetic material on the platter or the rotational speed, but by the head or
channel bandwidth to each platter itself.  When they run the disks at a
higher RPM, they need to stretch the bits longer on the disk surface so as
not to exceed the channel bandwidth.  When they need to get higher disk
capacity, they add more platters.

This would logically conclude that you can get a higher maximum disk
capacity at a rotational speed which is smaller.  In fact, I currently see
up to 3T available in 7.2krpm drives ... I see a max 800G in 15krpm... 

Yes, the higher rpm drives have smaller latency.  No, they don't have higher
sustained throughput.

If anyone wants to look up more drive specs...  Here's how to find it on
seagate.com:  Go to support, Knowledgebase.  Under Support go to Document
Library.  Click the drive in question.  And then you can find the Data
Sheet.

The couple of things that are really clear by looking over a bunch of data
sheets are:
* Higher rpm's means lower latency.  (duh.)
* Higher rpm's is loosely correlated with higher throughput, but it's not a
linear correlation, and not always present.
* If you go to a different drive type (SATA vs SAS vs FC) then you can get
higher throughput...  In no case is the throughput even remotely close to
the bus speed, so the improved performance is not *because* of the
interface.  Presumably the more expensive drive type has a more expensive
head or whatever internally.
* Larger disk size does not improve sustained throughput at all.  Zero. 

All of this supports what Eric said.  The throughput of a drive is not
determined by the platter density or rotation speed.  It's limited by the
head or something else in the data channel accessing the disk.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and spindle speed (7.2k / 10k / 15k)

2011-02-02 Thread James
Edward,
Thanks for the reply.  

Good point on platter density.  I'ld considered the benefit of lower 
fragmentation but not the possible increase in sequential iops due to density.

I assume while a 2TB 7200rpm drive may have better sequential IOPS than a 
500GB, it will not be double and therefore, if the 500GB's are half price the 
double spindle count would lead to better overall sequential IOPS (assuming 
still enough excess space to remove fragmentation benefit and increased usable 
GB's).  Agree?

Our IO is random small(4kB) read/write but I expect ZFS to convert the writes 
to sequential and the L2ARC to intercept a lot of the reads. (assumes low 
latency, high iop ZIL SLOG device).   It basically seems to come down to 
whether the random reads that miss the cache need lower individual latency or 
not.  

Thanks for the pointer on terminology.  I had thought that the RAM write cache 
was also called ARC (in addition to the RAM read cache).
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and spindle speed (7.2k / 10k / 15k)

2011-02-02 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of James
 
 I assume while a 2TB 7200rpm drive may have better sequential IOPS than a
 500GB, it will not be double and therefore, 

Don't know why you'd assume that.  I would assume a 2TB drive would be
precisely double the sequential throughput of a 500G.  I think if you double
the surface density in two dimensions (a flat surface) you end up with 4x
the storage capacity.  Hence, a 2T drive should have 2x the 1-dimensional
track density, and should be 2x faster sequential throughput than a 500G
drive, with all other things being equal.


 Our IO is random small(4kB) read/write but I expect ZFS to convert the
writes
 to sequential and the L2ARC to intercept a lot of the reads. (assumes low
 latency, high iop ZIL SLOG device).   

If you truly have random reads, then your L2ARC can't help you much.  Your
L2ARC can only help if you have a lot of repeated reads clustered in hot
areas of the storage pool.  ARC is based on Frequently Used and Recently
Used

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and spindle speed (7.2k / 10k / 15k)

2011-02-02 Thread Richard Elling
On Feb 2, 2011, at 6:10 AM, Edward Ned Harvey wrote:

 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of James
 
 I assume while a 2TB 7200rpm drive may have better sequential IOPS than a
 500GB, it will not be double and therefore, 
 
 Don't know why you'd assume that.  I would assume a 2TB drive would be
 precisely double the sequential throughput of a 500G.  I think if you double
 the surface density in two dimensions (a flat surface) you end up with 4x
 the storage capacity.  Hence, a 2T drive should have 2x the 1-dimensional
 track density, and should be 2x faster sequential throughput than a 500G
 drive, with all other things being equal.

They aren't.  Check the datasheets, the max media bandwidth is almost always
published.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and spindle speed (7.2k / 10k / 15k)

2011-02-02 Thread Edward Ned Harvey
 From: Richard Elling [mailto:richard.ell...@gmail.com]
 
 They aren't.  Check the datasheets, the max media bandwidth is almost
 always
 published.

I looked for said data sheets before posting.  Care to drop any pointers?  I
didn't see any drives publishing figures for throughput to/from platter
today.

I know this information exists for some drives.  I've seen it before.  But
it's apparently not for most of the readily available drives on the market
now.  Not in the dozen drives that I tried to look up.

The goal is to compare the throughput for two drives, where size of one
drive is 4x greater than the size of the other, and all other things are
equal.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and spindle speed (7.2k / 10k / 15k)

2011-02-02 Thread Brandon High
On Wed, Feb 2, 2011 at 6:10 AM, Edward Ned Harvey
opensolarisisdeadlongliveopensola...@nedharvey.com wrote:
 Don't know why you'd assume that.  I would assume a 2TB drive would be
 precisely double the sequential throughput of a 500G.  I think if you double

That's assuming that the drives have the same number of platters. 500G
drives are generally one platter, and 2T drives are generally 4
platters. Same size platters, same density. The 500G drive could be
expected to have slightly higher random iops due to lower mass in the
heads, but it's probably not statistically significant.

I think the current batch of 3TB drives are 7200 RPM with 5 platters
and 667GB per platter or 5400 RPM with 4 platters at 750GB/platter.

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and spindle speed (7.2k / 10k / 15k)

2011-02-02 Thread James
Thanks Richard  Edward for the additional contributions.

I had assumed that maximum sequential transfer rates on datasheets (btw - 
those are the same for differing capacity seagate's) were based on large block 
sizes and a ZFS 4kB recordsize* would mean much lower IOPS.  e.g. Seagate 
Constellations are around 75-141MB/s(inner-outer) and  75MB/s is 18750 4kB 
IOPS!   However I've just tested** a slow 1TB 7200 drive and got over 6000 4kB 
seq write IOPS which is a lot more than the 500 I was working on.

If this is correct then even with hardly any spindles, if there's enough free 
space for ZFS to do sequential writes (no interrupting reads etc),  you should 
easily get 6000 4k write IOPS from the disks (ie SLOG IOPS/latency will become 
limiting factor for sync writes) 

However, is this what people are actually seeing?(links to any other good 
reference builds with benchmarks welcome).

The benchmarks I've found so far are:
1) http://www.zfsbuild.com/2010/10/09/nexenta-core-platform-benchmarks/  (maxed 
around 4000 write IOPS but iSCSI(sync) to X25-E so could be limited by that - 
rated 3300 IOPS)
2) http://www.opensolaris.org/jive/thread.jspa?messageID=507090#507090 One 
response quoted achieving 550-600 write IOPS on 15k drives (actually just 
realised this is your response Edward) (if your traffic is large blocks this 
may explain lower iops if bandwidth limited?).

ps. Regarding workload and Random reads/L2ARC:   The VMs are Windows Servers 
(Web servers, Databases etc) so the overall mix is random but I'm expecting 
L2ARC should end up holding frequently read blocks like those behind regularly 
read database blocks and, especially if we got dedupe working, the Operating 
System blocks.  I also theorise that with thick vmdk files's and defrag'd guest 
OS and applications, ZFS read-ahead should start to kick in when sequential 
reads are made to files within the OS.

* VMWare over NFS scenario (similar to local database - 4/8kB read/writes to 
large file)
** IOMeter, 4kB 100% seq, 1  64 outstanding, Win7 partition 4kb alloc unit, 
Drive write cache enabled (I believe ZFS uses drive write cache) on WD10EACS

).  .
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and spindle speed (7.2k / 10k / 15k)

2011-02-02 Thread Edward Ned Harvey
 From: Brandon High [mailto:bh...@freaks.com]
 
 That's assuming that the drives have the same number of platters. 500G
 drives are generally one platter, and 2T drives are generally 4
 platters. Same size platters, same density. The 500G drive could be

Wouldn't multiple platters of the same density still produce a throughput
that's a multiple of what it would have been with a single platter?  I'm
assuming the heads on the multiple platters are all able to operate
simultaneously.

Anyway, here's a data point:
http://www.seagate.com/docs/pdf/datasheet/disc/ds_barracuda_7200_12.pdf

All the disks from 160G up to 1T have the same sustained data rate, which is
125 MB/s

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and spindle speed (7.2k / 10k / 15k)

2011-02-02 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of James
 
 block sizes and a ZFS 4kB recordsize* would mean much lower IOPS.  e.g.
 Seagate Constellations are around 75-141MB/s(inner-outer) and  75MB/s is
 18750 4kB IOPS!   However I've just tested** a slow 1TB 7200 drive and got
 over 6000 4kB seq write IOPS which is a lot more than the 500 I was
working

For sustained throughput, I don't measure in IOPS.  I measure in MB/s, or
Mbit/s.  For a slow hard drive, 500Mbit/s.  For a fast one, 1 Gbit/s or
higher.  I was surprised by the specs of the seagate disks I just emailed a
moment ago.  1Gbit out of a 7.2krpm drive...  That's what I normally expect
out of a 15krpm drive.

I know people sometimes (often) use IOPS even when talking about sequential
operations, but I only say IOPS for random operations.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and spindle speed (7.2k / 10k / 15k)

2011-02-02 Thread Eric D. Mudama

On Wed, Feb  2 at 20:40, Edward Ned Harvey wrote:

Wouldn't multiple platters of the same density still produce a throughput
that's a multiple of what it would have been with a single platter?  I'm
assuming the heads on the multiple platters are all able to operate
simultaneously.


Nope.  Most HDDs today have a single read channel, and they select
which head uses that channel at any point in time.  They cannot use
multiple heads at the same time, because the heads to not travel the
same path on their respective surfaces at the same time.  There's no
real vertical alignment of the tracks between surfaces, and every
surface has its own embedded position information that is used when
that surface's head is active.  There were attempts at multi-actuator
designs with separate servo arms and multiple channels, but
mechanically they're too difficult to manufacture at high yields as I
understood it.

http://www.tomshardware.com/news/seagate-hdd-harddrive,8279.html



--
Eric D. Mudama
edmud...@bounceswoosh.org

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and spindle speed (7.2k / 10k / 15k)

2011-02-02 Thread Eric D. Mudama

On Wed, Feb  2 at 20:45, Edward Ned Harvey wrote:

For sustained throughput, I don't measure in IOPS.  I measure in MB/s, or
Mbit/s.  For a slow hard drive, 500Mbit/s.  For a fast one, 1 Gbit/s or
higher.  I was surprised by the specs of the seagate disks I just emailed a
moment ago.  1Gbit out of a 7.2krpm drive...  That's what I normally expect
out of a 15krpm drive.


It used to be that enterprise grade, higher RPM devices used more
expensive electronics, but that's not really the case anymore.  It
seems most vendors are trying to use common electronics across their
product lines, which generally makes great business sense.

These days I think most HDD companies get their channel working at a
certain max bitrate, and format their drive zones to match that
bitrate at the max radius where velocity is the highest.  This is a
bit of a simplification, but it's the general idea.

When the drive is spinning the media less quickly, in a 7200 RPM
device, they can pack the bits in more tightly, which lowers overall
cost because they need fewer heads and platters to achieve a target
capacity.  It just so happens that the max bits/second flying under
the read head is a constant pegged to the channel design.  All other
things being equal, the 15k and the 7200 drive, which share
electronics, will have the same max transfer rate at the OD.


I know people sometimes (often) use IOPS even when talking about sequential
operations, but I only say IOPS for random operations.


Me too, though not everyone realizes how much overhead there can be in
small operations, even sequential ones.

--eric


--
Eric D. Mudama
edmud...@bounceswoosh.org

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and spindle speed (7.2k / 10k / 15k)

2011-02-02 Thread Mark Sandrock

On Feb 2, 2011, at 8:10 PM, Eric D. Mudama wrote:

  All other
 things being equal, the 15k and the 7200 drive, which share
 electronics, will have the same max transfer rate at the OD.

Is that true? So the only difference is in the access time?

Mark
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and spindle speed (7.2k / 10k / 15k)

2011-02-02 Thread taemun
Uhm. Higher RPM = higher linear speed of the head above the platter = higher
throughput. If the bit pitch (ie the size of each bit on the platter) is the
same, then surely a higher linear speed corresponds with a larger number of
bits per second?

So if all other things being equal includes the bit density, and radius to
the edge of the media, then ... surely higher rpm = higher throughput?

Cheers,

On 3 February 2011 14:10, Mark Sandrock mark.sandr...@oracle.com wrote:


 On Feb 2, 2011, at 8:10 PM, Eric D. Mudama wrote:

   All other
  things being equal, the 15k and the 7200 drive, which share
  electronics, will have the same max transfer rate at the OD.

 Is that true? So the only difference is in the access time?

 Mark
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and spindle speed (7.2k / 10k / 15k)

2011-02-02 Thread Eric D. Mudama

On Thu, Feb  3 at 14:18, taemun wrote:

  Uhm. Higher RPM = higher linear speed of the head above the platter =
  higher throughput. If the bit pitch (ie the size of each bit on the
  platter) is the same, then surely a higher linear speed corresponds with a
  larger number of bits per second?
  So if all other things being equal includes the bit density, and radius
  to the edge of the media, then ... surely higher rpm = higher throughput?
  Cheers,


Point being that they have to lower the bit density on high RPM drives
to fit within the bandwidth constraints of the channel.

If they could just get their channel working at 3GHz instead of 2GHz
or whatever, they'd use that capability to pack even more bits into
the consumer drives to lower costs.


--
Eric D. Mudama
edmud...@bounceswoosh.org

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and spindle speed (7.2k / 10k / 15k)

2011-02-01 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of James
 
 I’m trying to select the appropriate disk spindle speed for a proposal and
 would welcome any experience and opinions  (e.g. has anyone actively
 chosen 10k/15k drives for a new ZFS build and, if so, why?).

There is nothing special about ZFS in relation to spindle speed.  If you get 
higher rpm's, then you get higher iops, and the same is true for EXT3, NTFS, 
HFS+, ZFS, etc.

One characteristic people often overlook is:  When you get a disk with higher 
capacity (say, 2T versus 600G) then you get more empty space and hence 
typically lower fragmentation in the drive.  Also, the platter density is 
typically higher, so if the two drives have equal RPM's, typically the higher 
capacity drive can perform faster sustained sequential operations.

Even if you use slow drives, assuming you have them in some sort of raid 
configuration, they quickly add up sequential speed to reach the bus speed.  So 
if you expect to do large sequential operations, go for the lower rpm disks.  
But if you expect to do lots of small operations, then twice the rpm's 
literally means twice the performance.  So for small random operations, go for 
the higher rpm disks.


 ** My understanding is that  ZFS will adjust the amount of data accepted into
 each “transaction” (TXG) to ensure it can be written to disk in 5s.Async 
 data
 will stay in ARC, Sync data will also go to ZIL or, if overthreshold, will go 
 to disk
 and pointer to ZIL(on low latency SLOG) – ie. all writes apart from sync 
 writes

ZFS will aggregate small random writes into larger sequential writes.  So you 
don't have to worry too much about rpm's and iops during writes.  But of course 
there's nothing you can do about the random reads.  So if you do random reads, 
you do indeed want higher rpm's.

Your understanding (or terminology) of arc is not correct.  Arc and l2arc are 
read cache.  The terminology for the context you're describing would be the 
write buffer.  Async writes will be stored in the ram write buffer and 
optimized for sequential disk blocks before writing to disk.  Whenever there 
are sync writes, they will be written to the ZIL (hopefully you have a 
dedicated ZIL log device) immediately, and then they will join the write buffer 
with all the other async writes.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS and spindle speed (7.2k / 10k / 15k)

2011-01-31 Thread James
G'day All. 

I’m trying to select the appropriate disk spindle speed for a proposal and 
would welcome any experience and opinions  (e.g. has anyone actively chosen 
10k/15k drives for a new ZFS build and, if so, why?).

This is for ZFS over NFS for VMWare storage ie. primarily random 4kB read/sync 
writes (SLOG) + some general CIFS file serving. About 40/60 read/write 
ratio.

The primary drive options I’m trying to compare are 48xHP SAS 500gb 7.2k(avg 
8ms seek - approx 80 random IOPS/drive) or 24xHP SAS 450g or 600g 10k drives 
(avg 4ms seek - approx 138 random IOPS/drive) which work out pretty close in 
price.

Ok, first theory.  Assuming sequential writes, the 7200 drives should be up to 
75% (at worst 80/138%) the IOPS of the 10k and with twice the number of 
spindles and a low latency ZIL SLOG that should give much better write 
performance**.  Correct?What IOPS are people seeing from 7200 (approx 8ms 
avg seek) drives under mainly write loads?

Random reads IOPS are about the same on both options in terms of £/Random IO so 
the only problem is higher latency for reads that miss the ARC/L2ARC and are 
serviced by the 7200’s (avg 12.3- max 25.8ms) which is slower than the 10k 
would be (avg 7ms – max 14ms).  I’m currently planning 2x240GB L2ARC so 
hopefully we’ll be able to get a lot of the active read memory into cache and 
keep the latencies low.  Any suggestions how to identify the amount of “working 
dataset” on windows/netapp etc? 

I note ZFSBuild said they’ld do their next build with 15k SAS but I couldn’t 
follow their logic.   Anything else I’m missing.  

** My understanding is that  ZFS will adjust the amount of data accepted into 
each “transaction” (TXG) to ensure it can be written to disk in 5s.Async 
data will stay in ARC, Sync data will also go to ZIL or, if overthreshold, will 
go to disk and pointer to ZIL(on low latency SLOG) – ie. all writes apart from 
sync writes over threshold will be unaffected by disk write latency from a 
client perspective.Therefore if, for the same budget, 7200rpm can give you 
a higher iops, high latency disk whereas 10k gives you lower latency but lower 
iops, the 7200rpm system would end up providing highest write iops at lowest 
latency (due to SLOG).
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss