On 1/20/2012 1:50 AM, Michael Tokarev wrote:

> Please excuse me for the somewhat harsh words, but except of the
> alignment issues which should be solved for once when partitioning
> and creating filesystem, the rest is a complete bullshit collected
> from various forums where people does not understand what they're
> doing and blame bad drives.

Since I am apparently the target of "does not understand what they're
doing and blame bad drives" let me give you some education that isn't
"bullshit" from "various forums", but knowledge gained from working with
hard disk drive technology for over 25 years, designing RAID storage
systems for a few of those, and cutting my teeth when drives were still
called "Winchesters".  I will explain in detail why these 'green' drives
aren't suitable for mail queue and other high random IOPS workloads.

> These drives are excellent when set up and used properly (the
> misalignment issue you mentioned is real indeed, and MUST be
> taken into account: everything should be aligned to 4Kb, lots
> of especially old tools don't do that or even don't LET you to
> do that - eg cfdisk in linux).  Very reliable, fast and
> predictable.

As it turns out the OP has Seagate LP drives which are not Advanced
Format 512/4096 drives.  No alignment issue there.  And he's using EXT4,
not XFS, so there's not really a possibility of filesystem misalignment
on the RAID stripe, as EXT4 isn't advanced enough to align to the
underlying RAID stripe.

> I'll give just one note about speed, which may look completely
> wrong at first.  The reason they're speedy is that at their
> low rotational speed, they also have much more data density, --
> ie, basically, they can transfer much more data during single
> rotate.  This way, their linear speed (sequentional read or
> write) often goes FASTER than enterprize-class 15KRPM drives
> which are of much less volumes (300-600Gb as opposed to 1 or
> 2Tb or more for these "green" drives).

This argument is flawed.  Lower spindle speed doesn't create higher
aerial density--there is no relationship between the two.  And higher
aerial density doesn't dictate a lower spindle speed.  This is simply a
design tradeoff/decision.  Higher aerial density will yield a higher
sequential data rate.  But lower spindle speed will ALWAYS yield a lower
random transaction rate.  The former is irrelevant to mail server
(queue) performance.  The latter is key to mail server (queue) performance.

> Yes, due to slower rotational speed, they take more time to
> position platter to the right place.  But that's, again, not
> whole story: the seek time is about the same as for their
> "elder" brothers.  Now, use just first 300 or 600Gb out of
> this 2Tb drive, to have more fair comparison with 15KRPM
> drives, and you realize that the seek time improves greatly,
> since we now have to seek less!

Flawed again.  For 4 reasons:

1.  The "seek time" isn't anywhere close to the same.  The
track-to-track seek latency will be similar because the actuator motors
are very similar.  But the full stroke seek latency will be over twice
as high for the 'green' drives.  This is critically important because
green drives, especially the WD models, aggressively park their heads
every few seconds to save power.  When the head is not parked there must
always be current in the voice coil to maintain head position.  Parking
the heads aggressively saves power as the coil is not energized when the
head is parked.  Thus, on the next access, the heads must move all the
way from the parking ramp to the target track.  This un-parking latency
isn't included in the "average latency" figures advertised.

Contrast this with many enterprise SAS drives which hover the heads over
the center of the platter when idle.  Doing this allows for the shortest
average travel distance to any potential track.  Thus, even though
"advertised average latency", which is the track-to-track figure, is
similar between green consumer drives and enterprise SAS drives, the
total latency for actual real world IO is MUCH lower for the enterprise
drives, like a factor of 3 or more.

2.  Nobody will go to the trouble of short stroking one of these 'green'
drives in an effort to approach the random seek performance of a 15K
drive, which is impossible away (see 1 above and 3,4 below).

3.  15K drives have a 1.5" platter diameter.  The high cap green drives
have a 2.5" platter diameter.  Thus the heads on the green drives must
travel 67% further, and thus have a much higher seek latency for random
IO, regardless of spindle speed.

4.  The firmware on the green drives is optimized to weight acoustical
management over raw seek performance.  Thus, the firmware will attempt
to buffer as many writes into the drive cache as possible, then reorder
them in a manner that most MINIMIZES head movement, in order to decrease
noise output.  This alone is likely more detrimental to fsync
performance than slow spindle speed.  This is the single biggest reason
to avoid such drives for random IO workloads, and it's one of the two
main reasons consumers love them:  cheap and quiet.

Enterprise drives do the opposite.  The firmware is optimized for
absolute minimum latency and maximum IOPS.  Noise management isn't a
consideration.

> That's about speed.  As you can see, the picture is FAR from
> definitive.

Now that I've taken the brush from your hand and painted an accurate
picture, we see that the green drives on the market have horrible random
IOPS performance, by design, just as I already stated.

> I never tried running mailserver tests on an array of such drives
> because I don't have many of them.  But a single 2Tb WD20EARS
> drive outperforms single 500Gb Hitachi enterprise-class _sata_
> (not sas) 7200RPM (both are of the same "generation", ie, bought
> at about the same time and both were current) for about 10% for
> postfix smtp-sink workload.  Go figure.

The EARS should have better sequential performance.  But that's
irrelevant here.  The Hitachi will best the WD EARS in random IOPS
performance.  Which is relevant here.

BTW, smtp-sink doesn't test queue performance.  smtp-sink is for testing
the network throughput of a sending host.  You're thinking of
smtp-source, which is what I used to generate the numbers I posted
previously in this thread.

-- 
Stan

Reply via email to