On 20.01.2012 16:01, Stan Hoeppner wrote:
> On 1/20/2012 1:50 AM, Michael Tokarev wrote:
> 
>> Please excuse me for the somewhat harsh words, but except of the
>> alignment issues which should be solved for once when partitioning
>> and creating filesystem, the rest is a complete bullshit collected
>> from various forums where people does not understand what they're
>> doing and blame bad drives.
> 
> Since I am apparently the target of "does not understand what they're
> doing and blame bad drives" let me give you some education that isn't
> "bullshit" from "various forums", but knowledge gained from working with
> hard disk drive technology for over 25 years, designing RAID storage
> systems for a few of those, and cutting my teeth when drives were still
> called "Winchesters".  I will explain in detail why these 'green' drives
> aren't suitable for mail queue and other high random IOPS workloads.

It looks like we're coming from the same place/background/time
but apparently do not share the conclusions.

I referred to forums because lots of people really don't
understand what's going on and just blame the drives, --
this is something which do exists.

>> These drives are excellent when set up and used properly (the
>> misalignment issue you mentioned is real indeed, and MUST be
>> taken into account: everything should be aligned to 4Kb, lots
>> of especially old tools don't do that or even don't LET you to
>> do that - eg cfdisk in linux).  Very reliable, fast and
>> predictable.
> 
> As it turns out the OP has Seagate LP drives which are not Advanced
> Format 512/4096 drives.  No alignment issue there.

I never had/usd these so don't know.  Apparently the LP drives
are more optimized for sequentional operations, since they're
positioned for various media tasks (movies, photos etc) - but
again it is difficult to say what is "optimization" here.

But that's not the point, I commented on your general
conclusion not the particular drive model.

>     And he's using EXT4,
> not XFS, so there's not really a possibility of filesystem misalignment
> on the RAID stripe, as EXT4 isn't advanced enough to align to the
> underlying RAID stripe.

This is, again, somewhat over-statement.  But it is not the
point here again.

>> I'll give just one note about speed, which may look completely
>> wrong at first.  The reason they're speedy is that at their
>> low rotational speed, they also have much more data density, --
>> ie, basically, they can transfer much more data during single
>> rotate.  This way, their linear speed (sequentional read or
>> write) often goes FASTER than enterprize-class 15KRPM drives
>> which are of much less volumes (300-600Gb as opposed to 1 or
>> 2Tb or more for these "green" drives).
> 
> This argument is flawed.  Lower spindle speed doesn't create higher
> aerial density--there is no relationship between the two.  And higher
> aerial density doesn't dictate a lower spindle speed.  This is simply a
> design tradeoff/decision.  

I never said higher density is BECAUSE lower rotational speed or
vise versa.  But there _is_ a very strong relationship between
the two: you can't have _both_ because at higher speed you can NOT
read/write high-density data due to very high requiriments for
the mechanical parts in this case.

>   Higher aerial density will yield a higher
> sequential data rate. But lower spindle speed will ALWAYS yield a lower
> random transaction rate.  The former is irrelevant to mail server
> (queue) performance.  The latter is key to mail server (queue) performance.

Both true and I never tried to state the opposite, except of one
detail questioning the "ALWAYS" part, here:

>> Yes, due to slower rotational speed, they take more time to
>> position platter to the right place.  But that's, again, not
>> whole story: the seek time is about the same as for their
>> "elder" brothers.  Now, use just first 300 or 600Gb out of
>> this 2Tb drive, to have more fair comparison with 15KRPM
>> drives, and you realize that the seek time improves greatly,
>> since we now have to seek less!
> 
> Flawed again.  For 4 reasons:
> 
> 1.  The "seek time" isn't anywhere close to the same.  The
> track-to-track seek latency will be similar because the actuator motors
> are very similar.  But the full stroke seek latency will be over twice
> as high for the 'green' drives.  This is critically important because
> green drives, especially the WD models, aggressively park their heads
> every few seconds to save power.  When the head is not parked there must
> always be current in the voice coil to maintain head position.  Parking
> the heads aggressively saves power as the coil is not energized when the
> head is parked.  Thus, on the next access, the heads must move all the
> way from the parking ramp to the target track.  This un-parking latency
> isn't included in the "average latency" figures advertised.

That's lots of details in one statement.  Parking isn't
mandatory and can be turned off, and if the drive is in
constant use there's no time to park.

> Contrast this with many enterprise SAS drives which hover the heads over
> the center of the platter when idle.  Doing this allows for the shortest
> average travel distance to any potential track.  Thus, even though
> "advertised average latency", which is the track-to-track figure, is
> similar between green consumer drives and enterprise SAS drives, the
> total latency for actual real world IO is MUCH lower for the enterprise
> drives, like a factor of 3 or more.

That's one of a good differences between consumer and enterprise
drives (some of the latter anyway) -- heads positioning while
at idle.

> 2.  Nobody will go to the trouble of short stroking one of these 'green'
> drives in an effort to approach the random seek performance of a 15K
> drive, which is impossible away (see 1 above and 3,4 below).


> 3.  15K drives have a 1.5" platter diameter.

Heh.  Now this is 100% contradicts with reality I've seen. ;)
I disassembled several dead drives (150 and 300gig sas range),
all had large platters inside of 3" case.  Maybe the ones in
2.5" case, -- these, obviously, have smaller plates.  Sure,
there are quite some 2.5" enterprise drives appeared recently.

>   The high cap green drives
> have a 2.5" platter diameter.  Thus the heads on the green drives must
> travel 67% further, and thus have a much higher seek latency for random
> IO, regardless of spindle speed.

This contradicts with what I said above: if you compare native
(large) size of a green and (small) size of "high-end" drive, --
in that case you're right.  But I referred to a "more fair"
comparison when you compare equal sizes, and there, green
will have to seek less not more.

> 4.  The firmware on the green drives is optimized to weight acoustical
> management over raw seek performance.  Thus, the firmware will attempt
> to buffer as many writes into the drive cache as possible, then reorder
> them in a manner that most MINIMIZES head movement, in order to decrease
> noise output.  This alone is likely more detrimental to fsync
> performance than slow spindle speed.  This is the single biggest reason
> to avoid such drives for random IO workloads, and it's one of the two
> main reasons consumers love them:  cheap and quiet.
> 
> Enterprise drives do the opposite.  The firmware is optimized for
> absolute minimum latency and maximum IOPS.  Noise management isn't a
> consideration.

When minimizing head movement, we also maximize overall
performance, since we'll have to seek less and spend less
time waiting.

>> That's about speed.  As you can see, the picture is FAR from
>> definitive.
> 
> Now that I've taken the brush from your hand and painted an accurate
> picture, we see that the green drives on the market have horrible random
> IOPS performance, by design, just as I already stated.

But that contradicts with real life again...

I'm not arguing against "whole drive" case IOPS here.   I'm
not arguing about "by design" too - sure thing, small (in both
amount of data which can be stored and size of platters) 15K
drives with heads ready to fly should perform better than any
green out there.

It is even more: suppose you want to have large mail spool -
you'll need MUCH more small 15KRPM drives than large greens,
so you'll have much more spindles all ready to serve you,
so the overall speed will be alot faster again.

But get the same amount of greens, use the same size from each
(say, 300Gb), -- ie, give them fair comparison.  And it will
not be orders of magnitude worse.  Yes it will be slower, but
not VERY slower.

And yes, sure enough, IOPS is what plays the most significant
role for a mail server load -- I never said the opposite.

>> I never tried running mailserver tests on an array of such drives
>> because I don't have many of them.  But a single 2Tb WD20EARS
>> drive outperforms single 500Gb Hitachi enterprise-class _sata_
>> (not sas) 7200RPM (both are of the same "generation", ie, bought
>> at about the same time and both were current) for about 10% for
>> postfix smtp-sink workload.  Go figure.
> 
> The EARS should have better sequential performance.  But that's
> irrelevant here.  The Hitachi will best the WD EARS in random IOPS
> performance.  Which is relevant here.
> 
> BTW, smtp-sink doesn't test queue performance.  smtp-sink is for testing
> the network throughput of a sending host.  You're thinking of
> smtp-source, which is what I used to generate the numbers I posted
> previously in this thread.

My mistake about smtp-sink vs smtp-source, indeed you're right
here.  What I mean to say here is that - just out of curiocity -
I measured postfix speed on my home machine when I bought one
of these WD greens - because I really was curious whenever all
the speed issues which were mentioned on lots of forums are true.
It was in the beginning of 2010, ie, about a year ago.

So I benchmarked postfix _queue_ performance (from accepting from
smtp-source to delivering to /dev/null).  In messages per second,
the wd green was about 10% faster than the hitachi (more like 9%).
I don't remember the details - it was a quick test, I never really
tried to perform good testing, it wasn't my intention.  I can try
measuring it again to get some real numbers, but I don't have
drives handy to do it (they're in use).

I'm not saying that I've a huge expirence with various greens out
there, maybe some of them really perform badly for some reason.
I can say for sure for EARS from WD: very large percentage of
information about them in various forums is wrong, most problems
goes from misaligning (which is not easy to fix when the OS or
application does not support 4Kb-sized transactions all over).
BTW, I never used any greens for anything serious - I had several
at home and played with several at home of my friends, that's
all.

But what I'm saying is that you're wrong in your "too strong"
conclusion.  In short: never say "never", life is not all
black and white.

And having said all that, I think it'd be interesting to
understand why the OP has this slow system.  It should not
be this slow, with non-AF drives (misalignment for AF drives
can explain this slowness, and that's the only thing I can
think of right now).  Yes there are several layers of storage,
but that should not be _that_ bad.  Maybe that's the LP drives,
I dunno.

/mjt

Reply via email to