On Mon, 2009-02-02 at 20:06 -0800, jason hadoop wrote:
> This can be made significantly worse by your underlying host file
> system and the disks that support it.

Oh, yes, we know...  It was a late-realized mistake just yesterday that
we weren't using noatime on that cluster's slaves.

The attached graph is instructive.  We have our nightly-rotated logs for
DataNode all the way back to when this test cluster was created in
November.  This morning on one node, I sampled the first 10 BlockReport
scan lines from each day's log, up through the current hour today, and
handed it to gnuplot to graph.  The seriously erratic behavior that
begins around the 900K-1M point is very disturbing.

Immediate solutions for us include noatime, nodiratime, BIOS upgrade on
the discs, and eliminating enough small files (blocks) in DFS to get the
total count below 400K.

Reply via email to