Bryan Duxbury wrote:
We use XFS for our data drives, and we've had somewhat mixed results.


Thanks for that. I've just created a wiki page to put some of these notes up -extensions and some hard data would be welcome

http://wiki.apache.org/hadoop/DiskSetup

One problem we have for hard data is that we need some different benchmarks for MR jobs. Terasort is good for measuring IO and MR framework performance, but for more CPU intensive algorithms, or things that need to seek round a bit more, you can't be sure that terasort benchmarks are a good predictor of what's right for you in terms of hardware, filesystem, etc.

Contributions in this area would be welcome.

I'd like to measure the power consumed on a run too, which is actually possible as far as my laptop is concerned, because you can ask it's battery what happened.

-steve

Reply via email to