On Nov 26, 2008, at 12:08 PM, Doug Cutting wrote:
Dennis Kubes wrote:
2) Besides possible slight degradation in performance, is there a
reason why the BlocksMap shouldn't or couldn't be stored on disk?
I think the assumption is that it would be considerably more than
slight degradation. I've seen the namenode benchmarked at over
50,000 opens per second. If file data is on disk, and the namespace
is considerably bigger than RAM, then a seek would be required per
access. At 10MS/seek, that would give only 100 opens per second, or
500x slower. Flash storage today peaks at around 5k seeks/second.
For smaller clusters the namenode might not need to be able to
perform 50k opens/second, but for larger clusters we do not want the
namenode to become a bottleneck.
:)
Do you have any graphs you can share showing 50k opens / second (could
be publicly or privately)? The more external benchmarking data I
have, the more I can encourage adoption amongst my university...
Brian