I will preface this with a couple statements: a) it's almost 6am, and I've been up all night b) I'm drugged up from an allergic reaction, so I may not be firing on all 64 bits.
Do I correctly understand the HDFS architecture in that the namenode is a network bottleneck into the system? I.e., it doesn't really matter how many ethernet interfaces I roll into my data nodes, I will always be limited in how much traffic I can drive to the HDFS pool by the network capacity of the namenode? I am trying to move a -lot- of data, and i'd like to not throttle the namenode (especially in the old cluster, where I cannot just bond up more interfaces). If there's a way to spread the inbound network (for block writes) traffic I'd love to hear it. Thanks! -j