Hi, I just want to throw out a discussion topic on federation.
Reading *The Definitive Guide* on HDFS, it sounds like when federating, every distinct namespace needs a distinct namenode machine instance. This means if a company wanted three namespaces, say retail, commercial, government, they would have to have a host machine (or machine pair for high-availability) for each one, so 3 (pair) namenode hosts? What if a company was hosting client data? Say they had 20 clients accessing a cluster. 20 namespaces minimum, would mean 20 servers just for namenodes? At what point in this situation would it become practical to begin virtualizing namenodes on a high-powered virtualization cluster? I think there would be some calculation that would go into as to the expected size of the namespace partition vs. block density vs. memory...there would also be the obvious question of resource contention and overall system drag caused by that... What do other community members think? *Devin Suiter* Jr. Data Solutions Software Engineer 100 Sandusky Street | 2nd Floor | Pittsburgh, PA 15212 Google Voice: 412-256-8556 | www.rdx.com