Hello,

I was wondering what a minimal setup in terms of # of servers might be for 
HBase.  Here is what I think is needed:


1 or 2 HBase master servers   -- 1 or 2 dedicated boxes?

1 or more RegionServers        -- 1 or more dedicated boxes?

1 or more Zookeepers            -- 1 or more dedicated boxes?


If running on HDFS, add:
1 or 2 NameNodes                 -- can this run on same box(es) as HBase 
master?

1 or more DataNodes             -- can DNs be on same box(as) as RegionServers?


If you want to run MR jobs on data in HBase, add:
1 or more JobTrackers           -- can this run on the same box as HBase master 
and NN?

1 or more TaskTrackers         -- can this run on the same box as RegionServer 
+ DN?

So, my main questions are:

* Is it OK for HBase Master and NameNode (+JobTracker) to run on the same 
server? NN needs memory.  What does HBase Master need the most?

* Is it OK for RegionServer and DataNode (+TaskTracker) to run on the same 
server? (I think this is actually advised, so data is local?)  I believe 
RegionMaster is a memory hungry (b/c of Memcache) process?  I believe DNs need 
the CPU to run the MR jobs, and disk I/O, of course.

* Finally, is the following correct?


Non-HA system, with local disk:
1 HB master/NN/JT + 1 RegionServer/TT/DN + 1 ZK   =  3 boxes

HA HBase cluster with HDFS:
2 HB masters/NNs/JTs + 2 RegionServers/TTs/DNs + 2 ZKs  =  6 boxes

Thanks,
Otis

Reply via email to