Thanks for your response.

Now if 5 servers are enough, how can I install and configure my nodes? If I need 3 replicas in case data loss, I should at least have 3 datanodes, we still have namenode, regionserver and HMaster nodes, zookeeper nodes, some of them must be installed in the same machine. The datanode seems the disk IO sensitive node while region server is the mem sensitive, can I install them in the same machine? Any suggestion on the deployment plan?

My business requirement is that the write is much more than read(7:3), and I have another concern that I have a field which will have the 8~15KB in data size, I am not sure, there will be any problem in hbase when it runs compaction and split in regions.
Oh, you already have heavyweight's input :).

Thanks JM.

Warm Regards,
Tariq
cloudfront.blogspot.com


On Sat, Jun 22, 2013 at 8:05 PM, Mohammad Tariq <donta...@gmail.com> wrote:

Hello there,

         IMHO, 5-8 servers are sufficient enough to start with. But it's
all relative to the data you have and the intensity of your reads/writes.
You should have different strategies though, based on whether it's 'read'
or 'write'. You actually can't define 'big' in absolute terms. My cluster
might be big for me, but for someone else it might still be not big enough
or for someone it might be very big. Long story short it depends on your
needs. If you are able to achieve your goal with 5-8 RSs, then having more
machines will be a wastage, I think.

But you should always keep in mind that HBase is kinda greedy when it
comes to memory. For a decent load 4G is sufficient, IMHO. But it again
depends on operations you are gonna perform. If you have large clusters
where you are planning to run MR jobs frequently you are better off with
additional 2G.


Warm Regards,
Tariq
cloudfront.blogspot.com


On Sat, Jun 22, 2013 at 7:51 PM, myhbase <myhb...@126.com> wrote:

Hello All,

I learn hbase almost from papers and books, according to my
understanding, HBase is the kind of architecture which is more appliable
to a big cluster. We should have many HDFS nodes, and many HBase(region
server) nodes. If we only have several severs(5-8), it seems hbase is
not a good choice, please correct me if I am wrong. In addition, how
many nodes usually we can start to consider the hbase solution and how
about the physic mem size and other hardware resource in each node, any
reference document or cases? Thanks.

--Ning




Reply via email to