Hey,

So there are 2 major problems here:
- the setup is way off. There is no actual data duplication for
example, you will put every write to 1 machine, which when it fails,
so goes your data.
- These machines don't have enough ram. They must have at least
1gb/core, ideally 2gb/core or more.  This means they should have 8 gb
ram.  crucial.com

A better setup would be:
- 1 "master" node, runs: hmaster, 1xzookeeper, namenode
- 5 data/regionservers

The key here to performance is to spread your workload over more
machines.  This is how clustered software works in a nutshell.  using
only 1/3 of your machines for "regionservers" and 1/6th for data
storage (datanode) is non-ideal.

You really need to up the ram.  I run:
- dual quad i7s with hyper-threading, which gives 16 cores to the OS
- 24 gb ram
- 4 x 1tb disk

My small end machines are:
- dual quad xeons, 8 cores to the OS
- 16 gb ram
- 2 x 1tb disk

For performance you really dont want to have less than 1-2gb ram per
core. Without a lot of ram, you don't get effective disk caching. You
can't run map-reduces on the same nodes, you may run into swap issues,
etc.  4 gb ddr3 ram is about $150 usd.

But given a reasonable machine set, doing 50k inserts/sec sustained
over long periods of time is totally doable. You will need more than 6
machines though! Don't forget your spares, since you really want to be
able to operate on N-{1,2} machines so failures don't cripple you.



On Mon, Jan 18, 2010 at 2:55 AM, Gaurav Vashishth <vashgau...@gmail.com> wrote:
>
> Using 6 machines, 8 core with 4 GB Ram, right now for setting up the
> scenario.
>
> 2 region servers
> 1 ZooKeeper
> 1 Data Node
> 2 Name Node
>
>
>
> Ryan Rawson wrote:
>>
>> How many machines do you have? I'd try at least 20+ late model boxes.
>>
>> On Jan 18, 2010 2:14 AM, "Gaurav Vashishth" <vashgau...@gmail.com> wrote:
>>
>>
>> I need to store live data which is about 40-50K records /sec, evaluated
>> MYSql
>> and now trying  HBase.
>>
>> Just read in docstoc that HBase insert performance, for few 1000 rows and
>> 10
>> columns with 1 MB values, is 68ms/row. My scenario is similar, we need
>> under
>> 10k rows, 10-20 columns and which can have thousands of version with
>> values
>> not greater than 300 bytes. Initially, I thought HBase can solve the
>> puprose
>> but reading docstoc article have put doubt in my mind.
>>
>> Can we get 40-50k records/sec insertion speed in HBase?? Also, there would
>> be thousand of users who will be reading teh database also, can HBase
>> maintain that much of speed?
>>
>> Thanks
>> Gaurav
>> --
>> View this message in context:
>> http://old.nabble.com/HBase-Insert-Performance-tp27208387p27208387.html
>> Sent from the HBase User mailing list archive at Nabble.com.
>>
>>
>
> --
> View this message in context: 
> http://old.nabble.com/HBase-Insert-Performance-tp27208387p27208828.html
> Sent from the HBase User mailing list archive at Nabble.com.
>
>

Reply via email to