Hey, So there are 2 major problems here: - the setup is way off. There is no actual data duplication for example, you will put every write to 1 machine, which when it fails, so goes your data. - These machines don't have enough ram. They must have at least 1gb/core, ideally 2gb/core or more. This means they should have 8 gb ram. crucial.com
A better setup would be: - 1 "master" node, runs: hmaster, 1xzookeeper, namenode - 5 data/regionservers The key here to performance is to spread your workload over more machines. This is how clustered software works in a nutshell. using only 1/3 of your machines for "regionservers" and 1/6th for data storage (datanode) is non-ideal. You really need to up the ram. I run: - dual quad i7s with hyper-threading, which gives 16 cores to the OS - 24 gb ram - 4 x 1tb disk My small end machines are: - dual quad xeons, 8 cores to the OS - 16 gb ram - 2 x 1tb disk For performance you really dont want to have less than 1-2gb ram per core. Without a lot of ram, you don't get effective disk caching. You can't run map-reduces on the same nodes, you may run into swap issues, etc. 4 gb ddr3 ram is about $150 usd. But given a reasonable machine set, doing 50k inserts/sec sustained over long periods of time is totally doable. You will need more than 6 machines though! Don't forget your spares, since you really want to be able to operate on N-{1,2} machines so failures don't cripple you. On Mon, Jan 18, 2010 at 2:55 AM, Gaurav Vashishth <vashgau...@gmail.com> wrote: > > Using 6 machines, 8 core with 4 GB Ram, right now for setting up the > scenario. > > 2 region servers > 1 ZooKeeper > 1 Data Node > 2 Name Node > > > > Ryan Rawson wrote: >> >> How many machines do you have? I'd try at least 20+ late model boxes. >> >> On Jan 18, 2010 2:14 AM, "Gaurav Vashishth" <vashgau...@gmail.com> wrote: >> >> >> I need to store live data which is about 40-50K records /sec, evaluated >> MYSql >> and now trying HBase. >> >> Just read in docstoc that HBase insert performance, for few 1000 rows and >> 10 >> columns with 1 MB values, is 68ms/row. My scenario is similar, we need >> under >> 10k rows, 10-20 columns and which can have thousands of version with >> values >> not greater than 300 bytes. Initially, I thought HBase can solve the >> puprose >> but reading docstoc article have put doubt in my mind. >> >> Can we get 40-50k records/sec insertion speed in HBase?? Also, there would >> be thousand of users who will be reading teh database also, can HBase >> maintain that much of speed? >> >> Thanks >> Gaurav >> -- >> View this message in context: >> http://old.nabble.com/HBase-Insert-Performance-tp27208387p27208387.html >> Sent from the HBase User mailing list archive at Nabble.com. >> >> > > -- > View this message in context: > http://old.nabble.com/HBase-Insert-Performance-tp27208387p27208828.html > Sent from the HBase User mailing list archive at Nabble.com. > >