Re: Loading data, hbase slower than Hive?

2013-01-19 Thread Mohammad Tariq
Hello Austin, I am sorry for the late response. Asaf has made a very valid point. Rowkwey design is very crucial. Specially if the data is gonna be sequential(timeseries kinda thing). You may end up with hotspotting problem. Use pre-splitted tables or hash the keys to avoid that. It'll

Re: Loading data, hbase slower than Hive?

2013-01-19 Thread Asaf Mesika
Start by telling us your row key design. Check for pre splitting your table regions. I managed to get to 25mb/sec write throughput in Hbase using 1 region server. If your data is evenly spread you can get around 7 times that in a 10 regions server environment. Should mean that 1 gig should take 4 s

Re: Reagrding HBase Hadoop multiple scan objects issue

2013-01-19 Thread Ted Yu
Have you subscribed to user mailing list ? Please do not mix email for user@ and subscription. Some email system would regard messages from amazon.com as unverifiable and put them in Spam folder. What HBase version are you using ? bq. it's inefficient to have one scan object to scan everything

Re: Custom Filter and SEEK_NEXT_USING_HINT issue

2013-01-19 Thread Ted
In your original email you said the first key looked like start key of a region, can you verify that ? Thanks On Jan 19, 2013, at 1:36 AM, Eugeny Morozov wrote: > Ted, > > that is correct. > HBase 0.92.x and we use part of the patch 6509. > > I use the filter as a custom filter, it lives in

Re: Custom Filter and SEEK_NEXT_USING_HINT issue

2013-01-19 Thread Eugeny Morozov
Ted, that is correct. HBase 0.92.x and we use part of the patch 6509. I use the filter as a custom filter, it lives in separate jar file and goes to HBase's classpath. I did not patch HBase. Moreover I do not use protobuf's descriptions that comes with the filter in patch. Only two classes I have