Re: Please help me overcome HBase's weaknesses

2010-09-04 Thread Samuru Jackson
Hi! I just want to add my personal opinion to this point: > (1) Ease of use > Cassandra does not require any other software. All nodes of Cassandra have > the same role. Pretty easy. > On the other hand, HBase requires HDFS and ZooKeeper. Users have to > manipulate and manage HDFS and ZooKeeper. T

Re: HBase secondary index performance

2010-09-04 Thread Samuru Jackson
Hi, > where key will be [value:key] and insert rows every time, when we insert > our values. We will got 30k rows/s/node. Could you specify on what kind of hardware you did this? How did you design your indexer? Is it multithreaded? /SJ --- http://uncinuscloud.blogspot.com/

Re: HBase secondary index performance

2010-09-04 Thread Samuru Jackson
uch that daily updates are around 10 million records > where > most of it are just updates and we want it to be real time (or NRT). Any > suggestions are appreciated. > > Thanks, > Murali Krishna > > > > > > From: Samuru Jackson >

Re: HBase secondary index performance

2010-09-03 Thread Samuru Jackson
Hi, I wrote my own Indexer and actually I have a pretty good performance. However, there are still known places where I could gain even more performance (just not having the time right now). What is important is to create bulk loads when you are indexing something. I posted this one before, but m

Re: Scanning half a key or value in HBase

2010-08-23 Thread Samuru Jackson
One thing I forgot: You will need to add the separator to your search value (In your case underscore): searchValue = "123" + "_"; Otherwise you might get unwanted multiple results! /SJ On Mon, Aug 23, 2010 at 10:32 AM, Samuru Jackson < samurujack...@googlemail.com>

Re: Scanning half a key or value in HBase

2010-08-23 Thread Samuru Jackson
Hi, I do it this way: The variable searchValue is my Prefix like in your case 123 would be: searchValue = "123"; PrefixFilter prefixFilter = new PrefixFilter(Bytes.toBytes(searchValue)); Scan scan = new Scan(); scan.addFamily(Bytes.toBytes(this.REF_FAM)); scan.setFilter(prefixFilter); ResultSca

Questions about Columns and Families

2010-08-16 Thread Samuru Jackson
Hi, I have two question regarding the usage of columns within a family: 1. Is there any limitation about the number of columns inside a column family associated to a row (i.e. maximum number of columns)? 2. Is there any pagination possible for columns? If I have a huge number of columns for a row

Re: Extending RegionServer for Indexing or using the Client?

2010-07-28 Thread Samuru Jackson
Hi, Does it have any special advantage to implement on the region-side in a sense that communication overhead can be lowered or for instance some performance improvements can be gained? I finally made the github version of the indexer running with the current trunk of HBase (except the transactio

Extending RegionServer for Indexing or using the Client?

2010-07-28 Thread Samuru Jackson
Hi, I'm currently looking intensively into indexing for HBase. The Indexer maintained on http://github.com/hbase-trx/hbase-transactional-tableindexed extends the RegionServer and thus the client just defines the Index and then adds one Put with the record towards HBase. The rest is taken care on t

Re: Secondary indexes in 0.89

2010-07-26 Thread Samuru Jackson
Hi, Maybe it has been discussed already, but I don't understand why this featuer is only located in the contib. I think that this is a really important feature that should be included and maintained. I guess in most use-cases you will need to index your columns /SJ On Mon, Jul 26, 2010 at 4

How does HBase ensure strong consistency?

2010-07-23 Thread Samuru Jackson
Hi, I'm looking for a document or anything like that, that describes how HBase ensures a strong consistency. I understand that in theory this is achieved by replicating data. Does that really mean that whenever I perform a put() the return comes only after the replication has been conducted? I'm

Re: Secondary indexes in 0.89

2010-07-23 Thread Samuru Jackson
Hi, I'm wondering how your design of the key for your index looks like?! My own inital implementation for an inverted index is to create for each distinct column a separate table. Example (key - Col : Col : Col) A - Id1 : Id2 : Id3 B - Id4 : Id5 : Id6 C - Id7 : Id8 : Id9 This is convenient, ho

Re: HBase performace & bulk load

2010-07-23 Thread Samuru Jackson
Hi, For testing purposes I have to make some bulk loads as well. What I do is to insert the data in bulks (for instance 10.000 rows every time). I create a Put List out of those records: List pList = new ArrayList(); where each Put has WriteToWAL set to false; put.setWriteToWAL(false); pList.

Zookeeper exceptions while starting up region

2010-07-06 Thread Samuru Jackson
Hi, First of all I'm fairly new to HBase and have set up a small deployment of Hadoop and HBase (0.20.4) on two servers for the beginning in a fully distributed mode. HBase works fine on one server (client operations work perfectly), however starting the second RegionServer throws exceptions which