Hi!
I just want to add my personal opinion to this point:
> (1) Ease of use
> Cassandra does not require any other software. All nodes of Cassandra have
> the same role. Pretty easy.
> On the other hand, HBase requires HDFS and ZooKeeper. Users have to
> manipulate and manage HDFS and ZooKeeper. T
Hi,
> where key will be [value:key] and insert rows every time, when we insert
> our values. We will got 30k rows/s/node.
Could you specify on what kind of hardware you did this? How did you
design your indexer? Is it multithreaded?
/SJ
---
http://uncinuscloud.blogspot.com/
uch that daily updates are around 10 million records
> where
> most of it are just updates and we want it to be real time (or NRT). Any
> suggestions are appreciated.
>
> Thanks,
> Murali Krishna
>
>
>
>
>
> From: Samuru Jackson
>
Hi,
I wrote my own Indexer and actually I have a pretty good performance.
However, there are still known places where I could gain even more
performance (just not having the time right now).
What is important is to create bulk loads when you are indexing something. I
posted this one before, but m
One thing I forgot:
You will need to add the separator to your search value (In your case
underscore):
searchValue = "123" + "_";
Otherwise you might get unwanted multiple results!
/SJ
On Mon, Aug 23, 2010 at 10:32 AM, Samuru Jackson <
samurujack...@googlemail.com>
Hi,
I do it this way:
The variable searchValue is my Prefix like in your case 123 would be:
searchValue = "123";
PrefixFilter prefixFilter = new PrefixFilter(Bytes.toBytes(searchValue));
Scan scan = new Scan();
scan.addFamily(Bytes.toBytes(this.REF_FAM));
scan.setFilter(prefixFilter);
ResultSca
Hi,
I have two question regarding the usage of columns within a family:
1. Is there any limitation about the number of columns inside a column
family associated to a row (i.e. maximum number of columns)?
2. Is there any pagination possible for columns? If I have a huge
number of columns for a row
Hi,
Does it have any special advantage to implement on the region-side in
a sense that communication overhead can be lowered or for instance
some performance improvements can be gained?
I finally made the github version of the indexer running with the
current trunk of HBase (except the transactio
Hi,
I'm currently looking intensively into indexing for HBase. The Indexer
maintained on http://github.com/hbase-trx/hbase-transactional-tableindexed
extends the RegionServer and thus the client just defines the Index
and then adds one Put with the record towards HBase. The rest is taken
care on t
Hi,
Maybe it has been discussed already, but I don't understand why this
featuer is only located in the contib. I think that this is a really
important feature that should be included and maintained.
I guess in most use-cases you will need to index your columns
/SJ
On Mon, Jul 26, 2010 at 4
Hi,
I'm looking for a document or anything like that, that describes how
HBase ensures a strong consistency.
I understand that in theory this is achieved by replicating data. Does
that really mean that whenever I perform a put() the return comes only
after the replication has been conducted?
I'm
Hi,
I'm wondering how your design of the key for your index looks like?!
My own inital implementation for an inverted index is to create for
each distinct column a separate table.
Example (key - Col : Col : Col)
A - Id1 : Id2 : Id3
B - Id4 : Id5 : Id6
C - Id7 : Id8 : Id9
This is convenient, ho
Hi,
For testing purposes I have to make some bulk loads as well.
What I do is to insert the data in bulks (for instance 10.000 rows every time).
I create a Put List out of those records:
List pList = new ArrayList();
where each Put has WriteToWAL set to false;
put.setWriteToWAL(false);
pList.
Hi,
First of all I'm fairly new to HBase and have set up a small
deployment of Hadoop and HBase (0.20.4) on two servers for the
beginning in a fully distributed mode. HBase works fine on one server
(client operations work perfectly), however starting the second
RegionServer throws exceptions which
14 matches
Mail list logo