Hi,
Thanks for the detailed explanation, I liked the idea of timestamp
check, this will be good enough for us and I can put a periodic MR cleaner.
However I need some help in understanding the 30K number that was claimed. With
the IndexedTable approach, I got only 1200rows/s (60rows/s X
On Sun, Sep 5, 2010 at 12:27 AM, phil young wrote:
> I'm interested in doing joins in Hive between HBase tables and between HBase
> and Hive tables.
>
> Can someone suggest an appropriate stack to do that? i.e.
> Is it possible to use HBase 0.89
> If I use HBase 0.20.6, do I still need to apply HB
On Sun, Sep 5, 2010 at 12:07 AM, Jonathan Gray wrote:
>> > But your boss seems rather to be criticizing the fact that our system
>> > is made of components. In software engineering, this is usually
>> > considered a strength. As to 'roles', one of the bigtable author's
>> > argues that a cluster
I'm interested in doing joins in Hive between HBase tables and between HBase
and Hive tables.
Can someone suggest an appropriate stack to do that? i.e.
Is it possible to use HBase 0.89
If I use HBase 0.20.6, do I still need to apply HBASE-2473
Should I go with the trunk versions of any of these (e
The tool Stack mentioned is hbck. If you want to port it to 0.20, see email
thread entitled:
compiling HBaseFsck.java for 0.20.5You should try reducing the number of
tables in your system, possibly through HBASE-2473
Cheers
On Thu, Sep 2, 2010 at 11:41 AM, Sharma, Avani wrote:
>
>
>
> -Orig
> > But your boss seems rather to be criticizing the fact that our system
> > is made of components. In software engineering, this is usually
> > considered a strength. As to 'roles', one of the bigtable author's
> > argues that a cluster of master and slaves makes for simpler systems
> > [1].
>
MauMau:
public void createTable(HTableDescriptor desc, byte [] startKey,
byte [] endKey, int numRegions)
If you choose HBase 0.20.6, please be aware that you need to apply
HBASE-2473 yourself so that you can use the above API.
On Sat, Sep 4, 2010 at 6:49 PM, MauMau wrote:
> Hello, Stack,
>
Hello, Samuru,
Thank you for your opinion. I love HBase's API, too. Cassandra's API and its
data model (supercolumns) are complicated to us.
- Original Message -
From: "Samuru Jackson"
I evaluated Cassandra and HBase for a particular problem domain and
found that Cassandra is a hu
Hello, Stack,
Thank you for giving me advice.
But your boss seems rather to be criticizing the fact that our system
is made of components. In software engineering, this is usually
considered a strength. As to 'roles', one of the bigtable author's
argues that a cluster of master and slaves mak
Hello, Jonathan,
Thank you. I understood the situation.
If you have a strong requirement of not being able to have data
unavailable for more than one second, I think Cassandra would be a clear
winner here. Is this a requirement just for reads, for writes, or both?
Perhaps just for reads, bu
Hi!
I just want to add my personal opinion to this point:
> (1) Ease of use
> Cassandra does not require any other software. All nodes of Cassandra have
> the same role. Pretty easy.
> On the other hand, HBase requires HDFS and ZooKeeper. Users have to
> manipulate and manage HDFS and ZooKeeper. T
Hi,
> where key will be [value:key] and insert rows every time, when we insert
> our values. We will got 30k rows/s/node.
Could you specify on what kind of hardware you did this? How did you
design your indexer? Is it multithreaded?
/SJ
---
http://uncinuscloud.blogspot.com/
2010/9/4 MauMau :
> However, my boss points out the following as the weaknesses of HBase and
> insists that we choose Cassandra. I prefer HBase because HBase has stronger
> potential, thanks to its active community and rich ecosystem backed by the
> membership of Hadoop family. Are there any good e
2010/9/3 Murali Krishna. P :
> * custom indexing is good, but our data keeps changing every day. So,
> probably
> indextable is the best option for us
In case of custom indexing you can use timestamps to check, that index
record still valid.
(or ever simply recheck existance of the value)
Answers inline.
> -Original Message-
> From: MauMau [mailto:maumau...@gmail.com]
> Sent: Saturday, September 04, 2010 9:31 AM
> To: user@hbase.apache.org
> Subject: Please help me overcome HBase's weaknesses
>
> Hello,
>
> We are considering which of HBase or Cassandra to choose for our
On Fri, Sep 3, 2010 at 7:57 AM, Michael Segel wrote:
>
>
>
> > Date: Fri, 3 Sep 2010 18:00:42 +0530
> > From: muralikpb...@yahoo.com
> > Subject: Re: HBase secondary index performance
> > To: user@hbase.apache.org
> >
> > Thanks Andrey,
> >
> > * Setting the autoflush to false and increasing
Hi,
I'm not sure if I understand your problems completely, but relating to your
update issue:
HBase keeps versions of your columns. If you have an index on something that
needs to be updated you just overwrite the value in the index. There is no
need to remove things.
I also organize my indexes
Hello,
We are considering which of HBase or Cassandra to choose for our future
projects. I'm recommending HBase to my boss and coworkers, because HBase is
good both for analysis (MapReduce) and for OLTP (get/put provides relatively
fast response). Cassandra is superior in get/put response time
Thanks Samuru,
I was reading about custom indexing in habse, just wanted to know how are
we
handling the updates incase of custom indexing. Probably if the original data
doesn't change, it might be a good solution. Say, if one of the column value
gets changed in the original table, we need
19 matches
Mail list logo