it should be possible... the bottlenecks will become things like log
splitting, region management and the master/regionserver comm channel
issues. these are all up for fixinating in 0.21.
How big are your datums? If they are fairly large, it might make more
sense to store the raw data on HDFS, a
Thanks J-G and Ryan, we are trying to use 0.20.0 now. It is time-consuming,
since there is no document now, but we can continue the work. :-)
And another general question:
Do you think it is possible to store and serve 200TB of data (uncompressed,
maybe 50TB after compressed) in a 20-nodes cluste
Schubert,
Sounds like you know what you're doing.
There are two different LRU implementations in current trunk, but they
do more than you need them to. Both act on objects that implement
HeapSize, so they are a heapsize-bound LRU. May not be that bad of an
idea. The algorithm is universal
J-G,
Thanks for your reply. You really understand my question.
I am planning to experiment the partitioned-tables (day by day partition).
Yes, it will let our query application become complex. But there is a award
that let it easy to delete the old data, and easy to run mapreduce jobs
which only
You're right about the conflict between randomized keys or time-ordered
keys. Getting the best load distribution vs isolating regions being
written to.
There's a number of different ways you could deal with this, some being
fairly complex (you could partition tables by time). You could add a
Ryan,
Thanks, maybe my previous email did not describe clearly. We really know
that the 'memcache'/'memstor' is a write buffer. The read operation will not
need such a cache. :-)
So according to your answer "strict limiting factor is the index size.",
I am considering to use the 'softReference' t
I think you might be misunderstanding what the 'memcache' is, we are
calling it 'memstore' now. It is a write buffer, not a cache. It is
also memory sensitive, so as you insert more data, hbase will flush
the 'memcache' to HDFS. By default memcache is limited to 64MB a
store, 40% of Xmx, and we a
Ryan,
I know we can store more than 250GB in one region ssrver. But how about 3TB,
even 10TB.
Except for the memory usage by indexs, there may have other factors, such as
the memcache.
If there are 5000 regions opened, the total memcache heap will be very
large.
So, I am thinking two:
1. What is
By dedicating more ram to the situation you can achieve more regions
under a single regionserver. I have noticed that in my own region
servers, 200-600MB = 1-2MB of index. This value, however, is
dependent on the size of your keys and values. I have very small keys
and values. You can also tune
Ryan,
Yes. you are right.
But my question is that, even through 1000 regions (250MB)) per
regionserver, each regionserver can only support 250GB storage.
Please also check this thread "Help needed - Adding HBase to architecture",
Stack and Andrew have put some talk there.
Schubert
On Fri, Jul 1
That size is not memory-resident, so the total data size is not an
issue. The index size is what limits you with RAM, and its about 1 MB
per region (256MB region).
-ryan
On Thu, Jul 9, 2009 at 9:51 PM, zsongbo wrote:
> Hi Ryan,
>
> Thanks.
>
> If your regionsize is about 250MB, than 400 regions
Hi Ryan,
Thanks.
If your regionsize is about 250MB, than 400 regions can store 100GB data on
each regionserver.
Now, if you have 100TB data, then you need 1000 regionservers.
We are not google or yahoo who have so many nodes.
Schubert
On Fri, Jul 10, 2009 at 12:29 PM, Ryan Rawson wrote:
> re:
re: #2: in fact we don't know that... I know that I ran run 200-400
regions on a regionserver with a heap size of 4-5gb. More even. I
bet I could have 1000 regions open on 4gb ram. Each region is ~ 1mb
of all the time data, so there we go.
As for compactions, they are fairly fast, 0-30s or so d
Hi all,
1. In this configuration property:
hbase.hstore.compactionThreshold
3
If more than this number of HStoreFiles in any one HStore
(one HStoreFile is written per flush of memcache) then a compaction
is run to rewrite all HStoreFiles files as one. Larger numbers
Xin,
Comments inline.
Regards,
J-D
On Tue, Jul 22, 2008 at 2:28 AM, Xin Jing <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I am a new user of HBase, I am curious about the inert process of HBase.
> Could you please explain it in details?
>
> The question is: when I created a table (only one column, to
Hi,
I am a new user of HBase, I am curious about the inert process of HBase. Could
you please explain it in details?
The question is: when I created a table (only one column, to make it easy to
describe), and insert a huge amount of data into the table. I know it is a
B-Tree like storage struc
16 matches
Mail list logo