Shen,
You are right. Currently the default flush size is 64MB, the
compactionThreshold is 3, and the splitSize/max.filesize is 256MB. So we end
up compacting into a 192MB file when filling an empty region.
Take a look at HBASE-2375 (https://issues.apache.org/jira/browse/HBASE-2375).
That is
It does incremental compacting since you don't want to spend too much
time doing the compactions, and you don't want to compact very large
store files with much smaller ones (that would result in rewriting the
same data x times per day). Looking at Store.compact, you can see this
comment:
Hi,
I got when the menstore reaches a configurable size(64MB), it's flushed to
HDFS, and create a new StoreFile, therefore, when these StoreFiles more
than 3 files, they will be compacted to a single StoreFile. But, if the
default hbase.hstore.compactionThreshold is 3, does it mean that a compa
Hi Jonathan,
Thanks for your reply. Please find my replies inline.
On Wed, Apr 7, 2010 at 4:04 AM, Jonathan Gray wrote:
> Or if you have a budget in mind, we can help you determine what would be the
> best way to allocate those dollars.
>
That would be just great. Budget provisioned for the wh
hi, Jonathan,
*
*
On Wed, Apr 7, 2010 at 6:15 AM, Jonathan Gray wrote:
> Can you explain more about what information you are trying to find out?
>
> You had an existing HDFS and you want to measure the additional impact
> adding HBase is? Is that in terms of reads/writes/iops or data size?
>
> *
Also need to do configuration in hbase/conf/hadoop-metrics.xml (yes,
thats hadoop-metrics, not hbase-metrics) which I believe is only read
on restart.
So double-no.
St.Ack
On Tue, Apr 6, 2010 at 4:18 PM, Jean-Daniel Cryans wrote:
> This boils down to the question: can you enable JMX while the J
This boils down to the question: can you enable JMX while the JVM is
running? The answer is no (afaik).
More doc here
http://java.sun.com/javase/6/docs/technotes/guides/management/agent.html
J-D
On Tue, Apr 6, 2010 at 4:12 PM, Igor Ranitovic wrote:
> Is it possible to enable the hbase metrics
Is it possible to enable the hbase metrics without a restart? Thanks.
i.
Or if you have a budget in mind, we can help you determine what would be the
best way to allocate those dollars.
> -Original Message-
> From: Jonathan Gray [mailto:jg...@facebook.com]
> Sent: Tuesday, April 06, 2010 3:11 PM
> To: hbase-user@hadoop.apache.org
> Subject: RE: About test/prod
Can you explain more about what information you are trying to find out?
You had an existing HDFS and you want to measure the additional impact adding
HBase is? Is that in terms of reads/writes/iops or data size?
If you have a steady-state set of metrics for HDFS w/o HBase, can you not just
mon
Imran,
Have you run Solr atop HDFS? I doubt this will be performant.
Also, to properly scope your cluster, you need to come up with actual number
targets if you want to be able to accurately provision hardware. "not much"
data now, but "lots" of data later could mean anything. Decide what yo
hi, thanks for your inputs,
i was asking with respect to do sparql queries over hbase tables.
i have read that yahoo and other use hbase or bigtable for their
searchresults and so i'm thinking of how to apply a sparql query
- which is nothing else than a normal query - to hbase.
openrdf's sail-
Or put it in MySQL, or in S3, or...or... so my point was that you need
a recipient that transcends the JVMs ;)
So it is doable and pretty normal to output in tables the result of
MRs that map other tables, we have dozens of those here at
StumbleUpon. But if it fits in a single HashMap in a single
>From DataXceiver's javadoc
/**
* Thread for processing incoming/outgoing data stream.
*/
So it's a bit different from the handlers AFAIK.
J-D
On Mon, Apr 5, 2010 at 10:57 PM, steven zhuang
wrote:
> than, J.D.
> my cluster has the first problem. BTW, dfs.datanode.max.xcievers
> mean
hi, there,
I have this problem of checking the influence HBase brought to
HDFS.
I have a Hadoop cluster which has 30+ data nodes, and a Hbase
cluster based on it, with 18 regionservers residing on 18 datanodes.
we have observed the HDFS IO has increased a l
J-D,
There's an alternative...
He could write a M/R that takes the input from a scan() , do something,
reduce() and then output the reduced set back to hbase in the form of a temp
table.
(Even an in memory temp table) and then at the end pull the data out in to a
hash table?
In theory this
Thanks, Stack.
But, I think I want to know what's different between
SequenceFile.Writer.sync() and syncFs() (HDFS-200) ?
Could someone tell me what does the HLog sync with something?
Shen
On Sat, Apr 3, 2010 at 3:13 AM, Stack wrote:
> On Fri, Apr 2, 2010 at 10:59 AM, ChingShen
> wrote:
> >
17 matches
Mail list logo