Re: HBase scan performance decreases over time.

Michael Segel Mon, 05 Nov 2012 05:05:37 -0800

There's an HDFS bandwidth setting which is set to 10MB/s. 

Way too low for even 1GBe.


Have you modified this setting yet? 

-Mike

On Nov 3, 2012, at 2:50 PM, David Koch <ogd...@googlemail.com> wrote:

> Hello Ted,
> 
> We never initiate major compaction manually. I have not looked at I/O
> balance between nodes in detail. We have noticed that after running for a
> couple of weeks HBase seems to spend hours pushing blocks between nodes in
> order to optimize things. We add data daily in one ~30gb push to several
> tables. Sometimes nodes get added to the running system.
> 
> Where can I get more information on how to carry out performance related
> HBase administrative tasks?
> 
> Thank you,
> 
> /David
> 
> 
> On Sat, Nov 3, 2012 at 4:42 PM, Ted Yu <yuzhih...@gmail.com> wrote:
> 
>> Can you tell us how often you run major compaction after the import ?
>> Have you noticed imbalanced read / write requests in the cluster ? Meaning
>> subset of region servers receive bulk of the writes.
>> 
>> We do some manual movement of regions when the above happens.
>> 
>> Cheers
>> 
>> On Sat, Nov 3, 2012 at 8:12 AM, David Koch <ogd...@googlemail.com> wrote:
>> 
>>> Hello,
>>> 
>>> Every now and then we need to flatten our cluster and re-import all data
>>> from log files (changes in data format, etc.) Afterwards we notice a
>>> significant increase in scan performance. As data is added and shuffled
>>> around between region servers, performance goes down again over time
>> (say a
>>> couple of weeks). Are there any routine operations that one should run
>>> manually, or settings to activate in the HBase configuration to keep the
>>> data well distributed? We use HBase 0.92 as part of a Cloudera4 cluster.
>>> 
>>> Thank you,
>>> 
>>> /David
>>> 
>>

Re: HBase scan performance decreases over time.

Reply via email to