[ 
https://issues.apache.org/jira/browse/HBASE-5349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13257085#comment-13257085
 ] 

Enis Soztutar commented on HBASE-5349:
--------------------------------------

I have been thinking about this, and I think we can have a shot at a simple 
implementation. Let me summarize what I have in mind before starting the 
implementation: 
Goals: 
 - Provide min - max heap percentages for block cache (memstore kind of has 
it). I think we should keep max-min sanity bounds, and if they are equal, 
disable auto-tuning. 
 - enable optimizing the available memory for adaptive workloads (mostly writes 
during the day, a lot of reads once MR job starts, etc). For example, when a 
large write job is started after ~10 minutes, region servers should tune for 
write workload. 
Non-goals: 
 - find the optimum mem-utilization algorithm
 - introduce a bunch of other parameters, to get rid of the current ones
 - make it very experimental so that nobody enables it in production. 

Ideally, to optimize the usage of the available memory, we should predict the 
future workload (possibly from past workload), and devise a model capturing all 
the costs associated with block cache hits / misses, flushes, compactions, etc. 
But this model will be very complex to do it properly.

I have checked Hypertable's implementation, and it seems that they check 
whether the load is read/write heavy by some hard coded values for the 
counters, and increment/decrement the mem limits, much like what Zhihong 
proposes above. I also want to start with something similar. 

Implementation layer: 
 - Currently global memstore limit is a soft limit, we may have to make it a 
hard limit (blocking writes)
 - we should enable incrementing / decrementing and setting global memstore and 
block cache maximum limits. We do not have live configuration changes, but 
regardless of auto-tuning, we should be able to manually set those online. 
 - Periodically we should check past workload (like past 10 min), and depending 
on whether it is write heavy or read heavy (from metrics), adjust the mem 
limits in small intervals. 

What do you guys think? Still worth pursuing?
                
> Automagically tweak global memstore and block cache sizes based on workload
> ---------------------------------------------------------------------------
>
>                 Key: HBASE-5349
>                 URL: https://issues.apache.org/jira/browse/HBASE-5349
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.92.0
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.96.0
>
>
> Hypertable does a neat thing where it changes the size given to the CellCache 
> (our MemStores) and Block Cache based on the workload. If you need an image, 
> scroll down at the bottom of this link: 
> http://www.hypertable.com/documentation/architecture/
> That'd be one less thing to configure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to