[ https://issues.apache.org/jira/browse/HBASE-16030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333224#comment-15333224 ]
Tianying Chang commented on HBASE-16030: ---------------------------------------- [~enis] thanks for reviewing the patch. Yes, 5 minutes is not enough, we would like to see the flush uniformly distributed through the one hour range in online facing production cluster. I am fine if we can make this value configurable, therefore larger than 5 min. Will it have a problem if flush request is queued and delayed for up to 1 hour? BTW, attached a new graph to show the impact of the hourly spike on the network/disk/cpu on our new 1.2RC test cluster. > All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is > on, causing flush spike > -------------------------------------------------------------------------------------------------- > > Key: HBASE-16030 > URL: https://issues.apache.org/jira/browse/HBASE-16030 > Project: HBase > Issue Type: Improvement > Affects Versions: 1.2.1 > Reporter: Tianying Chang > Assignee: Tianying Chang > Fix For: 2.0.0, 1.4.0, 1.3.1, 1.2.3 > > Attachments: Screen Shot 2016-06-15 at 11.35.42 PM.png, > hbase-16030.patch > > > In our production cluster, we observed that memstore flush spike every hour > for all regions/RS. (we use the default memstore periodic flush time of 1 > hour). > This will happend when two conditions are met: > 1. the memstore does not have enough data to be flushed before 1 hour limit > reached; > 2. all regions are opened around the same time, (e.g. all RS are started at > the same time when start a cluster). > With above two conditions, all the regions will be flushed around the same > time at: startTime+1hour-delay again and again. > We added a flush jittering time to randomize the flush time of each region, > so that they don't get flushed at around the same time. We had this feature > running in our 94.7 and 94.26 cluster. Recently, we upgrade to 1.2, found > this issue still there in 1.2. So we are porting this into 1.2 branch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)