[jira] [Commented] (HBASE-14069) Add the ability for RegionSplitter to rolling split without using a SplitAlgorithm
[ https://issues.apache.org/jira/browse/HBASE-14069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14641090#comment-14641090 ] Abhilash commented on HBASE-14069: -- Made changes according to Elliot's reviews. I wont be able to work on this for some time from now. If it has further code reviews I will pick it up some time later. Thanks a lot :) > Add the ability for RegionSplitter to rolling split without using a > SplitAlgorithm > -- > > Key: HBASE-14069 > URL: https://issues.apache.org/jira/browse/HBASE-14069 > Project: HBase > Issue Type: New Feature > Components: hbase, util >Reporter: Elliott Clark >Assignee: Abhilash > Attachments: 0001-Improve-RegionSplitter-v1.patch, > 0001-Improve-RegionSplitter.patch > > > RegionSplittler is the utility that can rolling split regions. It would be > nice to be able to split regions and have the normal split points get > computed for me so that I'm not reliant on knowing data distribution. > Tested manually on standalone mode for various test cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14069) Add the ability for RegionSplitter to rolling split without using a SplitAlgorithm
[ https://issues.apache.org/jira/browse/HBASE-14069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-14069: - Attachment: 0001-Improve-RegionSplitter-v1.patch > Add the ability for RegionSplitter to rolling split without using a > SplitAlgorithm > -- > > Key: HBASE-14069 > URL: https://issues.apache.org/jira/browse/HBASE-14069 > Project: HBase > Issue Type: New Feature > Components: hbase, util >Reporter: Elliott Clark >Assignee: Abhilash > Attachments: 0001-Improve-RegionSplitter-v1.patch, > 0001-Improve-RegionSplitter.patch > > > RegionSplittler is the utility that can rolling split regions. It would be > nice to be able to split regions and have the normal split points get > computed for me so that I'm not reliant on knowing data distribution. > Tested manually on standalone mode for various test cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14069) Add the ability for RegionSplitter to rolling split without using a SplitAlgorithm
[ https://issues.apache.org/jira/browse/HBASE-14069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-14069: - Attachment: (was: 0001-Improve-RegionSplitter.patch) > Add the ability for RegionSplitter to rolling split without using a > SplitAlgorithm > -- > > Key: HBASE-14069 > URL: https://issues.apache.org/jira/browse/HBASE-14069 > Project: HBase > Issue Type: New Feature > Components: hbase, util >Reporter: Elliott Clark >Assignee: Abhilash > Attachments: 0001-Improve-RegionSplitter.patch > > > RegionSplittler is the utility that can rolling split regions. It would be > nice to be able to split regions and have the normal split points get > computed for me so that I'm not reliant on knowing data distribution. > Tested manually on standalone mode for various test cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14069) Add the ability for RegionSplitter to rolling split without using a SplitAlgorithm
[ https://issues.apache.org/jira/browse/HBASE-14069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-14069: - Status: Patch Available (was: Open) > Add the ability for RegionSplitter to rolling split without using a > SplitAlgorithm > -- > > Key: HBASE-14069 > URL: https://issues.apache.org/jira/browse/HBASE-14069 > Project: HBase > Issue Type: New Feature > Components: hbase, util >Reporter: Elliott Clark >Assignee: Abhilash > Attachments: 0001-Improve-RegionSplitter.patch, > 0001-Improve-RegionSplitter.patch > > > RegionSplittler is the utility that can rolling split regions. It would be > nice to be able to split regions and have the normal split points get > computed for me so that I'm not reliant on knowing data distribution. > Tested manually on standalone mode for various test cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14069) Add the ability for RegionSplitter to rolling split without using a SplitAlgorithm
[ https://issues.apache.org/jira/browse/HBASE-14069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-14069: - Attachment: 0001-Improve-RegionSplitter.patch > Add the ability for RegionSplitter to rolling split without using a > SplitAlgorithm > -- > > Key: HBASE-14069 > URL: https://issues.apache.org/jira/browse/HBASE-14069 > Project: HBase > Issue Type: New Feature > Components: hbase, util >Reporter: Elliott Clark >Assignee: Abhilash > Attachments: 0001-Improve-RegionSplitter.patch, > 0001-Improve-RegionSplitter.patch > > > RegionSplittler is the utility that can rolling split regions. It would be > nice to be able to split regions and have the normal split points get > computed for me so that I'm not reliant on knowing data distribution. > Tested manually on standalone mode for various test cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14069) Add the ability for RegionSplitter to rolling split without using a SplitAlgorithm
[ https://issues.apache.org/jira/browse/HBASE-14069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-14069: - Component/s: util hbase > Add the ability for RegionSplitter to rolling split without using a > SplitAlgorithm > -- > > Key: HBASE-14069 > URL: https://issues.apache.org/jira/browse/HBASE-14069 > Project: HBase > Issue Type: New Feature > Components: hbase, util >Reporter: Elliott Clark >Assignee: Abhilash > Attachments: 0001-Improve-RegionSplitter.patch > > > RegionSplittler is the utility that can rolling split regions. It would be > nice to be able to split regions and have the normal split points get > computed for me so that I'm not reliant on knowing data distribution. > Tested manually on standalone mode for various test cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14069) Add the ability for RegionSplitter to rolling split without using a SplitAlgorithm
[ https://issues.apache.org/jira/browse/HBASE-14069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-14069: - Description: RegionSplittler is the utility that can rolling split regions. It would be nice to be able to split regions and have the normal split points get computed for me so that I'm not reliant on knowing data distribution. Tested manually on standalone mode for various test cases. was:RegionSplittler is the utility that can rolling split regions. It would be nice to be able to split regions and have the normal split points get computed for me so that I'm not reliant on knowing data distribution. > Add the ability for RegionSplitter to rolling split without using a > SplitAlgorithm > -- > > Key: HBASE-14069 > URL: https://issues.apache.org/jira/browse/HBASE-14069 > Project: HBase > Issue Type: New Feature >Reporter: Elliott Clark >Assignee: Abhilash > Attachments: 0001-Improve-RegionSplitter.patch > > > RegionSplittler is the utility that can rolling split regions. It would be > nice to be able to split regions and have the normal split points get > computed for me so that I'm not reliant on knowing data distribution. > Tested manually on standalone mode for various test cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14069) Add the ability for RegionSplitter to rolling split without using a SplitAlgorithm
[ https://issues.apache.org/jira/browse/HBASE-14069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-14069: - Attachment: 0001-Improve-RegionSplitter.patch > Add the ability for RegionSplitter to rolling split without using a > SplitAlgorithm > -- > > Key: HBASE-14069 > URL: https://issues.apache.org/jira/browse/HBASE-14069 > Project: HBase > Issue Type: New Feature >Reporter: Elliott Clark >Assignee: Abhilash > Attachments: 0001-Improve-RegionSplitter.patch > > > RegionSplittler is the utility that can rolling split regions. It would be > nice to be able to split regions and have the normal split points get > computed for me so that I'm not reliant on knowing data distribution. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14058) Stabilizing default heap memory tuner
[ https://issues.apache.org/jira/browse/HBASE-14058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-14058: - Attachment: 0001-Stabilizing-default-heap-memory-tuner.patch > Stabilizing default heap memory tuner > - > > Key: HBASE-14058 > URL: https://issues.apache.org/jira/browse/HBASE-14058 > Project: HBase > Issue Type: Improvement > Components: regionserver >Affects Versions: 2.0.0, 1.2.0, 1.3.0 >Reporter: Abhilash >Assignee: Abhilash > Attachments: 0001-Stabilizing-default-heap-memory-tuner.patch, > HBASE-14058-v1.patch, HBASE-14058.patch, after_modifications.png, > before_modifications.png > > > The memory tuner works well in general cases but when we have a work load > that is both read heavy as well as write heavy the tuner does too many > tuning. We should try to control the number of tuner operation and stabilize > it. The main problem was that the tuner thinks it is in steady state even if > it sees just one neutral tuner period thus does too many tuning operations > and too many reverts that too with large step sizes(step size was set to > maximum even after one neutral period). So to stop this I have thought of > these steps: > 1) The division created by μ + δ/2 and μ - δ/2 is too small. Statistically > ~62% periods will lie outside this range, which means 62% of the data points > are considered either high or low which is too much. Use μ + δ*0.8 and μ - > δ*0.8 instead. On expectations it will decrease number of tuner operations > per 100 periods from 19 to just 10. If we use δ/2 then 31% of data values > will be considered to be high and 31% will be considered to be low (2*0.31 * > 0.31 = 0.19), on the other hand if we use δ*0.8 then 22% will be low and 22% > will be high(2*0.22*0.22 ~ 0.10). > 2) Defining proper steady state by looking at past few periods(it is equal to > hbase.regionserver.heapmemory.autotuner.lookup.periods) rather than just last > tuner operation. We say tuner is in steady state when last few tuner periods > were NEUTRAL. We keep decreasing step size unless it is extremely low. Then > leave system in that state for some time. > 3) Rather then decreasing step size only while reverting, decrease the > magnitude of step size whenever we are trying to revert tuning done in last > few periods(sum the changes of last few periods and compare to current step) > rather than just looking at last period. When its magnitude gets too low then > make tuner steps NEUTRAL(no operation). This will cause step size to > continuously decrease unless we reach steady state. After that tuning process > will restart (tuner step size rests again when we reach steady state). > 4) The tuning done in last few periods will be decaying sum of past tuner > steps with sign. This parameter will be positive for increase in memstore and > negative for increase in block cache. Rather than using arithmetic mean we > use this to give more priority to recent tuner steps. > Please see the attachments. One represents the size of memstore(green) and > size of block cache(blue) adjusted by tuner without these modification and > other with the above modifications. The x-axis is time axis and y-axis is the > fraction of heap memory available to memstore and block cache at that time(it > always sums up to 80%). I configured min/max ranges for both components to > 0.1 and 0.7 respectively(so in the plots the y-axis min and max is 0.1 and > 0.7). In both cases the tuner tries to distribute memory by giving ~15% to > memstore and ~65% to block cache. But the modified one does it much more > smoothly. > I got these results from YCSB test. The test was doing approximately 5000 > inserts and 500 reads per second (for one region server). The results can be > further fine tuned and number of tuner operation can be reduced with these > changes in configuration. > For more fine tuning: > a) lower max step size (suggested = 4%) > b) lower min step size ( default if also fine ) > To further decrease frequency of tuning operations: > c) increase the number of lookup periods ( in the tests it was just 10, > default is 60 ) > d) increase tuner period ( in the tests it was just 20 secs, default is > 60secs) > I used smaller tuner period/ number of look up periods to get more data > points. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14069) Add the ability for RegionSplitter to rolling split without using a SplitAlgorithm
[ https://issues.apache.org/jira/browse/HBASE-14069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14629976#comment-14629976 ] Abhilash commented on HBASE-14069: -- I am using org.apache.hadoop.hbase.client.Admin.SplitRegion(). Any specific concerns ? > Add the ability for RegionSplitter to rolling split without using a > SplitAlgorithm > -- > > Key: HBASE-14069 > URL: https://issues.apache.org/jira/browse/HBASE-14069 > Project: HBase > Issue Type: New Feature >Reporter: Elliott Clark >Assignee: Abhilash > > RegionSplittler is the utility that can rolling split regions. It would be > nice to be able to split regions and have the normal split points get > computed for me so that I'm not reliant on knowing data distribution. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14069) Add the ability for RegionSplitter to rolling split without using a SplitAlgorithm
[ https://issues.apache.org/jira/browse/HBASE-14069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14629117#comment-14629117 ] Abhilash commented on HBASE-14069: -- No. It wont use any of the algorithms defined in RegionSplitter. Just simply call splitRegion(regionName) for all regions one by one(the same function that is called while manually splitting a region). It will keep splitting regions unless we have given number of regions or all regions are less than a given size(BFS kind of order). > Add the ability for RegionSplitter to rolling split without using a > SplitAlgorithm > -- > > Key: HBASE-14069 > URL: https://issues.apache.org/jira/browse/HBASE-14069 > Project: HBase > Issue Type: New Feature >Reporter: Elliott Clark >Assignee: Abhilash > > RegionSplittler is the utility that can rolling split regions. It would be > nice to be able to split regions and have the normal split points get > computed for me so that I'm not reliant on knowing data distribution. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14058) Stabilizing default heap memory tuner
[ https://issues.apache.org/jira/browse/HBASE-14058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626568#comment-14626568 ] Abhilash commented on HBASE-14058: -- Then lets get this patch in as there as no other reviews for this patch. > Stabilizing default heap memory tuner > - > > Key: HBASE-14058 > URL: https://issues.apache.org/jira/browse/HBASE-14058 > Project: HBase > Issue Type: Improvement > Components: regionserver >Affects Versions: 2.0.0, 1.2.0, 1.3.0 >Reporter: Abhilash >Assignee: Abhilash > Attachments: HBASE-14058-v1.patch, HBASE-14058.patch, > after_modifications.png, before_modifications.png > > > The memory tuner works well in general cases but when we have a work load > that is both read heavy as well as write heavy the tuner does too many > tuning. We should try to control the number of tuner operation and stabilize > it. The main problem was that the tuner thinks it is in steady state even if > it sees just one neutral tuner period thus does too many tuning operations > and too many reverts that too with large step sizes(step size was set to > maximum even after one neutral period). So to stop this I have thought of > these steps: > 1) The division created by μ + δ/2 and μ - δ/2 is too small. Statistically > ~62% periods will lie outside this range, which means 62% of the data points > are considered either high or low which is too much. Use μ + δ*0.8 and μ - > δ*0.8 instead. On expectations it will decrease number of tuner operations > per 100 periods from 19 to just 10. If we use δ/2 then 31% of data values > will be considered to be high and 31% will be considered to be low (2*0.31 * > 0.31 = 0.19), on the other hand if we use δ*0.8 then 22% will be low and 22% > will be high(2*0.22*0.22 ~ 0.10). > 2) Defining proper steady state by looking at past few periods(it is equal to > hbase.regionserver.heapmemory.autotuner.lookup.periods) rather than just last > tuner operation. We say tuner is in steady state when last few tuner periods > were NEUTRAL. We keep decreasing step size unless it is extremely low. Then > leave system in that state for some time. > 3) Rather then decreasing step size only while reverting, decrease the > magnitude of step size whenever we are trying to revert tuning done in last > few periods(sum the changes of last few periods and compare to current step) > rather than just looking at last period. When its magnitude gets too low then > make tuner steps NEUTRAL(no operation). This will cause step size to > continuously decrease unless we reach steady state. After that tuning process > will restart (tuner step size rests again when we reach steady state). > 4) The tuning done in last few periods will be decaying sum of past tuner > steps with sign. This parameter will be positive for increase in memstore and > negative for increase in block cache. Rather than using arithmetic mean we > use this to give more priority to recent tuner steps. > Please see the attachments. One represents the size of memstore(green) and > size of block cache(blue) adjusted by tuner without these modification and > other with the above modifications. The x-axis is time axis and y-axis is the > fraction of heap memory available to memstore and block cache at that time(it > always sums up to 80%). I configured min/max ranges for both components to > 0.1 and 0.7 respectively(so in the plots the y-axis min and max is 0.1 and > 0.7). In both cases the tuner tries to distribute memory by giving ~15% to > memstore and ~65% to block cache. But the modified one does it much more > smoothly. > I got these results from YCSB test. The test was doing approximately 5000 > inserts and 500 reads per second (for one region server). The results can be > further fine tuned and number of tuner operation can be reduced with these > changes in configuration. > For more fine tuning: > a) lower max step size (suggested = 4%) > b) lower min step size ( default if also fine ) > To further decrease frequency of tuning operations: > c) increase the number of lookup periods ( in the tests it was just 10, > default is 60 ) > d) increase tuner period ( in the tests it was just 20 secs, default is > 60secs) > I used smaller tuner period/ number of look up periods to get more data > points. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HBASE-14069) Add the ability for RegionSplitter to rolling split without using a SplitAlgorithm
[ https://issues.apache.org/jira/browse/HBASE-14069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash reassigned HBASE-14069: Assignee: Abhilash > Add the ability for RegionSplitter to rolling split without using a > SplitAlgorithm > -- > > Key: HBASE-14069 > URL: https://issues.apache.org/jira/browse/HBASE-14069 > Project: HBase > Issue Type: New Feature >Reporter: Elliott Clark >Assignee: Abhilash > > RegionSplittler is the utility that can rolling split regions. It would be > nice to be able to split regions and have the normal split points get > computed for me so that I'm not reliant on knowing data distribution. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14058) Stabilizing default heap memory tuner
[ https://issues.apache.org/jira/browse/HBASE-14058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14625006#comment-14625006 ] Abhilash commented on HBASE-14058: -- Thanks for the suggestions. We already maintain moving average and deviation of metrics over past periods(number of past periods to consider is configurable). Then compare mean and deviation to current metric value to decide tuning. Did you mean to use above value instead of current metric value while comparing or use it for that stats calculation itself ? IMO it wont be a good idea to use it for stat calculation as it will tamper the actual trend and stats. It will be very interesting/challenging to try to resize step size based on variation. But I feel it may lead to the same problem of over tuning. Instead of doing this resize every time. How about we do it only when we reset our step size ? Instead of resetting step size to maximum we can set it to some mediocre value depending on current variation and tuner will try to settle down things in following periods and then go to steady state. > Stabilizing default heap memory tuner > - > > Key: HBASE-14058 > URL: https://issues.apache.org/jira/browse/HBASE-14058 > Project: HBase > Issue Type: Improvement > Components: regionserver >Affects Versions: 2.0.0, 1.2.0, 1.3.0 >Reporter: Abhilash >Assignee: Abhilash > Attachments: HBASE-14058-v1.patch, HBASE-14058.patch, > after_modifications.png, before_modifications.png > > > The memory tuner works well in general cases but when we have a work load > that is both read heavy as well as write heavy the tuner does too many > tuning. We should try to control the number of tuner operation and stabilize > it. The main problem was that the tuner thinks it is in steady state even if > it sees just one neutral tuner period thus does too many tuning operations > and too many reverts that too with large step sizes(step size was set to > maximum even after one neutral period). So to stop this I have thought of > these steps: > 1) The division created by μ + δ/2 and μ - δ/2 is too small. Statistically > ~62% periods will lie outside this range, which means 62% of the data points > are considered either high or low which is too much. Use μ + δ*0.8 and μ - > δ*0.8 instead. On expectations it will decrease number of tuner operations > per 100 periods from 19 to just 10. If we use δ/2 then 31% of data values > will be considered to be high and 31% will be considered to be low (2*0.31 * > 0.31 = 0.19), on the other hand if we use δ*0.8 then 22% will be low and 22% > will be high(2*0.22*0.22 ~ 0.10). > 2) Defining proper steady state by looking at past few periods(it is equal to > hbase.regionserver.heapmemory.autotuner.lookup.periods) rather than just last > tuner operation. We say tuner is in steady state when last few tuner periods > were NEUTRAL. We keep decreasing step size unless it is extremely low. Then > leave system in that state for some time. > 3) Rather then decreasing step size only while reverting, decrease the > magnitude of step size whenever we are trying to revert tuning done in last > few periods(sum the changes of last few periods and compare to current step) > rather than just looking at last period. When its magnitude gets too low then > make tuner steps NEUTRAL(no operation). This will cause step size to > continuously decrease unless we reach steady state. After that tuning process > will restart (tuner step size rests again when we reach steady state). > 4) The tuning done in last few periods will be decaying sum of past tuner > steps with sign. This parameter will be positive for increase in memstore and > negative for increase in block cache. Rather than using arithmetic mean we > use this to give more priority to recent tuner steps. > Please see the attachments. One represents the size of memstore(green) and > size of block cache(blue) adjusted by tuner without these modification and > other with the above modifications. The x-axis is time axis and y-axis is the > fraction of heap memory available to memstore and block cache at that time(it > always sums up to 80%). I configured min/max ranges for both components to > 0.1 and 0.7 respectively(so in the plots the y-axis min and max is 0.1 and > 0.7). In both cases the tuner tries to distribute memory by giving ~15% to > memstore and ~65% to block cache. But the modified one does it much more > smoothly. > I got these results from YCSB test. The test was doing approximately 5000 > inserts and 500 reads per second (for one region server). The results can be > further fine tuned and number of tuner operation can be reduced with these > changes in configuration. > For more fine tuning: > a) lower max step size (suggested = 4%) > b) lo
[jira] [Updated] (HBASE-14058) Stabilizing default heap memory tuner
[ https://issues.apache.org/jira/browse/HBASE-14058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-14058: - Attachment: HBASE-14058-v1.patch > Stabilizing default heap memory tuner > - > > Key: HBASE-14058 > URL: https://issues.apache.org/jira/browse/HBASE-14058 > Project: HBase > Issue Type: Improvement > Components: regionserver >Affects Versions: 2.0.0, 1.2.0, 1.3.0 >Reporter: Abhilash >Assignee: Abhilash > Attachments: HBASE-14058-v1.patch, HBASE-14058.patch, > after_modifications.png, before_modifications.png > > > The memory tuner works well in general cases but when we have a work load > that is both read heavy as well as write heavy the tuner does too many > tuning. We should try to control the number of tuner operation and stabilize > it. The main problem was that the tuner thinks it is in steady state even if > it sees just one neutral tuner period thus does too many tuning operations > and too many reverts that too with large step sizes(step size was set to > maximum even after one neutral period). So to stop this I have thought of > these steps: > 1) The division created by μ + δ/2 and μ - δ/2 is too small. Statistically > ~62% periods will lie outside this range, which means 62% of the data points > are considered either high or low which is too much. Use μ + δ*0.8 and μ - > δ*0.8 instead. On expectations it will decrease number of tuner operations > per 100 periods from 19 to just 10. If we use δ/2 then 31% of data values > will be considered to be high and 31% will be considered to be low (2*0.31 * > 0.31 = 0.19), on the other hand if we use δ*0.8 then 22% will be low and 22% > will be high(2*0.22*0.22 ~ 0.10). > 2) Defining proper steady state by looking at past few periods(it is equal to > hbase.regionserver.heapmemory.autotuner.lookup.periods) rather than just last > tuner operation. We say tuner is in steady state when last few tuner periods > were NEUTRAL. We keep decreasing step size unless it is extremely low. Then > leave system in that state for some time. > 3) Rather then decreasing step size only while reverting, decrease the > magnitude of step size whenever we are trying to revert tuning done in last > few periods(sum the changes of last few periods and compare to current step) > rather than just looking at last period. When its magnitude gets too low then > make tuner steps NEUTRAL(no operation). This will cause step size to > continuously decrease unless we reach steady state. After that tuning process > will restart (tuner step size rests again when we reach steady state). > 4) The tuning done in last few periods will be decaying sum of past tuner > steps with sign. This parameter will be positive for increase in memstore and > negative for increase in block cache. Rather than using arithmetic mean we > use this to give more priority to recent tuner steps. > Please see the attachments. One represents the size of memstore(green) and > size of block cache(blue) adjusted by tuner without these modification and > other with the above modifications. The x-axis is time axis and y-axis is the > fraction of heap memory available to memstore and block cache at that time(it > always sums up to 80%). I configured min/max ranges for both components to > 0.1 and 0.7 respectively(so in the plots the y-axis min and max is 0.1 and > 0.7). In both cases the tuner tries to distribute memory by giving ~15% to > memstore and ~65% to block cache. But the modified one does it much more > smoothly. > I got these results from YCSB test. The test was doing approximately 5000 > inserts and 500 reads per second (for one region server). The results can be > further fine tuned and number of tuner operation can be reduced with these > changes in configuration. > For more fine tuning: > a) lower max step size (suggested = 4%) > b) lower min step size ( default if also fine ) > To further decrease frequency of tuning operations: > c) increase the number of lookup periods ( in the tests it was just 10, > default is 60 ) > d) increase tuner period ( in the tests it was just 20 secs, default is > 60secs) > I used smaller tuner period/ number of look up periods to get more data > points. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14058) Stabilizing default heap memory tuner
[ https://issues.apache.org/jira/browse/HBASE-14058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14624029#comment-14624029 ] Abhilash commented on HBASE-14058: -- Thanks a lot for the reviews Ted. I think decayingTunerStepSizeSum will be correct name for it. I saw that percentage mistake yesterday, I was just waiting for further reviews. Will commit soon if there are no further reviews. > Stabilizing default heap memory tuner > - > > Key: HBASE-14058 > URL: https://issues.apache.org/jira/browse/HBASE-14058 > Project: HBase > Issue Type: Improvement > Components: regionserver >Affects Versions: 2.0.0, 1.2.0, 1.3.0 >Reporter: Abhilash >Assignee: Abhilash > Attachments: HBASE-14058.patch, after_modifications.png, > before_modifications.png > > > The memory tuner works well in general cases but when we have a work load > that is both read heavy as well as write heavy the tuner does too many > tuning. We should try to control the number of tuner operation and stabilize > it. The main problem was that the tuner thinks it is in steady state even if > it sees just one neutral tuner period thus does too many tuning operations > and too many reverts that too with large step sizes(step size was set to > maximum even after one neutral period). So to stop this I have thought of > these steps: > 1) The division created by μ + δ/2 and μ - δ/2 is too small. Statistically > ~62% periods will lie outside this range, which means 62% of the data points > are considered either high or low which is too much. Use μ + δ*0.8 and μ - > δ*0.8 instead. On expectations it will decrease number of tuner operations > per 100 periods from 19 to just 10. If we use δ/2 then 31% of data values > will be considered to be high and 31% will be considered to be low (2*0.31 * > 0.31 = 0.19), on the other hand if we use δ*0.8 then 22% will be low and 22% > will be high(2*0.22*0.22 ~ 0.10). > 2) Defining proper steady state by looking at past few periods(it is equal to > hbase.regionserver.heapmemory.autotuner.lookup.periods) rather than just last > tuner operation. We say tuner is in steady state when last few tuner periods > were NEUTRAL. We keep decreasing step size unless it is extremely low. Then > leave system in that state for some time. > 3) Rather then decreasing step size only while reverting, decrease the > magnitude of step size whenever we are trying to revert tuning done in last > few periods(sum the changes of last few periods and compare to current step) > rather than just looking at last period. When its magnitude gets too low then > make tuner steps NEUTRAL(no operation). This will cause step size to > continuously decrease unless we reach steady state. After that tuning process > will restart (tuner step size rests again when we reach steady state). > 4) The tuning done in last few periods will be decaying sum of past tuner > steps with sign. This parameter will be positive for increase in memstore and > negative for increase in block cache. Rather than using arithmetic mean we > use this to give more priority to recent tuner steps. > Please see the attachments. One represents the size of memstore(green) and > size of block cache(blue) adjusted by tuner without these modification and > other with the above modifications. The x-axis is time axis and y-axis is the > fraction of heap memory available to memstore and block cache at that time(it > always sums up to 80%). I configured min/max ranges for both components to > 0.1 and 0.7 respectively(so in the plots the y-axis min and max is 0.1 and > 0.7). In both cases the tuner tries to distribute memory by giving ~15% to > memstore and ~65% to block cache. But the modified one does it much more > smoothly. > I got these results from YCSB test. The test was doing approximately 5000 > inserts and 500 reads per second (for one region server). The results can be > further fine tuned and number of tuner operation can be reduced with these > changes in configuration. > For more fine tuning: > a) lower max step size (suggested = 4%) > b) lower min step size ( default if also fine ) > To further decrease frequency of tuning operations: > c) increase the number of lookup periods ( in the tests it was just 10, > default is 60 ) > d) increase tuner period ( in the tests it was just 20 secs, default is > 60secs) > I used smaller tuner period/ number of look up periods to get more data > points. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14058) Stabilizing default heap memory tuner
[ https://issues.apache.org/jira/browse/HBASE-14058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14623670#comment-14623670 ] Abhilash commented on HBASE-14058: -- Yes you are absolutely right. In read/write heavy scenario evictions due to compaction is a problem. We should try to distinguish between normal evictions and eviction caused due to compactions. But we are also using cache misses(for operations that were set to be cached while read) which gives more clear indication if we need to increase size of BC or not. It is more helpful in both cases that you mentioned i.e ignore evictions caused by compaction and caching on writes. Should we completely remove use of evictions form tuner decisions and just use cache misses ? > Stabilizing default heap memory tuner > - > > Key: HBASE-14058 > URL: https://issues.apache.org/jira/browse/HBASE-14058 > Project: HBase > Issue Type: Improvement > Components: regionserver >Affects Versions: 2.0.0, 1.2.0, 1.3.0 >Reporter: Abhilash >Assignee: Abhilash > Attachments: HBASE-14058.patch, after_modifications.png, > before_modifications.png > > > The memory tuner works well in general cases but when we have a work load > that is both read heavy as well as write heavy the tuner does too many > tuning. We should try to control the number of tuner operation and stabilize > it. The main problem was that the tuner thinks it is in steady state even if > it sees just one neutral tuner period thus does too many tuning operations > and too many reverts that too with large step sizes(step size was set to > maximum even after one neutral period). So to stop this I have thought of > these steps: > 1) The division created by μ + δ/2 and μ - δ/2 is too small. Statistically > ~62% periods will lie outside this range, which means 62% of the data points > are considered either high or low which is too much. Use μ + δ*0.8 and μ - > δ*0.8 instead. On expectations it will decrease number of tuner operations > per 100 periods from 19 to just 10. If we use δ/2 then 31% of data values > will be considered to be high and 31% will be considered to be low (2*0.31 * > 0.31 = 0.19), on the other hand if we use δ*0.8 then 22% will be low and 22% > will be high(2*0.22*0.22 ~ 0.10). > 2) Defining proper steady state by looking at past few periods(it is equal to > hbase.regionserver.heapmemory.autotuner.lookup.periods) rather than just last > tuner operation. We say tuner is in steady state when last few tuner periods > were NEUTRAL. We keep decreasing step size unless it is extremely low. Then > leave system in that state for some time. > 3) Rather then decreasing step size only while reverting, decrease the > magnitude of step size whenever we are trying to revert tuning done in last > few periods(sum the changes of last few periods and compare to current step) > rather than just looking at last period. When its magnitude gets too low then > make tuner steps NEUTRAL(no operation). This will cause step size to > continuously decrease unless we reach steady state. After that tuning process > will restart (tuner step size rests again when we reach steady state). > 4) The tuning done in last few periods will be decaying sum of past tuner > steps with sign. This parameter will be positive for increase in memstore and > negative for increase in block cache. Rather than using arithmetic mean we > use this to give more priority to recent tuner steps. > Please see the attachments. One represents the size of memstore(green) and > size of block cache(blue) adjusted by tuner without these modification and > other with the above modifications. The x-axis is time axis and y-axis is the > fraction of heap memory available to memstore and block cache at that time(it > always sums up to 80%). I configured min/max ranges for both components to > 0.1 and 0.7 respectively(so in the plots the y-axis min and max is 0.1 and > 0.7). In both cases the tuner tries to distribute memory by giving ~15% to > memstore and ~65% to block cache. But the modified one does it much more > smoothly. > I got these results from YCSB test. The test was doing approximately 5000 > inserts and 500 reads per second (for one region server). The results can be > further fine tuned and number of tuner operation can be reduced with these > changes in configuration. > For more fine tuning: > a) lower max step size (suggested = 4%) > b) lower min step size ( default if also fine ) > To further decrease frequency of tuning operations: > c) increase the number of lookup periods ( in the tests it was just 10, > default is 60 ) > d) increase tuner period ( in the tests it was just 20 secs, default is > 60secs) > I used smaller tuner period/ number of look up periods to get more data > points.
[jira] [Updated] (HBASE-14058) Stabilizing default heap memory tuner
[ https://issues.apache.org/jira/browse/HBASE-14058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-14058: - Description: The memory tuner works well in general cases but when we have a work load that is both read heavy as well as write heavy the tuner does too many tuning. We should try to control the number of tuner operation and stabilize it. The main problem was that the tuner thinks it is in steady state even if it sees just one neutral tuner period thus does too many tuning operations and too many reverts that too with large step sizes(step size was set to maximum even after one neutral period). So to stop this I have thought of these steps: 1) The division created by μ + δ/2 and μ - δ/2 is too small. Statistically ~62% periods will lie outside this range, which means 62% of the data points are considered either high or low which is too much. Use μ + δ*0.8 and μ - δ*0.8 instead. On expectations it will decrease number of tuner operations per 100 periods from 19 to just 10. If we use δ/2 then 31% of data values will be considered to be high and 31% will be considered to be low (2*0.31 * 0.31 = 0.19), on the other hand if we use δ*0.8 then 22% will be low and 22% will be high(2*0.22*0.22 ~ 0.10). 2) Defining proper steady state by looking at past few periods(it is equal to hbase.regionserver.heapmemory.autotuner.lookup.periods) rather than just last tuner operation. We say tuner is in steady state when last few tuner periods were NEUTRAL. We keep decreasing step size unless it is extremely low. Then leave system in that state for some time. 3) Rather then decreasing step size only while reverting, decrease the magnitude of step size whenever we are trying to revert tuning done in last few periods(sum the changes of last few periods and compare to current step) rather than just looking at last period. When its magnitude gets too low then make tuner steps NEUTRAL(no operation). This will cause step size to continuously decrease unless we reach steady state. After that tuning process will restart (tuner step size rests again when we reach steady state). 4) The tuning done in last few periods will be decaying sum of past tuner steps with sign. This parameter will be positive for increase in memstore and negative for increase in block cache. Rather than using arithmetic mean we use this to give more priority to recent tuner steps. Please see the attachments. One represents the size of memstore(green) and size of block cache(blue) adjusted by tuner without these modification and other with the above modifications. The x-axis is time axis and y-axis is the fraction of heap memory available to memstore and block cache at that time(it always sums up to 80%). I configured min/max ranges for both components to 0.1 and 0.7 respectively(so in the plots the y-axis min and max is 0.1 and 0.7). In both cases the tuner tries to distribute memory by giving ~15% to memstore and ~65% to block cache. But the modified one does it much more smoothly. I got these results from YCSB test. The test was doing approximately 5000 inserts and 500 reads per second (for one region server). The results can be further fine tuned and number of tuner operation can be reduced with these changes in configuration. For more fine tuning: a) lower max step size (suggested = 4%) b) lower min step size ( default if also fine ) To further decrease frequency of tuning operations: c) increase the number of lookup periods ( in the tests it was just 10, default is 60 ) d) increase tuner period ( in the tests it was just 20 secs, default is 60secs) I used smaller tuner period/ number of look up periods to get more data points. was: The memory tuner works well in general cases but when we have a work load that is both read heavy as well as write heavy the tuner does too many tuning. We should try to control the number of tuner operation and stabilize it. The main problem was that the tuner thinks it is in steady state even if it sees just one neutral tuner period thus does too many tuning operations and too many reverts that too with large step sizes(step size was set to maximum even after one neutral period). So to stop this I have thought of these steps: 1) The division created by μ + δ/2 and μ - δ/2 is too small. Statistically ~62% periods will lie outside this range, which means 62% of the data points are considered either high or low which is too much. Use μ + δ*0.8 and μ - δ*0.8 instead. On expectations it will decrease number of tuner operations per 100 periods from 19 to just 10. If we use δ/2 then 31% of data values will be considered to be high and 31% will be considered to be low (2*0.31 * 0.31 = 0.19), on the other hand if we use δ*0.8 then 22% will be low and 22% will be high(2*0.22*0.22 ~ 0.10). 2) Defining proper steady state by looking at past few periods(it is equal to hbase.regionserver.heap
[jira] [Updated] (HBASE-14058) Stabilizing default heap memory tuner
[ https://issues.apache.org/jira/browse/HBASE-14058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-14058: - Attachment: HBASE-14058.patch > Stabilizing default heap memory tuner > - > > Key: HBASE-14058 > URL: https://issues.apache.org/jira/browse/HBASE-14058 > Project: HBase > Issue Type: Improvement > Components: regionserver >Reporter: Abhilash >Assignee: Abhilash > Attachments: HBASE-14058.patch, after_modifications.png, > before_modifications.png > > > The memory tuner works well in general cases but when we have a work load > that is both read heavy as well as write heavy the tuner does too many > tuning. We should try to control the number of tuner operation and stabilize > it. The main problem was that the tuner thinks it is in steady state even if > it sees just one neutral tuner period thus does too many tuning operations > and too many reverts that too with large step sizes(step size was set to > maximum even after one neutral period). So to stop this I have thought of > these steps: > 1) The division created by μ + δ/2 and μ - δ/2 is too small. Statistically > ~62% periods will lie outside this range, which means 62% of the data points > are considered either high or low which is too much. Use μ + δ*0.8 and μ - > δ*0.8 instead. On expectations it will decrease number of tuner operations > per 100 periods from 19 to just 10. If we use δ/2 then 31% of data values > will be considered to be high and 31% will be considered to be low (2*0.31 * > 0.31 = 0.19), on the other hand if we use δ*0.8 then 22% will be low and 22% > will be high(2*0.22*0.22 ~ 0.10). > 2) Defining proper steady state by looking at past few periods(it is equal to > hbase.regionserver.heapmemory.autotuner.lookup.periods) rather than just last > tuner operation. We say tuner is in steady state when last few tuner periods > were NEUTRAL. We keep decreasing step size unless it is extremely low. Then > leave system in that state for some time. > 3) Rather then decreasing step size only while reverting, decrease the > magnitude of step size whenever we are trying to revert tuning done in last > few periods(sum the changes of last few periods and compare to current step) > rather than just looking at last period. When its magnitude gets too low then > make tuner steps NEUTRAL(no operation). This will cause step size to > continuously decrease unless we reach steady state. After that tuning process > will restart (tuner step size rests again when we reach steady state). > 4) The tuning done in last few periods will be decaying sum of past tuner > steps with sign. This parameter will be positive for increase in memstore and > negative for increase in block cache. Rather than using arithmetic mean we > use this to give more priority to recent tuner steps. > Please see the attachments. One represents the size of memstore(green) and > size of block cache(blue) adjusted by tuner without these modification and > other with the above modifications. I got these results from YCSB test. The > test was doing approximately 5000 inserts and 500 reads per second (for one > region server). The results can be further fine tuned and number of tuner > operation can be reduced with these changes in configuration. > For more fine tuning: > a) lower max step size (suggested = 4%) > b) lower min step size ( default if also fine ) > To further decrease frequency of tuning operations: > c) increase the number of lookup periods ( in the tests it was just 10, > default is 60 ) > d) increase tuner period ( in the tests it was just 20 secs, default is > 60secs) > I used smaller tuner period/ number of look up periods to get more data > points. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14058) Stabilizing default heap memory tuner
[ https://issues.apache.org/jira/browse/HBASE-14058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-14058: - Attachment: before_modifications.png after_modifications.png > Stabilizing default heap memory tuner > - > > Key: HBASE-14058 > URL: https://issues.apache.org/jira/browse/HBASE-14058 > Project: HBase > Issue Type: Improvement > Components: regionserver >Reporter: Abhilash >Assignee: Abhilash > Attachments: after_modifications.png, before_modifications.png > > > The memory tuner works well in general cases but when we have a work load > that is both read heavy as well as write heavy the tuner does too many > tuning. We should try to control the number of tuner operation and stabilize > it. The main problem was that the tuner thinks it is in steady state even if > it sees just one neutral tuner period thus does too many tuning operations > and too many reverts that too with large step sizes(step size was set to > maximum even after one neutral period). So to stop this I have thought of > these steps: > 1) The division created by μ + δ/2 and μ - δ/2 is too small. Statistically > ~62% periods will lie outside this range, which means 62% of the data points > are considered either high or low which is too much. Use μ + δ*0.8 and μ - > δ*0.8 instead. On expectations it will decrease number of tuner operations > per 100 periods from 19 to just 10. If we use δ/2 then 31% of data values > will be considered to be high and 31% will be considered to be low (2*0.31 * > 0.31 = 0.19), on the other hand if we use δ*0.8 then 22% will be low and 22% > will be high(2*0.22*0.22 ~ 0.10). > 2) Defining proper steady state by looking at past few periods(it is equal to > hbase.regionserver.heapmemory.autotuner.lookup.periods) rather than just last > tuner operation. We say tuner is in steady state when last few tuner periods > were NEUTRAL. We keep decreasing step size unless it is extremely low. Then > leave system in that state for some time. > 3) Rather then decreasing step size only while reverting, decrease the > magnitude of step size whenever we are trying to revert tuning done in last > few periods(sum the changes of last few periods and compare to current step) > rather than just looking at last period. When its magnitude gets too low then > make tuner steps NEUTRAL(no operation). This will cause step size to > continuously decrease unless we reach steady state. After that tuning process > will restart (tuner step size rests again when we reach steady state). > 4) The tuning done in last few periods will be decaying sum of past tuner > steps with sign. This parameter will be positive for increase in memstore and > negative for increase in block cache. Rather than using arithmetic mean we > use this to give more priority to recent tuner steps. > Please see the attachments. One represents the size of memstore(green) and > size of block cache(blue) adjusted by tuner without these modification and > other with the above modifications. I got these results from YCSB test. The > test was doing approximately 5000 inserts and 500 reads per second (for one > region server). The results can be further fine tuned and number of tuner > operation can be reduced with these changes in configuration. > For more fine tuning: > a) lower max step size (suggested = 4%) > b) lower min step size ( default if also fine ) > To further decrease frequency of tuning operations: > c) increase the number of lookup periods ( in the tests it was just 10, > default is 60 ) > d) increase tuner period ( in the tests it was just 20 secs, default is > 60secs) > I used smaller tuner period/ number of look up periods to get more data > points. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14058) Stabilizing default heap memory tuner
[ https://issues.apache.org/jira/browse/HBASE-14058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-14058: - Description: The memory tuner works well in general cases but when we have a work load that is both read heavy as well as write heavy the tuner does too many tuning. We should try to control the number of tuner operation and stabilize it. The main problem was that the tuner thinks it is in steady state even if it sees just one neutral tuner period thus does too many tuning operations and too many reverts that too with large step sizes(step size was set to maximum even after one neutral period). So to stop this I have thought of these steps: 1) The division created by μ + δ/2 and μ - δ/2 is too small. Statistically ~62% periods will lie outside this range, which means 62% of the data points are considered either high or low which is too much. Use μ + δ*0.8 and μ - δ*0.8 instead. On expectations it will decrease number of tuner operations per 100 periods from 19 to just 10. If we use δ/2 then 31% of data values will be considered to be high and 31% will be considered to be low (2*0.31 * 0.31 = 0.19), on the other hand if we use δ*0.8 then 22% will be low and 22% will be high(2*0.22*0.22 ~ 0.10). 2) Defining proper steady state by looking at past few periods(it is equal to hbase.regionserver.heapmemory.autotuner.lookup.periods) rather than just last tuner operation. We say tuner is in steady state when last few tuner periods were NEUTRAL. We keep decreasing step size unless it is extremely low. Then leave system in that state for some time. 3) Rather then decreasing step size only while reverting, decrease the magnitude of step size whenever we are trying to revert tuning done in last few periods(sum the changes of last few periods and compare to current step) rather than just looking at last period. When its magnitude gets too low then make tuner steps NEUTRAL(no operation). This will cause step size to continuously decrease unless we reach steady state. After that tuning process will restart (tuner step size rests again when we reach steady state). 4) The tuning done in last few periods will be decaying sum of past tuner steps with sign. This parameter will be positive for increase in memstore and negative for increase in block cache. Rather than using arithmetic mean we use this to give more priority to recent tuner steps. Please see the attachments. One represents the size of memstore(green) and size of block cache(blue) adjusted by tuner without these modification and other with the above modifications. I got these results from YCSB test. The test was doing approximately 5000 inserts and 500 reads per second (for one region server). The results can be further fine tuned and number of tuner operation can be reduced with these changes in configuration. For more fine tuning: a) lower max step size (suggested = 4%) b) lower min step size ( default if also fine ) To further decrease frequency of tuning operations: c) increase the number of lookup periods ( in the tests it was just 10, default is 60 ) d) increase tuner period ( in the tests it was just 20 secs, default is 60secs) I used smaller tuner period/ number of look up periods to get more data points. was: The memory tuner works well in general cases but when we have a work load that is both read heavy as well as write heavy the tuner does too many tuning. We should try to control the number of tuner operation and stabilize it. The main problem was that the tuner thinks it is in steady state even if it sees just one neutral tuner period thus does too many tuning operations and too many reverts that too with large step sizes(step size was set to maximum even after one neutral period). So to stop this I have thought of these steps: 1) The division created by μ + δ/2 and μ - δ/2 is too small. Statistically ~62% periods will lie outside this range, which means 62% of the data points are considered either high or low which is too much. Use μ + δ*0.8 and μ - δ*0.8 instead. On expectations it will decrease number of tuner operations per 100 periods from 19 to just 10. If we use δ/2 then 31% of data values will be considered to be high and 31% will be considered to be low (2*0.31 * 0.31 = 0.19), on the other hand if we use δ*0.8 then 22% will be low and 22% will be high(2*0.22*0.22 ~ 0.10). 2) Defining proper steady state by looking at past few periods(it is equal to hbase.regionserver.heapmemory.autotuner.lookup.periods) rather than just last tuner operation. We say tuner is in steady state when last few tuner periods were NEUTRAL. We keep decreasing step size unless it is extremely low. Then leave system in that state for some time. 3) Rather then decreasing step size only while reverting, decrease the magnitude of step size whenever we are trying to revert tuning done in last few periods(sum the changes of
[jira] [Updated] (HBASE-14058) Stabilizing default heap memory tuner
[ https://issues.apache.org/jira/browse/HBASE-14058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-14058: - Affects Version/s: (was: 1.1.1) (was: 1.1.0) (was: 1.0.1) (was: 2.0.0) > Stabilizing default heap memory tuner > - > > Key: HBASE-14058 > URL: https://issues.apache.org/jira/browse/HBASE-14058 > Project: HBase > Issue Type: Improvement > Components: regionserver >Reporter: Abhilash >Assignee: Abhilash > > The memory tuner works well in general cases but when we have a work load > that is both read heavy as well as write heavy the tuner does too many > tuning. We should try to control the number of tuner operation and stabilize > it. The main problem was that the tuner thinks it is in steady state even if > it sees just one neutral tuner period thus does too many tuning operations > and too many reverts that too with large step sizes(step size was set to > maximum even after one neutral period). So to stop this I have thought of > these steps: > 1) The division created by μ + δ/2 and μ - δ/2 is too small. Statistically > ~62% periods will lie outside this range, which means 62% of the data points > are considered either high or low which is too much. Use μ + δ*0.8 and μ - > δ*0.8 instead. On expectations it will decrease number of tuner operations > per 100 periods from 19 to just 10. If we use δ/2 then 31% of data values > will be considered to be high and 31% will be considered to be low (2*0.31 * > 0.31 = 0.19), on the other hand if we use δ*0.8 then 22% will be low and 22% > will be high(2*0.22*0.22 ~ 0.10). > 2) Defining proper steady state by looking at past few periods(it is equal to > hbase.regionserver.heapmemory.autotuner.lookup.periods) rather than just last > tuner operation. We say tuner is in steady state when last few tuner periods > were NEUTRAL. We keep decreasing step size unless it is extremely low. Then > leave system in that state for some time. > 3) Rather then decreasing step size only while reverting, decrease the > magnitude of step size whenever we are trying to revert tuning done in last > few periods(sum the changes of last few periods and compare to current step) > rather than just looking at last period. When its magnitude gets too low then > make tuner steps NEUTRAL(no operation). This will cause step size to > continuously decrease unless we reach steady state. After that tuning process > will restart (tuner step size rests again when we reach steady state). > 4) The tuning done in last few periods will be decaying sum of past tuner > steps with sign. This parameter will be positive for increase in memstore and > negative for increase in block cache. Rather than using arithmetic mean we > use this to give more priority to recent tuner steps. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14058) Stabilizing default heap memory tuner
[ https://issues.apache.org/jira/browse/HBASE-14058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-14058: - Affects Version/s: 2.0.0 1.0.1 1.1.0 1.1.1 > Stabilizing default heap memory tuner > - > > Key: HBASE-14058 > URL: https://issues.apache.org/jira/browse/HBASE-14058 > Project: HBase > Issue Type: Improvement > Components: regionserver >Affects Versions: 2.0.0, 1.0.1, 1.1.0, 1.1.1 >Reporter: Abhilash >Assignee: Abhilash > > The memory tuner works well in general cases but when we have a work load > that is both read heavy as well as write heavy the tuner does too many > tuning. We should try to control the number of tuner operation and stabilize > it. The main problem was that the tuner thinks it is in steady state even if > it sees just one neutral tuner period thus does too many tuning operations > and too many reverts that too with large step sizes(step size was set to > maximum even after one neutral period). So to stop this I have thought of > these steps: > 1) The division created by μ + δ/2 and μ - δ/2 is too small. Statistically > ~62% periods will lie outside this range, which means 62% of the data points > are considered either high or low which is too much. Use μ + δ*0.8 and μ - > δ*0.8 instead. On expectations it will decrease number of tuner operations > per 100 periods from 19 to just 10. If we use δ/2 then 31% of data values > will be considered to be high and 31% will be considered to be low (2*0.31 * > 0.31 = 0.19), on the other hand if we use δ*0.8 then 22% will be low and 22% > will be high(2*0.22*0.22 ~ 0.10). > 2) Defining proper steady state by looking at past few periods(it is equal to > hbase.regionserver.heapmemory.autotuner.lookup.periods) rather than just last > tuner operation. We say tuner is in steady state when last few tuner periods > were NEUTRAL. We keep decreasing step size unless it is extremely low. Then > leave system in that state for some time. > 3) Rather then decreasing step size only while reverting, decrease the > magnitude of step size whenever we are trying to revert tuning done in last > few periods(sum the changes of last few periods and compare to current step) > rather than just looking at last period. When its magnitude gets too low then > make tuner steps NEUTRAL(no operation). This will cause step size to > continuously decrease unless we reach steady state. After that tuning process > will restart (tuner step size rests again when we reach steady state). > 4) The tuning done in last few periods will be decaying sum of past tuner > steps with sign. This parameter will be positive for increase in memstore and > negative for increase in block cache. Rather than using arithmetic mean we > use this to give more priority to recent tuner steps. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14058) Stabilizing default heap memory tuner
[ https://issues.apache.org/jira/browse/HBASE-14058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-14058: - Summary: Stabilizing default heap memory tuner (was: Stabilizing heap memory tuner) > Stabilizing default heap memory tuner > - > > Key: HBASE-14058 > URL: https://issues.apache.org/jira/browse/HBASE-14058 > Project: HBase > Issue Type: Improvement > Components: regionserver >Reporter: Abhilash >Assignee: Abhilash > > The memory tuner works well in general cases but when we have a work load > that is both read heavy as well as write heavy the tuner does too many > tuning. We should try to control the number of tuner operation and stabilize > it. The main problem was that the tuner thinks it is in steady state even if > it sees just one neutral tuner period thus does too many tuning operations > and too many reverts that too with large step sizes(step size was set to > maximum even after one neutral period). So to stop this I have thought of > these steps: > 1) The division created by μ + δ/2 and μ - δ/2 is too small. Statistically > ~62% periods will lie outside this range, which means 62% of the data points > are considered either high or low which is too much. Use μ + δ*0.8 and μ - > δ*0.8 instead. On expectations it will decrease number of tuner operations > per 100 periods from 19 to just 10. If we use δ/2 then 31% of data values > will be considered to be high and 31% will be considered to be low (2*0.31 * > 0.31 = 0.19), on the other hand if we use δ*0.8 then 22% will be low and 22% > will be high(2*0.22*0.22 ~ 0.10). > 2) Defining proper steady state by looking at past few periods(it is equal to > hbase.regionserver.heapmemory.autotuner.lookup.periods) rather than just last > tuner operation. We say tuner is in steady state when last few tuner periods > were NEUTRAL. We keep decreasing step size unless it is extremely low. Then > leave system in that state for some time. > 3) Rather then decreasing step size only while reverting, decrease the > magnitude of step size whenever we are trying to revert tuning done in last > few periods(sum the changes of last few periods and compare to current step) > rather than just looking at last period. When its magnitude gets too low then > make tuner steps NEUTRAL(no operation). This will cause step size to > continuously decrease unless we reach steady state. After that tuning process > will restart (tuner step size rests again when we reach steady state). > 4) The tuning done in last few periods will be decaying sum of past tuner > steps with sign. This parameter will be positive for increase in memstore and > negative for increase in block cache. Rather than using arithmetic mean we > use this to give more priority to recent tuner steps. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14058) Stabilizing heap memory tuner
[ https://issues.apache.org/jira/browse/HBASE-14058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-14058: - Summary: Stabilizing heap memory tuner (was: Stabilize heap memory tuner) > Stabilizing heap memory tuner > - > > Key: HBASE-14058 > URL: https://issues.apache.org/jira/browse/HBASE-14058 > Project: HBase > Issue Type: Improvement > Components: regionserver >Reporter: Abhilash >Assignee: Abhilash > > The memory tuner works well in general cases but when we have a work load > that is both read heavy as well as write heavy the tuner does too many > tuning. We should try to control the number of tuner operation and stabilize > it. The main problem was that the tuner thinks it is in steady state even if > it sees just one neutral tuner period thus does too many tuning operations > and too many reverts that too with large step sizes(step size was set to > maximum even after one neutral period). So to stop this I have thought of > these steps: > 1) The division created by μ + δ/2 and μ - δ/2 is too small. Statistically > ~62% periods will lie outside this range, which means 62% of the data points > are considered either high or low which is too much. Use μ + δ*0.8 and μ - > δ*0.8 instead. On expectations it will decrease number of tuner operations > per 100 periods from 19 to just 10. If we use δ/2 then 31% of data values > will be considered to be high and 31% will be considered to be low (2*0.31 * > 0.31 = 0.19), on the other hand if we use δ*0.8 then 22% will be low and 22% > will be high(2*0.22*0.22 ~ 0.10). > 2) Defining proper steady state by looking at past few periods(it is equal to > hbase.regionserver.heapmemory.autotuner.lookup.periods) rather than just last > tuner operation. We say tuner is in steady state when last few tuner periods > were NEUTRAL. We keep decreasing step size unless it is extremely low. Then > leave system in that state for some time. > 3) Rather then decreasing step size only while reverting, decrease the > magnitude of step size whenever we are trying to revert tuning done in last > few periods(sum the changes of last few periods and compare to current step) > rather than just looking at last period. When its magnitude gets too low then > make tuner steps NEUTRAL(no operation). This will cause step size to > continuously decrease unless we reach steady state. After that tuning process > will restart (tuner step size rests again when we reach steady state). > 4) The tuning done in last few periods will be decaying sum of past tuner > steps with sign. This parameter will be positive for increase in memstore and > negative for increase in block cache. Rather than using arithmetic mean we > use this to give more priority to recent tuner steps. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14058) Stabilize heap memory tuner
Abhilash created HBASE-14058: Summary: Stabilize heap memory tuner Key: HBASE-14058 URL: https://issues.apache.org/jira/browse/HBASE-14058 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Abhilash Assignee: Abhilash The memory tuner works well in general cases but when we have a work load that is both read heavy as well as write heavy the tuner does too many tuning. We should try to control the number of tuner operation and stabilize it. The main problem was that the tuner thinks it is in steady state even if it sees just one neutral tuner period thus does too many tuning operations and too many reverts that too with large step sizes(step size was set to maximum even after one neutral period). So to stop this I have thought of these steps: 1) The division created by μ + δ/2 and μ - δ/2 is too small. Statistically ~62% periods will lie outside this range, which means 62% of the data points are considered either high or low which is too much. Use μ + δ*0.8 and μ - δ*0.8 instead. On expectations it will decrease number of tuner operations per 100 periods from 19 to just 10. If we use δ/2 then 31% of data values will be considered to be high and 31% will be considered to be low (2*0.31 * 0.31 = 0.19), on the other hand if we use δ*0.8 then 22% will be low and 22% will be high(2*0.22*0.22 ~ 0.10). 2) Defining proper steady state by looking at past few periods(it is equal to hbase.regionserver.heapmemory.autotuner.lookup.periods) rather than just last tuner operation. We say tuner is in steady state when last few tuner periods were NEUTRAL. We keep decreasing step size unless it is extremely low. Then leave system in that state for some time. 3) Rather then decreasing step size only while reverting, decrease the magnitude of step size whenever we are trying to revert tuning done in last few periods(sum the changes of last few periods and compare to current step) rather than just looking at last period. When its magnitude gets too low then make tuner steps NEUTRAL(no operation). This will cause step size to continuously decrease unless we reach steady state. After that tuning process will restart (tuner step size rests again when we reach steady state). 4) The tuning done in last few periods will be decaying sum of past tuner steps with sign. This parameter will be positive for increase in memstore and negative for increase in block cache. Rather than using arithmetic mean we use this to give more priority to recent tuner steps. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13980) Distinguish blockedFlushCount vs unblockedFlushCount when tuning heap memory
[ https://issues.apache.org/jira/browse/HBASE-13980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13980: - Attachment: HBASE-13980-v1.patch > Distinguish blockedFlushCount vs unblockedFlushCount when tuning heap memory > > > Key: HBASE-13980 > URL: https://issues.apache.org/jira/browse/HBASE-13980 > Project: HBase > Issue Type: Improvement >Reporter: Ted Yu >Assignee: Abhilash >Priority: Minor > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-13980-v1.patch, HBASE-13980-v1.patch, > HBASE-13980.patch, HBASE-13980.patch > > > Currently DefaultHeapMemoryTuner doesn't distinguish blockedFlushCount vs > unblockedFlushCount. > In its tune() method: > {code} > long totalFlushCount = blockedFlushCount+unblockedFlushCount; > rollingStatsForCacheMisses.insertDataValue(cacheMissCount); > rollingStatsForFlushes.insertDataValue(totalFlushCount); > {code} > Occurrence of blocked flush indicates that upper limit for memstore is not > sufficient. > We should either give blockedFlushCount more weight or, take tuning action > based on blockedFlushCount directly. > See discussion from tail of HBASE-13876. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13980) Distinguish blockedFlushCount vs unblockedFlushCount when tuning heap memory
[ https://issues.apache.org/jira/browse/HBASE-13980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13980: - Attachment: HBASE-13980-v1.patch > Distinguish blockedFlushCount vs unblockedFlushCount when tuning heap memory > > > Key: HBASE-13980 > URL: https://issues.apache.org/jira/browse/HBASE-13980 > Project: HBase > Issue Type: Improvement >Reporter: Ted Yu >Assignee: Abhilash >Priority: Minor > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-13980-v1.patch, HBASE-13980.patch, > HBASE-13980.patch > > > Currently DefaultHeapMemoryTuner doesn't distinguish blockedFlushCount vs > unblockedFlushCount. > In its tune() method: > {code} > long totalFlushCount = blockedFlushCount+unblockedFlushCount; > rollingStatsForCacheMisses.insertDataValue(cacheMissCount); > rollingStatsForFlushes.insertDataValue(totalFlushCount); > {code} > Occurrence of blocked flush indicates that upper limit for memstore is not > sufficient. > We should either give blockedFlushCount more weight or, take tuning action > based on blockedFlushCount directly. > See discussion from tail of HBASE-13876. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13980) Distinguish blockedFlushCount vs unblockedFlushCount when tuning heap memory
[ https://issues.apache.org/jira/browse/HBASE-13980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13980: - Attachment: (was: HBASE-13980-v1.patch) > Distinguish blockedFlushCount vs unblockedFlushCount when tuning heap memory > > > Key: HBASE-13980 > URL: https://issues.apache.org/jira/browse/HBASE-13980 > Project: HBase > Issue Type: Improvement >Reporter: Ted Yu >Assignee: Abhilash >Priority: Minor > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-13980.patch, HBASE-13980.patch > > > Currently DefaultHeapMemoryTuner doesn't distinguish blockedFlushCount vs > unblockedFlushCount. > In its tune() method: > {code} > long totalFlushCount = blockedFlushCount+unblockedFlushCount; > rollingStatsForCacheMisses.insertDataValue(cacheMissCount); > rollingStatsForFlushes.insertDataValue(totalFlushCount); > {code} > Occurrence of blocked flush indicates that upper limit for memstore is not > sufficient. > We should either give blockedFlushCount more weight or, take tuning action > based on blockedFlushCount directly. > See discussion from tail of HBASE-13876. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13980) Distinguish blockedFlushCount vs unblockedFlushCount when tuning heap memory
[ https://issues.apache.org/jira/browse/HBASE-13980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13980: - Attachment: HBASE-13980-v1.patch > Distinguish blockedFlushCount vs unblockedFlushCount when tuning heap memory > > > Key: HBASE-13980 > URL: https://issues.apache.org/jira/browse/HBASE-13980 > Project: HBase > Issue Type: Improvement >Reporter: Ted Yu >Assignee: Abhilash >Priority: Minor > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-13980-v1.patch, HBASE-13980.patch, > HBASE-13980.patch > > > Currently DefaultHeapMemoryTuner doesn't distinguish blockedFlushCount vs > unblockedFlushCount. > In its tune() method: > {code} > long totalFlushCount = blockedFlushCount+unblockedFlushCount; > rollingStatsForCacheMisses.insertDataValue(cacheMissCount); > rollingStatsForFlushes.insertDataValue(totalFlushCount); > {code} > Occurrence of blocked flush indicates that upper limit for memstore is not > sufficient. > We should either give blockedFlushCount more weight or, take tuning action > based on blockedFlushCount directly. > See discussion from tail of HBASE-13876. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13980) Distinguish blockedFlushCount vs unblockedFlushCount when tuning heap memory
[ https://issues.apache.org/jira/browse/HBASE-13980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13980: - Attachment: HBASE-13980.patch > Distinguish blockedFlushCount vs unblockedFlushCount when tuning heap memory > > > Key: HBASE-13980 > URL: https://issues.apache.org/jira/browse/HBASE-13980 > Project: HBase > Issue Type: Improvement >Reporter: Ted Yu >Assignee: Abhilash >Priority: Minor > Attachments: HBASE-13980.patch, HBASE-13980.patch > > > Currently DefaultHeapMemoryTuner doesn't distinguish blockedFlushCount vs > unblockedFlushCount. > In its tune() method: > {code} > long totalFlushCount = blockedFlushCount+unblockedFlushCount; > rollingStatsForCacheMisses.insertDataValue(cacheMissCount); > rollingStatsForFlushes.insertDataValue(totalFlushCount); > {code} > Occurrence of blocked flush indicates that upper limit for memstore is not > sufficient. > We should either give blockedFlushCount more weight or, take tuning action > based on blockedFlushCount directly. > See discussion from tail of HBASE-13876. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13980) Distinguish blockedFlushCount vs unblockedFlushCount when tuning heap memory
[ https://issues.apache.org/jira/browse/HBASE-13980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13980: - Attachment: HBASE-13980.patch > Distinguish blockedFlushCount vs unblockedFlushCount when tuning heap memory > > > Key: HBASE-13980 > URL: https://issues.apache.org/jira/browse/HBASE-13980 > Project: HBase > Issue Type: Improvement >Reporter: Ted Yu >Assignee: Abhilash >Priority: Minor > Attachments: HBASE-13980.patch > > > Currently DefaultHeapMemoryTuner doesn't distinguish blockedFlushCount vs > unblockedFlushCount. > In its tune() method: > {code} > long totalFlushCount = blockedFlushCount+unblockedFlushCount; > rollingStatsForCacheMisses.insertDataValue(cacheMissCount); > rollingStatsForFlushes.insertDataValue(totalFlushCount); > {code} > Occurrence of blocked flush indicates that upper limit for memstore is not > sufficient. > We should either give blockedFlushCount more weight or, take tuning action > based on blockedFlushCount directly. > See discussion from tail of HBASE-13876. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13980) Distinguish blockedFlushCount vs unblockedFlushCount when tuning heap memory
[ https://issues.apache.org/jira/browse/HBASE-13980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13980: - Attachment: (was: HBASE-13980.patch) > Distinguish blockedFlushCount vs unblockedFlushCount when tuning heap memory > > > Key: HBASE-13980 > URL: https://issues.apache.org/jira/browse/HBASE-13980 > Project: HBase > Issue Type: Improvement >Reporter: Ted Yu >Assignee: Abhilash >Priority: Minor > > Currently DefaultHeapMemoryTuner doesn't distinguish blockedFlushCount vs > unblockedFlushCount. > In its tune() method: > {code} > long totalFlushCount = blockedFlushCount+unblockedFlushCount; > rollingStatsForCacheMisses.insertDataValue(cacheMissCount); > rollingStatsForFlushes.insertDataValue(totalFlushCount); > {code} > Occurrence of blocked flush indicates that upper limit for memstore is not > sufficient. > We should either give blockedFlushCount more weight or, take tuning action > based on blockedFlushCount directly. > See discussion from tail of HBASE-13876. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13980) Distinguish blockedFlushCount vs unblockedFlushCount when tuning heap memory
[ https://issues.apache.org/jira/browse/HBASE-13980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13980: - Attachment: HBASE-13980.patch > Distinguish blockedFlushCount vs unblockedFlushCount when tuning heap memory > > > Key: HBASE-13980 > URL: https://issues.apache.org/jira/browse/HBASE-13980 > Project: HBase > Issue Type: Improvement >Reporter: Ted Yu >Assignee: Abhilash >Priority: Minor > Attachments: HBASE-13980.patch > > > Currently DefaultHeapMemoryTuner doesn't distinguish blockedFlushCount vs > unblockedFlushCount. > In its tune() method: > {code} > long totalFlushCount = blockedFlushCount+unblockedFlushCount; > rollingStatsForCacheMisses.insertDataValue(cacheMissCount); > rollingStatsForFlushes.insertDataValue(totalFlushCount); > {code} > Occurrence of blocked flush indicates that upper limit for memstore is not > sufficient. > We should either give blockedFlushCount more weight or, take tuning action > based on blockedFlushCount directly. > See discussion from tail of HBASE-13876. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13980) Distinguish blockedFlushCount vs unblockedFlushCount when tuning heap memory
[ https://issues.apache.org/jira/browse/HBASE-13980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14606774#comment-14606774 ] Abhilash commented on HBASE-13980: -- Currently we ignore few initial periods (while system is restarting and cache is warming up) and just update our statistics. And for first tuning operation we assume that last tuner operation was neutral. I guess its a fair estimation as current stats are in steady state ( atleast for that current memory distribution). I guess we are out of recursion now :P > Distinguish blockedFlushCount vs unblockedFlushCount when tuning heap memory > > > Key: HBASE-13980 > URL: https://issues.apache.org/jira/browse/HBASE-13980 > Project: HBase > Issue Type: Improvement >Reporter: Ted Yu >Assignee: Abhilash >Priority: Minor > > Currently DefaultHeapMemoryTuner doesn't distinguish blockedFlushCount vs > unblockedFlushCount. > In its tune() method: > {code} > long totalFlushCount = blockedFlushCount+unblockedFlushCount; > rollingStatsForCacheMisses.insertDataValue(cacheMissCount); > rollingStatsForFlushes.insertDataValue(totalFlushCount); > {code} > Occurrence of blocked flush indicates that upper limit for memstore is not > sufficient. > We should either give blockedFlushCount more weight or, take tuning action > based on blockedFlushCount directly. > See discussion from tail of HBASE-13876. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13980) Distinguish blockedFlushCount vs unblockedFlushCount when tuning heap memory
[ https://issues.apache.org/jira/browse/HBASE-13980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14606692#comment-14606692 ] Abhilash commented on HBASE-13980: -- The most simple implementation will be that we say we are in steady state if last (or past few) tuning operations was(were) neutral. If there is sudden increase in blockedFlushCount and last tuner step was to decrease memstore then with high probability we can say that this sudden increase was tuner's side effect. > Distinguish blockedFlushCount vs unblockedFlushCount when tuning heap memory > > > Key: HBASE-13980 > URL: https://issues.apache.org/jira/browse/HBASE-13980 > Project: HBase > Issue Type: Improvement >Reporter: Ted Yu >Assignee: Abhilash >Priority: Minor > > Currently DefaultHeapMemoryTuner doesn't distinguish blockedFlushCount vs > unblockedFlushCount. > In its tune() method: > {code} > long totalFlushCount = blockedFlushCount+unblockedFlushCount; > rollingStatsForCacheMisses.insertDataValue(cacheMissCount); > rollingStatsForFlushes.insertDataValue(totalFlushCount); > {code} > Occurrence of blocked flush indicates that upper limit for memstore is not > sufficient. > We should either give blockedFlushCount more weight or, take tuning action > based on blockedFlushCount directly. > See discussion from tail of HBASE-13876. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13980) Distinguish blockedFlushCount vs unblockedFlushCount when tuning heap memory
[ https://issues.apache.org/jira/browse/HBASE-13980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14606617#comment-14606617 ] Abhilash commented on HBASE-13980: -- In my opinion rather than giving more weight (it would be difficult in itself to reason about their relative weights) to blockedFlushCount. We should directly take tuning decision if we observe blocked flushes in steady state. Because they are rare and very undesirable. One complication is that when memory tuner increases block cache size and decreases memstore size. It might force many blocked flushes but that does not mean we have to abruptly increase memstore size. So we should do that only in steady state. If after reading this you have doubt about how tuner would works then ? Ans : Suppose in a certain period tuner(1st tuning) increased block cache by 8% and decreased memstore by same amount. Then in next step(if server is not completely read heavy, if it is ready heavy then we are good anyways) we will observe many blocked / unblocked flushes. Then if this is number is really very high compared to past trend we revert our tuning. But not completely, we partially revert previous tuning operation(2nd tuning). Overall we have increased block cache by 4% which tuner wanted to do in 1st tuning. > Distinguish blockedFlushCount vs unblockedFlushCount when tuning heap memory > > > Key: HBASE-13980 > URL: https://issues.apache.org/jira/browse/HBASE-13980 > Project: HBase > Issue Type: Improvement >Reporter: Ted Yu >Assignee: Abhilash >Priority: Minor > > Currently DefaultHeapMemoryTuner doesn't distinguish blockedFlushCount vs > unblockedFlushCount. > In its tune() method: > {code} > long totalFlushCount = blockedFlushCount+unblockedFlushCount; > rollingStatsForCacheMisses.insertDataValue(cacheMissCount); > rollingStatsForFlushes.insertDataValue(totalFlushCount); > {code} > Occurrence of blocked flush indicates that upper limit for memstore is not > sufficient. > We should either give blockedFlushCount more weight or, take tuning action > based on blockedFlushCount directly. > See discussion from tail of HBASE-13876. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13989) Threshold for combined MemStore and BlockCache percentages is not checked
[ https://issues.apache.org/jira/browse/HBASE-13989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14606292#comment-14606292 ] Abhilash commented on HBASE-13989: -- Great catch. Looks good to me. > Threshold for combined MemStore and BlockCache percentages is not checked > - > > Key: HBASE-13989 > URL: https://issues.apache.org/jira/browse/HBASE-13989 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 13989-v1.txt, 13989-v2.txt > > > In HeapMemoryManager#doInit(): > {code} > globalMemStorePercentMinRange = conf.getFloat(MEMSTORE_SIZE_MIN_RANGE_KEY, > globalMemStorePercent); > globalMemStorePercentMaxRange = conf.getFloat(MEMSTORE_SIZE_MAX_RANGE_KEY, > globalMemStorePercent); > ... > if (globalMemStorePercent == globalMemStorePercentMinRange > && globalMemStorePercent == globalMemStorePercentMaxRange) { > return false; > } > {code} > If memory tuning is not specified, globalMemStorePercentMinRange and > globalMemStorePercentMaxRange would carry the value of globalMemStorePercent. > This would make doInit() exit before checking the threshold for combined > MemStore and BlockCache percentages. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HBASE-13980) Distinguish blockedFlushCount vs unblockedFlushCount when tuning heap memory
[ https://issues.apache.org/jira/browse/HBASE-13980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash reassigned HBASE-13980: Assignee: Abhilash > Distinguish blockedFlushCount vs unblockedFlushCount when tuning heap memory > > > Key: HBASE-13980 > URL: https://issues.apache.org/jira/browse/HBASE-13980 > Project: HBase > Issue Type: Improvement >Reporter: Ted Yu >Assignee: Abhilash >Priority: Minor > > Currently DefaultHeapMemoryTuner doesn't distinguish blockedFlushCount vs > unblockedFlushCount. > In its tune() method: > {code} > long totalFlushCount = blockedFlushCount+unblockedFlushCount; > rollingStatsForCacheMisses.insertDataValue(cacheMissCount); > rollingStatsForFlushes.insertDataValue(totalFlushCount); > {code} > Occurrence of blocked flush indicates that upper limit for memstore is not > sufficient. > We should either give blockedFlushCount more weight or, take tuning action > based on blockedFlushCount directly. > See discussion from tail of HBASE-13876. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13876) Improving performance of HeapMemoryManager
[ https://issues.apache.org/jira/browse/HBASE-13876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14603789#comment-14603789 ] Abhilash commented on HBASE-13876: -- I was thinking about that too. Rather than giving more weightage to blockedFlushCount, can we just directly increase(if favorable) memstore size when we observe blocked flushes ? As even a single blockedFlushCount very strongly indicates that current upper limit for memstore is not sufficient and its highly undesirable ? > Improving performance of HeapMemoryManager > -- > > Key: HBASE-13876 > URL: https://issues.apache.org/jira/browse/HBASE-13876 > Project: HBase > Issue Type: Improvement > Components: hbase, regionserver >Affects Versions: 2.0.0, 1.0.1, 1.1.0, 1.1.1 >Reporter: Abhilash >Assignee: Abhilash >Priority: Minor > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-13876-v2.patch, HBASE-13876-v3.patch, > HBASE-13876-v4.patch, HBASE-13876-v5.patch, HBASE-13876-v6.patch, > HBASE-13876-v7.patch, HBASE-13876.patch > > > I am trying to improve the performance of DefaultHeapMemoryTuner by > introducing some more checks. The current checks under which the > DefaultHeapMemoryTuner works are very rare so I am trying to weaken these > checks to improve its performance. > Check current memstore size and current block cache size. For say if we are > using less than 50% of currently available block cache size we say block > cache is sufficient and same for memstore. This check will be very effective > when server is either load heavy or write heavy. Earlier version just waited > for number of evictions / number of flushes to be zero which are very rare. > Otherwise based on percent change in number of cache misses and number of > flushes we increase / decrease memory provided for caching / memstore. After > doing so, on next call of HeapMemoryTuner we verify that last change has > indeed decreased number of evictions / flush either of which it was expected > to do. We also check that it does not make the other (evictions / flush) > increase much. I am doing this analysis by comparing percent change (which is > basically nothing but normalized derivative) of number of evictions and > number of flushes during last two periods. The main motive for doing this was > that if we have random reads then we will be having a lot of cache misses. > But even after increasing block cache we wont be able to decrease number of > cache misses and we will revert back and eventually we will not waste memory > on block caches. This will also help us ignore random short term spikes in > reads / writes. I have also tried to take care not to tune memory if do do > not have enough hints as unnecessary tuning my slow down the system. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13863) Multi-wal feature breaks reported number and size of HLogs
[ https://issues.apache.org/jira/browse/HBASE-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14602246#comment-14602246 ] Abhilash commented on HBASE-13863: -- Any chances of getting +1 on this :) ? > Multi-wal feature breaks reported number and size of HLogs > -- > > Key: HBASE-13863 > URL: https://issues.apache.org/jira/browse/HBASE-13863 > Project: HBase > Issue Type: Bug > Components: regionserver, UI >Affects Versions: 1.0.0, 1.1.0 >Reporter: Elliott Clark >Assignee: Abhilash > Fix For: 1.0.0, 1.1.0 > > Attachments: HBASE-13863-v1.patch, HBASE-13863-v1.patch, > HBASE-13863-v1.patch, HBASE-13863-v1.patch, HBASE-13863.patch > > > When multi-wal is enabled the number and size of retained HLogs is always > reported as zero. > We should fix this so that the numbers are the sum of all retained logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13971) Flushes stuck since 6 hours on a regionserver.
[ https://issues.apache.org/jira/browse/HBASE-13971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14602239#comment-14602239 ] Abhilash commented on HBASE-13971: -- Sorry. Somehow I dont have Region server logs. > Flushes stuck since 6 hours on a regionserver. > -- > > Key: HBASE-13971 > URL: https://issues.apache.org/jira/browse/HBASE-13971 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 1.3.0 > Environment: Caused while running IntegrationTestLoadAndVerify for 20 > M rows on cluster with 32 region servers each with max heap size of 24GBs. >Reporter: Abhilash >Priority: Critical > Attachments: rsDebugDump.txt, screenshot-1.png > > > One region server stuck while flushing(possible deadlock). Its trying to > flush two regions since last 6 hours (see the screenshot). > Caused while running IntegrationTestLoadAndVerify for 20 M rows with 600 > mapper jobs and 100 back references. ~37 Million writes on each regionserver > till now but no writes happening on any regionserver from past 6 hours and > their memstore size is zero(I dont know if this is related). But this > particular regionserver has memstore size of 9GBs from past 6 hours. > Relevant snaps from debug dump: > Tasks: > === > Task: Flushing > IntegrationTestLoadAndVerify,R\x9B\x1B\xBF\xAE\x08\xD1\xA2,1435179555993.8e2d075f94ce7699f416ec4ced9873cd. > Status: RUNNING:Preparing to flush by snapshotting stores in > 8e2d075f94ce7699f416ec4ced9873cd > Running for 22034s > Task: Flushing > IntegrationTestLoadAndVerify,\x93\xA385\x81Z\x11\xE6,1435179555993.9f8d0e01a40405b835bf6e5a22a86390. > Status: RUNNING:Preparing to flush by snapshotting stores in > 9f8d0e01a40405b835bf6e5a22a86390 > Running for 22033s > Executors: > === > ... > Thread 139 (MemStoreFlusher.1): > State: WAITING > Blocked count: 139711 > Waited count: 239212 > Waiting on java.util.concurrent.CountDownLatch$Sync@b9c094a > Stack: > sun.misc.Unsafe.park(Native Method) > java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) > > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) > java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) > org.apache.hadoop.hbase.wal.WALKey.getSequenceId(WALKey.java:305) > > org.apache.hadoop.hbase.regionserver.HRegion.getNextSequenceId(HRegion.java:2422) > > org.apache.hadoop.hbase.regionserver.HRegion.internalPrepareFlushCache(HRegion.java:2168) > > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2047) > > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2011) > org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1902) > org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:1828) > > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:510) > > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471) > > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$900(MemStoreFlusher.java:75) > > org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259) > java.lang.Thread.run(Thread.java:745) > Thread 137 (MemStoreFlusher.0): > State: WAITING > Blocked count: 138931 > Waited count: 237448 > Waiting on java.util.concurrent.CountDownLatch$Sync@53f41f76 > Stack: > sun.misc.Unsafe.park(Native Method) > java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) > > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) > java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) > org.apache.hadoop.hbase.wal.WALKey.getSequenceId(WALKey.java:305) > > org.apache.hadoop.hbase.regionserver.HRegion.getNextSequenceId(HRegion.java:2422) > > org.apache.hadoop.hbase.regionserver.HRegion.internalPrepareFlushCache(HRegion.java:2168) > > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2047) >
[jira] [Updated] (HBASE-13863) Multi-wal feature breaks reported number and size of HLogs
[ https://issues.apache.org/jira/browse/HBASE-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13863: - Affects Version/s: 1.0.0 1.1.0 > Multi-wal feature breaks reported number and size of HLogs > -- > > Key: HBASE-13863 > URL: https://issues.apache.org/jira/browse/HBASE-13863 > Project: HBase > Issue Type: Bug > Components: regionserver, UI >Affects Versions: 1.0.0, 1.1.0 >Reporter: Elliott Clark >Assignee: Abhilash > Fix For: 1.0.0, 1.1.0 > > Attachments: HBASE-13863-v1.patch, HBASE-13863-v1.patch, > HBASE-13863-v1.patch, HBASE-13863-v1.patch, HBASE-13863.patch > > > When multi-wal is enabled the number and size of retained HLogs is always > reported as zero. > We should fix this so that the numbers are the sum of all retained logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13863) Multi-wal feature breaks reported number and size of HLogs
[ https://issues.apache.org/jira/browse/HBASE-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13863: - Fix Version/s: 1.0.0 1.1.0 > Multi-wal feature breaks reported number and size of HLogs > -- > > Key: HBASE-13863 > URL: https://issues.apache.org/jira/browse/HBASE-13863 > Project: HBase > Issue Type: Bug > Components: regionserver, UI >Affects Versions: 1.0.0, 1.1.0 >Reporter: Elliott Clark >Assignee: Abhilash > Fix For: 1.0.0, 1.1.0 > > Attachments: HBASE-13863-v1.patch, HBASE-13863-v1.patch, > HBASE-13863-v1.patch, HBASE-13863-v1.patch, HBASE-13863.patch > > > When multi-wal is enabled the number and size of retained HLogs is always > reported as zero. > We should fix this so that the numbers are the sum of all retained logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13971) Flushes stuck since 6 hours on a regionserver.
[ https://issues.apache.org/jira/browse/HBASE-13971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13971: - Attachment: rsDebugDump.txt > Flushes stuck since 6 hours on a regionserver. > -- > > Key: HBASE-13971 > URL: https://issues.apache.org/jira/browse/HBASE-13971 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 1.3.0 > Environment: Caused while running IntegrationTestLoadAndVerify for 20 > M rows on cluster with 32 region servers each with max heap size of 24GBs. >Reporter: Abhilash >Priority: Critical > Attachments: rsDebugDump.txt, screenshot-1.png > > > One region server stuck while flushing(possible deadlock). Its trying to > flush two regions since last 6 hours (see the screenshot). > Caused while running IntegrationTestLoadAndVerify for 20 M rows with 600 > mapper jobs and 100 back references. ~37 Million writes on each regionserver > till now but no writes happening on any regionserver from past 6 hours and > their memstore size is zero(I dont know if this is related). But this > particular regionserver has memstore size of 9GBs from past 6 hours. > Relevant snaps from debug dump: > Tasks: > === > Task: Flushing > IntegrationTestLoadAndVerify,R\x9B\x1B\xBF\xAE\x08\xD1\xA2,1435179555993.8e2d075f94ce7699f416ec4ced9873cd. > Status: RUNNING:Preparing to flush by snapshotting stores in > 8e2d075f94ce7699f416ec4ced9873cd > Running for 22034s > Task: Flushing > IntegrationTestLoadAndVerify,\x93\xA385\x81Z\x11\xE6,1435179555993.9f8d0e01a40405b835bf6e5a22a86390. > Status: RUNNING:Preparing to flush by snapshotting stores in > 9f8d0e01a40405b835bf6e5a22a86390 > Running for 22033s > Executors: > === > ... > Thread 139 (MemStoreFlusher.1): > State: WAITING > Blocked count: 139711 > Waited count: 239212 > Waiting on java.util.concurrent.CountDownLatch$Sync@b9c094a > Stack: > sun.misc.Unsafe.park(Native Method) > java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) > > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) > java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) > org.apache.hadoop.hbase.wal.WALKey.getSequenceId(WALKey.java:305) > > org.apache.hadoop.hbase.regionserver.HRegion.getNextSequenceId(HRegion.java:2422) > > org.apache.hadoop.hbase.regionserver.HRegion.internalPrepareFlushCache(HRegion.java:2168) > > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2047) > > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2011) > org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1902) > org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:1828) > > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:510) > > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471) > > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$900(MemStoreFlusher.java:75) > > org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259) > java.lang.Thread.run(Thread.java:745) > Thread 137 (MemStoreFlusher.0): > State: WAITING > Blocked count: 138931 > Waited count: 237448 > Waiting on java.util.concurrent.CountDownLatch$Sync@53f41f76 > Stack: > sun.misc.Unsafe.park(Native Method) > java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) > > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) > java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) > org.apache.hadoop.hbase.wal.WALKey.getSequenceId(WALKey.java:305) > > org.apache.hadoop.hbase.regionserver.HRegion.getNextSequenceId(HRegion.java:2422) > > org.apache.hadoop.hbase.regionserver.HRegion.internalPrepareFlushCache(HRegion.java:2168) > > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2047) > > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion
[jira] [Updated] (HBASE-13971) Flushes stuck since 6 hours on a regionserver.
[ https://issues.apache.org/jira/browse/HBASE-13971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13971: - Description: One region server stuck while flushing(possible deadlock). Its trying to flush two regions since last 6 hours (see the screenshot). Caused while running IntegrationTestLoadAndVerify for 20 M rows with 600 mapper jobs and 100 back references. ~37 Million writes on each regionserver till now but no writes happening on any regionserver from past 6 hours and their memstore size is zero(I dont know if this is related). But this particular regionserver has memstore size of 9GBs from past 6 hours. Relevant snaps from debug dump: Tasks: === Task: Flushing IntegrationTestLoadAndVerify,R\x9B\x1B\xBF\xAE\x08\xD1\xA2,1435179555993.8e2d075f94ce7699f416ec4ced9873cd. Status: RUNNING:Preparing to flush by snapshotting stores in 8e2d075f94ce7699f416ec4ced9873cd Running for 22034s Task: Flushing IntegrationTestLoadAndVerify,\x93\xA385\x81Z\x11\xE6,1435179555993.9f8d0e01a40405b835bf6e5a22a86390. Status: RUNNING:Preparing to flush by snapshotting stores in 9f8d0e01a40405b835bf6e5a22a86390 Running for 22033s Executors: === ... Thread 139 (MemStoreFlusher.1): State: WAITING Blocked count: 139711 Waited count: 239212 Waiting on java.util.concurrent.CountDownLatch$Sync@b9c094a Stack: sun.misc.Unsafe.park(Native Method) java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) org.apache.hadoop.hbase.wal.WALKey.getSequenceId(WALKey.java:305) org.apache.hadoop.hbase.regionserver.HRegion.getNextSequenceId(HRegion.java:2422) org.apache.hadoop.hbase.regionserver.HRegion.internalPrepareFlushCache(HRegion.java:2168) org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2047) org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2011) org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1902) org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:1828) org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:510) org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471) org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$900(MemStoreFlusher.java:75) org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259) java.lang.Thread.run(Thread.java:745) Thread 137 (MemStoreFlusher.0): State: WAITING Blocked count: 138931 Waited count: 237448 Waiting on java.util.concurrent.CountDownLatch$Sync@53f41f76 Stack: sun.misc.Unsafe.park(Native Method) java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) org.apache.hadoop.hbase.wal.WALKey.getSequenceId(WALKey.java:305) org.apache.hadoop.hbase.regionserver.HRegion.getNextSequenceId(HRegion.java:2422) org.apache.hadoop.hbase.regionserver.HRegion.internalPrepareFlushCache(HRegion.java:2168) org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2047) org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2011) org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1902) org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:1828) org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:510) org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471) org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$900(MemStoreFlusher.java:75) org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259) java.lang.Thread.run(Thread.java:745) was: One region server stuck while flushing(possible deadlock). Its trying to flush two regions since last 6 hours (see the screenshot). Caused while running IntegrationTestLoadAndVerify fo
[jira] [Updated] (HBASE-13971) Flushes stuck since 6 hours on a regionserver.
[ https://issues.apache.org/jira/browse/HBASE-13971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13971: - Attachment: screenshot-1.png > Flushes stuck since 6 hours on a regionserver. > -- > > Key: HBASE-13971 > URL: https://issues.apache.org/jira/browse/HBASE-13971 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 1.3.0 > Environment: Caused while running IntegrationTestLoadAndVerify for 20 > M rows on cluster with 32 region servers each with max heap size of 24GBs. >Reporter: Abhilash >Priority: Critical > Attachments: screenshot-1.png > > > One region server stuck while flushing(possible deadlock). Its trying to > flush two regions since last 6 hours (see the screenshot). > Caused while running IntegrationTestLoadAndVerify for 20 M rows with 600 > mapper jobs and 100 back references. ~37 Million writes on each regionserver > till now but no writes happening on any other regionserver from past 6 hours > and their memstore size is zero(I dont know if this is related). But this > particular regionserver has memstore size of 9GBs from past 6 hours. > Relevant snaps from debug dump: > Tasks: > === > Task: Flushing > IntegrationTestLoadAndVerify,R\x9B\x1B\xBF\xAE\x08\xD1\xA2,1435179555993.8e2d075f94ce7699f416ec4ced9873cd. > Status: RUNNING:Preparing to flush by snapshotting stores in > 8e2d075f94ce7699f416ec4ced9873cd > Running for 22034s > Task: Flushing > IntegrationTestLoadAndVerify,\x93\xA385\x81Z\x11\xE6,1435179555993.9f8d0e01a40405b835bf6e5a22a86390. > Status: RUNNING:Preparing to flush by snapshotting stores in > 9f8d0e01a40405b835bf6e5a22a86390 > Running for 22033s > Executors: > === > ... > Thread 139 (MemStoreFlusher.1): > State: WAITING > Blocked count: 139711 > Waited count: 239212 > Waiting on java.util.concurrent.CountDownLatch$Sync@b9c094a > Stack: > sun.misc.Unsafe.park(Native Method) > java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) > > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) > java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) > org.apache.hadoop.hbase.wal.WALKey.getSequenceId(WALKey.java:305) > > org.apache.hadoop.hbase.regionserver.HRegion.getNextSequenceId(HRegion.java:2422) > > org.apache.hadoop.hbase.regionserver.HRegion.internalPrepareFlushCache(HRegion.java:2168) > > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2047) > > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2011) > org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1902) > org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:1828) > > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:510) > > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471) > > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$900(MemStoreFlusher.java:75) > > org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259) > java.lang.Thread.run(Thread.java:745) > Thread 137 (MemStoreFlusher.0): > State: WAITING > Blocked count: 138931 > Waited count: 237448 > Waiting on java.util.concurrent.CountDownLatch$Sync@53f41f76 > Stack: > sun.misc.Unsafe.park(Native Method) > java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) > > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) > java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) > org.apache.hadoop.hbase.wal.WALKey.getSequenceId(WALKey.java:305) > > org.apache.hadoop.hbase.regionserver.HRegion.getNextSequenceId(HRegion.java:2422) > > org.apache.hadoop.hbase.regionserver.HRegion.internalPrepareFlushCache(HRegion.java:2168) > > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2047) > > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2011
[jira] [Updated] (HBASE-13971) Flushes stuck since 6 hours on a regionserver.
[ https://issues.apache.org/jira/browse/HBASE-13971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13971: - Priority: Critical (was: Major) > Flushes stuck since 6 hours on a regionserver. > -- > > Key: HBASE-13971 > URL: https://issues.apache.org/jira/browse/HBASE-13971 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 1.3.0 > Environment: Caused while running IntegrationTestLoadAndVerify for 20 > M rows on cluster with 32 region servers each with max heap size of 24GBs. >Reporter: Abhilash >Priority: Critical > > One region server stuck while flushing(possible deadlock). Its trying to > flush two regions since last 6 hours (see the screenshot). > Caused while running IntegrationTestLoadAndVerify for 20 M rows with 600 > mapper jobs and 100 back references. ~37 Million writes on each regionserver > till now but no writes happening on any other regionserver from past 6 hours > and their memstore size is zero(I dont know if this is related). But this > particular regionserver has memstore size of 9GBs from past 6 hours. > Relevant snaps from debug dump: > Tasks: > === > Task: Flushing > IntegrationTestLoadAndVerify,R\x9B\x1B\xBF\xAE\x08\xD1\xA2,1435179555993.8e2d075f94ce7699f416ec4ced9873cd. > Status: RUNNING:Preparing to flush by snapshotting stores in > 8e2d075f94ce7699f416ec4ced9873cd > Running for 22034s > Task: Flushing > IntegrationTestLoadAndVerify,\x93\xA385\x81Z\x11\xE6,1435179555993.9f8d0e01a40405b835bf6e5a22a86390. > Status: RUNNING:Preparing to flush by snapshotting stores in > 9f8d0e01a40405b835bf6e5a22a86390 > Running for 22033s > Executors: > === > ... > Thread 139 (MemStoreFlusher.1): > State: WAITING > Blocked count: 139711 > Waited count: 239212 > Waiting on java.util.concurrent.CountDownLatch$Sync@b9c094a > Stack: > sun.misc.Unsafe.park(Native Method) > java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) > > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) > java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) > org.apache.hadoop.hbase.wal.WALKey.getSequenceId(WALKey.java:305) > > org.apache.hadoop.hbase.regionserver.HRegion.getNextSequenceId(HRegion.java:2422) > > org.apache.hadoop.hbase.regionserver.HRegion.internalPrepareFlushCache(HRegion.java:2168) > > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2047) > > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2011) > org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1902) > org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:1828) > > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:510) > > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471) > > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$900(MemStoreFlusher.java:75) > > org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259) > java.lang.Thread.run(Thread.java:745) > Thread 137 (MemStoreFlusher.0): > State: WAITING > Blocked count: 138931 > Waited count: 237448 > Waiting on java.util.concurrent.CountDownLatch$Sync@53f41f76 > Stack: > sun.misc.Unsafe.park(Native Method) > java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) > > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) > java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) > org.apache.hadoop.hbase.wal.WALKey.getSequenceId(WALKey.java:305) > > org.apache.hadoop.hbase.regionserver.HRegion.getNextSequenceId(HRegion.java:2422) > > org.apache.hadoop.hbase.regionserver.HRegion.internalPrepareFlushCache(HRegion.java:2168) > > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2047) > > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2011) > org.apache.hadoop.hbase.region
[jira] [Updated] (HBASE-13971) Flushes stuck since 6 hours on a regionserver.
[ https://issues.apache.org/jira/browse/HBASE-13971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13971: - Description: One region server stuck while flushing(possible deadlock). Its trying to flush two regions since last 6 hours (see the screenshot). Caused while running IntegrationTestLoadAndVerify for 20 M rows with 600 mapper jobs and 100 back references. ~37 Million writes on each regionserver till now but no writes happening on any other regionserver from past 6 hours and their memstore size is zero(I dont know if this is related). But this particular regionserver has memstore size of 9GBs from past 6 hours. Relevant snaps from debug dump: Tasks: === Task: Flushing IntegrationTestLoadAndVerify,R\x9B\x1B\xBF\xAE\x08\xD1\xA2,1435179555993.8e2d075f94ce7699f416ec4ced9873cd. Status: RUNNING:Preparing to flush by snapshotting stores in 8e2d075f94ce7699f416ec4ced9873cd Running for 22034s Task: Flushing IntegrationTestLoadAndVerify,\x93\xA385\x81Z\x11\xE6,1435179555993.9f8d0e01a40405b835bf6e5a22a86390. Status: RUNNING:Preparing to flush by snapshotting stores in 9f8d0e01a40405b835bf6e5a22a86390 Running for 22033s Executors: === ... Thread 139 (MemStoreFlusher.1): State: WAITING Blocked count: 139711 Waited count: 239212 Waiting on java.util.concurrent.CountDownLatch$Sync@b9c094a Stack: sun.misc.Unsafe.park(Native Method) java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) org.apache.hadoop.hbase.wal.WALKey.getSequenceId(WALKey.java:305) org.apache.hadoop.hbase.regionserver.HRegion.getNextSequenceId(HRegion.java:2422) org.apache.hadoop.hbase.regionserver.HRegion.internalPrepareFlushCache(HRegion.java:2168) org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2047) org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2011) org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1902) org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:1828) org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:510) org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471) org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$900(MemStoreFlusher.java:75) org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259) java.lang.Thread.run(Thread.java:745) Thread 137 (MemStoreFlusher.0): State: WAITING Blocked count: 138931 Waited count: 237448 Waiting on java.util.concurrent.CountDownLatch$Sync@53f41f76 Stack: sun.misc.Unsafe.park(Native Method) java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) org.apache.hadoop.hbase.wal.WALKey.getSequenceId(WALKey.java:305) org.apache.hadoop.hbase.regionserver.HRegion.getNextSequenceId(HRegion.java:2422) org.apache.hadoop.hbase.regionserver.HRegion.internalPrepareFlushCache(HRegion.java:2168) org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2047) org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2011) org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1902) org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:1828) org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:510) org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471) org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$900(MemStoreFlusher.java:75) org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259) java.lang.Thread.run(Thread.java:745) was: One region server stuck while flushing(possible deadlock). Its trying to flush two regions since last 6 hours (see the screenshot). Caused while running IntegrationTestLoadAndVer
[jira] [Created] (HBASE-13971) Flushes stuck since 6 hours on a regionserver.
Abhilash created HBASE-13971: Summary: Flushes stuck since 6 hours on a regionserver. Key: HBASE-13971 URL: https://issues.apache.org/jira/browse/HBASE-13971 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 1.3.0 Environment: Caused while running IntegrationTestLoadAndVerify for 20 M rows on cluster with 32 region servers each with max heap size of 24GBs. Reporter: Abhilash One region server stuck while flushing(possible deadlock). Its trying to flush two regions since last 6 hours (see the screenshot). Caused while running IntegrationTestLoadAndVerify for 20 M rows with 600 mapper jobs and 100 back references. ~37 Million writes on each regionserver till now but no writes happening on any other regionserver from past 6 hours and their memstore size is zero(I dont know if this is related). But this particular regionserver has memstore size of 9GBs from past 6 hours. Relevant snaps from debug dump: Tasks: === Task: Flushing IntegrationTestLoadAndVerify,R\x9B\x1B\xBF\xAE\x08\xD1\xA2,1435179555993.8e2d075f94ce7699f416ec4ced9873cd. Status: RUNNING:Preparing to flush by snapshotting stores in 8e2d075f94ce7699f416ec4ced9873cd Running for 22034s Task: Flushing IntegrationTestLoadAndVerify,\x93\xA385\x81Z\x11\xE6,1435179555993.9f8d0e01a40405b835bf6e5a22a86390. Status: RUNNING:Preparing to flush by snapshotting stores in 9f8d0e01a40405b835bf6e5a22a86390 Running for 22033s Thread 139 (MemStoreFlusher.1): State: WAITING Blocked count: 139711 Waited count: 239212 Waiting on java.util.concurrent.CountDownLatch$Sync@b9c094a Stack: sun.misc.Unsafe.park(Native Method) java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) org.apache.hadoop.hbase.wal.WALKey.getSequenceId(WALKey.java:305) org.apache.hadoop.hbase.regionserver.HRegion.getNextSequenceId(HRegion.java:2422) org.apache.hadoop.hbase.regionserver.HRegion.internalPrepareFlushCache(HRegion.java:2168) org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2047) org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2011) org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1902) org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:1828) org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:510) org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471) org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$900(MemStoreFlusher.java:75) org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259) java.lang.Thread.run(Thread.java:745) Thread 137 (MemStoreFlusher.0): State: WAITING Blocked count: 138931 Waited count: 237448 Waiting on java.util.concurrent.CountDownLatch$Sync@53f41f76 Stack: sun.misc.Unsafe.park(Native Method) java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) org.apache.hadoop.hbase.wal.WALKey.getSequenceId(WALKey.java:305) org.apache.hadoop.hbase.regionserver.HRegion.getNextSequenceId(HRegion.java:2422) org.apache.hadoop.hbase.regionserver.HRegion.internalPrepareFlushCache(HRegion.java:2168) org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2047) org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2011) org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1902) org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:1828) org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:510) org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471) org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$900(MemStoreFlusher.java:75) org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandle
[jira] [Commented] (HBASE-13863) Multi-wal feature breaks reported number and size of HLogs
[ https://issues.apache.org/jira/browse/HBASE-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600351#comment-14600351 ] Abhilash commented on HBASE-13863: -- The failed test does not look related to the patch(patch passes that test on my local machine). Trying for re-run. > Multi-wal feature breaks reported number and size of HLogs > -- > > Key: HBASE-13863 > URL: https://issues.apache.org/jira/browse/HBASE-13863 > Project: HBase > Issue Type: Bug > Components: regionserver, UI >Reporter: Elliott Clark >Assignee: Abhilash > Attachments: HBASE-13863-v1.patch, HBASE-13863-v1.patch, > HBASE-13863-v1.patch, HBASE-13863-v1.patch, HBASE-13863.patch > > > When multi-wal is enabled the number and size of retained HLogs is always > reported as zero. > We should fix this so that the numbers are the sum of all retained logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13863) Multi-wal feature breaks reported number and size of HLogs
[ https://issues.apache.org/jira/browse/HBASE-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13863: - Attachment: HBASE-13863-v1.patch > Multi-wal feature breaks reported number and size of HLogs > -- > > Key: HBASE-13863 > URL: https://issues.apache.org/jira/browse/HBASE-13863 > Project: HBase > Issue Type: Bug > Components: regionserver, UI >Reporter: Elliott Clark >Assignee: Abhilash > Attachments: HBASE-13863-v1.patch, HBASE-13863-v1.patch, > HBASE-13863-v1.patch, HBASE-13863-v1.patch, HBASE-13863.patch > > > When multi-wal is enabled the number and size of retained HLogs is always > reported as zero. > We should fix this so that the numbers are the sum of all retained logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13863) Multi-wal feature breaks reported number and size of HLogs
[ https://issues.apache.org/jira/browse/HBASE-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13863: - Component/s: UI regionserver > Multi-wal feature breaks reported number and size of HLogs > -- > > Key: HBASE-13863 > URL: https://issues.apache.org/jira/browse/HBASE-13863 > Project: HBase > Issue Type: Bug > Components: regionserver, UI >Reporter: Elliott Clark >Assignee: Abhilash > Attachments: HBASE-13863-v1.patch, HBASE-13863-v1.patch, > HBASE-13863-v1.patch, HBASE-13863.patch > > > When multi-wal is enabled the number and size of retained HLogs is always > reported as zero. > We should fix this so that the numbers are the sum of all retained logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13863) Multi-wal feature breaks reported number and size of HLogs
[ https://issues.apache.org/jira/browse/HBASE-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14599863#comment-14599863 ] Abhilash commented on HBASE-13863: -- Reattached to check if this compiles now after HBASE-13948 got reverted. > Multi-wal feature breaks reported number and size of HLogs > -- > > Key: HBASE-13863 > URL: https://issues.apache.org/jira/browse/HBASE-13863 > Project: HBase > Issue Type: Bug >Reporter: Elliott Clark >Assignee: Abhilash > Attachments: HBASE-13863-v1.patch, HBASE-13863-v1.patch, > HBASE-13863-v1.patch, HBASE-13863.patch > > > When multi-wal is enabled the number and size of retained HLogs is always > reported as zero. > We should fix this so that the numbers are the sum of all retained logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13863) Multi-wal feature breaks reported number and size of HLogs
[ https://issues.apache.org/jira/browse/HBASE-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13863: - Attachment: HBASE-13863-v1.patch > Multi-wal feature breaks reported number and size of HLogs > -- > > Key: HBASE-13863 > URL: https://issues.apache.org/jira/browse/HBASE-13863 > Project: HBase > Issue Type: Bug >Reporter: Elliott Clark >Assignee: Abhilash > Attachments: HBASE-13863-v1.patch, HBASE-13863-v1.patch, > HBASE-13863-v1.patch, HBASE-13863.patch > > > When multi-wal is enabled the number and size of retained HLogs is always > reported as zero. > We should fix this so that the numbers are the sum of all retained logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13863) Multi-wal feature breaks reported number and size of HLogs
[ https://issues.apache.org/jira/browse/HBASE-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13863: - Attachment: HBASE-13863-v1.patch > Multi-wal feature breaks reported number and size of HLogs > -- > > Key: HBASE-13863 > URL: https://issues.apache.org/jira/browse/HBASE-13863 > Project: HBase > Issue Type: Bug >Reporter: Elliott Clark >Assignee: Abhilash > Attachments: HBASE-13863-v1.patch, HBASE-13863-v1.patch, > HBASE-13863.patch > > > When multi-wal is enabled the number and size of retained HLogs is always > reported as zero. > We should fix this so that the numbers are the sum of all retained logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13863) Multi-wal feature breaks reported number and size of HLogs
[ https://issues.apache.org/jira/browse/HBASE-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14598717#comment-14598717 ] Abhilash commented on HBASE-13863: -- Done. Thanks a lot ^_^ . > Multi-wal feature breaks reported number and size of HLogs > -- > > Key: HBASE-13863 > URL: https://issues.apache.org/jira/browse/HBASE-13863 > Project: HBase > Issue Type: Bug >Reporter: Elliott Clark >Assignee: Abhilash > Attachments: HBASE-13863-v1.patch, HBASE-13863.patch > > > When multi-wal is enabled the number and size of retained HLogs is always > reported as zero. > We should fix this so that the numbers are the sum of all retained logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13863) Multi-wal feature breaks reported number and size of HLogs
[ https://issues.apache.org/jira/browse/HBASE-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13863: - Attachment: HBASE-13863-v1.patch > Multi-wal feature breaks reported number and size of HLogs > -- > > Key: HBASE-13863 > URL: https://issues.apache.org/jira/browse/HBASE-13863 > Project: HBase > Issue Type: Bug >Reporter: Elliott Clark >Assignee: Abhilash > Attachments: HBASE-13863-v1.patch, HBASE-13863.patch > > > When multi-wal is enabled the number and size of retained HLogs is always > reported as zero. > We should fix this so that the numbers are the sum of all retained logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13863) Multi-wal feature breaks reported number and size of HLogs
[ https://issues.apache.org/jira/browse/HBASE-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13863: - Attachment: HBASE-13863.patch > Multi-wal feature breaks reported number and size of HLogs > -- > > Key: HBASE-13863 > URL: https://issues.apache.org/jira/browse/HBASE-13863 > Project: HBase > Issue Type: Bug >Reporter: Elliott Clark >Assignee: Abhilash > Attachments: HBASE-13863.patch > > > When multi-wal is enabled the number and size of retained HLogs is always > reported as zero. > We should fix this so that the numbers are the sum of all retained logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13863) Multi-wal feature breaks reported number and size of HLogs
[ https://issues.apache.org/jira/browse/HBASE-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13863: - Attachment: (was: HBASE-13863.patch) > Multi-wal feature breaks reported number and size of HLogs > -- > > Key: HBASE-13863 > URL: https://issues.apache.org/jira/browse/HBASE-13863 > Project: HBase > Issue Type: Bug >Reporter: Elliott Clark >Assignee: Abhilash > > When multi-wal is enabled the number and size of retained HLogs is always > reported as zero. > We should fix this so that the numbers are the sum of all retained logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13863) Multi-wal feature breaks reported number and size of HLogs
[ https://issues.apache.org/jira/browse/HBASE-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13863: - Attachment: HBASE-13863.patch > Multi-wal feature breaks reported number and size of HLogs > -- > > Key: HBASE-13863 > URL: https://issues.apache.org/jira/browse/HBASE-13863 > Project: HBase > Issue Type: Bug >Reporter: Elliott Clark >Assignee: Abhilash > Attachments: HBASE-13863.patch > > > When multi-wal is enabled the number and size of retained HLogs is always > reported as zero. > We should fix this so that the numbers are the sum of all retained logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HBASE-13863) Multi-wal feature breaks reported number and size of HLogs
[ https://issues.apache.org/jira/browse/HBASE-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash reassigned HBASE-13863: Assignee: Abhilash > Multi-wal feature breaks reported number and size of HLogs > -- > > Key: HBASE-13863 > URL: https://issues.apache.org/jira/browse/HBASE-13863 > Project: HBase > Issue Type: Bug >Reporter: Elliott Clark >Assignee: Abhilash > > When multi-wal is enabled the number and size of retained HLogs is always > reported as zero. > We should fix this so that the numbers are the sum of all retained logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13876) Improving performance of HeapMemoryManager
[ https://issues.apache.org/jira/browse/HBASE-13876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13876: - Attachment: (was: HBASE-13876-v5.patch) > Improving performance of HeapMemoryManager > -- > > Key: HBASE-13876 > URL: https://issues.apache.org/jira/browse/HBASE-13876 > Project: HBase > Issue Type: Improvement > Components: hbase, regionserver >Affects Versions: 2.0.0, 1.0.1, 1.1.0, 1.1.1 >Reporter: Abhilash >Assignee: Abhilash >Priority: Minor > Attachments: HBASE-13876-v2.patch, HBASE-13876-v3.patch, > HBASE-13876-v4.patch, HBASE-13876-v5.patch, HBASE-13876-v6.patch, > HBASE-13876-v7.patch, HBASE-13876.patch > > > I am trying to improve the performance of DefaultHeapMemoryTuner by > introducing some more checks. The current checks under which the > DefaultHeapMemoryTuner works are very rare so I am trying to weaken these > checks to improve its performance. > Check current memstore size and current block cache size. For say if we are > using less than 50% of currently available block cache size we say block > cache is sufficient and same for memstore. This check will be very effective > when server is either load heavy or write heavy. Earlier version just waited > for number of evictions / number of flushes to be zero which are very rare. > Otherwise based on percent change in number of cache misses and number of > flushes we increase / decrease memory provided for caching / memstore. After > doing so, on next call of HeapMemoryTuner we verify that last change has > indeed decreased number of evictions / flush either of which it was expected > to do. We also check that it does not make the other (evictions / flush) > increase much. I am doing this analysis by comparing percent change (which is > basically nothing but normalized derivative) of number of evictions and > number of flushes during last two periods. The main motive for doing this was > that if we have random reads then we will be having a lot of cache misses. > But even after increasing block cache we wont be able to decrease number of > cache misses and we will revert back and eventually we will not waste memory > on block caches. This will also help us ignore random short term spikes in > reads / writes. I have also tried to take care not to tune memory if do do > not have enough hints as unnecessary tuning my slow down the system. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13876) Improving performance of HeapMemoryManager
[ https://issues.apache.org/jira/browse/HBASE-13876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13876: - Attachment: HBASE-13876-v7.patch > Improving performance of HeapMemoryManager > -- > > Key: HBASE-13876 > URL: https://issues.apache.org/jira/browse/HBASE-13876 > Project: HBase > Issue Type: Improvement > Components: hbase, regionserver >Affects Versions: 2.0.0, 1.0.1, 1.1.0, 1.1.1 >Reporter: Abhilash >Assignee: Abhilash >Priority: Minor > Attachments: HBASE-13876-v2.patch, HBASE-13876-v3.patch, > HBASE-13876-v4.patch, HBASE-13876-v5.patch, HBASE-13876-v5.patch, > HBASE-13876-v6.patch, HBASE-13876-v7.patch, HBASE-13876.patch > > > I am trying to improve the performance of DefaultHeapMemoryTuner by > introducing some more checks. The current checks under which the > DefaultHeapMemoryTuner works are very rare so I am trying to weaken these > checks to improve its performance. > Check current memstore size and current block cache size. For say if we are > using less than 50% of currently available block cache size we say block > cache is sufficient and same for memstore. This check will be very effective > when server is either load heavy or write heavy. Earlier version just waited > for number of evictions / number of flushes to be zero which are very rare. > Otherwise based on percent change in number of cache misses and number of > flushes we increase / decrease memory provided for caching / memstore. After > doing so, on next call of HeapMemoryTuner we verify that last change has > indeed decreased number of evictions / flush either of which it was expected > to do. We also check that it does not make the other (evictions / flush) > increase much. I am doing this analysis by comparing percent change (which is > basically nothing but normalized derivative) of number of evictions and > number of flushes during last two periods. The main motive for doing this was > that if we have random reads then we will be having a lot of cache misses. > But even after increasing block cache we wont be able to decrease number of > cache misses and we will revert back and eventually we will not waste memory > on block caches. This will also help us ignore random short term spikes in > reads / writes. I have also tried to take care not to tune memory if do do > not have enough hints as unnecessary tuning my slow down the system. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13876) Improving performance of HeapMemoryManager
[ https://issues.apache.org/jira/browse/HBASE-13876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14582778#comment-14582778 ] Abhilash commented on HBASE-13876: -- Using stats from few past lookup periods(configurable) to decide tuner step. > Improving performance of HeapMemoryManager > -- > > Key: HBASE-13876 > URL: https://issues.apache.org/jira/browse/HBASE-13876 > Project: HBase > Issue Type: Improvement > Components: hbase, regionserver >Affects Versions: 2.0.0, 1.0.1, 1.1.0, 1.1.1 >Reporter: Abhilash >Assignee: Abhilash >Priority: Minor > Attachments: HBASE-13876-v2.patch, HBASE-13876-v3.patch, > HBASE-13876-v4.patch, HBASE-13876-v5.patch, HBASE-13876-v5.patch, > HBASE-13876-v6.patch, HBASE-13876.patch > > > I am trying to improve the performance of DefaultHeapMemoryTuner by > introducing some more checks. The current checks under which the > DefaultHeapMemoryTuner works are very rare so I am trying to weaken these > checks to improve its performance. > Check current memstore size and current block cache size. For say if we are > using less than 50% of currently available block cache size we say block > cache is sufficient and same for memstore. This check will be very effective > when server is either load heavy or write heavy. Earlier version just waited > for number of evictions / number of flushes to be zero which are very rare. > Otherwise based on percent change in number of cache misses and number of > flushes we increase / decrease memory provided for caching / memstore. After > doing so, on next call of HeapMemoryTuner we verify that last change has > indeed decreased number of evictions / flush either of which it was expected > to do. We also check that it does not make the other (evictions / flush) > increase much. I am doing this analysis by comparing percent change (which is > basically nothing but normalized derivative) of number of evictions and > number of flushes during last two periods. The main motive for doing this was > that if we have random reads then we will be having a lot of cache misses. > But even after increasing block cache we wont be able to decrease number of > cache misses and we will revert back and eventually we will not waste memory > on block caches. This will also help us ignore random short term spikes in > reads / writes. I have also tried to take care not to tune memory if do do > not have enough hints as unnecessary tuning my slow down the system. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13876) Improving performance of HeapMemoryManager
[ https://issues.apache.org/jira/browse/HBASE-13876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13876: - Attachment: HBASE-13876-v6.patch > Improving performance of HeapMemoryManager > -- > > Key: HBASE-13876 > URL: https://issues.apache.org/jira/browse/HBASE-13876 > Project: HBase > Issue Type: Improvement > Components: hbase, regionserver >Affects Versions: 2.0.0, 1.0.1, 1.1.0, 1.1.1 >Reporter: Abhilash >Assignee: Abhilash >Priority: Minor > Attachments: HBASE-13876-v2.patch, HBASE-13876-v3.patch, > HBASE-13876-v4.patch, HBASE-13876-v5.patch, HBASE-13876-v5.patch, > HBASE-13876-v6.patch, HBASE-13876.patch > > > I am trying to improve the performance of DefaultHeapMemoryTuner by > introducing some more checks. The current checks under which the > DefaultHeapMemoryTuner works are very rare so I am trying to weaken these > checks to improve its performance. > Check current memstore size and current block cache size. For say if we are > using less than 50% of currently available block cache size we say block > cache is sufficient and same for memstore. This check will be very effective > when server is either load heavy or write heavy. Earlier version just waited > for number of evictions / number of flushes to be zero which are very rare. > Otherwise based on percent change in number of cache misses and number of > flushes we increase / decrease memory provided for caching / memstore. After > doing so, on next call of HeapMemoryTuner we verify that last change has > indeed decreased number of evictions / flush either of which it was expected > to do. We also check that it does not make the other (evictions / flush) > increase much. I am doing this analysis by comparing percent change (which is > basically nothing but normalized derivative) of number of evictions and > number of flushes during last two periods. The main motive for doing this was > that if we have random reads then we will be having a lot of cache misses. > But even after increasing block cache we wont be able to decrease number of > cache misses and we will revert back and eventually we will not waste memory > on block caches. This will also help us ignore random short term spikes in > reads / writes. I have also tried to take care not to tune memory if do do > not have enough hints as unnecessary tuning my slow down the system. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13876) Improving performance of HeapMemoryManager
[ https://issues.apache.org/jira/browse/HBASE-13876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13876: - Description: I am trying to improve the performance of DefaultHeapMemoryTuner by introducing some more checks. The current checks under which the DefaultHeapMemoryTuner works are very rare so I am trying to weaken these checks to improve its performance. Check current memstore size and current block cache size. For say if we are using less than 50% of currently available block cache size we say block cache is sufficient and same for memstore. This check will be very effective when server is either load heavy or write heavy. Earlier version just waited for number of evictions / number of flushes to be zero which are very rare. Otherwise based on percent change in number of cache misses and number of flushes we increase / decrease memory provided for caching / memstore. After doing so, on next call of HeapMemoryTuner we verify that last change has indeed decreased number of evictions / flush either of which it was expected to do. We also check that it does not make the other (evictions / flush) increase much. I am doing this analysis by comparing percent change (which is basically nothing but normalized derivative) of number of evictions and number of flushes during last two periods. The main motive for doing this was that if we have random reads then we will be having a lot of cache misses. But even after increasing block cache we wont be able to decrease number of cache misses and we will revert back and eventually we will not waste memory on block caches. This will also help us ignore random short term spikes in reads / writes. I have also tried to take care not to tune memory if do do not have enough hints as unnecessary tuning my slow down the system. was: I am trying to improve the performance of DefaultHeapMemoryTuner by introducing some more checks. The current checks under which the DefaultHeapMemoryTuner works are very rare so I am trying to weaken these checks to improve its performance. Check current memstore size and current block cache size. If we are using less than 50% of currently available block cache size we say block cache is sufficient and same for memstore. This check will be very effective when server is either load heavy or write heavy. Earlier version just waited for number of evictions / number of flushes to be zero which are very rare. Otherwise based on percent change in number of cache misses and number of flushes we increase / decrease memory provided for caching / memstore. After doing so, on next call of HeapMemoryTuner we verify that last change has indeed decreased number of evictions / flush ( combined). I am doing this analysis by comparing percent change (which is basically nothing but normalized derivative) of number of evictions and number of flushes during last two periods. The main motive for doing this was that if we have random reads then we will be having a lot of cache misses. But even after increasing block cache we wont be able to decrease number of cache misses and we will revert back and eventually we will not waste memory on block caches. This will also help us ignore random short term spikes in reads / writes. > Improving performance of HeapMemoryManager > -- > > Key: HBASE-13876 > URL: https://issues.apache.org/jira/browse/HBASE-13876 > Project: HBase > Issue Type: Improvement > Components: hbase, regionserver >Affects Versions: 2.0.0, 1.0.1, 1.1.0, 1.1.1 >Reporter: Abhilash >Assignee: Abhilash >Priority: Minor > Attachments: HBASE-13876-v2.patch, HBASE-13876-v3.patch, > HBASE-13876-v4.patch, HBASE-13876-v5.patch, HBASE-13876-v5.patch, > HBASE-13876.patch > > > I am trying to improve the performance of DefaultHeapMemoryTuner by > introducing some more checks. The current checks under which the > DefaultHeapMemoryTuner works are very rare so I am trying to weaken these > checks to improve its performance. > Check current memstore size and current block cache size. For say if we are > using less than 50% of currently available block cache size we say block > cache is sufficient and same for memstore. This check will be very effective > when server is either load heavy or write heavy. Earlier version just waited > for number of evictions / number of flushes to be zero which are very rare. > Otherwise based on percent change in number of cache misses and number of > flushes we increase / decrease memory provided for caching / memstore. After > doing so, on next call of HeapMemoryTuner we verify that last change has > indeed decreased number of evictions / flush either of which it was expected > to do. We also check that it does not make the other (evictions / fl
[jira] [Commented] (HBASE-13876) Improving performance of HeapMemoryManager
[ https://issues.apache.org/jira/browse/HBASE-13876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14581503#comment-14581503 ] Abhilash commented on HBASE-13876: -- Reattached the patch because it passes that failed test on my machine. > Improving performance of HeapMemoryManager > -- > > Key: HBASE-13876 > URL: https://issues.apache.org/jira/browse/HBASE-13876 > Project: HBase > Issue Type: Improvement > Components: hbase, regionserver >Affects Versions: 2.0.0, 1.0.1, 1.1.0, 1.1.1 >Reporter: Abhilash >Assignee: Abhilash >Priority: Minor > Attachments: HBASE-13876-v2.patch, HBASE-13876-v3.patch, > HBASE-13876-v4.patch, HBASE-13876-v5.patch, HBASE-13876-v5.patch, > HBASE-13876.patch > > > I am trying to improve the performance of DefaultHeapMemoryTuner by > introducing some more checks. The current checks under which the > DefaultHeapMemoryTuner works are very rare so I am trying to weaken these > checks to improve its performance. > Check current memstore size and current block cache size. If we are using > less than 50% of currently available block cache size we say block cache is > sufficient and same for memstore. This check will be very effective when > server is either load heavy or write heavy. Earlier version just waited for > number of evictions / number of flushes to be zero which are very rare. > Otherwise based on percent change in number of cache misses and number of > flushes we increase / decrease memory provided for caching / memstore. After > doing so, on next call of HeapMemoryTuner we verify that last change has > indeed decreased number of evictions / flush ( combined). I am doing this > analysis by comparing percent change (which is basically nothing but > normalized derivative) of number of evictions and number of flushes during > last two periods. The main motive for doing this was that if we have random > reads then we will be having a lot of cache misses. But even after increasing > block cache we wont be able to decrease number of cache misses and we will > revert back and eventually we will not waste memory on block caches. This > will also help us ignore random short term spikes in reads / writes. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13876) Improving performance of HeapMemoryManager
[ https://issues.apache.org/jira/browse/HBASE-13876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13876: - Attachment: HBASE-13876-v5.patch > Improving performance of HeapMemoryManager > -- > > Key: HBASE-13876 > URL: https://issues.apache.org/jira/browse/HBASE-13876 > Project: HBase > Issue Type: Improvement > Components: hbase, regionserver >Affects Versions: 2.0.0, 1.0.1, 1.1.0, 1.1.1 >Reporter: Abhilash >Assignee: Abhilash >Priority: Minor > Attachments: HBASE-13876-v2.patch, HBASE-13876-v3.patch, > HBASE-13876-v4.patch, HBASE-13876-v5.patch, HBASE-13876-v5.patch, > HBASE-13876.patch > > > I am trying to improve the performance of DefaultHeapMemoryTuner by > introducing some more checks. The current checks under which the > DefaultHeapMemoryTuner works are very rare so I am trying to weaken these > checks to improve its performance. > Check current memstore size and current block cache size. If we are using > less than 50% of currently available block cache size we say block cache is > sufficient and same for memstore. This check will be very effective when > server is either load heavy or write heavy. Earlier version just waited for > number of evictions / number of flushes to be zero which are very rare. > Otherwise based on percent change in number of cache misses and number of > flushes we increase / decrease memory provided for caching / memstore. After > doing so, on next call of HeapMemoryTuner we verify that last change has > indeed decreased number of evictions / flush ( combined). I am doing this > analysis by comparing percent change (which is basically nothing but > normalized derivative) of number of evictions and number of flushes during > last two periods. The main motive for doing this was that if we have random > reads then we will be having a lot of cache misses. But even after increasing > block cache we wont be able to decrease number of cache misses and we will > revert back and eventually we will not waste memory on block caches. This > will also help us ignore random short term spikes in reads / writes. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13876) Improving performance of HeapMemoryManager
[ https://issues.apache.org/jira/browse/HBASE-13876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13876: - Attachment: HBASE-13876-v5.patch > Improving performance of HeapMemoryManager > -- > > Key: HBASE-13876 > URL: https://issues.apache.org/jira/browse/HBASE-13876 > Project: HBase > Issue Type: Improvement > Components: hbase, regionserver >Affects Versions: 2.0.0, 1.0.1, 1.1.0, 1.1.1 >Reporter: Abhilash >Assignee: Abhilash >Priority: Minor > Attachments: HBASE-13876-v2.patch, HBASE-13876-v3.patch, > HBASE-13876-v4.patch, HBASE-13876-v5.patch, HBASE-13876.patch > > > I am trying to improve the performance of DefaultHeapMemoryTuner by > introducing some more checks. The current checks under which the > DefaultHeapMemoryTuner works are very rare so I am trying to weaken these > checks to improve its performance. > Check current memstore size and current block cache size. If we are using > less than 50% of currently available block cache size we say block cache is > sufficient and same for memstore. This check will be very effective when > server is either load heavy or write heavy. Earlier version just waited for > number of evictions / number of flushes to be zero which are very rare. > Otherwise based on percent change in number of cache misses and number of > flushes we increase / decrease memory provided for caching / memstore. After > doing so, on next call of HeapMemoryTuner we verify that last change has > indeed decreased number of evictions / flush ( combined). I am doing this > analysis by comparing percent change (which is basically nothing but > normalized derivative) of number of evictions and number of flushes during > last two periods. The main motive for doing this was that if we have random > reads then we will be having a lot of cache misses. But even after increasing > block cache we wont be able to decrease number of cache misses and we will > revert back and eventually we will not waste memory on block caches. This > will also help us ignore random short term spikes in reads / writes. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13876) Improving performance of HeapMemoryManager
[ https://issues.apache.org/jira/browse/HBASE-13876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14581228#comment-14581228 ] Abhilash commented on HBASE-13876: -- I am trying to use evictCount to decide wether I should revert or not because because thats strong indicator that cache size is insufficient and I am using missRatio to do performance tuning as missRation is not a direct indicator of that cache size is insufficient but it is a parameter that we would like to improve for better performance. > Improving performance of HeapMemoryManager > -- > > Key: HBASE-13876 > URL: https://issues.apache.org/jira/browse/HBASE-13876 > Project: HBase > Issue Type: Improvement > Components: hbase, regionserver >Affects Versions: 2.0.0, 1.0.1, 1.1.0, 1.1.1 >Reporter: Abhilash >Assignee: Abhilash >Priority: Minor > Attachments: HBASE-13876-v2.patch, HBASE-13876-v3.patch, > HBASE-13876-v4.patch, HBASE-13876.patch > > > I am trying to improve the performance of DefaultHeapMemoryTuner by > introducing some more checks. The current checks under which the > DefaultHeapMemoryTuner works are very rare so I am trying to weaken these > checks to improve its performance. > Check current memstore size and current block cache size. If we are using > less than 50% of currently available block cache size we say block cache is > sufficient and same for memstore. This check will be very effective when > server is either load heavy or write heavy. Earlier version just waited for > number of evictions / number of flushes to be zero which are very rare. > Otherwise based on percent change in number of cache misses and number of > flushes we increase / decrease memory provided for caching / memstore. After > doing so, on next call of HeapMemoryTuner we verify that last change has > indeed decreased number of evictions / flush ( combined). I am doing this > analysis by comparing percent change (which is basically nothing but > normalized derivative) of number of evictions and number of flushes during > last two periods. The main motive for doing this was that if we have random > reads then we will be having a lot of cache misses. But even after increasing > block cache we wont be able to decrease number of cache misses and we will > revert back and eventually we will not waste memory on block caches. This > will also help us ignore random short term spikes in reads / writes. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13876) Improving performance of HeapMemoryManager
[ https://issues.apache.org/jira/browse/HBASE-13876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14581227#comment-14581227 ] Abhilash commented on HBASE-13876: -- I am trying to use evictCount to decide wether I should revert or not because because thats strong indicator that cache size is insufficient and I am using missRatio to do performance tuning as missRation is not a direct indicator of that cache size is insufficient but it is a parameter that we would like to improve for better performance. > Improving performance of HeapMemoryManager > -- > > Key: HBASE-13876 > URL: https://issues.apache.org/jira/browse/HBASE-13876 > Project: HBase > Issue Type: Improvement > Components: hbase, regionserver >Affects Versions: 2.0.0, 1.0.1, 1.1.0, 1.1.1 >Reporter: Abhilash >Assignee: Abhilash >Priority: Minor > Attachments: HBASE-13876-v2.patch, HBASE-13876-v3.patch, > HBASE-13876-v4.patch, HBASE-13876.patch > > > I am trying to improve the performance of DefaultHeapMemoryTuner by > introducing some more checks. The current checks under which the > DefaultHeapMemoryTuner works are very rare so I am trying to weaken these > checks to improve its performance. > Check current memstore size and current block cache size. If we are using > less than 50% of currently available block cache size we say block cache is > sufficient and same for memstore. This check will be very effective when > server is either load heavy or write heavy. Earlier version just waited for > number of evictions / number of flushes to be zero which are very rare. > Otherwise based on percent change in number of cache misses and number of > flushes we increase / decrease memory provided for caching / memstore. After > doing so, on next call of HeapMemoryTuner we verify that last change has > indeed decreased number of evictions / flush ( combined). I am doing this > analysis by comparing percent change (which is basically nothing but > normalized derivative) of number of evictions and number of flushes during > last two periods. The main motive for doing this was that if we have random > reads then we will be having a lot of cache misses. But even after increasing > block cache we wont be able to decrease number of cache misses and we will > revert back and eventually we will not waste memory on block caches. This > will also help us ignore random short term spikes in reads / writes. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13876) Improving performance of HeapMemoryManager
[ https://issues.apache.org/jira/browse/HBASE-13876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14581207#comment-14581207 ] Abhilash commented on HBASE-13876: -- Thanks a lot for the reviews. Point of adding up percent changes was that in most of cases when we try to decrease evictCounts by increasing blockCache size the number of flushes increases so that would help to create balance. For example suppose after increasing block cache size we have significant increase in number of flushes ( more than tolerance level ) but at the same time we have very huge decrease in cache misses ( maybe we were using cache size just less than working set size ) separating them would lead to no tuner operation in such cases. Trying to implement other suggestions. > Improving performance of HeapMemoryManager > -- > > Key: HBASE-13876 > URL: https://issues.apache.org/jira/browse/HBASE-13876 > Project: HBase > Issue Type: Improvement > Components: hbase, regionserver >Affects Versions: 2.0.0, 1.0.1, 1.1.0, 1.1.1 >Reporter: Abhilash >Assignee: Abhilash >Priority: Minor > Attachments: HBASE-13876-v2.patch, HBASE-13876-v3.patch, > HBASE-13876-v4.patch, HBASE-13876.patch > > > I am trying to improve the performance of DefaultHeapMemoryTuner by > introducing some more checks. The current checks under which the > DefaultHeapMemoryTuner works are very rare so I am trying to weaken these > checks to improve its performance. > Check current memstore size and current block cache size. If we are using > less than 50% of currently available block cache size we say block cache is > sufficient and same for memstore. This check will be very effective when > server is either load heavy or write heavy. Earlier version just waited for > number of evictions / number of flushes to be zero which are very rare. > Otherwise based on percent change in number of cache misses and number of > flushes we increase / decrease memory provided for caching / memstore. After > doing so, on next call of HeapMemoryTuner we verify that last change has > indeed decreased number of evictions / flush ( combined). I am doing this > analysis by comparing percent change (which is basically nothing but > normalized derivative) of number of evictions and number of flushes during > last two periods. The main motive for doing this was that if we have random > reads then we will be having a lot of cache misses. But even after increasing > block cache we wont be able to decrease number of cache misses and we will > revert back and eventually we will not waste memory on block caches. This > will also help us ignore random short term spikes in reads / writes. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13876) Improving performance of HeapMemoryManager
[ https://issues.apache.org/jira/browse/HBASE-13876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13876: - Attachment: HBASE-13876-v4.patch > Improving performance of HeapMemoryManager > -- > > Key: HBASE-13876 > URL: https://issues.apache.org/jira/browse/HBASE-13876 > Project: HBase > Issue Type: Improvement > Components: hbase, regionserver >Affects Versions: 2.0.0, 1.0.1, 1.1.0, 1.1.1 >Reporter: Abhilash >Assignee: Abhilash >Priority: Minor > Attachments: HBASE-13876-v2.patch, HBASE-13876-v3.patch, > HBASE-13876-v4.patch, HBASE-13876.patch > > > I am trying to improve the performance of DefaultHeapMemoryTuner by > introducing some more checks. The current checks under which the > DefaultHeapMemoryTuner works are very rare so I am trying to weaken these > checks to improve its performance. > Check current memstore size and current block cache size. If we are using > less than 50% of currently available block cache size we say block cache is > sufficient and same for memstore. This check will be very effective when > server is either load heavy or write heavy. Earlier version just waited for > number of evictions / number of flushes to be zero which are very rare. > Otherwise based on percent change in number of cache misses and number of > flushes we increase / decrease memory provided for caching / memstore. After > doing so, on next call of HeapMemoryTuner we verify that last change has > indeed decreased number of evictions / flush ( combined). I am doing this > analysis by comparing percent change (which is basically nothing but > normalized derivative) of number of evictions and number of flushes during > last two periods. The main motive for doing this was that if we have random > reads then we will be having a lot of cache misses. But even after increasing > block cache we wont be able to decrease number of cache misses and we will > revert back and eventually we will not waste memory on block caches. This > will also help us ignore random short term spikes in reads / writes. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13876) Improving performance of HeapMemoryManager
[ https://issues.apache.org/jira/browse/HBASE-13876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14580846#comment-14580846 ] Abhilash commented on HBASE-13876: -- Thanks a lot for the reviews. Going to use ENUM and introducing NEUTRAL step to remove unnecessary tuning. Which will have better use of the final variable "tolerance". Thinking about how to remove constants like 50%. Possibly provide a static final member to adjust the range when we say memstore/block cache is sufficient or not. I was using current value of number of evictCount and number of flushes because they are non zero if program enter that block but previous evictCount and flushes may be zero. Thinking about how to use previous values without introducing lots of checks. > Improving performance of HeapMemoryManager > -- > > Key: HBASE-13876 > URL: https://issues.apache.org/jira/browse/HBASE-13876 > Project: HBase > Issue Type: Improvement > Components: hbase, regionserver >Affects Versions: 2.0.0, 1.0.1, 1.1.0, 1.1.1 >Reporter: Abhilash >Assignee: Abhilash >Priority: Minor > Attachments: HBASE-13876-v2.patch, HBASE-13876-v3.patch, > HBASE-13876.patch > > > I am trying to improve the performance of DefaultHeapMemoryTuner by > introducing some more checks. The current checks under which the > DefaultHeapMemoryTuner works are very rare so I am trying to weaken these > checks to improve its performance. > Check current memstore size and current block cache size. If we are using > less than 50% of currently available block cache size we say block cache is > sufficient and same for memstore. This check will be very effective when > server is either load heavy or write heavy. Earlier version just waited for > number of evictions / number of flushes to be zero which are very rare. > Otherwise based on percent change in number of cache misses and number of > flushes we increase / decrease memory provided for caching / memstore. After > doing so, on next call of HeapMemoryTuner we verify that last change has > indeed decreased number of evictions / flush ( combined). I am doing this > analysis by comparing percent change (which is basically nothing but > normalized derivative) of number of evictions and number of flushes during > last two periods. The main motive for doing this was that if we have random > reads then we will be having a lot of cache misses. But even after increasing > block cache we wont be able to decrease number of cache misses and we will > revert back and eventually we will not waste memory on block caches. This > will also help us ignore random short term spikes in reads / writes. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13876) Improving performance of HeapMemoryManager
[ https://issues.apache.org/jira/browse/HBASE-13876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13876: - Attachment: HBASE-13876-v3.patch > Improving performance of HeapMemoryManager > -- > > Key: HBASE-13876 > URL: https://issues.apache.org/jira/browse/HBASE-13876 > Project: HBase > Issue Type: Improvement > Components: hbase, regionserver >Affects Versions: 2.0.0, 1.0.1, 1.1.0, 1.1.1 >Reporter: Abhilash >Assignee: Abhilash >Priority: Minor > Attachments: HBASE-13876-v2.patch, HBASE-13876-v3.patch, > HBASE-13876.patch > > > I am trying to improve the performance of DefaultHeapMemoryTuner by > introducing some more checks. The current checks under which the > DefaultHeapMemoryTuner works are very rare so I am trying to weaken these > checks to improve its performance. > Check current memstore size and current block cache size. If we are using > less than 50% of currently available block cache size we say block cache is > sufficient and same for memstore. This check will be very effective when > server is either load heavy or write heavy. Earlier version just waited for > number of evictions / number of flushes to be zero which are very rare. > Otherwise based on percent change in number of cache misses and number of > flushes we increase / decrease memory provided for caching / memstore. After > doing so, on next call of HeapMemoryTuner we verify that last change has > indeed decreased number of evictions / flush ( combined). I am doing this > analysis by comparing percent change (which is basically nothing but > normalized derivative) of number of evictions and number of flushes during > last two periods. The main motive for doing this was that if we have random > reads then we will be having a lot of cache misses. But even after increasing > block cache we wont be able to decrease number of cache misses and we will > revert back and eventually we will not waste memory on block caches. This > will also help us ignore random short term spikes in reads / writes. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13876) Improving performance of HeapMemoryManager
[ https://issues.apache.org/jira/browse/HBASE-13876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13876: - Attachment: (was: HBASE-13876-v1.patch) > Improving performance of HeapMemoryManager > -- > > Key: HBASE-13876 > URL: https://issues.apache.org/jira/browse/HBASE-13876 > Project: HBase > Issue Type: Improvement > Components: hbase, regionserver >Affects Versions: 2.0.0, 1.0.1, 1.1.0, 1.1.1 >Reporter: Abhilash >Assignee: Abhilash >Priority: Minor > Attachments: HBASE-13876-v2.patch, HBASE-13876.patch > > > I am trying to improve the performance of DefaultHeapMemoryTuner by > introducing some more checks. The current checks under which the > DefaultHeapMemoryTuner works are very rare so I am trying to weaken these > checks to improve its performance. > Check current memstore size and current block cache size. If we are using > less than 50% of currently available block cache size we say block cache is > sufficient and same for memstore. This check will be very effective when > server is either load heavy or write heavy. Earlier version just waited for > number of evictions / number of flushes to be zero which are very rare. > Otherwise based on percent change in number of cache misses and number of > flushes we increase / decrease memory provided for caching / memstore. After > doing so, on next call of HeapMemoryTuner we verify that last change has > indeed decreased number of evictions / flush ( combined). I am doing this > analysis by comparing percent change (which is basically nothing but > normalized derivative) of number of evictions and number of flushes during > last two periods. The main motive for doing this was that if we have random > reads then we will be having a lot of cache misses. But even after increasing > block cache we wont be able to decrease number of cache misses and we will > revert back and eventually we will not waste memory on block caches. This > will also help us ignore random short term spikes in reads / writes. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13876) Improving performance of HeapMemoryManager
[ https://issues.apache.org/jira/browse/HBASE-13876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13876: - Attachment: HBASE-13876-v2.patch > Improving performance of HeapMemoryManager > -- > > Key: HBASE-13876 > URL: https://issues.apache.org/jira/browse/HBASE-13876 > Project: HBase > Issue Type: Improvement > Components: hbase, regionserver >Affects Versions: 2.0.0, 1.0.1, 1.1.0, 1.1.1 >Reporter: Abhilash >Assignee: Abhilash >Priority: Minor > Attachments: HBASE-13876-v1.patch, HBASE-13876-v2.patch, > HBASE-13876.patch > > > I am trying to improve the performance of DefaultHeapMemoryTuner by > introducing some more checks. The current checks under which the > DefaultHeapMemoryTuner works are very rare so I am trying to weaken these > checks to improve its performance. > Check current memstore size and current block cache size. If we are using > less than 50% of currently available block cache size we say block cache is > sufficient and same for memstore. This check will be very effective when > server is either load heavy or write heavy. Earlier version just waited for > number of evictions / number of flushes to be zero which are very rare. > Otherwise based on percent change in number of cache misses and number of > flushes we increase / decrease memory provided for caching / memstore. After > doing so, on next call of HeapMemoryTuner we verify that last change has > indeed decreased number of evictions / flush ( combined). I am doing this > analysis by comparing percent change (which is basically nothing but > normalized derivative) of number of evictions and number of flushes during > last two periods. The main motive for doing this was that if we have random > reads then we will be having a lot of cache misses. But even after increasing > block cache we wont be able to decrease number of cache misses and we will > revert back and eventually we will not waste memory on block caches. This > will also help us ignore random short term spikes in reads / writes. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13876) Improving performance of HeapMemoryManager
[ https://issues.apache.org/jira/browse/HBASE-13876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13876: - Attachment: HBASE-13876-v1.patch > Improving performance of HeapMemoryManager > -- > > Key: HBASE-13876 > URL: https://issues.apache.org/jira/browse/HBASE-13876 > Project: HBase > Issue Type: Improvement > Components: hbase, regionserver >Affects Versions: 2.0.0, 1.0.1, 1.1.0, 1.1.1 >Reporter: Abhilash >Assignee: Abhilash >Priority: Minor > Attachments: HBASE-13876-v1.patch, HBASE-13876.patch > > > I am trying to improve the performance of DefaultHeapMemoryTuner by > introducing some more checks. The current checks under which the > DefaultHeapMemoryTuner works are very rare so I am trying to weaken these > checks to improve its performance. > Check current memstore size and current block cache size. If we are using > less than 50% of currently available block cache size we say block cache is > sufficient and same for memstore. This check will be very effective when > server is either load heavy or write heavy. Earlier version just waited for > number of evictions / number of flushes to be zero which are very rare. > Otherwise based on percent change in number of cache misses and number of > flushes we increase / decrease memory provided for caching / memstore. After > doing so, on next call of HeapMemoryTuner we verify that last change has > indeed decreased number of evictions / flush ( combined). I am doing this > analysis by comparing percent change (which is basically nothing but > normalized derivative) of number of evictions and number of flushes during > last two periods. The main motive for doing this was that if we have random > reads then we will be having a lot of cache misses. But even after increasing > block cache we wont be able to decrease number of cache misses and we will > revert back and eventually we will not waste memory on block caches. This > will also help us ignore random short term spikes in reads / writes. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13876) Improving performance of HeapMemoryManager
[ https://issues.apache.org/jira/browse/HBASE-13876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13876: - Description: I am trying to improve the performance of DefaultHeapMemoryTuner by introducing some more checks. The current checks under which the DefaultHeapMemoryTuner works are very rare so I am trying to weaken these checks to improve its performance. Check current memstore size and current block cache size. If we are using less than 50% of currently available block cache size we say block cache is sufficient and same for memstore. This check will be very effective when server is either load heavy or write heavy. Earlier version just waited for number of evictions / number of flushes to be zero which are very rare. Otherwise based on percent change in number of cache misses and number of flushes we increase / decrease memory provided for caching / memstore. After doing so, on next call of HeapMemoryTuner we verify that last change has indeed decreased number of evictions / flush ( combined). I am doing this analysis by comparing percent change (which is basically nothing but normalized derivative) of number of evictions and number of flushes during last two periods. The main motive for doing this was that if we have random reads then we will be having a lot of cache misses. But even after increasing block cache we wont be able to decrease number of cache misses and we will revert back and eventually we will not waste memory on block caches. This will also help us ignore random short term spikes in reads / writes. was: I am trying to improve the performance of DefaultHeapMemoryTuner by introducing some more checks. The current checks under which the DefaultHeapMemoryTuner works are very rare so I am trying to weaken these checks to improve its performance. Check current memstore size and current block cache size. If we are using less than 50% of currently available block cache size we say block cache is sufficient and same for memstore. This check will be very effective when server is either load heavy or write heavy. Earlier version just waited for number of evictions / number of flushes to be zero which are very rare. Otherwise based on percent change in number of cache misses and number of flushes we increase / decrease memory provided for caching / memstore. After doing so, on next call of HeapMemoryTuner we verify that last change has indeed decreased number of evictions / flush ( combined). I am doing this analysis by comparing percent change (which is basically nothing but normalized derivative) of number of evictions and number of flushes during last two periods. The main motive for doing this was that if we have random reads then we will be having a lot of cache misses. But even after increasing block cache we wont be able to decrease number of cache misses and we will revert back and eventually we will not waste memory on block caches. This will also help us ignore short term random spikes in reads / writes. > Improving performance of HeapMemoryManager > -- > > Key: HBASE-13876 > URL: https://issues.apache.org/jira/browse/HBASE-13876 > Project: HBase > Issue Type: Improvement > Components: hbase, regionserver >Affects Versions: 2.0.0, 1.0.1, 1.1.0, 1.1.1 >Reporter: Abhilash >Assignee: Abhilash >Priority: Minor > Attachments: HBASE-13876.patch > > > I am trying to improve the performance of DefaultHeapMemoryTuner by > introducing some more checks. The current checks under which the > DefaultHeapMemoryTuner works are very rare so I am trying to weaken these > checks to improve its performance. > Check current memstore size and current block cache size. If we are using > less than 50% of currently available block cache size we say block cache is > sufficient and same for memstore. This check will be very effective when > server is either load heavy or write heavy. Earlier version just waited for > number of evictions / number of flushes to be zero which are very rare. > Otherwise based on percent change in number of cache misses and number of > flushes we increase / decrease memory provided for caching / memstore. After > doing so, on next call of HeapMemoryTuner we verify that last change has > indeed decreased number of evictions / flush ( combined). I am doing this > analysis by comparing percent change (which is basically nothing but > normalized derivative) of number of evictions and number of flushes during > last two periods. The main motive for doing this was that if we have random > reads then we will be having a lot of cache misses. But even after increasing > block cache we wont be able to decrease number of cache misses and we will > revert back and eventually we will not waste memor
[jira] [Updated] (HBASE-13876) Improving performance of HeapMemoryManager
[ https://issues.apache.org/jira/browse/HBASE-13876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13876: - Description: I am trying to improve the performance of DefaultHeapMemoryTuner by introducing some more checks. The current checks under which the DefaultHeapMemoryTuner works are very rare so I am trying to weaken these checks to improve its performance. Check current memstore size and current block cache size. If we are using less than 50% of currently available block cache size we say block cache is sufficient and same for memstore. This check will be very effective when server is either load heavy or write heavy. Earlier version just waited for number of evictions / number of flushes to be zero which are very rare. Otherwise based on percent change in number of cache misses and number of flushes we increase / decrease memory provided for caching / memstore. After doing so, on next call of HeapMemoryTuner we verify that last change has indeed decreased number of evictions / flush ( combined). I am doing this analysis by comparing percent change (which is basically nothing but normalized derivative) of number of evictions and number of flushes during last two periods. The main motive for doing this was that if we have random reads then we will be having a lot of cache misses. But even after increasing block cache we wont be able to decrease number of cache misses and we will revert back and eventually we will not waste memory on block caches. This will also help us ignore short term random spikes in reads / writes. was: I am trying to improve the performance of DefaultHeapMemoryTuner by introducing some more checks. The current checks under which the DefaultHeapMemoryTuner works are very rare so I am trying to weaken these checks to improve its performance. Check current memstore size and current block cache size. If we are using less than 50% of currently available block cache size we say block cache is sufficient and same for memstore. This check will be very effective when server is either load heavy or write heavy. Earlier version just waited for number of evictions / number of flushes to be zero which is very rare to happen. Otherwise based on percent change in number of cache misses and number of flushes we increase / decrease memory provided for caching / memstore. After doing so, on next call of HeapMemoryTuner we verify that last change has indeed decreased number of evictions / flush ( combined). I am doing this analysis by comparing percent change (which is basically nothing but normalized derivative) of number of evictions and number of flushes during last two periods. The main motive for doing this was that if we have random reads then even after increasing block cache we wont be able to decrease number of cache misses and eventually we will not waste memory on block caches. > Improving performance of HeapMemoryManager > -- > > Key: HBASE-13876 > URL: https://issues.apache.org/jira/browse/HBASE-13876 > Project: HBase > Issue Type: Improvement > Components: hbase, regionserver >Affects Versions: 2.0.0, 1.0.1, 1.1.0, 1.1.1 >Reporter: Abhilash >Assignee: Abhilash >Priority: Minor > Attachments: HBASE-13876.patch > > > I am trying to improve the performance of DefaultHeapMemoryTuner by > introducing some more checks. The current checks under which the > DefaultHeapMemoryTuner works are very rare so I am trying to weaken these > checks to improve its performance. > Check current memstore size and current block cache size. If we are using > less than 50% of currently available block cache size we say block cache is > sufficient and same for memstore. This check will be very effective when > server is either load heavy or write heavy. Earlier version just waited for > number of evictions / number of flushes to be zero which are very rare. > Otherwise based on percent change in number of cache misses and number of > flushes we increase / decrease memory provided for caching / memstore. After > doing so, on next call of HeapMemoryTuner we verify that last change has > indeed decreased number of evictions / flush ( combined). I am doing this > analysis by comparing percent change (which is basically nothing but > normalized derivative) of number of evictions and number of flushes during > last two periods. The main motive for doing this was that if we have random > reads then we will be having a lot of cache misses. But even after increasing > block cache we wont be able to decrease number of cache misses and we will > revert back and eventually we will not waste memory on block caches. This > will also help us ignore short term random spikes in reads / writes. > -- This message was sent by Atla
[jira] [Updated] (HBASE-13876) Improving performance of HeapMemoryManager
[ https://issues.apache.org/jira/browse/HBASE-13876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13876: - Description: I am trying to improve the performance of DefaultHeapMemoryTuner by introducing some more checks. The current checks under which the DefaultHeapMemoryTuner works are very rare so I am trying to weaken these checks to improve its performance. Check current memstore size and current block cache size. If we are using less than 50% of currently available block cache size we say block cache is sufficient and same for memstore. This check will be very effective when server is either load heavy or write heavy. Earlier version just waited for number of evictions / number of flushes to be zero which is very rare to happen. Otherwise based on percent change in number of cache misses and number of flushes we increase / decrease memory provided for caching / memstore. After doing so, on next call of HeapMemoryTuner we verify that last change has indeed decreased number of evictions / flush ( combined). I am doing this analysis by comparing percent change (which is basically nothing but normalized derivative) of number of evictions and number of flushes during last two periods. The main motive for doing this was that if we have random reads then even after increasing block cache we wont be able to decrease number of cache misses and eventually we will not waste memory on block caches. was: I am trying to improve the performance of DefaultHeapMemoryTuner by introducing some more checks. The current checks under which the DefaultHeapMemoryTuner works are very rare so I am trying to weaken these checks to improve its performance. Check current memstore size and current block cache size. If we are using less than 50% of currently available block cache size we say block cache is sufficient and same for memstore. This check will be very effective when server is either load heavy or write heavy. Earlier version just waited to for number of evictions / number of flushes to be zero which is very rare to happen. Otherwise based on percent change in number of cache misses and number of flushes we increase / decrease memory provided for caching / memstore. After doing so, on next call of HeapMemoryTuner we verify that last change has indeed decreased number of evictions / flush ( combined). I am doing this analysis by comparing percent change (which is basically nothing but normalized derivative) of number of evictions and number of flushes during last two periods. The main motive for doing this was that if we have random reads then even after increasing block cache we wont be able to decrease number of cache misses and eventually we will not waste memory on block caches. > Improving performance of HeapMemoryManager > -- > > Key: HBASE-13876 > URL: https://issues.apache.org/jira/browse/HBASE-13876 > Project: HBase > Issue Type: Improvement > Components: hbase, regionserver >Affects Versions: 2.0.0, 1.0.1, 1.1.0, 1.1.1 >Reporter: Abhilash >Assignee: Abhilash >Priority: Minor > Attachments: HBASE-13876.patch > > > I am trying to improve the performance of DefaultHeapMemoryTuner by > introducing some more checks. The current checks under which the > DefaultHeapMemoryTuner works are very rare so I am trying to weaken these > checks to improve its performance. > Check current memstore size and current block cache size. If we are using > less than 50% of currently available block cache size we say block cache is > sufficient and same for memstore. This check will be very effective when > server is either load heavy or write heavy. Earlier version just waited for > number of evictions / number of flushes to be zero which is very rare to > happen. > Otherwise based on percent change in number of cache misses and number of > flushes we increase / decrease memory provided for caching / memstore. After > doing so, on next call of HeapMemoryTuner we verify that last change has > indeed decreased number of evictions / flush ( combined). I am doing this > analysis by comparing percent change (which is basically nothing but > normalized derivative) of number of evictions and number of flushes during > last two periods. The main motive for doing this was that if we have random > reads then even after increasing block cache we wont be able to decrease > number of cache misses and eventually we will not waste memory on block > caches. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13876) Improving performance of HeapMemoryManager
[ https://issues.apache.org/jira/browse/HBASE-13876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13876: - Description: I am trying to improve the performance of DefaultHeapMemoryTuner by introducing some more checks. The current checks under which the DefaultHeapMemoryTuner works are very rare so I am trying to weaken these checks to improve its performance. Check current memstore size and current block cache size. If we are using less than 50% of currently available block cache size we say block cache is sufficient and same for memstore. This check will be very effective when server is either load heavy or write heavy. Earlier version just waited to for number of evictions / number of flushes to be zero which is very rare to happen. Otherwise based on percent change in number of cache misses and number of flushes we increase / decrease memory provided for caching / memstore. After doing so, on next call of HeapMemoryTuner we verify that last change has indeed decreased number of evictions / flush ( combined). I am doing this analysis by comparing percent change (which is basically nothing but normalized derivative) of number of evictions and number of flushes during last two periods. The main motive for doing this was that if we have random reads then even after increasing block cache we wont be able to decrease number of cache misses and eventually we will not waste memory on block caches. was: I am trying to improve the performance of DefaultHeapMemoryTuner by introducing some more checks. The current checks under which the DefaultHeapMemoryTuner works are very rare so I am trying to weaken these checks to improve its performance. Check current memstore size and current block cache size. If we are using less than 50% of currently available block cache size we say block cache is sufficient and same for memstore. This check will be very effective when server is either load heavy or write heavy. Earlier version just waited to for number of evictions / number of flushes to be zero which is very rare to happen. Otherwise based on percent change in number of cache misses and number of flushes we increase / decrease memory provided for caching / memstore. After doing so, on next call of HeapMemoryTuner we verify that last change has indeed decreased number of evictions / flush ( combined). I am doing this analysis by comparing percent change (which is basically nothing but normalized derivative) of number of evictions and number of flushes in last two periods. The main motive for doing this was that if we have random reads then even after increasing block cache we wont be able to decrease number of cache misses and eventually we will not waste memory on block caches. > Improving performance of HeapMemoryManager > -- > > Key: HBASE-13876 > URL: https://issues.apache.org/jira/browse/HBASE-13876 > Project: HBase > Issue Type: Improvement > Components: hbase, regionserver >Affects Versions: 2.0.0, 1.0.1, 1.1.0, 1.1.1 >Reporter: Abhilash >Assignee: Abhilash >Priority: Minor > Attachments: HBASE-13876.patch > > > I am trying to improve the performance of DefaultHeapMemoryTuner by > introducing some more checks. The current checks under which the > DefaultHeapMemoryTuner works are very rare so I am trying to weaken these > checks to improve its performance. > Check current memstore size and current block cache size. If we are using > less than 50% of currently available block cache size we say block cache is > sufficient and same for memstore. This check will be very effective when > server is either load heavy or write heavy. Earlier version just waited to > for number of evictions / number of flushes to be zero which is very rare to > happen. > Otherwise based on percent change in number of cache misses and number of > flushes we increase / decrease memory provided for caching / memstore. After > doing so, on next call of HeapMemoryTuner we verify that last change has > indeed decreased number of evictions / flush ( combined). I am doing this > analysis by comparing percent change (which is basically nothing but > normalized derivative) of number of evictions and number of flushes during > last two periods. The main motive for doing this was that if we have random > reads then even after increasing block cache we wont be able to decrease > number of cache misses and eventually we will not waste memory on block > caches. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13876) Improving performance of HeapMemoryManager
[ https://issues.apache.org/jira/browse/HBASE-13876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13876: - Description: I am trying to improve the performance of DefaultHeapMemoryTuner by introducing some more checks. The current checks under which the DefaultHeapMemoryTuner works are very rare so I am trying to weaken these checks to improve its performance. Check current memstore size and current block cache size. If we are using less than 50% of currently available block cache size we say block cache is sufficient and same for memstore. This check will be very effective when server is either load heavy or write heavy. Earlier version just waited to for number of evictions / number of flushes to be zero which is very rare to happen. Otherwise based on percent change in number of cache misses and number of flushes we increase / decrease memory provided for caching / memstore. After doing so, on next call of HeapMemoryTuner we verify that last change has indeed decreased number of evictions / flush ( combined). I am doing this analysis by comparing percent change (which is basically nothing but normalized derivative) of number of evictions and number of flushes in last two periods. The main motive for doing this was that if we have random reads then even after increasing block cache we wont be able to decrease number of cache misses and eventually we will not waste memory on block caches. was: I am trying to improve the performance of DefaultHeapMemoryTuner by introducing some more checks. The current checks under which the DefaultHeapMemoryTuner works are very rare so I am trying to weaken these checks to improve its performance. Check current memstore size and current block cache size. If we are using less than 50% of currently available block cache size we say block cache is sufficient and same for memstore. This check will be very effective when server is either load heavy or write heavy. Earlier version just waited to for number of evictions / number of flushes to be zero which is very rare to happen. Otherwise based on percent change in number of cache misses and number of flushes we increase / decrease memory provided for caching / memstore. After doing so, on next call of HeapMemoryTuner we verify that last change has indeed decreased number of evictions / flush ( combined). I am doing this analysis by comparing percent change (which is basically nothing but normalized of the derivative) of number of evictions and number of flushes in last two periods. The main motive for doing this was that if we have random reads then even after increasing block cache we wont be able to decrease number of cache misses and eventually we will not waste memory on block caches. > Improving performance of HeapMemoryManager > -- > > Key: HBASE-13876 > URL: https://issues.apache.org/jira/browse/HBASE-13876 > Project: HBase > Issue Type: Improvement > Components: hbase, regionserver >Affects Versions: 2.0.0, 1.0.1, 1.1.0, 1.1.1 >Reporter: Abhilash >Assignee: Abhilash >Priority: Minor > Attachments: HBASE-13876.patch > > > I am trying to improve the performance of DefaultHeapMemoryTuner by > introducing some more checks. The current checks under which the > DefaultHeapMemoryTuner works are very rare so I am trying to weaken these > checks to improve its performance. > Check current memstore size and current block cache size. If we are using > less than 50% of currently available block cache size we say block cache is > sufficient and same for memstore. This check will be very effective when > server is either load heavy or write heavy. Earlier version just waited to > for number of evictions / number of flushes to be zero which is very rare to > happen. > Otherwise based on percent change in number of cache misses and number of > flushes we increase / decrease memory provided for caching / memstore. After > doing so, on next call of HeapMemoryTuner we verify that last change has > indeed decreased number of evictions / flush ( combined). I am doing this > analysis by comparing percent change (which is basically nothing but > normalized derivative) of number of evictions and number of flushes in last > two periods. The main motive for doing this was that if we have random reads > then even after increasing block cache we wont be able to decrease number of > cache misses and eventually we will not waste memory on block caches. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13876) Improving performance of HeapMemoryManager
[ https://issues.apache.org/jira/browse/HBASE-13876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13876: - Description: I am trying to improve the performance of DefaultHeapMemoryTuner by introducing some more checks. The current checks under which the DefaultHeapMemoryTuner works are very rare so I am trying to weaken these checks to improve its performance. Check current memstore size and current block cache size. If we are using less than 50% of currently available block cache size we say block cache is sufficient and same for memstore. This check will be very effective when cluster is load heavy or write heavy. Earlier version just waited to for number of evictions / number of flushes to be zero which is very rare to happen. Otherwise based on percent change in number of cache misses and number of flushes we increase / decrease memory provided for caching / memstore. After doing so, on next call of HeapMemoryTuner we verify that last change has indeed decreased number of evictions / flush ( combined). I am doing this analysis by comparing percent change (which is basically nothing but normalized of the derivative) of number of evictions and number of flushes in last two periods. The main motive for doing this was that if we have random reads then even after increasing block cache we wont be able to decrease number of cache misses and eventually we will not waste memory on block caches. was: I am trying to improve the performance of DefaultHeapMemoryTuner by introducing some more checks. The current checks under which the DefaultHeapMemoryTuner works are very rare so I am trying to weaken these checks to improve its performance. Check current memstore size and current block cache size. If we are using less than 50% of currently available block cache size we say block cache is sufficient and same for memstore. This check will be very effective when cluster is load heavy or write heavy. Earlier version just waited to for number of evictions / number of flushes to be zero which is very rare to happen. Otherwise based on percent change in number of cache misses and number of flushes we increase / decrease memory provided for caching / memstore. After doing so, on next call of HeapMemoryTuner we verify that last change has indeed decreased number of evictions / flush ( combined). I am doing this analysis by comparing percent change (which is basically nothing but normalized of the derivative) of number of evictions and number of flushes in last two periods. The main motive for doing this was that if we have random reads then even after increasing block cache we wont be able to decrease number of cache misses and eventually we will not waste memory on block caches. > Improving performance of HeapMemoryManager > -- > > Key: HBASE-13876 > URL: https://issues.apache.org/jira/browse/HBASE-13876 > Project: HBase > Issue Type: Improvement > Components: hbase, regionserver >Affects Versions: 2.0.0, 1.0.1, 1.1.0, 1.1.1 >Reporter: Abhilash >Assignee: Abhilash >Priority: Minor > Attachments: HBASE-13876.patch > > > I am trying to improve the performance of DefaultHeapMemoryTuner by > introducing some more checks. The current checks under which the > DefaultHeapMemoryTuner works are very rare so I am trying to weaken these > checks to improve its performance. > Check current memstore size and current block cache size. If we are using > less than 50% of currently available block cache size we say block cache is > sufficient and same for memstore. This check will be very effective when > cluster is load heavy or write heavy. Earlier version just waited to for > number of evictions / number of flushes to be zero which is very rare to > happen. > Otherwise based on percent change in number of cache misses and number of > flushes we increase / decrease memory provided for caching / memstore. After > doing so, on next call of HeapMemoryTuner we verify that last change has > indeed decreased number of evictions / flush ( combined). I am doing this > analysis by comparing percent change (which is basically nothing but > normalized of the derivative) of number of evictions and number of flushes in > last two periods. The main motive for doing this was that if we have random > reads then even after increasing block cache we wont be able to decrease > number of cache misses and eventually we will not waste memory on block > caches. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13876) Improving performance of HeapMemoryManager
[ https://issues.apache.org/jira/browse/HBASE-13876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13876: - Description: I am trying to improve the performance of DefaultHeapMemoryTuner by introducing some more checks. The current checks under which the DefaultHeapMemoryTuner works are very rare so I am trying to weaken these checks to improve its performance. Check current memstore size and current block cache size. If we are using less than 50% of currently available block cache size we say block cache is sufficient and same for memstore. This check will be very effective when server is either load heavy or write heavy. Earlier version just waited to for number of evictions / number of flushes to be zero which is very rare to happen. Otherwise based on percent change in number of cache misses and number of flushes we increase / decrease memory provided for caching / memstore. After doing so, on next call of HeapMemoryTuner we verify that last change has indeed decreased number of evictions / flush ( combined). I am doing this analysis by comparing percent change (which is basically nothing but normalized of the derivative) of number of evictions and number of flushes in last two periods. The main motive for doing this was that if we have random reads then even after increasing block cache we wont be able to decrease number of cache misses and eventually we will not waste memory on block caches. was: I am trying to improve the performance of DefaultHeapMemoryTuner by introducing some more checks. The current checks under which the DefaultHeapMemoryTuner works are very rare so I am trying to weaken these checks to improve its performance. Check current memstore size and current block cache size. If we are using less than 50% of currently available block cache size we say block cache is sufficient and same for memstore. This check will be very effective when cluster is load heavy or write heavy. Earlier version just waited to for number of evictions / number of flushes to be zero which is very rare to happen. Otherwise based on percent change in number of cache misses and number of flushes we increase / decrease memory provided for caching / memstore. After doing so, on next call of HeapMemoryTuner we verify that last change has indeed decreased number of evictions / flush ( combined). I am doing this analysis by comparing percent change (which is basically nothing but normalized of the derivative) of number of evictions and number of flushes in last two periods. The main motive for doing this was that if we have random reads then even after increasing block cache we wont be able to decrease number of cache misses and eventually we will not waste memory on block caches. > Improving performance of HeapMemoryManager > -- > > Key: HBASE-13876 > URL: https://issues.apache.org/jira/browse/HBASE-13876 > Project: HBase > Issue Type: Improvement > Components: hbase, regionserver >Affects Versions: 2.0.0, 1.0.1, 1.1.0, 1.1.1 >Reporter: Abhilash >Assignee: Abhilash >Priority: Minor > Attachments: HBASE-13876.patch > > > I am trying to improve the performance of DefaultHeapMemoryTuner by > introducing some more checks. The current checks under which the > DefaultHeapMemoryTuner works are very rare so I am trying to weaken these > checks to improve its performance. > Check current memstore size and current block cache size. If we are using > less than 50% of currently available block cache size we say block cache is > sufficient and same for memstore. This check will be very effective when > server is either load heavy or write heavy. Earlier version just waited to > for number of evictions / number of flushes to be zero which is very rare to > happen. > Otherwise based on percent change in number of cache misses and number of > flushes we increase / decrease memory provided for caching / memstore. After > doing so, on next call of HeapMemoryTuner we verify that last change has > indeed decreased number of evictions / flush ( combined). I am doing this > analysis by comparing percent change (which is basically nothing but > normalized of the derivative) of number of evictions and number of flushes in > last two periods. The main motive for doing this was that if we have random > reads then even after increasing block cache we wont be able to decrease > number of cache misses and eventually we will not waste memory on block > caches. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13876) Improving performance of HeapMemoryManager
[ https://issues.apache.org/jira/browse/HBASE-13876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13876: - Description: I am trying to improve the performance of DefaultHeapMemoryTuner by introducing some more checks. The current checks under which the DefaultHeapMemoryTuner works are very rare so I am trying to weaken these checks to improve its performance. Check current memstore size and current block cache size. If we are using less than 50% of currently available block cache size we say block cache is sufficient and same for memstore. This check will be very effective when cluster is load heavy or write heavy. Earlier version just waited to for number of evictions / number of flushes to be zero which is very rare to happen. Otherwise based on percent change in number of cache misses and number of flushes we increase / decrease memory provided for caching / memstore. After doing so, on next call of HeapMemoryTuner we verify that last change has indeed decreased number of evictions / flush ( combined). I am doing this analysis by comparing percent change (which is basically nothing but normalized of the derivative) of number of evictions and number of flushes in last two periods. The main motive for doing this was that if we have random reads then even after increasing block cache we wont be able to decrease number of cache misses and eventually we will not waste memory on block caches. was:I am trying to improve the performance of DefaultHeapMemoryTuner by introducing some more checks. The current checks under which the DefaultHeapMemoryTuner works are very rare so I am trying to weaken these checks to improve its performance. > Improving performance of HeapMemoryManager > -- > > Key: HBASE-13876 > URL: https://issues.apache.org/jira/browse/HBASE-13876 > Project: HBase > Issue Type: Improvement > Components: hbase, regionserver >Affects Versions: 2.0.0, 1.0.1, 1.1.0, 1.1.1 >Reporter: Abhilash >Assignee: Abhilash >Priority: Minor > Attachments: HBASE-13876.patch > > > I am trying to improve the performance of DefaultHeapMemoryTuner by > introducing some more checks. The current checks under which the > DefaultHeapMemoryTuner works are very rare so I am trying to weaken these > checks to improve its performance. > Check current memstore size and current block cache size. If we are using > less than 50% of currently available block cache size we say block cache is > sufficient and same for memstore. This check will be very effective when > cluster is load heavy or write heavy. Earlier version just waited to for > number of evictions / number of flushes to be zero which is very rare to > happen. > Otherwise based on percent change in number of cache misses and number of > flushes we increase / decrease memory provided for caching / memstore. After > doing so, on next call of HeapMemoryTuner we verify that last change has > indeed decreased number of evictions / flush ( combined). I am doing this > analysis by comparing percent change (which is basically nothing but > normalized of the derivative) of number of evictions and number of flushes in > last two periods. The main motive for doing this was that if we have random > reads then even after increasing block cache we wont be able to decrease > number of cache misses and eventually we will not waste memory on block > caches. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13876) Improving performance of HeapMemoryManager
[ https://issues.apache.org/jira/browse/HBASE-13876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13876: - Attachment: HBASE-13876.patch > Improving performance of HeapMemoryManager > -- > > Key: HBASE-13876 > URL: https://issues.apache.org/jira/browse/HBASE-13876 > Project: HBase > Issue Type: Improvement > Components: hbase, regionserver >Affects Versions: 2.0.0, 1.0.1, 1.1.0, 1.1.1 >Reporter: Abhilash >Assignee: Abhilash >Priority: Minor > Attachments: HBASE-13876.patch > > > I am trying to improve the performance of DefaultHeapMemoryTuner by > introducing some more checks. The current checks under which the > DefaultHeapMemoryTuner works are very rare so I am trying to weaken these > checks to improve its performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13876) Improving performance of HeapMemoryManager
[ https://issues.apache.org/jira/browse/HBASE-13876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13876: - Summary: Improving performance of HeapMemoryManager (was: Improving performance of HeapMemoryTunerManager) > Improving performance of HeapMemoryManager > -- > > Key: HBASE-13876 > URL: https://issues.apache.org/jira/browse/HBASE-13876 > Project: HBase > Issue Type: Improvement > Components: hbase, regionserver >Affects Versions: 2.0.0, 1.0.1, 1.1.0, 1.1.1 >Reporter: Abhilash >Assignee: Abhilash >Priority: Minor > > I am trying to improve the performance of DefaultHeapMemoryTuner by > introducing some more checks. The current checks under which the > DefaultHeapMemoryTuner works are very rare so I am trying to weaken these > checks to improve its performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-13876) Improving performance of HeapMemoryTunerManager
Abhilash created HBASE-13876: Summary: Improving performance of HeapMemoryTunerManager Key: HBASE-13876 URL: https://issues.apache.org/jira/browse/HBASE-13876 Project: HBase Issue Type: Improvement Components: hbase, regionserver Affects Versions: 1.1.0, 1.0.1, 2.0.0, 1.1.1 Reporter: Abhilash Assignee: Abhilash Priority: Minor I am trying to improve the performance of DefaultHeapMemoryTuner by introducing some more checks. The current checks under which the DefaultHeapMemoryTuner works are very rare so I am trying to weaken these checks to improve its performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13847) getWriteRequestCount function in HRegionServer uses int variable to return the count.
[ https://issues.apache.org/jira/browse/HBASE-13847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14578262#comment-14578262 ] Abhilash commented on HBASE-13847: -- Thanks [~tedyu] for review. Thanks [~stack] and [~anoop.hbase] for helping me generate the patch. > getWriteRequestCount function in HRegionServer uses int variable to return > the count. > - > > Key: HBASE-13847 > URL: https://issues.apache.org/jira/browse/HBASE-13847 > Project: HBase > Issue Type: Bug > Components: hbase, regionserver >Affects Versions: 1.0.0 >Reporter: Abhilash >Assignee: Abhilash > Labels: easyfix > Fix For: 2.0.0, 1.0.1, 1.2.0, 1.1.1 > > Attachments: HBASE-13847.patch, HBASE-13847.patch, HBASE-13847.patch, > HBASE-13847.patch, screenshot-1.png > > > Variable used to return the value of getWriteRequestCount is int, must be > long. I think it causes cluster UI to show negative Write Request Count. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13847) getWriteRequestCount function in HRegionServer uses int variable to return the count.
[ https://issues.apache.org/jira/browse/HBASE-13847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14577972#comment-14577972 ] Abhilash commented on HBASE-13847: -- Can anybody help me understand how can this patch fail the tests ? Error looks unrelated to the patch to me. > getWriteRequestCount function in HRegionServer uses int variable to return > the count. > - > > Key: HBASE-13847 > URL: https://issues.apache.org/jira/browse/HBASE-13847 > Project: HBase > Issue Type: Bug > Components: hbase, regionserver >Affects Versions: 1.0.0 >Reporter: Abhilash >Assignee: Abhilash > Labels: easyfix > Fix For: 2.0.0, 1.0.1, 1.2.0, 1.1.1 > > Attachments: HBASE-13847.patch, HBASE-13847.patch, HBASE-13847.patch, > screenshot-1.png > > > Variable used to return the value of getWriteRequestCount is int, must be > long. I think it causes cluster UI to show negative Write Request Count. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13847) getWriteRequestCount function in HRegionServer uses int variable to return the count.
[ https://issues.apache.org/jira/browse/HBASE-13847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13847: - Attachment: (was: HBASE-13847-v1.patch) > getWriteRequestCount function in HRegionServer uses int variable to return > the count. > - > > Key: HBASE-13847 > URL: https://issues.apache.org/jira/browse/HBASE-13847 > Project: HBase > Issue Type: Bug > Components: hbase, regionserver >Affects Versions: 1.0.0 >Reporter: Abhilash >Assignee: Abhilash > Labels: easyfix > Fix For: 2.0.0, 1.0.1, 1.2.0, 1.1.1 > > Attachments: HBASE-13847.patch, HBASE-13847.patch, HBASE-13847.patch, > screenshot-1.png > > > Variable used to return the value of getWriteRequestCount is int, must be > long. I think it causes cluster UI to show negative Write Request Count. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13847) getWriteRequestCount function in HRegionServer uses int variable to return the count.
[ https://issues.apache.org/jira/browse/HBASE-13847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13847: - Attachment: HBASE-13847-v1.patch > getWriteRequestCount function in HRegionServer uses int variable to return > the count. > - > > Key: HBASE-13847 > URL: https://issues.apache.org/jira/browse/HBASE-13847 > Project: HBase > Issue Type: Bug > Components: hbase, regionserver >Affects Versions: 1.0.0 >Reporter: Abhilash >Assignee: Abhilash > Labels: easyfix > Fix For: 2.0.0, 1.0.1, 1.2.0, 1.1.1 > > Attachments: HBASE-13847-v1.patch, HBASE-13847.patch, > HBASE-13847.patch, HBASE-13847.patch, screenshot-1.png > > > Variable used to return the value of getWriteRequestCount is int, must be > long. I think it causes cluster UI to show negative Write Request Count. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13847) getWriteRequestCount function in HRegionServer uses int variable to return the count.
[ https://issues.apache.org/jira/browse/HBASE-13847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14577562#comment-14577562 ] Abhilash commented on HBASE-13847: -- Sorry for that. The patch was being generated compared to my local commit. > getWriteRequestCount function in HRegionServer uses int variable to return > the count. > - > > Key: HBASE-13847 > URL: https://issues.apache.org/jira/browse/HBASE-13847 > Project: HBase > Issue Type: Bug > Components: hbase, regionserver >Affects Versions: 1.0.0 >Reporter: Abhilash >Assignee: Abhilash > Labels: easyfix > Fix For: 2.0.0, 1.0.1, 1.2.0, 1.1.1 > > Attachments: HBASE-13847.patch, HBASE-13847.patch, HBASE-13847.patch, > screenshot-1.png > > > Variable used to return the value of getWriteRequestCount is int, must be > long. I think it causes cluster UI to show negative Write Request Count. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13847) getWriteRequestCount function in HRegionServer uses int variable to return the count.
[ https://issues.apache.org/jira/browse/HBASE-13847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13847: - Attachment: HBASE-13847.patch > getWriteRequestCount function in HRegionServer uses int variable to return > the count. > - > > Key: HBASE-13847 > URL: https://issues.apache.org/jira/browse/HBASE-13847 > Project: HBase > Issue Type: Bug > Components: hbase, regionserver >Affects Versions: 1.0.0 >Reporter: Abhilash >Assignee: Abhilash > Labels: easyfix > Fix For: 2.0.0, 1.0.1, 1.2.0, 1.1.1 > > Attachments: HBASE-13847.patch, HBASE-13847.patch, HBASE-13847.patch, > screenshot-1.png > > > Variable used to return the value of getWriteRequestCount is int, must be > long. I think it causes cluster UI to show negative Write Request Count. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13847) getWriteRequestCount function in HRegionServer uses int variable to return the count.
[ https://issues.apache.org/jira/browse/HBASE-13847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14577448#comment-14577448 ] Abhilash commented on HBASE-13847: -- Thanks. I did not know I need to do that. > getWriteRequestCount function in HRegionServer uses int variable to return > the count. > - > > Key: HBASE-13847 > URL: https://issues.apache.org/jira/browse/HBASE-13847 > Project: HBase > Issue Type: Bug > Components: hbase, regionserver >Affects Versions: 1.0.0 >Reporter: Abhilash >Assignee: Abhilash > Labels: easyfix > Fix For: 2.0.0, 1.0.1, 1.2.0, 1.1.1 > > Attachments: HBASE-13847.patch, HBASE-13847.patch, screenshot-1.png > > > Variable used to return the value of getWriteRequestCount is int, must be > long. I think it causes cluster UI to show negative Write Request Count. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13834) Evict count not properly passed to HeapMemoryTuner.
[ https://issues.apache.org/jira/browse/HBASE-13834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575549#comment-14575549 ] Abhilash commented on HBASE-13834: -- Thanks a lot [~eclark] for introducing me to HBase. Thanks to [~ted_yu] and [~anoop.hbase] for your reviews. Really excited to contribute to HBase further ^_^ . > Evict count not properly passed to HeapMemoryTuner. > --- > > Key: HBASE-13834 > URL: https://issues.apache.org/jira/browse/HBASE-13834 > Project: HBase > Issue Type: Bug > Components: hbase, regionserver >Affects Versions: 1.0.0 >Reporter: Abhilash >Assignee: Abhilash > Labels: easyfix > Fix For: 2.0.0, 1.0.2, 1.2.0, 1.1.1 > > Attachments: HBASE-13834-v1.patch, HBASE-13834.patch > > > Evict count calculated inside the HeapMemoryManager class in tune function > that is passed to HeapMemoryTuner via TunerContext is miscalculated. It is > supposed to be Evict count between two intervals but its not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13847) getWriteRequestCount function in HRegionServer uses int variable to return the count.
[ https://issues.apache.org/jira/browse/HBASE-13847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13847: - Labels: easyfix (was: ) > getWriteRequestCount function in HRegionServer uses int variable to return > the count. > - > > Key: HBASE-13847 > URL: https://issues.apache.org/jira/browse/HBASE-13847 > Project: HBase > Issue Type: Bug > Components: hbase, regionserver >Affects Versions: 1.0.0 >Reporter: Abhilash >Assignee: Abhilash > Labels: easyfix > Fix For: 2.0.0, 1.0.1, 1.2.0, 1.1.1 > > Attachments: HBASE-13847.patch, screenshot-1.png > > > Variable used to return the value of getWriteRequestCount is int, must be > long. I think it causes cluster UI to show negative Write Request Count. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13834) Evict count not properly passed to HeapMemoryTuner.
[ https://issues.apache.org/jira/browse/HBASE-13834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13834: - Attachment: HBASE-13834-v1.patch > Evict count not properly passed to HeapMemoryTuner. > --- > > Key: HBASE-13834 > URL: https://issues.apache.org/jira/browse/HBASE-13834 > Project: HBase > Issue Type: Bug > Components: hbase, regionserver >Affects Versions: 1.0.0 >Reporter: Abhilash >Assignee: Abhilash > Labels: easyfix > Fix For: 2.0.0, 1.0.2, 1.1.1 > > Attachments: HBASE-13834-v1.patch, HBASE-13834.patch > > > Evict count calculated inside the HeapMemoryManager class in tune function > that is passed to HeapMemoryTuner via TunerContext is miscalculated. It is > supposed to be Evict count between two intervals but its not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)