bharathv commented on a change in pull request #1257: URL: https://github.com/apache/hbase/pull/1257#discussion_r437181864
########## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java ########## @@ -947,7 +1031,94 @@ public void run() { } LruBlockCache cache = this.cache.get(); if (cache == null) break; - cache.evict(); + bytesFreed = cache.evict(); + /* + * Sometimes we are reading more data than can fit into BlockCache + * and it is the cause a high rate of evictions. + * This in turn leads to heavy Garbage Collector works. + * So a lot of blocks put into BlockCache but never read, + * but spending a lot of CPU resources. + * Here we will analyze how many bytes were freed and decide + * decide whether the time has come to reduce amount of caching blocks. + * It help avoid put too many blocks into BlockCache + * when evict() works very active and save CPU for other jobs. + * More delails: https://issues.apache.org/jira/browse/HBASE-23887 + */ + + // First of all we have to control how much time + // has passed since previuos evict() was launched + // This is should be almost the same time (+/- 10s) + // because we get comparable volumes of freed bytes each time. + // 10s because this is default period to run evict() (see above this.wait) + long stopTime = System.currentTimeMillis(); + if (stopTime - startTime <= 1000 * 10 - 1) { + mbFreedSum += bytesFreed/1024/1024; + } else { + // Here we have to calc what situation we have got. + // We have the limit "hbase.lru.cache.heavy.eviction.bytes.size.limit" + // and can calculte overhead on it. + // We will use this information to decide, + // how to change percent of caching blocks. + freedDataOverheadPercent = + (int) (mbFreedSum * 100 / cache.heavyEvictionMbSizeLimit) - 100; + if (freedDataOverheadPercent > 0) { + // Now we are in the situation when we are above the limit + // But maybe we are going to ignore it because it will end quite soon + heavyEvictionCount++; + if (heavyEvictionCount > cache.heavyEvictionCountLimit) { + // It is going for a long time and we have to reduce of caching + // blocks now. So we calculate here how many blocks we want to skip. + // It depends on: + // 1. Overhead - if overhead is big we could more aggressive + // reducing amount of caching blocks. + // 2. How fast we want to get the result. If we know that our + // heavy reading for a long time, we don't want to wait and can + // increase the coefficient and get good performance quite soon. + // But if we don't sure we can do it slowly and it could prevent + // premature exit from this mode. So, when the coefficient is + // higher we can get better performance when heavy reading is stable. + // But when reading is changing we can adjust to it and set + // the coefficient to lower value. + int ch = (int) (freedDataOverheadPercent * cache.heavyEvictionOverheadCoefficient); + // But practice shows that 15% of reducing is quite enough. + // We are not greedy (it could lead to premature exit). + ch = ch > 15 ? 15 : ch; Review comment: Well, whats the use of the coefficient if you are capping the max reduction to 15%? How often did you hit this in your benchmarks? ########## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java ########## @@ -947,7 +1031,94 @@ public void run() { } LruBlockCache cache = this.cache.get(); if (cache == null) break; - cache.evict(); + bytesFreed = cache.evict(); + /* + * Sometimes we are reading more data than can fit into BlockCache + * and it is the cause a high rate of evictions. + * This in turn leads to heavy Garbage Collector works. + * So a lot of blocks put into BlockCache but never read, + * but spending a lot of CPU resources. + * Here we will analyze how many bytes were freed and decide + * decide whether the time has come to reduce amount of caching blocks. + * It help avoid put too many blocks into BlockCache + * when evict() works very active and save CPU for other jobs. + * More delails: https://issues.apache.org/jira/browse/HBASE-23887 + */ + + // First of all we have to control how much time + // has passed since previuos evict() was launched + // This is should be almost the same time (+/- 10s) + // because we get comparable volumes of freed bytes each time. + // 10s because this is default period to run evict() (see above this.wait) + long stopTime = System.currentTimeMillis(); + if (stopTime - startTime <= 1000 * 10 - 1) { + mbFreedSum += bytesFreed/1024/1024; + } else { + // Here we have to calc what situation we have got. + // We have the limit "hbase.lru.cache.heavy.eviction.bytes.size.limit" + // and can calculte overhead on it. + // We will use this information to decide, + // how to change percent of caching blocks. + freedDataOverheadPercent = + (int) (mbFreedSum * 100 / cache.heavyEvictionMbSizeLimit) - 100; + if (freedDataOverheadPercent > 0) { + // Now we are in the situation when we are above the limit + // But maybe we are going to ignore it because it will end quite soon + heavyEvictionCount++; + if (heavyEvictionCount > cache.heavyEvictionCountLimit) { + // It is going for a long time and we have to reduce of caching + // blocks now. So we calculate here how many blocks we want to skip. + // It depends on: + // 1. Overhead - if overhead is big we could more aggressive + // reducing amount of caching blocks. + // 2. How fast we want to get the result. If we know that our + // heavy reading for a long time, we don't want to wait and can + // increase the coefficient and get good performance quite soon. + // But if we don't sure we can do it slowly and it could prevent + // premature exit from this mode. So, when the coefficient is + // higher we can get better performance when heavy reading is stable. + // But when reading is changing we can adjust to it and set + // the coefficient to lower value. + int ch = (int) (freedDataOverheadPercent * cache.heavyEvictionOverheadCoefficient); + // But practice shows that 15% of reducing is quite enough. + // We are not greedy (it could lead to premature exit). + ch = ch > 15 ? 15 : ch; + ch = ch < 0 ? 0 : ch; // I think it will never happen but check for sure + // So this is the key point, here we are reducing % of caching blocks + cache.cacheDataBlockPercent -= ch; + // If we go down too deep we have to stop here, 1% any way should be. + cache.cacheDataBlockPercent = + cache.cacheDataBlockPercent < 1 ? 1 : cache.cacheDataBlockPercent; + } + } else { + // Well, we have got overshooting. + // Mayby it is just short-term fluctuation and we can stay in this mode. + // It help avoid permature exit during short-term fluctuation. + // If overshooting less than 90%, we will try to increase the percent of + // caching blocks and hope it is enough. + if (mbFreedSum >= cache.heavyEvictionMbSizeLimit * 0.1) { Review comment: How did you come up with 0.1? This is critical because this branch controls how the cache recovers to a normal state after heavy eviction is over. ########## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java ########## @@ -153,6 +153,22 @@ private static final String LRU_MAX_BLOCK_SIZE = "hbase.lru.max.block.size"; private static final long DEFAULT_MAX_BLOCK_SIZE = 16L * 1024L * 1024L; + private static final String LRU_CACHE_HEAVY_EVICTION_COUNT_LIMIT + = "hbase.lru.cache.heavy.eviction.count.limit"; + // Default value actually equal to disable feature of increasing performance. + // Because 2147483647 is about ~680 years (after that it will start to work) + // We can set it to 0-10 and get the profit right now. + // (see details https://issues.apache.org/jira/browse/HBASE-23887). + private static final int DEFAULT_LRU_CACHE_HEAVY_EVICTION_COUNT_LIMIT = 2147483647; Review comment: nit: Use Integer.MAX_VALUE? ########## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java ########## @@ -153,6 +153,18 @@ private static final String LRU_MAX_BLOCK_SIZE = "hbase.lru.max.block.size"; private static final long DEFAULT_MAX_BLOCK_SIZE = 16L * 1024L * 1024L; + private static final String LRU_CACHE_HEAVY_EVICTION_COUNT_LIMIT + = "hbase.lru.cache.heavy.eviction.count.limit"; + private static final int DEFAULT_LRU_CACHE_HEAVY_EVICTION_COUNT_LIMIT = 10; + + private static final String LRU_CACHE_HEAVY_EVICTION_MB_SIZE_LIMIT + = "hbase.lru.cache.heavy.eviction.mb.size.limit"; + private static final long DEFAULT_LRU_CACHE_HEAVY_EVICTION_MB_SIZE_LIMIT = 500; + + private static final String LRU_CACHE_HEAVY_EVICTION_OVERHEAD_COEFFICIENT + = "hbase.lru.cache.heavy.eviction.overhead.coefficient"; + private static final float DEFAULT_LRU_CACHE_HEAVY_EVICTION_OVERHEAD_COEFFICIENT = 0.01f; + /** Review comment: Of course, why wouldn't it :-D ########## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java ########## @@ -947,7 +1031,94 @@ public void run() { } LruBlockCache cache = this.cache.get(); if (cache == null) break; - cache.evict(); + bytesFreed = cache.evict(); + /* + * Sometimes we are reading more data than can fit into BlockCache + * and it is the cause a high rate of evictions. + * This in turn leads to heavy Garbage Collector works. + * So a lot of blocks put into BlockCache but never read, + * but spending a lot of CPU resources. + * Here we will analyze how many bytes were freed and decide + * decide whether the time has come to reduce amount of caching blocks. + * It help avoid put too many blocks into BlockCache + * when evict() works very active and save CPU for other jobs. + * More delails: https://issues.apache.org/jira/browse/HBASE-23887 + */ + + // First of all we have to control how much time + // has passed since previuos evict() was launched + // This is should be almost the same time (+/- 10s) + // because we get comparable volumes of freed bytes each time. + // 10s because this is default period to run evict() (see above this.wait) + long stopTime = System.currentTimeMillis(); + if (stopTime - startTime <= 1000 * 10 - 1) { + mbFreedSum += bytesFreed/1024/1024; + } else { + // Here we have to calc what situation we have got. + // We have the limit "hbase.lru.cache.heavy.eviction.bytes.size.limit" + // and can calculte overhead on it. + // We will use this information to decide, + // how to change percent of caching blocks. + freedDataOverheadPercent = + (int) (mbFreedSum * 100 / cache.heavyEvictionMbSizeLimit) - 100; + if (freedDataOverheadPercent > 0) { + // Now we are in the situation when we are above the limit + // But maybe we are going to ignore it because it will end quite soon + heavyEvictionCount++; + if (heavyEvictionCount > cache.heavyEvictionCountLimit) { + // It is going for a long time and we have to reduce of caching + // blocks now. So we calculate here how many blocks we want to skip. + // It depends on: + // 1. Overhead - if overhead is big we could more aggressive + // reducing amount of caching blocks. + // 2. How fast we want to get the result. If we know that our + // heavy reading for a long time, we don't want to wait and can + // increase the coefficient and get good performance quite soon. + // But if we don't sure we can do it slowly and it could prevent + // premature exit from this mode. So, when the coefficient is + // higher we can get better performance when heavy reading is stable. + // But when reading is changing we can adjust to it and set + // the coefficient to lower value. + int ch = (int) (freedDataOverheadPercent * cache.heavyEvictionOverheadCoefficient); + // But practice shows that 15% of reducing is quite enough. + // We are not greedy (it could lead to premature exit). + ch = ch > 15 ? 15 : ch; + ch = ch < 0 ? 0 : ch; // I think it will never happen but check for sure + // So this is the key point, here we are reducing % of caching blocks + cache.cacheDataBlockPercent -= ch; + // If we go down too deep we have to stop here, 1% any way should be. + cache.cacheDataBlockPercent = + cache.cacheDataBlockPercent < 1 ? 1 : cache.cacheDataBlockPercent; + } + } else { + // Well, we have got overshooting. + // Mayby it is just short-term fluctuation and we can stay in this mode. + // It help avoid permature exit during short-term fluctuation. + // If overshooting less than 90%, we will try to increase the percent of + // caching blocks and hope it is enough. + if (mbFreedSum >= cache.heavyEvictionMbSizeLimit * 0.1) { + // Simple logic: more overshooting - more caching blocks (backpressure) + int ch = (int) (-freedDataOverheadPercent * 0.1 + 1); Review comment: This is very cryptic and convoluted. Simplify? Also how did you come up with the math here? ########## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java ########## @@ -947,7 +1019,47 @@ public void run() { } LruBlockCache cache = this.cache.get(); if (cache == null) break; - cache.evict(); + bytesFreed = cache.evict(); + long stopTime = System.currentTimeMillis(); + // If heavy cleaning BlockCache control. + // It helps avoid put too many blocks into BlockCache + // when evict() works very active. + if (stopTime - startTime <= 1000 * 10 - 1) { + mbFreedSum += bytesFreed/1024/1024; + } else { + freedDataOverheadPercent = Review comment: Well, I think this is convoluted. ``` ((x*100)/y) - 100 > 0 (x*100)/y > 100 x/y > 1 x > y ``` if else can be simple comparison `if (mbFreedSum > heavyEvictionMbSizeLimit)`, much more readable. You can compute the ratio x/y later if you need it. ########## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java ########## @@ -947,7 +1031,94 @@ public void run() { } LruBlockCache cache = this.cache.get(); if (cache == null) break; - cache.evict(); + bytesFreed = cache.evict(); + /* + * Sometimes we are reading more data than can fit into BlockCache + * and it is the cause a high rate of evictions. + * This in turn leads to heavy Garbage Collector works. + * So a lot of blocks put into BlockCache but never read, + * but spending a lot of CPU resources. + * Here we will analyze how many bytes were freed and decide + * decide whether the time has come to reduce amount of caching blocks. + * It help avoid put too many blocks into BlockCache + * when evict() works very active and save CPU for other jobs. + * More delails: https://issues.apache.org/jira/browse/HBASE-23887 + */ + + // First of all we have to control how much time + // has passed since previuos evict() was launched + // This is should be almost the same time (+/- 10s) + // because we get comparable volumes of freed bytes each time. + // 10s because this is default period to run evict() (see above this.wait) + long stopTime = System.currentTimeMillis(); + if (stopTime - startTime <= 1000 * 10 - 1) { + mbFreedSum += bytesFreed/1024/1024; + } else { + // Here we have to calc what situation we have got. + // We have the limit "hbase.lru.cache.heavy.eviction.bytes.size.limit" + // and can calculte overhead on it. + // We will use this information to decide, + // how to change percent of caching blocks. + freedDataOverheadPercent = + (int) (mbFreedSum * 100 / cache.heavyEvictionMbSizeLimit) - 100; + if (freedDataOverheadPercent > 0) { + // Now we are in the situation when we are above the limit + // But maybe we are going to ignore it because it will end quite soon + heavyEvictionCount++; + if (heavyEvictionCount > cache.heavyEvictionCountLimit) { + // It is going for a long time and we have to reduce of caching + // blocks now. So we calculate here how many blocks we want to skip. + // It depends on: + // 1. Overhead - if overhead is big we could more aggressive + // reducing amount of caching blocks. + // 2. How fast we want to get the result. If we know that our + // heavy reading for a long time, we don't want to wait and can + // increase the coefficient and get good performance quite soon. + // But if we don't sure we can do it slowly and it could prevent + // premature exit from this mode. So, when the coefficient is + // higher we can get better performance when heavy reading is stable. + // But when reading is changing we can adjust to it and set + // the coefficient to lower value. + int ch = (int) (freedDataOverheadPercent * cache.heavyEvictionOverheadCoefficient); + // But practice shows that 15% of reducing is quite enough. + // We are not greedy (it could lead to premature exit). + ch = ch > 15 ? 15 : ch; + ch = ch < 0 ? 0 : ch; // I think it will never happen but check for sure + // So this is the key point, here we are reducing % of caching blocks + cache.cacheDataBlockPercent -= ch; + // If we go down too deep we have to stop here, 1% any way should be. + cache.cacheDataBlockPercent = + cache.cacheDataBlockPercent < 1 ? 1 : cache.cacheDataBlockPercent; + } + } else { + // Well, we have got overshooting. + // Mayby it is just short-term fluctuation and we can stay in this mode. + // It help avoid permature exit during short-term fluctuation. + // If overshooting less than 90%, we will try to increase the percent of + // caching blocks and hope it is enough. + if (mbFreedSum >= cache.heavyEvictionMbSizeLimit * 0.1) { + // Simple logic: more overshooting - more caching blocks (backpressure) + int ch = (int) (-freedDataOverheadPercent * 0.1 + 1); + cache.cacheDataBlockPercent += ch; + // But it can't be more then 100%, so check it. + cache.cacheDataBlockPercent = Review comment: Use Math.max() or Math.min (multiple places with ternary operators. ########## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java ########## @@ -947,7 +1031,94 @@ public void run() { } LruBlockCache cache = this.cache.get(); if (cache == null) break; - cache.evict(); + bytesFreed = cache.evict(); + /* + * Sometimes we are reading more data than can fit into BlockCache + * and it is the cause a high rate of evictions. + * This in turn leads to heavy Garbage Collector works. + * So a lot of blocks put into BlockCache but never read, + * but spending a lot of CPU resources. + * Here we will analyze how many bytes were freed and decide + * decide whether the time has come to reduce amount of caching blocks. + * It help avoid put too many blocks into BlockCache + * when evict() works very active and save CPU for other jobs. + * More delails: https://issues.apache.org/jira/browse/HBASE-23887 + */ + + // First of all we have to control how much time + // has passed since previuos evict() was launched + // This is should be almost the same time (+/- 10s) + // because we get comparable volumes of freed bytes each time. + // 10s because this is default period to run evict() (see above this.wait) + long stopTime = System.currentTimeMillis(); + if (stopTime - startTime <= 1000 * 10 - 1) { + mbFreedSum += bytesFreed/1024/1024; + } else { + // Here we have to calc what situation we have got. + // We have the limit "hbase.lru.cache.heavy.eviction.bytes.size.limit" + // and can calculte overhead on it. + // We will use this information to decide, + // how to change percent of caching blocks. + freedDataOverheadPercent = + (int) (mbFreedSum * 100 / cache.heavyEvictionMbSizeLimit) - 100; + if (freedDataOverheadPercent > 0) { + // Now we are in the situation when we are above the limit + // But maybe we are going to ignore it because it will end quite soon + heavyEvictionCount++; + if (heavyEvictionCount > cache.heavyEvictionCountLimit) { + // It is going for a long time and we have to reduce of caching + // blocks now. So we calculate here how many blocks we want to skip. + // It depends on: + // 1. Overhead - if overhead is big we could more aggressive + // reducing amount of caching blocks. + // 2. How fast we want to get the result. If we know that our + // heavy reading for a long time, we don't want to wait and can + // increase the coefficient and get good performance quite soon. + // But if we don't sure we can do it slowly and it could prevent + // premature exit from this mode. So, when the coefficient is + // higher we can get better performance when heavy reading is stable. + // But when reading is changing we can adjust to it and set + // the coefficient to lower value. + int ch = (int) (freedDataOverheadPercent * cache.heavyEvictionOverheadCoefficient); Review comment: nit: use a better variable name than ch? ########## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java ########## @@ -947,7 +1019,47 @@ public void run() { } LruBlockCache cache = this.cache.get(); if (cache == null) break; - cache.evict(); + bytesFreed = cache.evict(); + long stopTime = System.currentTimeMillis(); + // If heavy cleaning BlockCache control. + // It helps avoid put too many blocks into BlockCache + // when evict() works very active. + if (stopTime - startTime <= 1000 * 10 - 1) { Review comment: Oh I see.. looks like you intend to compute every 10s, but the problem is evictions can happen out of sync (look for callers of evict). For example in a timeline, it can happen at 10s, 12s, 23s etc. So at the end of 23s the mbs computed correspond to a total 13s (between 1st and last).. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org