[GitHub] [hbase] pustota2009 commented on a change in pull request #2934: HBASE-23887 AdaptiveLRU cache
pustota2009 commented on a change in pull request #2934: URL: https://github.com/apache/hbase/pull/2934#discussion_r573208568 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/LruAdaptiveBlockCache.java ## @@ -0,0 +1,1442 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hbase.io.hfile; + +import static java.util.Objects.requireNonNull; + +import java.lang.ref.WeakReference; +import java.util.EnumMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.PriorityQueue; +import java.util.SortedSet; +import java.util.TreeSet; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.atomic.AtomicLong; +import java.util.concurrent.atomic.LongAdder; +import java.util.concurrent.locks.ReentrantLock; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.hbase.io.HeapSize; +import org.apache.hadoop.hbase.io.encoding.DataBlockEncoding; +import org.apache.hadoop.hbase.util.ClassSize; +import org.apache.hadoop.util.StringUtils; +import org.apache.yetus.audience.InterfaceAudience; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.hbase.thirdparty.com.google.common.base.MoreObjects; +import org.apache.hbase.thirdparty.com.google.common.base.Objects; +import org.apache.hbase.thirdparty.com.google.common.util.concurrent.ThreadFactoryBuilder; + +/** + * This realisation improve performance of classical LRU + * cache up to 3 times via reduce GC job. + * + * The classical block cache implementation that is memory-aware using {@link HeapSize}, + * memory-bound using an + * LRU eviction algorithm, and concurrent: backed by a {@link ConcurrentHashMap} and with a + * non-blocking eviction thread giving constant-time {@link #cacheBlock} and {@link #getBlock} + * operations. + * + * Contains three levels of block priority to allow for scan-resistance and in-memory families + * {@link org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder#setInMemory(boolean)} (An + * in-memory column family is a column family that should be served from memory if possible): + * single-access, multiple-accesses, and in-memory priority. A block is added with an in-memory + * priority flag if {@link org.apache.hadoop.hbase.client.ColumnFamilyDescriptor#isInMemory()}, + * otherwise a block becomes a single access priority the first time it is read into this block + * cache. If a block is accessed again while in cache, it is marked as a multiple access priority + * block. This delineation of blocks is used to prevent scans from thrashing the cache adding a + * least-frequently-used element to the eviction algorithm. + * + * Each priority is given its own chunk of the total cache to ensure fairness during eviction. Each + * priority will retain close to its maximum size, however, if any priority is not using its entire + * chunk the others are able to grow beyond their chunk size. + * + * Instantiated at a minimum with the total size and average block size. All sizes are in bytes. The + * block size is not especially important as this cache is fully dynamic in its sizing of blocks. It + * is only used for pre-allocating data structures and in initial heap estimation of the map. + * + * The detailed constructor defines the sizes for the three priorities (they should total to the + * maximum size defined). It also sets the levels that trigger and control the eviction + * thread. + * + * The acceptable size is the cache size level which triggers the eviction process to + * start. It evicts enough blocks to get the size below the minimum size specified. + * + * Eviction happens in a separate thread and involves a single full-scan of the map. It determines + * how many bytes must be freed to reach the minimum size, and then while scanning determines the + * fewest least-recently-used blocks necessary from each of the three priorities (would be 3 times + * bytes to free). It then uses the priority chunk sizes to evict fairly according to the relative
[GitHub] [hbase] pustota2009 commented on a change in pull request #2934: HBASE-23887 AdaptiveLRU cache
pustota2009 commented on a change in pull request #2934: URL: https://github.com/apache/hbase/pull/2934#discussion_r573208486 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/LruAdaptiveBlockCache.java ## @@ -0,0 +1,1442 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hbase.io.hfile; + +import static java.util.Objects.requireNonNull; + +import java.lang.ref.WeakReference; +import java.util.EnumMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.PriorityQueue; +import java.util.SortedSet; +import java.util.TreeSet; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.atomic.AtomicLong; +import java.util.concurrent.atomic.LongAdder; +import java.util.concurrent.locks.ReentrantLock; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.hbase.io.HeapSize; +import org.apache.hadoop.hbase.io.encoding.DataBlockEncoding; +import org.apache.hadoop.hbase.util.ClassSize; +import org.apache.hadoop.util.StringUtils; +import org.apache.yetus.audience.InterfaceAudience; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.hbase.thirdparty.com.google.common.base.MoreObjects; +import org.apache.hbase.thirdparty.com.google.common.base.Objects; +import org.apache.hbase.thirdparty.com.google.common.util.concurrent.ThreadFactoryBuilder; + +/** + * This realisation improve performance of classical LRU + * cache up to 3 times via reduce GC job. + * + * The classical block cache implementation that is memory-aware using {@link HeapSize}, + * memory-bound using an + * LRU eviction algorithm, and concurrent: backed by a {@link ConcurrentHashMap} and with a + * non-blocking eviction thread giving constant-time {@link #cacheBlock} and {@link #getBlock} + * operations. + * + * Contains three levels of block priority to allow for scan-resistance and in-memory families + * {@link org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder#setInMemory(boolean)} (An + * in-memory column family is a column family that should be served from memory if possible): + * single-access, multiple-accesses, and in-memory priority. A block is added with an in-memory + * priority flag if {@link org.apache.hadoop.hbase.client.ColumnFamilyDescriptor#isInMemory()}, + * otherwise a block becomes a single access priority the first time it is read into this block + * cache. If a block is accessed again while in cache, it is marked as a multiple access priority + * block. This delineation of blocks is used to prevent scans from thrashing the cache adding a + * least-frequently-used element to the eviction algorithm. + * + * Each priority is given its own chunk of the total cache to ensure fairness during eviction. Each + * priority will retain close to its maximum size, however, if any priority is not using its entire + * chunk the others are able to grow beyond their chunk size. + * + * Instantiated at a minimum with the total size and average block size. All sizes are in bytes. The + * block size is not especially important as this cache is fully dynamic in its sizing of blocks. It + * is only used for pre-allocating data structures and in initial heap estimation of the map. + * + * The detailed constructor defines the sizes for the three priorities (they should total to the + * maximum size defined). It also sets the levels that trigger and control the eviction + * thread. + * + * The acceptable size is the cache size level which triggers the eviction process to + * start. It evicts enough blocks to get the size below the minimum size specified. + * + * Eviction happens in a separate thread and involves a single full-scan of the map. It determines + * how many bytes must be freed to reach the minimum size, and then while scanning determines the + * fewest least-recently-used blocks necessary from each of the three priorities (would be 3 times + * bytes to free). It then uses the priority chunk sizes to evict fairly according to the relative
[GitHub] [hbase] pustota2009 commented on a change in pull request #2934: HBASE-23887 AdaptiveLRU cache
pustota2009 commented on a change in pull request #2934: URL: https://github.com/apache/hbase/pull/2934#discussion_r573208223 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/LruAdaptiveBlockCache.java ## @@ -0,0 +1,1442 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hbase.io.hfile; + +import static java.util.Objects.requireNonNull; + +import java.lang.ref.WeakReference; +import java.util.EnumMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.PriorityQueue; +import java.util.SortedSet; +import java.util.TreeSet; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.atomic.AtomicLong; +import java.util.concurrent.atomic.LongAdder; +import java.util.concurrent.locks.ReentrantLock; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.hbase.io.HeapSize; +import org.apache.hadoop.hbase.io.encoding.DataBlockEncoding; +import org.apache.hadoop.hbase.util.ClassSize; +import org.apache.hadoop.util.StringUtils; +import org.apache.yetus.audience.InterfaceAudience; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.hbase.thirdparty.com.google.common.base.MoreObjects; +import org.apache.hbase.thirdparty.com.google.common.base.Objects; +import org.apache.hbase.thirdparty.com.google.common.util.concurrent.ThreadFactoryBuilder; + +/** + * This realisation improve performance of classical LRU + * cache up to 3 times via reduce GC job. + * + * The classical block cache implementation that is memory-aware using {@link HeapSize}, + * memory-bound using an + * LRU eviction algorithm, and concurrent: backed by a {@link ConcurrentHashMap} and with a + * non-blocking eviction thread giving constant-time {@link #cacheBlock} and {@link #getBlock} + * operations. + * + * Contains three levels of block priority to allow for scan-resistance and in-memory families + * {@link org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder#setInMemory(boolean)} (An + * in-memory column family is a column family that should be served from memory if possible): + * single-access, multiple-accesses, and in-memory priority. A block is added with an in-memory + * priority flag if {@link org.apache.hadoop.hbase.client.ColumnFamilyDescriptor#isInMemory()}, + * otherwise a block becomes a single access priority the first time it is read into this block + * cache. If a block is accessed again while in cache, it is marked as a multiple access priority + * block. This delineation of blocks is used to prevent scans from thrashing the cache adding a + * least-frequently-used element to the eviction algorithm. + * + * Each priority is given its own chunk of the total cache to ensure fairness during eviction. Each + * priority will retain close to its maximum size, however, if any priority is not using its entire + * chunk the others are able to grow beyond their chunk size. + * + * Instantiated at a minimum with the total size and average block size. All sizes are in bytes. The + * block size is not especially important as this cache is fully dynamic in its sizing of blocks. It + * is only used for pre-allocating data structures and in initial heap estimation of the map. + * + * The detailed constructor defines the sizes for the three priorities (they should total to the + * maximum size defined). It also sets the levels that trigger and control the eviction + * thread. + * + * The acceptable size is the cache size level which triggers the eviction process to + * start. It evicts enough blocks to get the size below the minimum size specified. + * + * Eviction happens in a separate thread and involves a single full-scan of the map. It determines + * how many bytes must be freed to reach the minimum size, and then while scanning determines the + * fewest least-recently-used blocks necessary from each of the three priorities (would be 3 times + * bytes to free). It then uses the priority chunk sizes to evict fairly according to the relative
[GitHub] [hbase] pustota2009 commented on a change in pull request #2934: HBASE-23887 AdaptiveLRU cache
pustota2009 commented on a change in pull request #2934: URL: https://github.com/apache/hbase/pull/2934#discussion_r573208070 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/LruAdaptiveBlockCache.java ## @@ -0,0 +1,1442 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hbase.io.hfile; + +import static java.util.Objects.requireNonNull; + +import java.lang.ref.WeakReference; +import java.util.EnumMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.PriorityQueue; +import java.util.SortedSet; +import java.util.TreeSet; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.atomic.AtomicLong; +import java.util.concurrent.atomic.LongAdder; +import java.util.concurrent.locks.ReentrantLock; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.hbase.io.HeapSize; +import org.apache.hadoop.hbase.io.encoding.DataBlockEncoding; +import org.apache.hadoop.hbase.util.ClassSize; +import org.apache.hadoop.util.StringUtils; +import org.apache.yetus.audience.InterfaceAudience; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.hbase.thirdparty.com.google.common.base.MoreObjects; +import org.apache.hbase.thirdparty.com.google.common.base.Objects; +import org.apache.hbase.thirdparty.com.google.common.util.concurrent.ThreadFactoryBuilder; + +/** + * This realisation improve performance of classical LRU + * cache up to 3 times via reduce GC job. + * + * The classical block cache implementation that is memory-aware using {@link HeapSize}, + * memory-bound using an + * LRU eviction algorithm, and concurrent: backed by a {@link ConcurrentHashMap} and with a + * non-blocking eviction thread giving constant-time {@link #cacheBlock} and {@link #getBlock} + * operations. + * + * Contains three levels of block priority to allow for scan-resistance and in-memory families + * {@link org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder#setInMemory(boolean)} (An + * in-memory column family is a column family that should be served from memory if possible): + * single-access, multiple-accesses, and in-memory priority. A block is added with an in-memory + * priority flag if {@link org.apache.hadoop.hbase.client.ColumnFamilyDescriptor#isInMemory()}, + * otherwise a block becomes a single access priority the first time it is read into this block + * cache. If a block is accessed again while in cache, it is marked as a multiple access priority + * block. This delineation of blocks is used to prevent scans from thrashing the cache adding a + * least-frequently-used element to the eviction algorithm. + * + * Each priority is given its own chunk of the total cache to ensure fairness during eviction. Each + * priority will retain close to its maximum size, however, if any priority is not using its entire + * chunk the others are able to grow beyond their chunk size. + * + * Instantiated at a minimum with the total size and average block size. All sizes are in bytes. The + * block size is not especially important as this cache is fully dynamic in its sizing of blocks. It + * is only used for pre-allocating data structures and in initial heap estimation of the map. + * + * The detailed constructor defines the sizes for the three priorities (they should total to the + * maximum size defined). It also sets the levels that trigger and control the eviction + * thread. + * + * The acceptable size is the cache size level which triggers the eviction process to + * start. It evicts enough blocks to get the size below the minimum size specified. + * + * Eviction happens in a separate thread and involves a single full-scan of the map. It determines + * how many bytes must be freed to reach the minimum size, and then while scanning determines the + * fewest least-recently-used blocks necessary from each of the three priorities (would be 3 times + * bytes to free). It then uses the priority chunk sizes to evict fairly according to the relative
[GitHub] [hbase] pustota2009 commented on a change in pull request #2934: HBASE-23887 AdaptiveLRU cache
pustota2009 commented on a change in pull request #2934: URL: https://github.com/apache/hbase/pull/2934#discussion_r573207460 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/LruAdaptiveBlockCache.java ## @@ -0,0 +1,1442 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hbase.io.hfile; + +import static java.util.Objects.requireNonNull; + +import java.lang.ref.WeakReference; +import java.util.EnumMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.PriorityQueue; +import java.util.SortedSet; +import java.util.TreeSet; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.atomic.AtomicLong; +import java.util.concurrent.atomic.LongAdder; +import java.util.concurrent.locks.ReentrantLock; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.hbase.io.HeapSize; +import org.apache.hadoop.hbase.io.encoding.DataBlockEncoding; +import org.apache.hadoop.hbase.util.ClassSize; +import org.apache.hadoop.util.StringUtils; +import org.apache.yetus.audience.InterfaceAudience; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.hbase.thirdparty.com.google.common.base.MoreObjects; +import org.apache.hbase.thirdparty.com.google.common.base.Objects; +import org.apache.hbase.thirdparty.com.google.common.util.concurrent.ThreadFactoryBuilder; + +/** + * This realisation improve performance of classical LRU + * cache up to 3 times via reduce GC job. + * + * The classical block cache implementation that is memory-aware using {@link HeapSize}, + * memory-bound using an + * LRU eviction algorithm, and concurrent: backed by a {@link ConcurrentHashMap} and with a + * non-blocking eviction thread giving constant-time {@link #cacheBlock} and {@link #getBlock} + * operations. + * + * Contains three levels of block priority to allow for scan-resistance and in-memory families + * {@link org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder#setInMemory(boolean)} (An + * in-memory column family is a column family that should be served from memory if possible): + * single-access, multiple-accesses, and in-memory priority. A block is added with an in-memory + * priority flag if {@link org.apache.hadoop.hbase.client.ColumnFamilyDescriptor#isInMemory()}, + * otherwise a block becomes a single access priority the first time it is read into this block + * cache. If a block is accessed again while in cache, it is marked as a multiple access priority + * block. This delineation of blocks is used to prevent scans from thrashing the cache adding a + * least-frequently-used element to the eviction algorithm. + * + * Each priority is given its own chunk of the total cache to ensure fairness during eviction. Each + * priority will retain close to its maximum size, however, if any priority is not using its entire + * chunk the others are able to grow beyond their chunk size. + * + * Instantiated at a minimum with the total size and average block size. All sizes are in bytes. The + * block size is not especially important as this cache is fully dynamic in its sizing of blocks. It + * is only used for pre-allocating data structures and in initial heap estimation of the map. + * + * The detailed constructor defines the sizes for the three priorities (they should total to the + * maximum size defined). It also sets the levels that trigger and control the eviction + * thread. + * + * The acceptable size is the cache size level which triggers the eviction process to + * start. It evicts enough blocks to get the size below the minimum size specified. + * + * Eviction happens in a separate thread and involves a single full-scan of the map. It determines + * how many bytes must be freed to reach the minimum size, and then while scanning determines the + * fewest least-recently-used blocks necessary from each of the three priorities (would be 3 times + * bytes to free). It then uses the priority chunk sizes to evict fairly according to the relative
[GitHub] [hbase] pustota2009 commented on a change in pull request #2934: HBASE-23887 AdaptiveLRU cache
pustota2009 commented on a change in pull request #2934: URL: https://github.com/apache/hbase/pull/2934#discussion_r573207051 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/LruAdaptiveBlockCache.java ## @@ -0,0 +1,1442 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hbase.io.hfile; + +import static java.util.Objects.requireNonNull; + +import java.lang.ref.WeakReference; +import java.util.EnumMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.PriorityQueue; +import java.util.SortedSet; +import java.util.TreeSet; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.atomic.AtomicLong; +import java.util.concurrent.atomic.LongAdder; +import java.util.concurrent.locks.ReentrantLock; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.hbase.io.HeapSize; +import org.apache.hadoop.hbase.io.encoding.DataBlockEncoding; +import org.apache.hadoop.hbase.util.ClassSize; +import org.apache.hadoop.util.StringUtils; +import org.apache.yetus.audience.InterfaceAudience; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.hbase.thirdparty.com.google.common.base.MoreObjects; +import org.apache.hbase.thirdparty.com.google.common.base.Objects; +import org.apache.hbase.thirdparty.com.google.common.util.concurrent.ThreadFactoryBuilder; + +/** + * This realisation improve performance of classical LRU + * cache up to 3 times via reduce GC job. + * + * The classical block cache implementation that is memory-aware using {@link HeapSize}, + * memory-bound using an + * LRU eviction algorithm, and concurrent: backed by a {@link ConcurrentHashMap} and with a + * non-blocking eviction thread giving constant-time {@link #cacheBlock} and {@link #getBlock} + * operations. + * + * Contains three levels of block priority to allow for scan-resistance and in-memory families + * {@link org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder#setInMemory(boolean)} (An + * in-memory column family is a column family that should be served from memory if possible): + * single-access, multiple-accesses, and in-memory priority. A block is added with an in-memory + * priority flag if {@link org.apache.hadoop.hbase.client.ColumnFamilyDescriptor#isInMemory()}, + * otherwise a block becomes a single access priority the first time it is read into this block + * cache. If a block is accessed again while in cache, it is marked as a multiple access priority + * block. This delineation of blocks is used to prevent scans from thrashing the cache adding a + * least-frequently-used element to the eviction algorithm. + * + * Each priority is given its own chunk of the total cache to ensure fairness during eviction. Each + * priority will retain close to its maximum size, however, if any priority is not using its entire + * chunk the others are able to grow beyond their chunk size. + * + * Instantiated at a minimum with the total size and average block size. All sizes are in bytes. The + * block size is not especially important as this cache is fully dynamic in its sizing of blocks. It + * is only used for pre-allocating data structures and in initial heap estimation of the map. + * + * The detailed constructor defines the sizes for the three priorities (they should total to the + * maximum size defined). It also sets the levels that trigger and control the eviction + * thread. + * + * The acceptable size is the cache size level which triggers the eviction process to + * start. It evicts enough blocks to get the size below the minimum size specified. + * + * Eviction happens in a separate thread and involves a single full-scan of the map. It determines + * how many bytes must be freed to reach the minimum size, and then while scanning determines the + * fewest least-recently-used blocks necessary from each of the three priorities (would be 3 times + * bytes to free). It then uses the priority chunk sizes to evict fairly according to the relative
[GitHub] [hbase] pustota2009 commented on a change in pull request #2934: HBASE-23887 AdaptiveLRU cache
pustota2009 commented on a change in pull request #2934: URL: https://github.com/apache/hbase/pull/2934#discussion_r573206391 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/LruAdaptiveBlockCache.java ## @@ -0,0 +1,1442 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hbase.io.hfile; + +import static java.util.Objects.requireNonNull; + +import java.lang.ref.WeakReference; +import java.util.EnumMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.PriorityQueue; +import java.util.SortedSet; +import java.util.TreeSet; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.atomic.AtomicLong; +import java.util.concurrent.atomic.LongAdder; +import java.util.concurrent.locks.ReentrantLock; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.hbase.io.HeapSize; +import org.apache.hadoop.hbase.io.encoding.DataBlockEncoding; +import org.apache.hadoop.hbase.util.ClassSize; +import org.apache.hadoop.util.StringUtils; +import org.apache.yetus.audience.InterfaceAudience; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.hbase.thirdparty.com.google.common.base.MoreObjects; +import org.apache.hbase.thirdparty.com.google.common.base.Objects; +import org.apache.hbase.thirdparty.com.google.common.util.concurrent.ThreadFactoryBuilder; + +/** + * This realisation improve performance of classical LRU + * cache up to 3 times via reduce GC job. + * + * The classical block cache implementation that is memory-aware using {@link HeapSize}, + * memory-bound using an + * LRU eviction algorithm, and concurrent: backed by a {@link ConcurrentHashMap} and with a + * non-blocking eviction thread giving constant-time {@link #cacheBlock} and {@link #getBlock} + * operations. + * + * Contains three levels of block priority to allow for scan-resistance and in-memory families + * {@link org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder#setInMemory(boolean)} (An + * in-memory column family is a column family that should be served from memory if possible): + * single-access, multiple-accesses, and in-memory priority. A block is added with an in-memory + * priority flag if {@link org.apache.hadoop.hbase.client.ColumnFamilyDescriptor#isInMemory()}, + * otherwise a block becomes a single access priority the first time it is read into this block + * cache. If a block is accessed again while in cache, it is marked as a multiple access priority + * block. This delineation of blocks is used to prevent scans from thrashing the cache adding a + * least-frequently-used element to the eviction algorithm. + * + * Each priority is given its own chunk of the total cache to ensure fairness during eviction. Each + * priority will retain close to its maximum size, however, if any priority is not using its entire + * chunk the others are able to grow beyond their chunk size. + * + * Instantiated at a minimum with the total size and average block size. All sizes are in bytes. The + * block size is not especially important as this cache is fully dynamic in its sizing of blocks. It + * is only used for pre-allocating data structures and in initial heap estimation of the map. + * + * The detailed constructor defines the sizes for the three priorities (they should total to the + * maximum size defined). It also sets the levels that trigger and control the eviction + * thread. + * + * The acceptable size is the cache size level which triggers the eviction process to + * start. It evicts enough blocks to get the size below the minimum size specified. + * + * Eviction happens in a separate thread and involves a single full-scan of the map. It determines + * how many bytes must be freed to reach the minimum size, and then while scanning determines the + * fewest least-recently-used blocks necessary from each of the three priorities (would be 3 times + * bytes to free). It then uses the priority chunk sizes to evict fairly according to the relative
[GitHub] [hbase] pustota2009 commented on a change in pull request #2934: HBASE-23887 AdaptiveLRU cache
pustota2009 commented on a change in pull request #2934: URL: https://github.com/apache/hbase/pull/2934#discussion_r573206341 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/LruAdaptiveBlockCache.java ## @@ -0,0 +1,1442 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hbase.io.hfile; + +import static java.util.Objects.requireNonNull; + +import java.lang.ref.WeakReference; +import java.util.EnumMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.PriorityQueue; +import java.util.SortedSet; +import java.util.TreeSet; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.atomic.AtomicLong; +import java.util.concurrent.atomic.LongAdder; +import java.util.concurrent.locks.ReentrantLock; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.hbase.io.HeapSize; +import org.apache.hadoop.hbase.io.encoding.DataBlockEncoding; +import org.apache.hadoop.hbase.util.ClassSize; +import org.apache.hadoop.util.StringUtils; +import org.apache.yetus.audience.InterfaceAudience; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.hbase.thirdparty.com.google.common.base.MoreObjects; +import org.apache.hbase.thirdparty.com.google.common.base.Objects; +import org.apache.hbase.thirdparty.com.google.common.util.concurrent.ThreadFactoryBuilder; + +/** + * This realisation improve performance of classical LRU + * cache up to 3 times via reduce GC job. + * + * The classical block cache implementation that is memory-aware using {@link HeapSize}, + * memory-bound using an + * LRU eviction algorithm, and concurrent: backed by a {@link ConcurrentHashMap} and with a + * non-blocking eviction thread giving constant-time {@link #cacheBlock} and {@link #getBlock} + * operations. + * + * Contains three levels of block priority to allow for scan-resistance and in-memory families + * {@link org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder#setInMemory(boolean)} (An + * in-memory column family is a column family that should be served from memory if possible): + * single-access, multiple-accesses, and in-memory priority. A block is added with an in-memory + * priority flag if {@link org.apache.hadoop.hbase.client.ColumnFamilyDescriptor#isInMemory()}, + * otherwise a block becomes a single access priority the first time it is read into this block + * cache. If a block is accessed again while in cache, it is marked as a multiple access priority + * block. This delineation of blocks is used to prevent scans from thrashing the cache adding a + * least-frequently-used element to the eviction algorithm. + * + * Each priority is given its own chunk of the total cache to ensure fairness during eviction. Each + * priority will retain close to its maximum size, however, if any priority is not using its entire + * chunk the others are able to grow beyond their chunk size. + * + * Instantiated at a minimum with the total size and average block size. All sizes are in bytes. The + * block size is not especially important as this cache is fully dynamic in its sizing of blocks. It + * is only used for pre-allocating data structures and in initial heap estimation of the map. + * + * The detailed constructor defines the sizes for the three priorities (they should total to the + * maximum size defined). It also sets the levels that trigger and control the eviction + * thread. + * + * The acceptable size is the cache size level which triggers the eviction process to + * start. It evicts enough blocks to get the size below the minimum size specified. + * + * Eviction happens in a separate thread and involves a single full-scan of the map. It determines + * how many bytes must be freed to reach the minimum size, and then while scanning determines the + * fewest least-recently-used blocks necessary from each of the three priorities (would be 3 times + * bytes to free). It then uses the priority chunk sizes to evict fairly according to the relative
[GitHub] [hbase] pustota2009 commented on a change in pull request #2934: HBASE-23887 AdaptiveLRU cache
pustota2009 commented on a change in pull request #2934: URL: https://github.com/apache/hbase/pull/2934#discussion_r573206052 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/LruAdaptiveBlockCache.java ## @@ -0,0 +1,1442 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hbase.io.hfile; + +import static java.util.Objects.requireNonNull; + +import java.lang.ref.WeakReference; +import java.util.EnumMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.PriorityQueue; +import java.util.SortedSet; +import java.util.TreeSet; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.atomic.AtomicLong; +import java.util.concurrent.atomic.LongAdder; +import java.util.concurrent.locks.ReentrantLock; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.hbase.io.HeapSize; +import org.apache.hadoop.hbase.io.encoding.DataBlockEncoding; +import org.apache.hadoop.hbase.util.ClassSize; +import org.apache.hadoop.util.StringUtils; +import org.apache.yetus.audience.InterfaceAudience; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.hbase.thirdparty.com.google.common.base.MoreObjects; +import org.apache.hbase.thirdparty.com.google.common.base.Objects; +import org.apache.hbase.thirdparty.com.google.common.util.concurrent.ThreadFactoryBuilder; + +/** + * This realisation improve performance of classical LRU + * cache up to 3 times via reduce GC job. + * + * The classical block cache implementation that is memory-aware using {@link HeapSize}, + * memory-bound using an + * LRU eviction algorithm, and concurrent: backed by a {@link ConcurrentHashMap} and with a + * non-blocking eviction thread giving constant-time {@link #cacheBlock} and {@link #getBlock} + * operations. + * + * Contains three levels of block priority to allow for scan-resistance and in-memory families + * {@link org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder#setInMemory(boolean)} (An + * in-memory column family is a column family that should be served from memory if possible): + * single-access, multiple-accesses, and in-memory priority. A block is added with an in-memory + * priority flag if {@link org.apache.hadoop.hbase.client.ColumnFamilyDescriptor#isInMemory()}, + * otherwise a block becomes a single access priority the first time it is read into this block + * cache. If a block is accessed again while in cache, it is marked as a multiple access priority + * block. This delineation of blocks is used to prevent scans from thrashing the cache adding a + * least-frequently-used element to the eviction algorithm. + * + * Each priority is given its own chunk of the total cache to ensure fairness during eviction. Each + * priority will retain close to its maximum size, however, if any priority is not using its entire + * chunk the others are able to grow beyond their chunk size. + * + * Instantiated at a minimum with the total size and average block size. All sizes are in bytes. The + * block size is not especially important as this cache is fully dynamic in its sizing of blocks. It + * is only used for pre-allocating data structures and in initial heap estimation of the map. + * + * The detailed constructor defines the sizes for the three priorities (they should total to the + * maximum size defined). It also sets the levels that trigger and control the eviction + * thread. + * + * The acceptable size is the cache size level which triggers the eviction process to + * start. It evicts enough blocks to get the size below the minimum size specified. + * + * Eviction happens in a separate thread and involves a single full-scan of the map. It determines + * how many bytes must be freed to reach the minimum size, and then while scanning determines the + * fewest least-recently-used blocks necessary from each of the three priorities (would be 3 times + * bytes to free). It then uses the priority chunk sizes to evict fairly according to the relative
[GitHub] [hbase] pustota2009 commented on a change in pull request #2934: HBASE-23887 AdaptiveLRU cache
pustota2009 commented on a change in pull request #2934: URL: https://github.com/apache/hbase/pull/2934#discussion_r573205983 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/LruAdaptiveBlockCache.java ## @@ -0,0 +1,1442 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hbase.io.hfile; + +import static java.util.Objects.requireNonNull; + +import java.lang.ref.WeakReference; +import java.util.EnumMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.PriorityQueue; +import java.util.SortedSet; +import java.util.TreeSet; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.atomic.AtomicLong; +import java.util.concurrent.atomic.LongAdder; +import java.util.concurrent.locks.ReentrantLock; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.hbase.io.HeapSize; +import org.apache.hadoop.hbase.io.encoding.DataBlockEncoding; +import org.apache.hadoop.hbase.util.ClassSize; +import org.apache.hadoop.util.StringUtils; +import org.apache.yetus.audience.InterfaceAudience; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.hbase.thirdparty.com.google.common.base.MoreObjects; +import org.apache.hbase.thirdparty.com.google.common.base.Objects; +import org.apache.hbase.thirdparty.com.google.common.util.concurrent.ThreadFactoryBuilder; + +/** + * This realisation improve performance of classical LRU + * cache up to 3 times via reduce GC job. + * + * The classical block cache implementation that is memory-aware using {@link HeapSize}, + * memory-bound using an + * LRU eviction algorithm, and concurrent: backed by a {@link ConcurrentHashMap} and with a + * non-blocking eviction thread giving constant-time {@link #cacheBlock} and {@link #getBlock} + * operations. + * + * Contains three levels of block priority to allow for scan-resistance and in-memory families + * {@link org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder#setInMemory(boolean)} (An + * in-memory column family is a column family that should be served from memory if possible): + * single-access, multiple-accesses, and in-memory priority. A block is added with an in-memory + * priority flag if {@link org.apache.hadoop.hbase.client.ColumnFamilyDescriptor#isInMemory()}, + * otherwise a block becomes a single access priority the first time it is read into this block + * cache. If a block is accessed again while in cache, it is marked as a multiple access priority + * block. This delineation of blocks is used to prevent scans from thrashing the cache adding a + * least-frequently-used element to the eviction algorithm. + * + * Each priority is given its own chunk of the total cache to ensure fairness during eviction. Each + * priority will retain close to its maximum size, however, if any priority is not using its entire + * chunk the others are able to grow beyond their chunk size. + * + * Instantiated at a minimum with the total size and average block size. All sizes are in bytes. The + * block size is not especially important as this cache is fully dynamic in its sizing of blocks. It + * is only used for pre-allocating data structures and in initial heap estimation of the map. + * + * The detailed constructor defines the sizes for the three priorities (they should total to the + * maximum size defined). It also sets the levels that trigger and control the eviction + * thread. + * + * The acceptable size is the cache size level which triggers the eviction process to + * start. It evicts enough blocks to get the size below the minimum size specified. + * + * Eviction happens in a separate thread and involves a single full-scan of the map. It determines + * how many bytes must be freed to reach the minimum size, and then while scanning determines the + * fewest least-recently-used blocks necessary from each of the three priorities (would be 3 times + * bytes to free). It then uses the priority chunk sizes to evict fairly according to the relative