[jira] [Commented] (HBASE-9857) Blockcache prefetch option
[ https://issues.apache.org/jira/browse/HBASE-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14008442#comment-14008442 ] Hudson commented on HBASE-9857: --- SUCCESS: Integrated in HBase-0.98 #314 (See [https://builds.apache.org/job/HBase-0.98/314/]) Amend HBASE-9857 Blockcache prefetch option; add missing license header to correct file this time (apurtell: rev a260c862a71c6433213e6619c4de004250468c83) * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/PrefetchExecutor.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java Blockcache prefetch option -- Key: HBASE-9857 URL: https://issues.apache.org/jira/browse/HBASE-9857 Project: HBase Issue Type: Improvement Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 0.99.0, 0.98.3 Attachments: 9857.patch, 9857.patch, HBASE-9857-0.98.patch, HBASE-9857-trunk.patch, HBASE-9857-trunk.patch Attached patch implements a prefetching function for HFile (v3) blocks, if indicated by a column family or regionserver property. The purpose of this change is to as rapidly after region open as reasonable warm the blockcache with all the data and index blocks of (presumably also in-memory) table data, without counting those block loads as cache misses. Great for fast reads and keeping the cache hit ratio high. Can tune the IO impact versus time until all data blocks are in cache. Works a bit like CompactSplitThread. Makes some effort not to stampede. I have been using this for setting up various experiments and thought I'd polish it up a bit and throw it out there. If the data to be preloaded will not fit in blockcache, or if as a percentage of blockcache it is large, this is not a good idea, will just blow out the cache and trigger a lot of useless GC activity. Might be useful as an expert tuning option though. Or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-9857) Blockcache prefetch option
[ https://issues.apache.org/jira/browse/HBASE-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14008453#comment-14008453 ] Hudson commented on HBASE-9857: --- SUCCESS: Integrated in HBase-TRUNK #5140 (See [https://builds.apache.org/job/HBase-TRUNK/5140/]) Amend HBASE-9857 Blockcache prefetch option; add missing license header to correct file this time (apurtell: rev be85f89cd4508de5337820a2694d5262c6d69092) * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/PrefetchExecutor.java Blockcache prefetch option -- Key: HBASE-9857 URL: https://issues.apache.org/jira/browse/HBASE-9857 Project: HBase Issue Type: Improvement Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 0.99.0, 0.98.3 Attachments: 9857.patch, 9857.patch, HBASE-9857-0.98.patch, HBASE-9857-trunk.patch, HBASE-9857-trunk.patch Attached patch implements a prefetching function for HFile (v3) blocks, if indicated by a column family or regionserver property. The purpose of this change is to as rapidly after region open as reasonable warm the blockcache with all the data and index blocks of (presumably also in-memory) table data, without counting those block loads as cache misses. Great for fast reads and keeping the cache hit ratio high. Can tune the IO impact versus time until all data blocks are in cache. Works a bit like CompactSplitThread. Makes some effort not to stampede. I have been using this for setting up various experiments and thought I'd polish it up a bit and throw it out there. If the data to be preloaded will not fit in blockcache, or if as a percentage of blockcache it is large, this is not a good idea, will just blow out the cache and trigger a lot of useless GC activity. Might be useful as an expert tuning option though. Or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-9857) Blockcache prefetch option
[ https://issues.apache.org/jira/browse/HBASE-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14008465#comment-14008465 ] Hudson commented on HBASE-9857: --- SUCCESS: Integrated in HBase-0.98-on-Hadoop-1.1 #295 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/295/]) Amend HBASE-9857 Blockcache prefetch option; add missing license header to correct file this time (apurtell: rev a260c862a71c6433213e6619c4de004250468c83) * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/PrefetchExecutor.java Blockcache prefetch option -- Key: HBASE-9857 URL: https://issues.apache.org/jira/browse/HBASE-9857 Project: HBase Issue Type: Improvement Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 0.99.0, 0.98.3 Attachments: 9857.patch, 9857.patch, HBASE-9857-0.98.patch, HBASE-9857-trunk.patch, HBASE-9857-trunk.patch Attached patch implements a prefetching function for HFile (v3) blocks, if indicated by a column family or regionserver property. The purpose of this change is to as rapidly after region open as reasonable warm the blockcache with all the data and index blocks of (presumably also in-memory) table data, without counting those block loads as cache misses. Great for fast reads and keeping the cache hit ratio high. Can tune the IO impact versus time until all data blocks are in cache. Works a bit like CompactSplitThread. Makes some effort not to stampede. I have been using this for setting up various experiments and thought I'd polish it up a bit and throw it out there. If the data to be preloaded will not fit in blockcache, or if as a percentage of blockcache it is large, this is not a good idea, will just blow out the cache and trigger a lot of useless GC activity. Might be useful as an expert tuning option though. Or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-9857) Blockcache prefetch option
[ https://issues.apache.org/jira/browse/HBASE-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14006247#comment-14006247 ] Vladimir Rodionov commented on HBASE-9857: -- [~apurtell], do you take into account that all new blocks are cached in young gen space, which is 25% of overall cache? If you do not read block immediately after write (into cache) it will never get promoted into multi-bucket (50% of a cache) and you will be trashing bottom 25% of a block cache? Blockcache prefetch option -- Key: HBASE-9857 URL: https://issues.apache.org/jira/browse/HBASE-9857 Project: HBase Issue Type: Improvement Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 0.99.0, 0.98.3 Attachments: 9857.patch, 9857.patch, HBASE-9857-0.98.patch, HBASE-9857-trunk.patch, HBASE-9857-trunk.patch Attached patch implements a prefetching function for HFile (v3) blocks, if indicated by a column family or regionserver property. The purpose of this change is to as rapidly after region open as reasonable warm the blockcache with all the data and index blocks of (presumably also in-memory) table data, without counting those block loads as cache misses. Great for fast reads and keeping the cache hit ratio high. Can tune the IO impact versus time until all data blocks are in cache. Works a bit like CompactSplitThread. Makes some effort not to stampede. I have been using this for setting up various experiments and thought I'd polish it up a bit and throw it out there. If the data to be preloaded will not fit in blockcache, or if as a percentage of blockcache it is large, this is not a good idea, will just blow out the cache and trigger a lot of useless GC activity. Might be useful as an expert tuning option though. Or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-9857) Blockcache prefetch option
[ https://issues.apache.org/jira/browse/HBASE-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14006351#comment-14006351 ] Andrew Purtell commented on HBASE-9857: --- bq. If you do not read block immediately after write (into cache) it will never get promoted into multi-bucket (50% of a cache) and you will be trashing bottom 25% of a block cache We have a separate already existing schema setting for cache-on-write. Otherwise, sure, there's no magic here. It's a tuning option. Blockcache prefetch option -- Key: HBASE-9857 URL: https://issues.apache.org/jira/browse/HBASE-9857 Project: HBase Issue Type: Improvement Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 0.99.0, 0.98.3 Attachments: 9857.patch, 9857.patch, HBASE-9857-0.98.patch, HBASE-9857-trunk.patch, HBASE-9857-trunk.patch Attached patch implements a prefetching function for HFile (v3) blocks, if indicated by a column family or regionserver property. The purpose of this change is to as rapidly after region open as reasonable warm the blockcache with all the data and index blocks of (presumably also in-memory) table data, without counting those block loads as cache misses. Great for fast reads and keeping the cache hit ratio high. Can tune the IO impact versus time until all data blocks are in cache. Works a bit like CompactSplitThread. Makes some effort not to stampede. I have been using this for setting up various experiments and thought I'd polish it up a bit and throw it out there. If the data to be preloaded will not fit in blockcache, or if as a percentage of blockcache it is large, this is not a good idea, will just blow out the cache and trigger a lot of useless GC activity. Might be useful as an expert tuning option though. Or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-9857) Blockcache prefetch option
[ https://issues.apache.org/jira/browse/HBASE-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14006372#comment-14006372 ] Hudson commented on HBASE-9857: --- FAILURE: Integrated in HBase-TRUNK #5137 (See [https://builds.apache.org/job/HBase-TRUNK/5137/]) HBASE-9857 Blockcache prefetch option (apurtell: rev 58818496daad0572843eacbeabfb95bc6af816ee) * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SingleSizeCache.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestPrefetch.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCache.java * hbase-client/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java * hbase-shell/src/main/ruby/hbase/admin.rb * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCacheOnWriteInSchema.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CombinedBlockCache.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabCache.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHeapMemoryManager.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV3.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestLruBlockCache.java * hbase-server/src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/PrefetchExecutor.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCache.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/bucket/TestBucketCache.java Amend HBASE-9857 Blockcache prefetch option; add missing license header (apurtell: rev 264725d59274374d7b9c8ee2b47a86713ab1a6b8) * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java Blockcache prefetch option -- Key: HBASE-9857 URL: https://issues.apache.org/jira/browse/HBASE-9857 Project: HBase Issue Type: Improvement Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 0.99.0, 0.98.3 Attachments: 9857.patch, 9857.patch, HBASE-9857-0.98.patch, HBASE-9857-trunk.patch, HBASE-9857-trunk.patch Attached patch implements a prefetching function for HFile (v3) blocks, if indicated by a column family or regionserver property. The purpose of this change is to as rapidly after region open as reasonable warm the blockcache with all the data and index blocks of (presumably also in-memory) table data, without counting those block loads as cache misses. Great for fast reads and keeping the cache hit ratio high. Can tune the IO impact versus time until all data blocks are in cache. Works a bit like CompactSplitThread. Makes some effort not to stampede. I have been using this for setting up various experiments and thought I'd polish it up a bit and throw it out there. If the data to be preloaded will not fit in blockcache, or if as a percentage of blockcache it is large, this is not a good idea, will just blow out the cache and trigger a lot of useless GC activity. Might be useful as an expert tuning option though. Or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-9857) Blockcache prefetch option
[ https://issues.apache.org/jira/browse/HBASE-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14006685#comment-14006685 ] Hudson commented on HBASE-9857: --- FAILURE: Integrated in HBase-0.98 #311 (See [https://builds.apache.org/job/HBase-0.98/311/]) HBASE-9857 Blockcache prefetch option (apurtell: rev 9b9f4df87a432778cc6f16becfeeea72e507e526) * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV3.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCache.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCacheOnWriteInSchema.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java * hbase-client/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CombinedBlockCache.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/PrefetchExecutor.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCache.java * hbase-server/src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/RandomSeek.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabCache.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/bucket/TestBucketCache.java * hbase-shell/src/main/ruby/hbase/admin.rb * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SingleSizeCache.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestLruBlockCache.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestPrefetch.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/SimpleBlockCache.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java Amend HBASE-9857 Blockcache prefetch option; add missing license header (apurtell: rev f87fc3cea43698e9f417b12aabb7bd4f7932155a) * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java Blockcache prefetch option -- Key: HBASE-9857 URL: https://issues.apache.org/jira/browse/HBASE-9857 Project: HBase Issue Type: Improvement Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 0.99.0, 0.98.3 Attachments: 9857.patch, 9857.patch, HBASE-9857-0.98.patch, HBASE-9857-trunk.patch, HBASE-9857-trunk.patch Attached patch implements a prefetching function for HFile (v3) blocks, if indicated by a column family or regionserver property. The purpose of this change is to as rapidly after region open as reasonable warm the blockcache with all the data and index blocks of (presumably also in-memory) table data, without counting those block loads as cache misses. Great for fast reads and keeping the cache hit ratio high. Can tune the IO impact versus time until all data blocks are in cache. Works a bit like CompactSplitThread. Makes some effort not to stampede. I have been using this for setting up various experiments and thought I'd polish it up a bit and throw it out there. If the data to be preloaded will not fit in blockcache, or if as a percentage of blockcache it is large, this is not a good idea, will just blow out the cache and trigger a lot of useless GC activity. Might be useful as an expert tuning option though. Or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-9857) Blockcache prefetch option
[ https://issues.apache.org/jira/browse/HBASE-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004641#comment-14004641 ] Nick Dimiduk commented on HBASE-9857: - This is looking very nice. I think we'll want to add a similar prefetch feature for index and bloom blocks, but that can come in a later revision. +1 Blockcache prefetch option -- Key: HBASE-9857 URL: https://issues.apache.org/jira/browse/HBASE-9857 Project: HBase Issue Type: Improvement Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 0.99.0, 0.98.3 Attachments: 9857.patch, 9857.patch, HBASE-9857-0.98.patch, HBASE-9857-trunk.patch, HBASE-9857-trunk.patch Attached patch implements a prefetching function for HFile (v3) blocks, if indicated by a column family or regionserver property. The purpose of this change is to as rapidly after region open as reasonable warm the blockcache with all the data and index blocks of (presumably also in-memory) table data, without counting those block loads as cache misses. Great for fast reads and keeping the cache hit ratio high. Can tune the IO impact versus time until all data blocks are in cache. Works a bit like CompactSplitThread. Makes some effort not to stampede. I have been using this for setting up various experiments and thought I'd polish it up a bit and throw it out there. If the data to be preloaded will not fit in blockcache, or if as a percentage of blockcache it is large, this is not a good idea, will just blow out the cache and trigger a lot of useless GC activity. Might be useful as an expert tuning option though. Or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-9857) Blockcache prefetch option
[ https://issues.apache.org/jira/browse/HBASE-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004770#comment-14004770 ] Andrew Purtell commented on HBASE-9857: --- The prefetch loads all blocks except meta blocks, so that includes indexes and blooms. Blockcache prefetch option -- Key: HBASE-9857 URL: https://issues.apache.org/jira/browse/HBASE-9857 Project: HBase Issue Type: Improvement Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 0.99.0, 0.98.3 Attachments: 9857.patch, 9857.patch, HBASE-9857-0.98.patch, HBASE-9857-trunk.patch, HBASE-9857-trunk.patch Attached patch implements a prefetching function for HFile (v3) blocks, if indicated by a column family or regionserver property. The purpose of this change is to as rapidly after region open as reasonable warm the blockcache with all the data and index blocks of (presumably also in-memory) table data, without counting those block loads as cache misses. Great for fast reads and keeping the cache hit ratio high. Can tune the IO impact versus time until all data blocks are in cache. Works a bit like CompactSplitThread. Makes some effort not to stampede. I have been using this for setting up various experiments and thought I'd polish it up a bit and throw it out there. If the data to be preloaded will not fit in blockcache, or if as a percentage of blockcache it is large, this is not a good idea, will just blow out the cache and trigger a lot of useless GC activity. Might be useful as an expert tuning option though. Or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-9857) Blockcache prefetch option
[ https://issues.apache.org/jira/browse/HBASE-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004778#comment-14004778 ] Nick Dimiduk commented on HBASE-9857: - Yes of course; I wasn't clear in my comment. I'm suggesting future work would enable only prefetching non-data blocks. Thus guaranteeing indices are always warm. One would want this when you know data size cache size. Blockcache prefetch option -- Key: HBASE-9857 URL: https://issues.apache.org/jira/browse/HBASE-9857 Project: HBase Issue Type: Improvement Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 0.99.0, 0.98.3 Attachments: 9857.patch, 9857.patch, HBASE-9857-0.98.patch, HBASE-9857-trunk.patch, HBASE-9857-trunk.patch Attached patch implements a prefetching function for HFile (v3) blocks, if indicated by a column family or regionserver property. The purpose of this change is to as rapidly after region open as reasonable warm the blockcache with all the data and index blocks of (presumably also in-memory) table data, without counting those block loads as cache misses. Great for fast reads and keeping the cache hit ratio high. Can tune the IO impact versus time until all data blocks are in cache. Works a bit like CompactSplitThread. Makes some effort not to stampede. I have been using this for setting up various experiments and thought I'd polish it up a bit and throw it out there. If the data to be preloaded will not fit in blockcache, or if as a percentage of blockcache it is large, this is not a good idea, will just blow out the cache and trigger a lot of useless GC activity. Might be useful as an expert tuning option though. Or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-9857) Blockcache prefetch option
[ https://issues.apache.org/jira/browse/HBASE-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004787#comment-14004787 ] Andrew Purtell commented on HBASE-9857: --- Commit on hold until the SVN-GIT migration is finished. Blockcache prefetch option -- Key: HBASE-9857 URL: https://issues.apache.org/jira/browse/HBASE-9857 Project: HBase Issue Type: Improvement Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 0.99.0, 0.98.3 Attachments: 9857.patch, 9857.patch, HBASE-9857-0.98.patch, HBASE-9857-trunk.patch, HBASE-9857-trunk.patch Attached patch implements a prefetching function for HFile (v3) blocks, if indicated by a column family or regionserver property. The purpose of this change is to as rapidly after region open as reasonable warm the blockcache with all the data and index blocks of (presumably also in-memory) table data, without counting those block loads as cache misses. Great for fast reads and keeping the cache hit ratio high. Can tune the IO impact versus time until all data blocks are in cache. Works a bit like CompactSplitThread. Makes some effort not to stampede. I have been using this for setting up various experiments and thought I'd polish it up a bit and throw it out there. If the data to be preloaded will not fit in blockcache, or if as a percentage of blockcache it is large, this is not a good idea, will just blow out the cache and trigger a lot of useless GC activity. Might be useful as an expert tuning option though. Or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-9857) Blockcache prefetch option
[ https://issues.apache.org/jira/browse/HBASE-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004790#comment-14004790 ] Andrew Purtell commented on HBASE-9857: --- Ah right, sounds good Nick. Blockcache prefetch option -- Key: HBASE-9857 URL: https://issues.apache.org/jira/browse/HBASE-9857 Project: HBase Issue Type: Improvement Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 0.99.0, 0.98.3 Attachments: 9857.patch, 9857.patch, HBASE-9857-0.98.patch, HBASE-9857-trunk.patch, HBASE-9857-trunk.patch Attached patch implements a prefetching function for HFile (v3) blocks, if indicated by a column family or regionserver property. The purpose of this change is to as rapidly after region open as reasonable warm the blockcache with all the data and index blocks of (presumably also in-memory) table data, without counting those block loads as cache misses. Great for fast reads and keeping the cache hit ratio high. Can tune the IO impact versus time until all data blocks are in cache. Works a bit like CompactSplitThread. Makes some effort not to stampede. I have been using this for setting up various experiments and thought I'd polish it up a bit and throw it out there. If the data to be preloaded will not fit in blockcache, or if as a percentage of blockcache it is large, this is not a good idea, will just blow out the cache and trigger a lot of useless GC activity. Might be useful as an expert tuning option though. Or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-9857) Blockcache prefetch option
[ https://issues.apache.org/jira/browse/HBASE-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003338#comment-14003338 ] Hadoop QA commented on HBASE-9857: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12645671/HBASE-9857-trunk.patch against trunk revision . ATTACHMENT ID: 12645671 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 26 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 2 warning messages. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings (more than the trunk's current 0 warnings). {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: + public HFileReaderV3(final Path path, FixedFileTrailer trailer, final FSDataInputStreamWrapper fsdis, + family.setPrefetchBlocksOnOpen(JBoolean.valueOf(arg.delete(org.apache.hadoop.hbase.HColumnDescriptor::PREFETCH_BLOCKS_ON_OPEN))) if arg.include?(org.apache.hadoop.hbase.HColumnDescriptor::PREFETCH_BLOCKS_ON_OPEN) {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestHCM Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/9547//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9547//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9547//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9547//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9547//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9547//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9547//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9547//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9547//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9547//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9547//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/9547//console This message is automatically generated. Blockcache prefetch option -- Key: HBASE-9857 URL: https://issues.apache.org/jira/browse/HBASE-9857 Project: HBase Issue Type: Improvement Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 0.99.0, 0.98.3 Attachments: 9857.patch, 9857.patch, HBASE-9857-0.98.patch, HBASE-9857-trunk.patch Attached patch implements a prefetching function for HFile (v3) blocks, if indicated by a column family or regionserver property. The purpose of this change is to as rapidly after region open as reasonable warm the blockcache with all the data and index blocks of (presumably also in-memory) table data, without counting those block loads as cache misses. Great for fast reads and keeping the cache hit ratio high. Can tune the IO impact versus time until all data blocks are in cache. Works a bit like CompactSplitThread. Makes some effort not to stampede. I have been using this for setting up various experiments and thought I'd polish it up a bit and throw it out there. If the data to be preloaded will not fit in blockcache, or if as a percentage of blockcache it is large, this is not a good idea, will just blow out the cache and trigger a lot of useless GC activity. Might be useful as an expert tuning option though. Or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-9857) Blockcache prefetch option
[ https://issues.apache.org/jira/browse/HBASE-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003557#comment-14003557 ] Andrew Purtell commented on HBASE-9857: --- Looks like the new findbugs warning is from the TTL pretty printing patch: Result of integer multiplication cast to long in org.apache.hadoop.hbase.util.PrettyPrinter.humanReadableTTL(long) Blockcache prefetch option -- Key: HBASE-9857 URL: https://issues.apache.org/jira/browse/HBASE-9857 Project: HBase Issue Type: Improvement Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 0.99.0, 0.98.3 Attachments: 9857.patch, 9857.patch, HBASE-9857-0.98.patch, HBASE-9857-trunk.patch, HBASE-9857-trunk.patch Attached patch implements a prefetching function for HFile (v3) blocks, if indicated by a column family or regionserver property. The purpose of this change is to as rapidly after region open as reasonable warm the blockcache with all the data and index blocks of (presumably also in-memory) table data, without counting those block loads as cache misses. Great for fast reads and keeping the cache hit ratio high. Can tune the IO impact versus time until all data blocks are in cache. Works a bit like CompactSplitThread. Makes some effort not to stampede. I have been using this for setting up various experiments and thought I'd polish it up a bit and throw it out there. If the data to be preloaded will not fit in blockcache, or if as a percentage of blockcache it is large, this is not a good idea, will just blow out the cache and trigger a lot of useless GC activity. Might be useful as an expert tuning option though. Or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-9857) Blockcache prefetch option
[ https://issues.apache.org/jira/browse/HBASE-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003558#comment-14003558 ] Andrew Purtell commented on HBASE-9857: --- Going to commit this to trunk and 0.98 this evening unless objection. Blockcache prefetch option -- Key: HBASE-9857 URL: https://issues.apache.org/jira/browse/HBASE-9857 Project: HBase Issue Type: Improvement Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 0.99.0, 0.98.3 Attachments: 9857.patch, 9857.patch, HBASE-9857-0.98.patch, HBASE-9857-trunk.patch, HBASE-9857-trunk.patch Attached patch implements a prefetching function for HFile (v3) blocks, if indicated by a column family or regionserver property. The purpose of this change is to as rapidly after region open as reasonable warm the blockcache with all the data and index blocks of (presumably also in-memory) table data, without counting those block loads as cache misses. Great for fast reads and keeping the cache hit ratio high. Can tune the IO impact versus time until all data blocks are in cache. Works a bit like CompactSplitThread. Makes some effort not to stampede. I have been using this for setting up various experiments and thought I'd polish it up a bit and throw it out there. If the data to be preloaded will not fit in blockcache, or if as a percentage of blockcache it is large, this is not a good idea, will just blow out the cache and trigger a lot of useless GC activity. Might be useful as an expert tuning option though. Or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-9857) Blockcache prefetch option
[ https://issues.apache.org/jira/browse/HBASE-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003952#comment-14003952 ] Hadoop QA commented on HBASE-9857: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12645814/HBASE-9857-trunk.patch against trunk revision . ATTACHMENT ID: 12645814 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 27 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1 warning messages. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings (more than the trunk's current 0 warnings). {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: + public HFileReaderV3(final Path path, FixedFileTrailer trailer, final FSDataInputStreamWrapper fsdis, + family.setPrefetchBlocksOnOpen(JBoolean.valueOf(arg.delete(org.apache.hadoop.hbase.HColumnDescriptor::PREFETCH_BLOCKS_ON_OPEN))) if arg.include?(org.apache.hadoop.hbase.HColumnDescriptor::PREFETCH_BLOCKS_ON_OPEN) {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestHCM Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/9556//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9556//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9556//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9556//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9556//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9556//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9556//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9556//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9556//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9556//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9556//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/9556//console This message is automatically generated. Blockcache prefetch option -- Key: HBASE-9857 URL: https://issues.apache.org/jira/browse/HBASE-9857 Project: HBase Issue Type: Improvement Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 0.99.0, 0.98.3 Attachments: 9857.patch, 9857.patch, HBASE-9857-0.98.patch, HBASE-9857-trunk.patch, HBASE-9857-trunk.patch Attached patch implements a prefetching function for HFile (v3) blocks, if indicated by a column family or regionserver property. The purpose of this change is to as rapidly after region open as reasonable warm the blockcache with all the data and index blocks of (presumably also in-memory) table data, without counting those block loads as cache misses. Great for fast reads and keeping the cache hit ratio high. Can tune the IO impact versus time until all data blocks are in cache. Works a bit like CompactSplitThread. Makes some effort not to stampede. I have been using this for setting up various experiments and thought I'd polish it up a bit and throw it out there. If the data to be preloaded will not fit in blockcache, or if as a percentage of blockcache it is large, this is not a good idea, will just blow out the cache and trigger a lot of useless GC activity. Might be useful as an expert tuning option though. Or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-9857) Blockcache prefetch option
[ https://issues.apache.org/jira/browse/HBASE-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004346#comment-14004346 ] ramkrishna.s.vasudevan commented on HBASE-9857: --- PrefetchExecutor misses the License header. May be rename it as HfileBlockPrefetchExecutor? (just a nit). On Exception during prefetch should cancel() be called on the catch block? Blockcache prefetch option -- Key: HBASE-9857 URL: https://issues.apache.org/jira/browse/HBASE-9857 Project: HBase Issue Type: Improvement Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 0.99.0, 0.98.3 Attachments: 9857.patch, 9857.patch, HBASE-9857-0.98.patch, HBASE-9857-trunk.patch, HBASE-9857-trunk.patch Attached patch implements a prefetching function for HFile (v3) blocks, if indicated by a column family or regionserver property. The purpose of this change is to as rapidly after region open as reasonable warm the blockcache with all the data and index blocks of (presumably also in-memory) table data, without counting those block loads as cache misses. Great for fast reads and keeping the cache hit ratio high. Can tune the IO impact versus time until all data blocks are in cache. Works a bit like CompactSplitThread. Makes some effort not to stampede. I have been using this for setting up various experiments and thought I'd polish it up a bit and throw it out there. If the data to be preloaded will not fit in blockcache, or if as a percentage of blockcache it is large, this is not a good idea, will just blow out the cache and trigger a lot of useless GC activity. Might be useful as an expert tuning option though. Or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-9857) Blockcache prefetch option
[ https://issues.apache.org/jira/browse/HBASE-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004397#comment-14004397 ] Andrew Purtell commented on HBASE-9857: --- Thanks for noticing the missing header. Will add it. I'm on the phone so don't have the patch in front of me. I think cancel() is not needed because we remove the entry for the prefetch upon exception but will double check tomorrow. Blockcache prefetch option -- Key: HBASE-9857 URL: https://issues.apache.org/jira/browse/HBASE-9857 Project: HBase Issue Type: Improvement Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 0.99.0, 0.98.3 Attachments: 9857.patch, 9857.patch, HBASE-9857-0.98.patch, HBASE-9857-trunk.patch, HBASE-9857-trunk.patch Attached patch implements a prefetching function for HFile (v3) blocks, if indicated by a column family or regionserver property. The purpose of this change is to as rapidly after region open as reasonable warm the blockcache with all the data and index blocks of (presumably also in-memory) table data, without counting those block loads as cache misses. Great for fast reads and keeping the cache hit ratio high. Can tune the IO impact versus time until all data blocks are in cache. Works a bit like CompactSplitThread. Makes some effort not to stampede. I have been using this for setting up various experiments and thought I'd polish it up a bit and throw it out there. If the data to be preloaded will not fit in blockcache, or if as a percentage of blockcache it is large, this is not a good idea, will just blow out the cache and trigger a lot of useless GC activity. Might be useful as an expert tuning option though. Or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-9857) Blockcache prefetch option
[ https://issues.apache.org/jira/browse/HBASE-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14002561#comment-14002561 ] Andrew Purtell commented on HBASE-9857: --- Attached updated patches. Blockcache prefetch option -- Key: HBASE-9857 URL: https://issues.apache.org/jira/browse/HBASE-9857 Project: HBase Issue Type: Improvement Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 0.99.0, 0.98.3 Attachments: 9857.patch, 9857.patch, HBASE-9857-0.98.patch, HBASE-9857-trunk.patch Attached patch implements a prefetching function for HFile (v3) blocks, if indicated by a column family or regionserver property. The purpose of this change is to as rapidly after region open as reasonable warm the blockcache with all the data and index blocks of (presumably also in-memory) table data, without counting those block loads as cache misses. Great for fast reads and keeping the cache hit ratio high. Can tune the IO impact versus time until all data blocks are in cache. Works a bit like CompactSplitThread. Makes some effort not to stampede. I have been using this for setting up various experiments and thought I'd polish it up a bit and throw it out there. If the data to be preloaded will not fit in blockcache, or if as a percentage of blockcache it is large, this is not a good idea, will just blow out the cache and trigger a lot of useless GC activity. Might be useful as an expert tuning option though. Or not. -- This message was sent by Atlassian JIRA (v6.2#6252)