vinayakphegde commented on code in PR #5866: URL: https://github.com/apache/hbase/pull/5866#discussion_r1593422387
########## hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterImpl.java: ########## @@ -555,16 +563,33 @@ private void writeInlineBlocks(boolean closing) throws IOException { private void doCacheOnWrite(long offset) { cacheConf.getBlockCache().ifPresent(cache -> { HFileBlock cacheFormatBlock = blockWriter.getBlockForCaching(cacheConf); + BlockCacheKey key = buildBlockCacheKey(offset, cacheFormatBlock); + if (!shouldCacheBlock(cache, key)) { Review Comment: > which would go all the way to fetch the file metadata and decide if the file is hot or not. Yes, it will look up the Configuration which corresponds to the store to which this file belongs to from the map. However, it won't read any file-related metadata, since we are already passing the maxTimestamp here. > Would this be called per block? Yes, it is called for every block. > would it impact the performance adversely? Yes, I think it will add some performance overhead. > Can we somehow restrict these traversals to once per file instead of once per block. I couldn't think of any better solution at the moment. ########## hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterImpl.java: ########## @@ -555,16 +563,33 @@ private void writeInlineBlocks(boolean closing) throws IOException { private void doCacheOnWrite(long offset) { cacheConf.getBlockCache().ifPresent(cache -> { HFileBlock cacheFormatBlock = blockWriter.getBlockForCaching(cacheConf); + BlockCacheKey key = buildBlockCacheKey(offset, cacheFormatBlock); + if (!shouldCacheBlock(cache, key)) { Review Comment: > which would go all the way to fetch the file metadata and decide if the file is hot or not. Yes, it will look up the Configuration which corresponds to the store to which this file belongs to from the map. However, it won't read any file-related metadata, since we are already passing the maxTimestamp here. > Would this be called per block? Yes, it is called for every block. > would it impact the performance adversely? Yes, I think it will add some performance overhead. > Can we somehow restrict these traversals to once per file instead of once per block. I couldn't think of any better solution at the moment. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org