vinayakphegde commented on code in PR #5866:
URL: https://github.com/apache/hbase/pull/5866#discussion_r1593422387


##########
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterImpl.java:
##########
@@ -555,16 +563,33 @@ private void writeInlineBlocks(boolean closing) throws 
IOException {
   private void doCacheOnWrite(long offset) {
     cacheConf.getBlockCache().ifPresent(cache -> {
       HFileBlock cacheFormatBlock = blockWriter.getBlockForCaching(cacheConf);
+      BlockCacheKey key = buildBlockCacheKey(offset, cacheFormatBlock);
+      if (!shouldCacheBlock(cache, key)) {

Review Comment:
   > which would go all the way to fetch the file metadata and decide if the 
file is hot or not.
   
   Yes, it will look up the Configuration which corresponds to the store to 
which this file belongs to from the map. However, it won't read any 
file-related metadata, since we are already passing the maxTimestamp here.
   
   > Would this be called per block?
   Yes, it is called for every block.
   
   > would it impact the performance adversely?
   Yes, I think it will add some performance overhead.
   
   > Can we somehow restrict these traversals to once per file instead of once 
per block.
   I couldn't think of any better solution at the moment.



##########
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterImpl.java:
##########
@@ -555,16 +563,33 @@ private void writeInlineBlocks(boolean closing) throws 
IOException {
   private void doCacheOnWrite(long offset) {
     cacheConf.getBlockCache().ifPresent(cache -> {
       HFileBlock cacheFormatBlock = blockWriter.getBlockForCaching(cacheConf);
+      BlockCacheKey key = buildBlockCacheKey(offset, cacheFormatBlock);
+      if (!shouldCacheBlock(cache, key)) {

Review Comment:
   > which would go all the way to fetch the file metadata and decide if the 
file is hot or not.
   
   Yes, it will look up the Configuration which corresponds to the store to 
which this file belongs to from the map. However, it won't read any 
file-related metadata, since we are already passing the maxTimestamp here.
   
   > Would this be called per block?
   
   Yes, it is called for every block.
   
   > would it impact the performance adversely?
   
   Yes, I think it will add some performance overhead.
   
   > Can we somehow restrict these traversals to once per file instead of once 
per block.
   
   I couldn't think of any better solution at the moment.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to