[ https://issues.apache.org/jira/browse/HBASE-12295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14564311#comment-14564311 ]
Anoop Sam John commented on HBASE-12295: ---------------------------------------- bq.We'll have to dig in on why. You'd think w/ less intermediaries that it would be faster. It should be the cost at socket layer and we will need N transfers instead of one. This one time transfer was looking better even if we need a temp copy. Regarding knowing whether L1 or L2 looking at key, actually this info of whether L1 or L2 is a state of HFileBlock. We have added this with an enum L1/L2/NOT_CACHED. Based on this type, we decided at the HFileScanner layer (on close) whether to call return on BlockCache. Also within the BlockCache impl, we might need to know the type. This is for CombineBC. If it is L2, then we call the BucketCache return and else call LRU cache return. So if we add the L1/L2 info also to BlockCacheKey, I am not sure whether this looks clean. BlockCacheKey is some thing which we will be creating while fetching the block from BC. While return, we can just pass the info by setting it in BlockCacheKey. It will just act as a carrier then. Or may be we can use HFileBlock object alone in the return API? Using a key we have got an object from a cache and we return *that* object back to the cache. It is always possible to make the BlockCacheKey from HFileBlock. bq. You going to mark the object as from L2 or something Yes. HFileBlock will contain state info whether it is from L1 or L2 or NOT_CACHED one. When it is CombinedBC, HFileReader ask the cache to give block and it returns the HFileBlock. So we are not sure from where it has come L1/L2. So better set it as a state info in HFileBlock carry the cellBlock in Result, am not sure.. At HRegion level, the get() return a Result but the scanner returns a List of Cells. Then in RsRpcServer level, we call in al loop to make those many rows/results as per caching/max size limit. Even if we make it to return a Result in scan area also, it will make overhead of creating smaller sized cellBlock buffer for each of the rows. So finally we will have to deal with more smaller size block buffers. It will be better to collect all rows and then make a single cellBlock at once for the scan case. Making sense? Agree to your point of not passing RPC stuff even to HRegion level. We have to see what else we can do to return this payload. I think I got now what is in your mind on saying finalize/close on Result and handle things that way. Right now, when we get a block from BC, we increase its ref count by 1, means one scanner is working on this. So if we have to do in this suggestion, then whenever we are creating a cell from this block, we have to again increment the ref count. Some thing like java ref counting way. Only Q is Result/Cell is a client side thing and am not sure how we can add server only BlockCache/ HFileBlock... But this would have made max NOT copy to happen.. Thinking more... > Prevent block eviction under us if reads are in progress from the BBs > --------------------------------------------------------------------- > > Key: HBASE-12295 > URL: https://issues.apache.org/jira/browse/HBASE-12295 > Project: HBase > Issue Type: Sub-task > Components: regionserver, Scanners > Reporter: ramkrishna.s.vasudevan > Assignee: ramkrishna.s.vasudevan > Fix For: 2.0.0 > > Attachments: HBASE-12295.pdf, HBASE-12295_trunk.patch > > > While we try to serve the reads from the BBs directly from the block cache, > we need to ensure that the blocks does not get evicted under us while > reading. This JIRA is to discuss and implement a strategy for the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)