Wow, This explanation is really detailed. That helps me much! I totally understand the read process now. Thanks a million.
Thanks, Alex 2018-05-02 22:33 GMT-07:00 ramkrishna vasudevan < ramkrishna.s.vasude...@gmail.com>: > Regarding the read flow this is what happens > > 1) Create a region level scanner > 2) the region level scanner can comprise of more than one store scanner > (each store scanner works on one column family). > 3) Every store scanner wil comprise of memstore scanner and a set of hfile > scanners (based on number of store files). > 4) The scan tries to read data in lexographical order. > For eg, for simplicty take you have row1 to row5 and there is only one > column family 'f1' and one column 'c1'. Assume row1 was already written and > it is flushed to a store file. Row2 to row5 are in the memstore . > When the scanner starts it will form a heap with all these memstore scanner > and store file (hfile) scanners. Internally since row1 is smaller > lexographically the row1 from the store file is retrieved first. This row1 > for the first time will be in HDFS (and not in block cache). The remaining > rows are fetched from memstore scanners. there is no block cache concept at > the memstore level. Memstore is just a simple Key value map. > > When the same scan is issued the next time we go through the above steps > but to fetch row1, the store file scanner that has row1, fetches the block > cache that has row1 (instead of HDFS) and returns the value from block > cache and the remaining rows are again fetched from memstore scanners from > the underlying memstore. > > Hope this helps. > > REgards > Ram > > On Thu, May 3, 2018 at 9:17 AM, Xi Yang <alex.xi.y...@gmail.com> wrote: > > > Hi Tim, > > > > Thanks for confirm the question. That question confused me for a long > > time. Really appreciate. > > > > > > About another question, I still don't know whether ModelA is correct or > > Model B is correct. Still confused > > > > > > Thanks, > > Alex > > > > 2018-05-02 13:53 GMT-07:00 Tim Robertson <timrobertson...@gmail.com>: > > > > > Thanks Alex, > > > > > > Yes, looking at that code I believe you are correct - the memStore > > scanner > > > is appended after the block scanners. > > > The block scanners may or may not see hits in the block cache when they > > > read. If they don't get a hit, they'll open the block from the > underlying > > > HFile(s). > > > > > > > > > > > > On Wed, May 2, 2018 at 10:41 PM, Xi Yang <alex.xi.y...@gmail.com> > wrote: > > > > > > > Hi Tim, > > > > > > > > Thank you for detailed explanation. Yes, that really helps me! I > really > > > > appreciate it! > > > > > > > > > > > > But I still confused about the sequence: > > > > > > > > I've read these codes in *HStore.getScanners* : > > > > > > > > > > > > * // TODO this used to get the store files in descending order,* > > > > * // but now we get them in ascending order, which I think is* > > > > * // actually more correct, since memstore get put at the end.* > > > > * List<StoreFileScanner> sfScanners = > > > > StoreFileScanner.getScannersForStoreFiles(storeFilesToScan,* > > > > * cacheBlocks, usePread, isCompaction, false, matcher, readPt);* > > > > * List<KeyValueScanner> scanners = new > > ArrayList<>(sfScanners.size() + > > > > 1);* > > > > * scanners.addAll(sfScanners);* > > > > * // Then the memstore scanners* > > > > * scanners.addAll(memStoreScanners);* > > > > > > > > > > > > Is it mean this step: > > > > > > > > > > > > *2) It looks in the memstore to see if there are any writes still in > > > > memoryready to flush down to the HFiles that needs merged with the > data > > > > read in 1) * > > > > > > > > is behind the following step? > > > > > > > > *c) the data is read from the opened block * > > > > > > > > > > > > > > > > > > > > Here are explanation of the images I drew before, so that we don't > need > > > the > > > > images: > > > > > > > > When a read request come in > > > > Model A > > > > > > > > 1. get Scanners (including StoreScanner and MemStoreScanner). > > > > MemStoreScanner is the last one > > > > 2. Begin with the first StoreScanner > > > > 3. Try to get the block from BlockCache of the StoreScanner > > > > 4. Try to get the block from HFile of the StoreScanner > > > > 5. Go to the next StoreScanner > > > > 6. Loop #2 - #5 until all StoreScanner been used > > > > 7. Try to get the block from memStore > > > > > > > > > > > > Model B > > > > > > > > 1. Try to get the block from BlockCache, if failed then go to #2 > > > > 2. get Scanners (including StoreScanner and MemStoreScanner). > > > > MemStoreScanner is the last on > > > > 3. Begin with the first StoreScanner > > > > 4. Try to get the block from HFile of the StoreScanner > > > > 5. Go to the next StoreScanner > > > > 6. Loop #4 - #5 until all StoreScanner been used > > > > 7. Try to get the block from memStore > > > > > > > > > > > > > > > > Thanks, > > > > Alex > > > > > > > > > > > > 2018-05-02 1:04 GMT-07:00 Tim Robertson <timrobertson...@gmail.com>: > > > > > > > > > Hi Alex, > > > > > > > > > > I'm not sure I fully follow your question without the images but > I'll > > > try > > > > > and help. > > > > > > > > > > When a read request comes in, my understanding of the order of > > > execution > > > > is > > > > > as follows (perhaps someone can verify this): > > > > > > > > > > 1) It looks in the block cache for the cells (this is a read only > > cache > > > > > containing recently read data) > > > > > 2) It looks in the memstore to see if there are any writes still in > > > > memory > > > > > ready to flush down to the HFiles that needs merged with the data > > read > > > in > > > > > 1) > > > > > 3) Only if not found it starts locating the data from HFiles (note, > > > there > > > > > can be multiple files per region until major compaction runs which > > > merges > > > > > into 1 per column family, discarding stale data where possible) > > > > > a) It uses bloom filters and the block cache indexes to locate > the > > > > target > > > > > blocks (these are part of the HFiles, but read into memory when the > > > > region > > > > > servers start) > > > > > b) those target blocks are then opened and occupy space on the > > block > > > > > cache on the region server (possibly evicting other blocks) > > > > > c) the data is read from the opened block > > > > > > > > > > Does that help at all? > > > > > > > > > > Thanks, > > > > > Tim > > > > > > > > > > > > > > > > > > > > On Wed, May 2, 2018 at 9:49 AM, Xi Yang <alex.xi.y...@gmail.com> > > > wrote: > > > > > > > > > > > OK, I got it. I've understood the Q2 by your help, thanks! > > > > > > > > > > > > > > > > > > > > > > > > Seems like I have to use some other way to draw my images, Here > is > > > the > > > > > > updated version Q1: > > > > > > > > > > > > > > > > > > Q1 > > > > > > > > > > > > I found that HFileScannerImpl.getCachedBlock(...) get block from > > > > > > BlockCache. This CachedBlock is used by StoreFileScanner. Is that > > > mean > > > > > the > > > > > > read model like: > > > > > > > > > > > > *Model A* > > > > > > > > > > > > When a read request come > > > > > > > > > > > > 1. Read 1st Store: > > > > > > a. read BlockCache > > > > > > b. read HFile > > > > > > 2. Read 2nd Store: > > > > > > a. read BlockCache > > > > > > b. read HFile > > > > > > 3. ...... > > > > > > 4. Read Memstore > > > > > > > > > > > > > > > > > > > > > > > > Or there is only one BlockCache and all the read request will go > > > > through > > > > > it > > > > > > first, like: > > > > > > > > > > > > *Model B:* > > > > > > > > > > > > When a read request come > > > > > > > > > > > > 1. Read BlockCache > > > > > > 2. Read 1st Store -> read HFIle > > > > > > 3. Read 2nd Store -> read HFile > > > > > > 4. .... > > > > > > 5. Read Memstore > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > Alex > > > > > > > > > > > > > > > > > > > > > > > > 2018-05-01 20:04 GMT-07:00 Josh Elser <els...@apache.org>: > > > > > > > > > > > > > FYI, the mailing list strips images. > > > > > > > > > > > > > > There is only one BlockCache per RS. Not sure if that answers > > your > > > Q1 > > > > > in > > > > > > > entirety though. > > > > > > > > > > > > > > Q2. The "Block" in "BlockCache" are the blocks that make up the > > > HBase > > > > > > > HFiles in HDFS. Data in the Memstore does not yet exist in > HFiles > > > on > > > > > > HDFS. > > > > > > > Additionally, Memstore is already in memory; no need to have a > > > > > different > > > > > > > cache to accomplish the same thing :) > > > > > > > > > > > > > > On 5/1/18 9:25 PM, Xi Yang wrote: > > > > > > > > > > > > > >> Sorry to bother you guys. May I ask 2 questions about HBase? > > > > > > >> > > > > > > >> Q1 > > > > > > >> > > > > > > >> I found that |HFileScannerImpl.getCachedBlock(...)| get block > > > from > > > > > > >> BlockCache. This CachedBlock is used by |StoreFileScanner|. Is > > > that > > > > > mean > > > > > > >> the read model like: > > > > > > >> > > > > > > >> *Model A* > > > > > > >> > > > > > > >> Or there is only one BlockCache and all the read request will > go > > > > > through > > > > > > >> it first, like: > > > > > > >> > > > > > > >> *Model B:* > > > > > > >> > > > > > > >> > > > > > > >> Q2 > > > > > > >> If the data been read from Memstore, will it be put in > > BlockCache > > > to > > > > > > >> accelerate the read process next time? > > > > > > >> > > > > > > >> > > > > > > >> Thanks, > > > > > > >> Alex > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >