Thanks Alex,

Yes, looking at that code I believe you are correct - the memStore scanner
is appended after the block scanners.
The block scanners may or may not see hits in the block cache when they
read. If they don't get a hit, they'll open the block from the underlying
HFile(s).



On Wed, May 2, 2018 at 10:41 PM, Xi Yang <alex.xi.y...@gmail.com> wrote:

> Hi Tim,
>
> Thank you for detailed explanation. Yes, that really helps me! I really
> appreciate it!
>
>
> But I still confused about the sequence:
>
> I've read these codes in *HStore.getScanners* :
>
>
> *    // TODO this used to get the store files in descending order,*
> *    // but now we get them in ascending order, which I think is*
> *    // actually more correct, since memstore get put at the end.*
> *    List<StoreFileScanner> sfScanners =
> StoreFileScanner.getScannersForStoreFiles(storeFilesToScan,*
> *      cacheBlocks, usePread, isCompaction, false, matcher, readPt);*
> *    List<KeyValueScanner> scanners = new ArrayList<>(sfScanners.size() +
> 1);*
> *    scanners.addAll(sfScanners);*
> *    // Then the memstore scanners*
> *    scanners.addAll(memStoreScanners);*
>
>
> Is it mean this step:
>
>
> *2) It looks in the memstore to see if there are any writes still in
> memoryready to flush down to the HFiles that needs merged with the data
> read in 1) *
>
> is behind the following step?
>
> *c) the data is read from the opened block *
>
>
>
>
> Here are explanation of the images I drew before, so that we don't need the
> images:
>
> When a read request come in
> Model A
>
>    1. get Scanners (including StoreScanner and MemStoreScanner).
>    MemStoreScanner is the last one
>    2. Begin with the first StoreScanner
>    3. Try to get the block from BlockCache of the StoreScanner
>    4. Try to get the block from HFile of the StoreScanner
>    5. Go to the next StoreScanner
>    6. Loop #2 - #5 until all StoreScanner been used
>    7. Try to get the block from memStore
>
>
> Model B
>
>    1. Try to get the block from BlockCache, if failed then go to #2
>    2. get Scanners (including StoreScanner and MemStoreScanner).
>    MemStoreScanner is the last on
>    3. Begin with the first StoreScanner
>    4. Try to get the block from HFile of the StoreScanner
>    5. Go to the next StoreScanner
>    6. Loop #4 - #5 until all StoreScanner been used
>    7. Try to get the block from memStore
>
>
>
> Thanks,
> Alex
>
>
> 2018-05-02 1:04 GMT-07:00 Tim Robertson <timrobertson...@gmail.com>:
>
> > Hi Alex,
> >
> > I'm not sure I fully follow your question without the images but I'll try
> > and help.
> >
> > When a read request comes in, my understanding of the order of execution
> is
> > as follows (perhaps someone can verify this):
> >
> > 1) It looks in the block cache for the cells (this is a read only cache
> > containing recently read data)
> > 2) It looks in the memstore to see if there are any writes still in
> memory
> > ready to flush down to the HFiles that needs merged with the data read in
> > 1)
> > 3) Only if not found it starts locating the data from HFiles (note, there
> > can be multiple files per region until major compaction runs which merges
> > into 1 per column family, discarding stale data where possible)
> >   a) It uses bloom filters and the block cache indexes to locate the
> target
> > blocks (these are part of the HFiles, but read into memory when the
> region
> > servers start)
> >   b) those target blocks are then opened and occupy space on the block
> > cache on the region server (possibly evicting other blocks)
> >   c) the data is read from the opened block
> >
> > Does that help at all?
> >
> > Thanks,
> > Tim
> >
> >
> >
> > On Wed, May 2, 2018 at 9:49 AM, Xi Yang <alex.xi.y...@gmail.com> wrote:
> >
> > > OK, I got it. I've understood the Q2 by your help, thanks!
> > >
> > >
> > >
> > > Seems like I have to use some other way to draw my images, Here is the
> > > updated version Q1:
> > >
> > >
> > > Q1
> > >
> > > I found that HFileScannerImpl.getCachedBlock(...) get block from
> > > BlockCache. This CachedBlock is used by StoreFileScanner. Is that mean
> > the
> > > read model like:
> > >
> > > *Model A*
> > >
> > > When a read request come
> > >
> > >    1. Read 1st Store:
> > >    a. read BlockCache
> > >    b. read HFile
> > >    2. Read 2nd Store:
> > >    a. read BlockCache
> > >    b. read HFile
> > >    3. ......
> > >    4. Read Memstore
> > >
> > >
> > >
> > > Or there is only one BlockCache and all the read request will go
> through
> > it
> > > first, like:
> > >
> > > *Model B:*
> > >
> > > When a read request come
> > >
> > >    1. Read BlockCache
> > >    2. Read 1st Store -> read HFIle
> > >    3. Read 2nd Store -> read HFile
> > >    4. ....
> > >    5. Read Memstore
> > >
> > >
> > > ​​
> > >
> > > Thanks,
> > > Alex
> > >
> > >
> > >
> > > 2018-05-01 20:04 GMT-07:00 Josh Elser <els...@apache.org>:
> > >
> > > > FYI, the mailing list strips images.
> > > >
> > > > There is only one BlockCache per RS. Not sure if that answers your Q1
> > in
> > > > entirety though.
> > > >
> > > > Q2. The "Block" in "BlockCache" are the blocks that make up the HBase
> > > > HFiles in HDFS. Data in the Memstore does not yet exist in HFiles on
> > > HDFS.
> > > > Additionally, Memstore is already in memory; no need to have a
> > different
> > > > cache to accomplish the same thing :)
> > > >
> > > > On 5/1/18 9:25 PM, Xi Yang wrote:
> > > >
> > > >> Sorry to bother you guys. May I ask 2 questions about HBase?
> > > >>
> > > >> Q1
> > > >>
> > > >> I found that |HFileScannerImpl.getCachedBlock(...)| get block from
> > > >> BlockCache. This CachedBlock is used by |StoreFileScanner|. Is that
> > mean
> > > >> the read model like:
> > > >>
> > > >> *Model A*
> > > >>
> > > >> Or there is only one BlockCache and all the read request will go
> > through
> > > >> it first, like:
> > > >>
> > > >> *Model B:*
> > > >>
> > > >> ​
> > > >> Q2
> > > >> If the data been read from Memstore, will it be put in BlockCache to
> > > >> accelerate the read process next time?
> > > >>
> > > >> ​
> > > >> Thanks,
> > > >> Alex
> > > >>
> > > >> ​
> > > >>
> > > >
> > >
> >
>

Reply via email to