Re: Fwd:

Xi Yang Wed, 02 May 2018 13:42:06 -0700

Hi Tim,

Thank you for detailed explanation. Yes, that really helps me! I really
appreciate it!



But I still confused about the sequence:

I've read these codes in *HStore.getScanners* :


*    // TODO this used to get the store files in descending order,*
*    // but now we get them in ascending order, which I think is*
*    // actually more correct, since memstore get put at the end.*
*    List<StoreFileScanner> sfScanners =
StoreFileScanner.getScannersForStoreFiles(storeFilesToScan,*
*      cacheBlocks, usePread, isCompaction, false, matcher, readPt);*
*    List<KeyValueScanner> scanners = new ArrayList<>(sfScanners.size() +
1);*
*    scanners.addAll(sfScanners);*
*    // Then the memstore scanners*
*    scanners.addAll(memStoreScanners);*


Is it mean this step:


*2) It looks in the memstore to see if there are any writes still in
memoryready to flush down to the HFiles that needs merged with the data
read in 1) *

is behind the following step?

*c) the data is read from the opened block *




Here are explanation of the images I drew before, so that we don't need the
images:

When a read request come in
Model A

   1. get Scanners (including StoreScanner and MemStoreScanner).
   MemStoreScanner is the last one
   2. Begin with the first StoreScanner
   3. Try to get the block from BlockCache of the StoreScanner
   4. Try to get the block from HFile of the StoreScanner
   5. Go to the next StoreScanner
   6. Loop #2 - #5 until all StoreScanner been used
   7. Try to get the block from memStore


Model B

   1. Try to get the block from BlockCache, if failed then go to #2
   2. get Scanners (including StoreScanner and MemStoreScanner).
   MemStoreScanner is the last on
   3. Begin with the first StoreScanner
   4. Try to get the block from HFile of the StoreScanner
   5. Go to the next StoreScanner
   6. Loop #4 - #5 until all StoreScanner been used
   7. Try to get the block from memStore



Thanks,
Alex


2018-05-02 1:04 GMT-07:00 Tim Robertson <timrobertson...@gmail.com>:

> Hi Alex,
>
> I'm not sure I fully follow your question without the images but I'll try
> and help.
>
> When a read request comes in, my understanding of the order of execution is
> as follows (perhaps someone can verify this):
>
> 1) It looks in the block cache for the cells (this is a read only cache
> containing recently read data)
> 2) It looks in the memstore to see if there are any writes still in memory
> ready to flush down to the HFiles that needs merged with the data read in
> 1)
> 3) Only if not found it starts locating the data from HFiles (note, there
> can be multiple files per region until major compaction runs which merges
> into 1 per column family, discarding stale data where possible)
>   a) It uses bloom filters and the block cache indexes to locate the target
> blocks (these are part of the HFiles, but read into memory when the region
> servers start)
>   b) those target blocks are then opened and occupy space on the block
> cache on the region server (possibly evicting other blocks)
>   c) the data is read from the opened block
>
> Does that help at all?
>
> Thanks,
> Tim
>
>
>
> On Wed, May 2, 2018 at 9:49 AM, Xi Yang <alex.xi.y...@gmail.com> wrote:
>
> > OK, I got it. I've understood the Q2 by your help, thanks!
> >
> >
> >
> > Seems like I have to use some other way to draw my images, Here is the
> > updated version Q1:
> >
> >
> > Q1
> >
> > I found that HFileScannerImpl.getCachedBlock(...) get block from
> > BlockCache. This CachedBlock is used by StoreFileScanner. Is that mean
> the
> > read model like:
> >
> > *Model A*
> >
> > When a read request come
> >
> >    1. Read 1st Store:
> >    a. read BlockCache
> >    b. read HFile
> >    2. Read 2nd Store:
> >    a. read BlockCache
> >    b. read HFile
> >    3. ......
> >    4. Read Memstore
> >
> >
> >
> > Or there is only one BlockCache and all the read request will go through
> it
> > first, like:
> >
> > *Model B:*
> >
> > When a read request come
> >
> >    1. Read BlockCache
> >    2. Read 1st Store -> read HFIle
> >    3. Read 2nd Store -> read HFile
> >    4. ....
> >    5. Read Memstore
> >
> >
> > 
> >
> > Thanks,
> > Alex
> >
> >
> >
> > 2018-05-01 20:04 GMT-07:00 Josh Elser <els...@apache.org>:
> >
> > > FYI, the mailing list strips images.
> > >
> > > There is only one BlockCache per RS. Not sure if that answers your Q1
> in
> > > entirety though.
> > >
> > > Q2. The "Block" in "BlockCache" are the blocks that make up the HBase
> > > HFiles in HDFS. Data in the Memstore does not yet exist in HFiles on
> > HDFS.
> > > Additionally, Memstore is already in memory; no need to have a
> different
> > > cache to accomplish the same thing :)
> > >
> > > On 5/1/18 9:25 PM, Xi Yang wrote:
> > >
> > >> Sorry to bother you guys. May I ask 2 questions about HBase?
> > >>
> > >> Q1
> > >>
> > >> I found that |HFileScannerImpl.getCachedBlock(...)| get block from
> > >> BlockCache. This CachedBlock is used by |StoreFileScanner|. Is that
> mean
> > >> the read model like:
> > >>
> > >> *Model A*
> > >>
> > >> Or there is only one BlockCache and all the read request will go
> through
> > >> it first, like:
> > >>
> > >> *Model B:*
> > >>
> > >> 
> > >> Q2
> > >> If the data been read from Memstore, will it be put in BlockCache to
> > >> accelerate the read process next time?
> > >>
> > >> 
> > >> Thanks,
> > >> Alex
> > >>
> > >> 
> > >>
> > >
> >
>

Re: Fwd:

Reply via email to