Thanks all of you, and your answer help me a lot. 2018-03-19 22:31 GMT+08:00 Saad Mufti <saad.mu...@gmail.com>:
> Another option if you have enough disk space/off heap memory space is to > enable bucket cache to cache even more of your data, and set the > PREFETCH_ON_OPEN => true option on the column families you want always > cache. That way HBase will prefetch your data into the bucket cache and > your scan won't have that initial slowdown. Or if you want to do it > globally for all column families, set the configuration flag > "hbase.rs.prefetchblocksonopen" to "true". Keep in mind though that if you > do this, you should either have enough bucket cache space for all your > data, otherwise there will be a lot of useless eviction activity at HBase > startup and even later. > > Also, where a region is located will also be heavily impacted by which > region balancer you have chosen and how you have tuned it in terms of how > often to run and other parameters. A split region will stay initially at > least on the same region server but your balancer if and when run can move > it (an indeed any region) elsewhere to satisfy its criteria. > > Cheers. > > ---- > Saad > > > On Mon, Mar 19, 2018 at 1:14 AM, ramkrishna vasudevan < > ramkrishna.s.vasude...@gmail.com> wrote: > > > Hi > > > > First regarding the scans, > > > > Generally the data resides in the store files which is in HDFS. So > probably > > the first scan that you are doing is reading from HDFS which involves > disk > > reads. Once the blocks are read, they are cached in the Block cache of > > HBase. So your further reads go through that and hence you see further > > speed up in the scans. > > > > >> And another question about region split, I want to know which > > RegionServer > > will load the new region afther splited , > > Will they be the same One with the old region? > > Yes . Generally same region server hosts it. > > > > In master the code is here, > > https://github.com/apache/hbase/blob/master/hbase- > > server/src/main/java/org/apache/hadoop/hbase/master/assignment/ > > SplitTableRegionProcedure.java > > > > You may need to understand the entire flow to know how the regions are > > opened after a split. > > > > Regards > > Ram > > > > On Sat, Mar 17, 2018 at 9:02 PM, Yang Zhang <zhang.yang...@gmail.com> > > wrote: > > > > > Hello everyone > > > > > > I try to do many Scan use RegionScanner in coprocessor, and > > ervery > > > time ,the first Scan cost about 10 times than the other, > > > I don't know why this will happen > > > > > > OneBucket Scan cost is : 8794 ms Num is : 710 > > > OneBucket Scan cost is : 91 ms Num is : 776 > > > OneBucket Scan cost is : 87 ms Num is : 808 > > > OneBucket Scan cost is : 105 ms Num is : 748 > > > OneBucket Scan cost is : 68 ms Num is : 200 > > > > > > > > > And another question about region split, I want to know which > > RegionServer > > > will load the new region afther splited , > > > Will they be the same One with the old region? Anyone know where I can > > > find the code to learn about that? > > > > > > > > > Thanks for your help > > > > > >