Oh great. Thanks for pointing that out. I think that is what is the exact place that the perf bottleneck was found.
Regards Ram On Thu, Jun 11, 2020 at 4:29 PM 张铎(Duo Zhang) <palomino...@gmail.com> wrote: > Oh, good. I recall that there is a related issue but I just forget the > title so I can not find it... > > Thanks for chimming in. > > OpenInx <open...@gmail.com> 于2020年6月11日周四 下午6:39写道: > > > Hi Zheng wang. > > > > Hope this issue will be helpful for you. > > https://issues.apache.org/jira/browse/HBASE-21657 > > Thanks. > > > > On Tue, Jun 9, 2020 at 5:53 PM Anoop John <anoop.hb...@gmail.com> wrote: > > > > > Thanks for the detailed analysis and update zheng wang. > > > >The code line below in StoreScanner.next() cost about 100ms in v2.1, > and > > > it added from v2.0, see HBASE-17647.  > > > So still there is some additional cost in 2.1 right? Do u have any > other > > > observation? Are we doing more cell compares in 2.x? > > > > > > Anoop > > > > > > > > > On Mon, Jun 8, 2020 at 1:50 AM zheng wang <18031...@qq.com> wrote: > > > > > > > Hi guys: > > > > > > > > > > > > I did some test on my pc to find the reason as Jan Van Besien > mentioned > > > in > > > > user channel. > > > > > > > > > > > > #test env > > > > OS : win10 > > > > JDK: 1.8 > > > > MEM: 8GB > > > > > > > > > > > > #test data: > > > > 1 million rows with only one columnfamily and one qualifier. > > > > > > > > > > > > rowkey: rowkey-#index# > > > > value: value-#index# > > > > > > > > > > > > #test method: > > > > just use client api to scan with default config several times, no pe, > > no > > > > ycsb > > > > > > > > > > > > #test result(avg): > > > > v1.2.0: 800ms > > > > v2.1.0: 1050ms > > > > > > > > > > > > So, it is sure that v2.1 is slower than v1.2, after this, i did some > > > > statistics on regionserver. > > > > Then i find the partly reason is related to the size estimated. > > > > > > > > > > > > The code line below in StoreScanner.next() cost about 100ms in v2.1, > > and > > > > it added from v2.0, see HBASE-17647. > > > > "int cellSize = PrivateCellUtil.estimatedSerializedSizeOf(cell);" > > > > > > > > > > > > Should we support to disable the MaxResultSize limit(2MB by default > > now) > > > > to get more efficient if user exactly knows their data and could > limit > > > > results only by setBatch and setLimit? > > > > > >