[jira] [Created] (HBASE-24544) Recommend upping zk jute.maxbuffer in all but minor installs
Michael Stack created HBASE-24544: - Summary: Recommend upping zk jute.maxbuffer in all but minor installs Key: HBASE-24544 URL: https://issues.apache.org/jira/browse/HBASE-24544 Project: HBase Issue Type: Bug Components: documentation Reporter: Michael Stack Add a doc note in upgrade and in zookeeper section recommending upping zk jute.maxbuffer to be above the default of 1M. Here is jute.maxbuffer from zk doc. {code} jute.maxbuffer: (Java system property: jute.maxbuffer) This option can only be set as a Java system property. There is no zookeeper prefix on it. It specifies the maximum size of the data that can be stored in a znode. The default is 0xf, or just under 1M. If this option is changed, the system property must be set on all servers and clients otherwise problems will arise. This is really a sanity check. ZooKeeper is designed to store data on the order of kilobytes in size. {code} It seems easy enough blowing the 1MB default. Here is one such scenario. A peer is disabled so WALs backup on each RegionServer or a bug makes it so we don't clear WALs out from under the RegionServer promptly. Backed-up WALs get into the hundreds... easy enough on a busy cluster. Next, there is a power outage and the cluster crashes down. Recovery may require an SCP recovering hundreds of WALs. As is, the way our SCP works, we can end up with a /hbase/splitWAL dir with hundreds -- even thousands -- of WALs in it. The 1MB buffer limit in zk can't carry listings this big. Of note, the jute.maxbuffer needs to be set on the zk servers -- with restart so the change is noticed -- and on the client-side, in the hbase master at least. This issue is about highlighting this old issue in our doc. It seems to be absent totally. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HBASE-24543) ScheduledChore logging is too chatty, replace with metrics
Andrew Kyle Purtell created HBASE-24543: --- Summary: ScheduledChore logging is too chatty, replace with metrics Key: HBASE-24543 URL: https://issues.apache.org/jira/browse/HBASE-24543 Project: HBase Issue Type: Improvement Components: metrics, Operability Reporter: Andrew Kyle Purtell ScheduledChore logs at DEBUG level the execution time of each chore. We used to log an average execution time across all chores every five minutes, which by consensus was judged to not be useful. Derived metrics like averages or histograms should be calculated per chore. So we modified the logging to dump the chore execution time each time it runs, to facilitate such calculations with the log aggregation and searching tool of choice. Per chore execution logging is more useful, in that sense, but may be too chatty. This is not unexpected but let me provide my observations so we can revisit this. On the master, for example, this is logged every second: {noformat} 2020-06-11 16:35:28,263 DEBUG [master/apurtell-ltm:8100.splitLogManager..Chore.1] hbase.ScheduledChore: SplitLogManager Timeout Monitor execution time: 0 ms. {noformat} Does the value of these lines outweigh the cost of 86,400 log lines per day per master instance? (At least.) On the regionserver it is somewhat better, these are logged every 10 seconds: {noformat} 2020-06-11 16:37:57,203 DEBUG [regionserver/apurtell-ltm:8120.Chore.1] hbase.ScheduledChore: CompactionChecker execution time: 0 ms. 2020-06-11 16:37:57,203 DEBUG [regionserver/apurtell-ltm:8120.Chore.1] hbase.ScheduledChore: MemstoreFlusherChore execution time: 0 ms. {noformat} So that will be 17,280 log lines per day per regionserver. (At least.) Perhaps these should be moved to TRACE level. We should definitely replace this logging with histogram metrics. There should be a separate metric for each distinct chore classname, allocated as needed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HBASE-24534) Delete reference off to Hadoop wiki's HBase FAQ
[ https://issues.apache.org/jira/browse/HBASE-24534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk resolved HBASE-24534. -- Fix Version/s: 3.0.0-alpha-1 Resolution: Fixed > Delete reference off to Hadoop wiki's HBase FAQ > --- > > Key: HBASE-24534 > URL: https://issues.apache.org/jira/browse/HBASE-24534 > Project: HBase > Issue Type: Task > Components: documentation >Reporter: Nick Dimiduk >Assignee: Nick Dimiduk >Priority: Minor > Fix For: 3.0.0-alpha-1 > > > Our `faq.adoc` has a link off to [a > FAQ|https://cwiki.apache.org/confluence/display/HADOOP2/Hbase+FAQ] in the > Hadoop wiki, which is empty other than a pointer back to our book. Let's just > delete the reference to the hadoop wiki. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HBASE-24542) Update anonymous git url on website
Nick Dimiduk created HBASE-24542: Summary: Update anonymous git url on website Key: HBASE-24542 URL: https://issues.apache.org/jira/browse/HBASE-24542 Project: HBase Issue Type: Task Components: documentation Reporter: Nick Dimiduk Our [source repository page|https://hbase.apache.org/source-repository.html] lists the anonymous gitbox url using {{git://}} protocol. They tell me over on {{#asfinfra}} that gitbox has never supported the {{git://}} protocol. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HBASE-24400) Automatically download CMake Dependencies
[ https://issues.apache.org/jira/browse/HBASE-24400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharath Vissapragada resolved HBASE-24400. -- Fix Version/s: master Resolution: Fixed > Automatically download CMake Dependencies > - > > Key: HBASE-24400 > URL: https://issues.apache.org/jira/browse/HBASE-24400 > Project: HBase > Issue Type: Sub-task >Reporter: Marc Parisi >Assignee: Marc Parisi >Priority: Major > Fix For: master > > > To improve the ability to build we should download and link a local version > of dependencies ( in the build folder ) > > This will help with skew of versions and the ability to build the project. > > This will help the build process in docker and allow people to develop > locally. this will also pave the way for future work to support -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [DISCUSS] Change the IA for MutableSizeHistogram and MutableTimeHistogram to LImitedPrivate
That's unfortunate, but needs must, IMHO. A potential benefit of also marking the impls LP(COPROC) is this captures any implicit dependency on semantics and functionality of the implementation classes not directly exposed in the hbase-metrics-api facade. So, let's do both? (Facade improvement, raise to LP the impl classes) On Thu, Jun 11, 2020 at 12:00 PM Geoffrey Jacoby wrote: > Couple points: > > 1. I like Andrew's proposed solution, and we should do it, but I'm not sure > it's sufficient for Rushabh's purposes because of semver rules. Phoenix > supports HBase 1.3 -1.5 (soon to add 1.6) and HBase 2.0 (soon to gain 2.1 > and 2.2, with 2.3 coming shortly after its release here.) If we add the new > sizeHistogram and timeHistogram methods to hbase-metrics, they'll be > available in Phoenix only in HBase 1.7 and 2.4. (since 2.3 is > mostly-frozen) > > Since Phoenix will be supporting earlier versions of both HBase branches > for a good while, there will need to be a compatibility shim. And the > older-version instance of the shim will probably need to access the classes > directly. (Please correct me if I'm wrong, Rushabh or Andrew.) So it still > might need a LimitedPrivate IA. > > 2. I agree with Nick that it's better to use LimitedPrivate.COPROC rather > than LimitedPrivate.PHOENIX. > > Geoffrey > > > > On Thu, Jun 11, 2020 at 11:28 AM Josh Elser wrote: > > > Sounds reasonable to me! > > > > On 6/11/20 1:06 PM, Andrew Purtell wrote: > > > hbase-metrics-api is available for coprocessors already and interfaces > > > within are already LimitedPrivate(COPROC). However, that package is > > mostly > > > interface and seems geared toward consuming metrics instantiated and > > > registered via private stuff. Or, rather, I didn't see how Phoenix > could > > choose > > > which of MutableSizeHistogram and MutableTimeHistogram to instantiate > > using > > > those interfaces, there is only Histogram > MetricRegistry#histogram(String > > > name). So I think it is also worth some time to review the utility of > > > hbase-metrics-api and decide if more need be done there. Would the > > addition > > > of > > > > > > Histogram MetricRegistry#sizeHistogram(String name) > > > Histogram MetricRegistry#timeHistogram(String name) > > > > > > achieve the desired objective instead? > > > > > > > > > On Thu, Jun 11, 2020 at 9:16 AM Nick Dimiduk > > wrote: > > > > > >> I was just about to reply with the same -- Josh is faster :) +1 on > > >> considering the full surface area of the APIs being exposed. > > >> > > >> I also wonder if exposing the metrics infrastructure is something of > > >> interest more broadly than Phoenix. Seems like any coprocessor might > > want > > >> to provide or monitor some metric value. > > >> > > >> On Thu, Jun 11, 2020 at 9:08 AM Josh Elser wrote: > > >> > > >>> My only concern is that you can't just mark these two classes a > > >>> LimitedPrivate for Phoenix -- you would also have to mark > > >>> MutableRangeHistogram, MutableHistogram (and the rest of the class > > >>> hierarchy) to make sure that we don't make it super confusing as to > > what > > >>> comes from LimitedPrivate classes and what is coming from Private > > >> classes. > > >>> > > >>> Would it be better to just say: make > > >>> ./hbase-hadoop2-compat/src/main/java/org/apache/hadoop/metrics2/lib > > >>> LimitedPrivate? > > >>> > > >>> Do you also need the stuff in > > >>> hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase to push > > >>> metrics back through the HBase metrics subsystem? > > >>> > > >>> Sorry for the late reply. Just want to make sure we open up the > > >>> audience, we open it sufficiently. > > >>> > > >>> On 6/8/20 1:15 PM, Rushabh Shah wrote: > > Hi, > > Currently the IA for MutableSizeHistogram and MutableTimeHistogram > is > > private. We want to use these classes in PHOENIX project and I > thought > > >> we > > can leverage the existing implementation from hbase histo > > >> implementation. > > IIUC the private IA can't be used in other projects. Proposing to > make > > >> it > > LimitedPrivate and mark HBaseInterfaceAudience.PHOENIX. Please > > suggest. > > Related jira: https://issues.apache.org/jira/browse/HBASE-24520 > > > > >>> > > >> > > > > > > > > > -- Best regards, Andrew Words like orphans lost among the crosstalk, meaning torn from truth's decrepit hands - A23, Crosstalk
Re: [DISCUSS] Change the IA for MutableSizeHistogram and MutableTimeHistogram to LImitedPrivate
Couple points: 1. I like Andrew's proposed solution, and we should do it, but I'm not sure it's sufficient for Rushabh's purposes because of semver rules. Phoenix supports HBase 1.3 -1.5 (soon to add 1.6) and HBase 2.0 (soon to gain 2.1 and 2.2, with 2.3 coming shortly after its release here.) If we add the new sizeHistogram and timeHistogram methods to hbase-metrics, they'll be available in Phoenix only in HBase 1.7 and 2.4. (since 2.3 is mostly-frozen) Since Phoenix will be supporting earlier versions of both HBase branches for a good while, there will need to be a compatibility shim. And the older-version instance of the shim will probably need to access the classes directly. (Please correct me if I'm wrong, Rushabh or Andrew.) So it still might need a LimitedPrivate IA. 2. I agree with Nick that it's better to use LimitedPrivate.COPROC rather than LimitedPrivate.PHOENIX. Geoffrey On Thu, Jun 11, 2020 at 11:28 AM Josh Elser wrote: > Sounds reasonable to me! > > On 6/11/20 1:06 PM, Andrew Purtell wrote: > > hbase-metrics-api is available for coprocessors already and interfaces > > within are already LimitedPrivate(COPROC). However, that package is > mostly > > interface and seems geared toward consuming metrics instantiated and > > registered via private stuff. Or, rather, I didn't see how Phoenix could > choose > > which of MutableSizeHistogram and MutableTimeHistogram to instantiate > using > > those interfaces, there is only Histogram MetricRegistry#histogram(String > > name). So I think it is also worth some time to review the utility of > > hbase-metrics-api and decide if more need be done there. Would the > addition > > of > > > > Histogram MetricRegistry#sizeHistogram(String name) > > Histogram MetricRegistry#timeHistogram(String name) > > > > achieve the desired objective instead? > > > > > > On Thu, Jun 11, 2020 at 9:16 AM Nick Dimiduk > wrote: > > > >> I was just about to reply with the same -- Josh is faster :) +1 on > >> considering the full surface area of the APIs being exposed. > >> > >> I also wonder if exposing the metrics infrastructure is something of > >> interest more broadly than Phoenix. Seems like any coprocessor might > want > >> to provide or monitor some metric value. > >> > >> On Thu, Jun 11, 2020 at 9:08 AM Josh Elser wrote: > >> > >>> My only concern is that you can't just mark these two classes a > >>> LimitedPrivate for Phoenix -- you would also have to mark > >>> MutableRangeHistogram, MutableHistogram (and the rest of the class > >>> hierarchy) to make sure that we don't make it super confusing as to > what > >>> comes from LimitedPrivate classes and what is coming from Private > >> classes. > >>> > >>> Would it be better to just say: make > >>> ./hbase-hadoop2-compat/src/main/java/org/apache/hadoop/metrics2/lib > >>> LimitedPrivate? > >>> > >>> Do you also need the stuff in > >>> hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase to push > >>> metrics back through the HBase metrics subsystem? > >>> > >>> Sorry for the late reply. Just want to make sure we open up the > >>> audience, we open it sufficiently. > >>> > >>> On 6/8/20 1:15 PM, Rushabh Shah wrote: > Hi, > Currently the IA for MutableSizeHistogram and MutableTimeHistogram is > private. We want to use these classes in PHOENIX project and I thought > >> we > can leverage the existing implementation from hbase histo > >> implementation. > IIUC the private IA can't be used in other projects. Proposing to make > >> it > LimitedPrivate and mark HBaseInterfaceAudience.PHOENIX. Please > suggest. > Related jira: https://issues.apache.org/jira/browse/HBASE-24520 > > >>> > >> > > > > >
Re: [DISCUSS] Change the IA for MutableSizeHistogram and MutableTimeHistogram to LImitedPrivate
Sounds reasonable to me! On 6/11/20 1:06 PM, Andrew Purtell wrote: hbase-metrics-api is available for coprocessors already and interfaces within are already LimitedPrivate(COPROC). However, that package is mostly interface and seems geared toward consuming metrics instantiated and registered via private stuff. Or, rather, I didn't see how Phoenix could choose which of MutableSizeHistogram and MutableTimeHistogram to instantiate using those interfaces, there is only Histogram MetricRegistry#histogram(String name). So I think it is also worth some time to review the utility of hbase-metrics-api and decide if more need be done there. Would the addition of Histogram MetricRegistry#sizeHistogram(String name) Histogram MetricRegistry#timeHistogram(String name) achieve the desired objective instead? On Thu, Jun 11, 2020 at 9:16 AM Nick Dimiduk wrote: I was just about to reply with the same -- Josh is faster :) +1 on considering the full surface area of the APIs being exposed. I also wonder if exposing the metrics infrastructure is something of interest more broadly than Phoenix. Seems like any coprocessor might want to provide or monitor some metric value. On Thu, Jun 11, 2020 at 9:08 AM Josh Elser wrote: My only concern is that you can't just mark these two classes a LimitedPrivate for Phoenix -- you would also have to mark MutableRangeHistogram, MutableHistogram (and the rest of the class hierarchy) to make sure that we don't make it super confusing as to what comes from LimitedPrivate classes and what is coming from Private classes. Would it be better to just say: make ./hbase-hadoop2-compat/src/main/java/org/apache/hadoop/metrics2/lib LimitedPrivate? Do you also need the stuff in hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase to push metrics back through the HBase metrics subsystem? Sorry for the late reply. Just want to make sure we open up the audience, we open it sufficiently. On 6/8/20 1:15 PM, Rushabh Shah wrote: Hi, Currently the IA for MutableSizeHistogram and MutableTimeHistogram is private. We want to use these classes in PHOENIX project and I thought we can leverage the existing implementation from hbase histo implementation. IIUC the private IA can't be used in other projects. Proposing to make it LimitedPrivate and mark HBaseInterfaceAudience.PHOENIX. Please suggest. Related jira: https://issues.apache.org/jira/browse/HBASE-24520
Re: [DISCUSS] Change the IA for MutableSizeHistogram and MutableTimeHistogram to LImitedPrivate
hbase-metrics-api is available for coprocessors already and interfaces within are already LimitedPrivate(COPROC). However, that package is mostly interface and seems geared toward consuming metrics instantiated and registered via private stuff. Or, rather, I didn't see how Phoenix could choose which of MutableSizeHistogram and MutableTimeHistogram to instantiate using those interfaces, there is only Histogram MetricRegistry#histogram(String name). So I think it is also worth some time to review the utility of hbase-metrics-api and decide if more need be done there. Would the addition of Histogram MetricRegistry#sizeHistogram(String name) Histogram MetricRegistry#timeHistogram(String name) achieve the desired objective instead? On Thu, Jun 11, 2020 at 9:16 AM Nick Dimiduk wrote: > I was just about to reply with the same -- Josh is faster :) +1 on > considering the full surface area of the APIs being exposed. > > I also wonder if exposing the metrics infrastructure is something of > interest more broadly than Phoenix. Seems like any coprocessor might want > to provide or monitor some metric value. > > On Thu, Jun 11, 2020 at 9:08 AM Josh Elser wrote: > > > My only concern is that you can't just mark these two classes a > > LimitedPrivate for Phoenix -- you would also have to mark > > MutableRangeHistogram, MutableHistogram (and the rest of the class > > hierarchy) to make sure that we don't make it super confusing as to what > > comes from LimitedPrivate classes and what is coming from Private > classes. > > > > Would it be better to just say: make > > ./hbase-hadoop2-compat/src/main/java/org/apache/hadoop/metrics2/lib > > LimitedPrivate? > > > > Do you also need the stuff in > > hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase to push > > metrics back through the HBase metrics subsystem? > > > > Sorry for the late reply. Just want to make sure we open up the > > audience, we open it sufficiently. > > > > On 6/8/20 1:15 PM, Rushabh Shah wrote: > > > Hi, > > > Currently the IA for MutableSizeHistogram and MutableTimeHistogram is > > > private. We want to use these classes in PHOENIX project and I thought > we > > > can leverage the existing implementation from hbase histo > implementation. > > > IIUC the private IA can't be used in other projects. Proposing to make > it > > > LimitedPrivate and mark HBaseInterfaceAudience.PHOENIX. Please suggest. > > > Related jira: https://issues.apache.org/jira/browse/HBASE-24520 > > > > > > -- Best regards, Andrew Words like orphans lost among the crosstalk, meaning torn from truth's decrepit hands - A23, Crosstalk
Re: [DISCUSS] Change the IA for MutableSizeHistogram and MutableTimeHistogram to LImitedPrivate
I was just about to reply with the same -- Josh is faster :) +1 on considering the full surface area of the APIs being exposed. I also wonder if exposing the metrics infrastructure is something of interest more broadly than Phoenix. Seems like any coprocessor might want to provide or monitor some metric value. On Thu, Jun 11, 2020 at 9:08 AM Josh Elser wrote: > My only concern is that you can't just mark these two classes a > LimitedPrivate for Phoenix -- you would also have to mark > MutableRangeHistogram, MutableHistogram (and the rest of the class > hierarchy) to make sure that we don't make it super confusing as to what > comes from LimitedPrivate classes and what is coming from Private classes. > > Would it be better to just say: make > ./hbase-hadoop2-compat/src/main/java/org/apache/hadoop/metrics2/lib > LimitedPrivate? > > Do you also need the stuff in > hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase to push > metrics back through the HBase metrics subsystem? > > Sorry for the late reply. Just want to make sure we open up the > audience, we open it sufficiently. > > On 6/8/20 1:15 PM, Rushabh Shah wrote: > > Hi, > > Currently the IA for MutableSizeHistogram and MutableTimeHistogram is > > private. We want to use these classes in PHOENIX project and I thought we > > can leverage the existing implementation from hbase histo implementation. > > IIUC the private IA can't be used in other projects. Proposing to make it > > LimitedPrivate and mark HBaseInterfaceAudience.PHOENIX. Please suggest. > > Related jira: https://issues.apache.org/jira/browse/HBASE-24520 > > >
Re: [DISCUSS] Change the IA for MutableSizeHistogram and MutableTimeHistogram to LImitedPrivate
My only concern is that you can't just mark these two classes a LimitedPrivate for Phoenix -- you would also have to mark MutableRangeHistogram, MutableHistogram (and the rest of the class hierarchy) to make sure that we don't make it super confusing as to what comes from LimitedPrivate classes and what is coming from Private classes. Would it be better to just say: make ./hbase-hadoop2-compat/src/main/java/org/apache/hadoop/metrics2/lib LimitedPrivate? Do you also need the stuff in hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase to push metrics back through the HBase metrics subsystem? Sorry for the late reply. Just want to make sure we open up the audience, we open it sufficiently. On 6/8/20 1:15 PM, Rushabh Shah wrote: Hi, Currently the IA for MutableSizeHistogram and MutableTimeHistogram is private. We want to use these classes in PHOENIX project and I thought we can leverage the existing implementation from hbase histo implementation. IIUC the private IA can't be used in other projects. Proposing to make it LimitedPrivate and mark HBaseInterfaceAudience.PHOENIX. Please suggest. Related jira: https://issues.apache.org/jira/browse/HBASE-24520
[jira] [Created] (HBASE-24541) Add support to run LoadIncrementalHFiles in a distributed manner
Constantin-Catalin Luca created HBASE-24541: --- Summary: Add support to run LoadIncrementalHFiles in a distributed manner Key: HBASE-24541 URL: https://issues.apache.org/jira/browse/HBASE-24541 Project: HBase Issue Type: Improvement Components: mapreduce, Performance Affects Versions: 1.4.0 Reporter: Constantin-Catalin Luca LoadIncrementalHFiles takes a very long time to complete when running HBase on top of S3 and attempting to bulkload 500K-700K files. The root cause of this is a combination of the higher latency of S3 (as compared to HDFS) as well as the calls made by LoadIncrementalHFiles to the underlying filesystem(each file is opened, seeked to the trailer offset at the end, and then the trailer is read). Increasing the parallelism does not yield any significant improvement. This seems to stem from the fact that once the trailer is read the stream is not consumed to the end. This causes the underlying HTTP connection to be aborted and it cannot be re-used. The proposed solution would be to also add support to run LoadIncrementalHFiles on multiple machines as a map reduce job. -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [DISCUSS]HBase2.1.0 is slower than HBase1.2.0
Oh great. Thanks for pointing that out. I think that is what is the exact place that the perf bottleneck was found. Regards Ram On Thu, Jun 11, 2020 at 4:29 PM 张铎(Duo Zhang) wrote: > Oh, good. I recall that there is a related issue but I just forget the > title so I can not find it... > > Thanks for chimming in. > > OpenInx 于2020年6月11日周四 下午6:39写道: > > > Hi Zheng wang. > > > > Hope this issue will be helpful for you. > > https://issues.apache.org/jira/browse/HBASE-21657 > > Thanks. > > > > On Tue, Jun 9, 2020 at 5:53 PM Anoop John wrote: > > > > > Thanks for the detailed analysis and update zheng wang. > > > >The code line below in StoreScanner.next() cost about 100ms in v2.1, > and > > > it added from v2.0, see HBASE-17647. > > > So still there is some additional cost in 2.1 right? Do u have any > other > > > observation? Are we doing more cell compares in 2.x? > > > > > > Anoop > > > > > > > > > On Mon, Jun 8, 2020 at 1:50 AM zheng wang <18031...@qq.com> wrote: > > > > > > > Hi guys: > > > > > > > > > > > > I did some test on my pc to find the reason as Jan Van Besien > mentioned > > > in > > > > user channel. > > > > > > > > > > > > #test env > > > > OS : win10 > > > > JDK: 1.8 > > > > MEM: 8GB > > > > > > > > > > > > #test data: > > > > 1 million rows with only one columnfamily and one qualifier. > > > > > > > > > > > > rowkey: rowkey-#index# > > > > value: value-#index# > > > > > > > > > > > > #test method: > > > > just use client api to scan with default config several times, no pe, > > no > > > > ycsb > > > > > > > > > > > > #test result(avg): > > > > v1.2.0: 800ms > > > > v2.1.0: 1050ms > > > > > > > > > > > > So, it is sure that v2.1 is slower than v1.2, after this, i did some > > > > statistics on regionserver. > > > > Then i find the partly reason is related to the size estimated. > > > > > > > > > > > > The code line below in StoreScanner.next() cost about 100ms in v2.1, > > and > > > > it added from v2.0, see HBASE-17647. > > > > "int cellSize = PrivateCellUtil.estimatedSerializedSizeOf(cell);" > > > > > > > > > > > > Should we support to disable the MaxResultSize limit(2MB by default > > now) > > > > to get more efficient if user exactly knows their data and could > limit > > > > results only by setBatch and setLimit? > > > > > >
Re: [DISCUSS]HBase2.1.0 is slower than HBase1.2.0
Oh, good. I recall that there is a related issue but I just forget the title so I can not find it... Thanks for chimming in. OpenInx 于2020年6月11日周四 下午6:39写道: > Hi Zheng wang. > > Hope this issue will be helpful for you. > https://issues.apache.org/jira/browse/HBASE-21657 > Thanks. > > On Tue, Jun 9, 2020 at 5:53 PM Anoop John wrote: > > > Thanks for the detailed analysis and update zheng wang. > > >The code line below in StoreScanner.next() cost about 100ms in v2.1, and > > it added from v2.0, see HBASE-17647. > > So still there is some additional cost in 2.1 right? Do u have any other > > observation? Are we doing more cell compares in 2.x? > > > > Anoop > > > > > > On Mon, Jun 8, 2020 at 1:50 AM zheng wang <18031...@qq.com> wrote: > > > > > Hi guys: > > > > > > > > > I did some test on my pc to find the reason as Jan Van Besien mentioned > > in > > > user channel. > > > > > > > > > #test env > > > OS : win10 > > > JDK: 1.8 > > > MEM: 8GB > > > > > > > > > #test data: > > > 1 million rows with only one columnfamily and one qualifier. > > > > > > > > > rowkey: rowkey-#index# > > > value: value-#index# > > > > > > > > > #test method: > > > just use client api to scan with default config several times, no pe, > no > > > ycsb > > > > > > > > > #test result(avg): > > > v1.2.0: 800ms > > > v2.1.0: 1050ms > > > > > > > > > So, it is sure that v2.1 is slower than v1.2, after this, i did some > > > statistics on regionserver. > > > Then i find the partly reason is related to the size estimated. > > > > > > > > > The code line below in StoreScanner.next() cost about 100ms in v2.1, > and > > > it added from v2.0, see HBASE-17647. > > > "int cellSize = PrivateCellUtil.estimatedSerializedSizeOf(cell);" > > > > > > > > > Should we support to disable the MaxResultSize limit(2MB by default > now) > > > to get more efficient if user exactly knows their data and could limit > > > results only by setBatch and setLimit? > > >
Re: [DISCUSS]HBase2.1.0 is slower than HBase1.2.0
Hi Zheng wang. Hope this issue will be helpful for you. https://issues.apache.org/jira/browse/HBASE-21657 Thanks. On Tue, Jun 9, 2020 at 5:53 PM Anoop John wrote: > Thanks for the detailed analysis and update zheng wang. > >The code line below in StoreScanner.next() cost about 100ms in v2.1, and > it added from v2.0, see HBASE-17647. > So still there is some additional cost in 2.1 right? Do u have any other > observation? Are we doing more cell compares in 2.x? > > Anoop > > > On Mon, Jun 8, 2020 at 1:50 AM zheng wang <18031...@qq.com> wrote: > > > Hi guys: > > > > > > I did some test on my pc to find the reason as Jan Van Besien mentioned > in > > user channel. > > > > > > #test env > > OS : win10 > > JDK: 1.8 > > MEM: 8GB > > > > > > #test data: > > 1 million rows with only one columnfamily and one qualifier. > > > > > > rowkey: rowkey-#index# > > value: value-#index# > > > > > > #test method: > > just use client api to scan with default config several times, no pe, no > > ycsb > > > > > > #test result(avg): > > v1.2.0: 800ms > > v2.1.0: 1050ms > > > > > > So, it is sure that v2.1 is slower than v1.2, after this, i did some > > statistics on regionserver. > > Then i find the partly reason is related to the size estimated. > > > > > > The code line below in StoreScanner.next() cost about 100ms in v2.1, and > > it added from v2.0, see HBASE-17647. > > "int cellSize = PrivateCellUtil.estimatedSerializedSizeOf(cell);" > > > > > > Should we support to disable the MaxResultSize limit(2MB by default now) > > to get more efficient if user exactly knows their data and could limit > > results only by setBatch and setLimit? >