[jira] [Commented] (HBASE-21164) reportForDuty should do (expotential) backoff rather than retry every 3 seconds (default).
[ https://issues.apache.org/jira/browse/HBASE-21164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611632#comment-16611632 ] stack commented on HBASE-21164: --- One minute is prompt enough for shutdown and much better period than current 3 second heartbeat. Let's start w one minute. Can go up later. Thanks > reportForDuty should do (expotential) backoff rather than retry every 3 > seconds (default). > -- > > Key: HBASE-21164 > URL: https://issues.apache.org/jira/browse/HBASE-21164 > Project: HBase > Issue Type: Improvement > Components: regionserver >Reporter: stack >Assignee: Mingliang Liu >Priority: Minor > Attachments: HBASE-21164.005.patch, HBASE-21164.branch-2.1.001.patch, > HBASE-21164.branch-2.1.002.patch, HBASE-21164.branch-2.1.003.patch, > HBASE-21164.branch-2.1.004.patch > > > RegionServers do reportForDuty on startup to tell Master they are available. > If Master is initializing, and especially on a big cluster when it can take a > while particularly if something is amiss, the log every three seconds is > annoying and doesn't do anything of use. Do backoff if fails up to a > reasonable maximum period. Here is example: > {code} > 2018-09-06 14:01:39,312 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty to > master=vc0207.halxg.cloudera.com,22001,1536266763109 with port=22001, > startcode=1536266763109 > 2018-09-06 14:01:39,312 WARN > org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; > sleeping and then retrying. > > {code} > For example, I am looking at a large cluster now that had a backlog of > procedure WALs. It is taking a couple of hours recreating the procedure-state > because there are millions of procedures outstanding. Meantime, the Master > log is just full of the above message -- every three seconds... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21181) Use the same filesystem for wal archive directory and wal directory
[ https://issues.apache.org/jira/browse/HBASE-21181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611631#comment-16611631 ] Hudson commented on HBASE-21181: Results for branch branch-2.1 [build #312 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/312/]: (/) *{color:green}+1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/312//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/312//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/312//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Use the same filesystem for wal archive directory and wal directory > --- > > Key: HBASE-21181 > URL: https://issues.apache.org/jira/browse/HBASE-21181 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0, 2.1.1 >Reporter: Tak Lon (Stephen) Wu >Assignee: Tak Lon (Stephen) Wu >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3 > > Attachments: HBASE-21181.master.001.patch > > > when {{hbase.wal.dir}} is set to any filesystem other than the same > filesystem used by rootDir e.g. {{hbase.wal.dir}} set to > {{hdfs://something/wal}} and {{hbase.rootdir}} set to {{s3://something}}, > before this change, WAL archive directory ({{walArchiveDir}}) cannot be > created and failed the WALProcedureStore on HMaster. > The issue is that WAL archive directory was considered to be collocated with > {{hbase.rootdir}} and creates a subdirectory under it. this logic needs to be > updated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21035) Meta Table should be able to online even if all procedures are lost
[ https://issues.apache.org/jira/browse/HBASE-21035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611629#comment-16611629 ] stack commented on HBASE-21035: --- .002 is a hack on top of [~allan163] 's patch. I used it doing fixup on a cluster here, one where I had to remove procedure WALs because too many too process and when done, a bunch of state needed repair. It includes some miscellaneous but main thing is assign of meta and of namespace so master startup can continue (without these, master exits because it can't scan a meta, and later a namespace table). I've also been playing around with a FixMetaProcedure that does force online. Tricky part is finding all the meta WALs and then doing the inline split. Trying to see if I should remove meta files when done and trying to figure how to fence off the meta if it open already. Will be back later. > Meta Table should be able to online even if all procedures are lost > --- > > Key: HBASE-21035 > URL: https://issues.apache.org/jira/browse/HBASE-21035 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Attachments: HBASE-21035.branch-2.0.001.patch, > HBASE-21035.branch-2.1.001.patch > > > After HBASE-20708, we changed the way we init after master starts. It will > only check WAL dirs and compare to Zookeeper RS nodes to decide which server > need to expire. For servers which's dir is ending with 'SPLITTING', we assure > that there will be a SCP for it. > But, if the server with the meta region crashed before master restarts, and > if all the procedure wals are lost (due to bug, or deleted manually, > whatever), the new restarted master will be stuck when initing. Since no one > will bring meta region online. > Although it is an anomaly case, but I think no matter what happens, we need > to online meta region. Otherwise, we are sitting ducks, noting can be done. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21035) Meta Table should be able to online even if all procedures are lost
[ https://issues.apache.org/jira/browse/HBASE-21035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-21035: -- Attachment: HBASE-21035.branch-2.1.001.patch > Meta Table should be able to online even if all procedures are lost > --- > > Key: HBASE-21035 > URL: https://issues.apache.org/jira/browse/HBASE-21035 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Attachments: HBASE-21035.branch-2.0.001.patch, > HBASE-21035.branch-2.1.001.patch > > > After HBASE-20708, we changed the way we init after master starts. It will > only check WAL dirs and compare to Zookeeper RS nodes to decide which server > need to expire. For servers which's dir is ending with 'SPLITTING', we assure > that there will be a SCP for it. > But, if the server with the meta region crashed before master restarts, and > if all the procedure wals are lost (due to bug, or deleted manually, > whatever), the new restarted master will be stuck when initing. Since no one > will bring meta region online. > Although it is an anomaly case, but I think no matter what happens, we need > to online meta region. Otherwise, we are sitting ducks, noting can be done. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21164) reportForDuty should do (expotential) backoff rather than retry every 3 seconds (default).
[ https://issues.apache.org/jira/browse/HBASE-21164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611622#comment-16611622 ] Mingliang Liu commented on HBASE-21164: --- Thanks for the comment, [~allan163] and [~stack]. Learning from the discussion. {quote} I think System.currentTimeMillis() is good enough, we don't need to be so acute when sleeping. {quote} The motivation is to avoid the effect of system time changes on elapsed time calculations. The system time change can be due to users changing the time settings, and/or internet time sync. I agree that suffering from the system time change does not have critical consequence while it is better if we can avoid that. We have similar effort in Hadoop see [HDFS-6841]. One side effect I can "imagine" if system time changes in our case is spurious warning as following in Sleeper. {code:java} (slept - this.period > MINIMAL_DELTA_FOR_LOGGING) { LOG.warn("We slept " + slept + "ms instead of " + this.period + "ms, this is likely due to a long " + "garbage collecting pause and it's usually bad, see " + "http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired;); } {code} {quote} There is a facility to wake this.sleeper. Could call from stop/abort? {quote} Can do that. As long as it's after {{this.stopped = true;}}, sleeper should respect that. {quote} Could just have a max of a minute or two. {quote} One minute seems an easier pill to swallow here? > reportForDuty should do (expotential) backoff rather than retry every 3 > seconds (default). > -- > > Key: HBASE-21164 > URL: https://issues.apache.org/jira/browse/HBASE-21164 > Project: HBase > Issue Type: Improvement > Components: regionserver >Reporter: stack >Assignee: Mingliang Liu >Priority: Minor > Attachments: HBASE-21164.005.patch, HBASE-21164.branch-2.1.001.patch, > HBASE-21164.branch-2.1.002.patch, HBASE-21164.branch-2.1.003.patch, > HBASE-21164.branch-2.1.004.patch > > > RegionServers do reportForDuty on startup to tell Master they are available. > If Master is initializing, and especially on a big cluster when it can take a > while particularly if something is amiss, the log every three seconds is > annoying and doesn't do anything of use. Do backoff if fails up to a > reasonable maximum period. Here is example: > {code} > 2018-09-06 14:01:39,312 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty to > master=vc0207.halxg.cloudera.com,22001,1536266763109 with port=22001, > startcode=1536266763109 > 2018-09-06 14:01:39,312 WARN > org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; > sleeping and then retrying. > > {code} > For example, I am looking at a large cluster now that had a backlog of > procedure WALs. It is taking a couple of hours recreating the procedure-state > because there are millions of procedures outstanding. Meantime, the Master > log is just full of the above message -- every three seconds... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21181) Use the same filesystem for wal archive directory and wal directory
[ https://issues.apache.org/jira/browse/HBASE-21181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611613#comment-16611613 ] Hudson commented on HBASE-21181: Results for branch branch-2 [build #1236 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1236/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1236//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1236//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1236//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Use the same filesystem for wal archive directory and wal directory > --- > > Key: HBASE-21181 > URL: https://issues.apache.org/jira/browse/HBASE-21181 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0, 2.1.1 >Reporter: Tak Lon (Stephen) Wu >Assignee: Tak Lon (Stephen) Wu >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3 > > Attachments: HBASE-21181.master.001.patch > > > when {{hbase.wal.dir}} is set to any filesystem other than the same > filesystem used by rootDir e.g. {{hbase.wal.dir}} set to > {{hdfs://something/wal}} and {{hbase.rootdir}} set to {{s3://something}}, > before this change, WAL archive directory ({{walArchiveDir}}) cannot be > created and failed the WALProcedureStore on HMaster. > The issue is that WAL archive directory was considered to be collocated with > {{hbase.rootdir}} and creates a subdirectory under it. this logic needs to be > updated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21021) Result returned by Append operation should be ordered
[ https://issues.apache.org/jira/browse/HBASE-21021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611612#comment-16611612 ] Hudson commented on HBASE-21021: Results for branch branch-2 [build #1236 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1236/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1236//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1236//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1236//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Result returned by Append operation should be ordered > - > > Key: HBASE-21021 > URL: https://issues.apache.org/jira/browse/HBASE-21021 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.0, 1.5.0 >Reporter: Nihal Jain >Assignee: Nihal Jain >Priority: Major > Fix For: 3.0.0, 1.5.0, 2.2.0 > > Attachments: HBASE-21021.branch-1.001.patch, > HBASE-21021.master.001.patch > > > *Problem:* > The result returned by the append operation should be ordered. Currently, it > returns an unordered list, which may cause problems like if the user tries to > perform Result.getValue(byte[] family, byte[] qualifier), even if the > returned result has a value corresponding to (family, qualifier), the method > may return null as it performs a binary search over the unsorted result > (which should have been sorted actually). > > The result is enumerated by iterating over each entry of tempMemstore hashmap > (which will never be ordered) and adding the values (see > [HRegion.java#L7882|https://github.com/apache/hbase/blob/1b50fe53724aa62a242b7f64adf7845048df/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java#L7882]). > > *Actual:* The returned result is unordered > *Expected:* Similar to increment op, the returned result should be ordered. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21098) Improve Snapshot Performance with Temporary Snapshot Directory when rootDir on S3
[ https://issues.apache.org/jira/browse/HBASE-21098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611614#comment-16611614 ] Hudson commented on HBASE-21098: Results for branch branch-2 [build #1236 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1236/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1236//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1236//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1236//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Improve Snapshot Performance with Temporary Snapshot Directory when rootDir > on S3 > - > > Key: HBASE-21098 > URL: https://issues.apache.org/jira/browse/HBASE-21098 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0, 1.4.8, 2.1.1 >Reporter: Tyler Mi >Priority: Major > Attachments: HBASE-21098.master.001.patch, > HBASE-21098.master.002.patch, HBASE-21098.master.003.patch, > HBASE-21098.master.004.patch, HBASE-21098.master.005.patch, > HBASE-21098.master.006.patch, HBASE-21098.master.007.patch, > HBASE-21098.master.008.patch, HBASE-21098.master.009.patch, > HBASE-21098.master.010.patch, HBASE-21098.master.011.patch, > HBASE-21098.master.012.patch, HBASE-21098.master.013.patch > > > When using Apache HBase, the snapshot feature can be used to make a point in > time recovery. To do this, HBase creates a manifest of all the files in all > of the Regions so that those files can be referenced again when a user > restores a snapshot. With HBase's S3 storage mode, developers can store their > data off-cluster on Amazon S3. However, utilizing S3 as a file system is > inefficient in some operations, namely renames. Most Hadoop ecosystem > applications use an atomic rename as a method of committing data. However, > with S3, a rename is a separate copy and then a delete of every file which is > no longer atomic and, in fact, quite costly. In addition, puts and deletes on > S3 have latency issues that traditional filesystems do not encounter when > manipulating the region snapshots to consolidate into a single manifest. When > HBase on S3 users have a significant amount of regions, puts, deletes, and > renames (the final commit stage of the snapshot) become the bottleneck > causing snapshots to take many minutes or even hours to complete. > The purpose of this patch is to increase the overall performance of snapshots > while utilizing HBase on S3 through the use of a temporary directory for the > snapshots that exists on a traditional filesystem like HDFS to circumvent the > bottlenecks. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21179) Fix the number of actions in responseTooSlow log
[ https://issues.apache.org/jira/browse/HBASE-21179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611603#comment-16611603 ] Hadoop QA commented on HBASE-21179: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 1s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange} 0m 0s{color} | {color:orange} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} branch-1 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 37s{color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 33s{color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 2m 35s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 2m 39s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 1m 36s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 21s{color} | {color:green} hbase-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 9s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 16m 44s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:61288f8 | | JIRA Issue | HBASE-21179 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12939357/HBASE-21179.branch-1.001.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 7291770b600c
[jira] [Commented] (HBASE-21181) Use the same filesystem for wal archive directory and wal directory
[ https://issues.apache.org/jira/browse/HBASE-21181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611600#comment-16611600 ] Hudson commented on HBASE-21181: Results for branch branch-2.0 [build #802 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/802/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/802//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/802//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/802//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > Use the same filesystem for wal archive directory and wal directory > --- > > Key: HBASE-21181 > URL: https://issues.apache.org/jira/browse/HBASE-21181 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0, 2.1.1 >Reporter: Tak Lon (Stephen) Wu >Assignee: Tak Lon (Stephen) Wu >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3 > > Attachments: HBASE-21181.master.001.patch > > > when {{hbase.wal.dir}} is set to any filesystem other than the same > filesystem used by rootDir e.g. {{hbase.wal.dir}} set to > {{hdfs://something/wal}} and {{hbase.rootdir}} set to {{s3://something}}, > before this change, WAL archive directory ({{walArchiveDir}}) cannot be > created and failed the WALProcedureStore on HMaster. > The issue is that WAL archive directory was considered to be collocated with > {{hbase.rootdir}} and creates a subdirectory under it. this logic needs to be > updated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21097) Flush pressure assertion may fail in testFlushThroughputTuning
[ https://issues.apache.org/jira/browse/HBASE-21097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611598#comment-16611598 ] Hadoop QA commented on HBASE-21097: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s{color} | {color:blue} Docker mode activated. {color} | | {color:blue}0{color} | {color:blue} patch {color} | {color:blue} 0m 1s{color} | {color:blue} The patch file was not named according to hbase's naming conventions. Please see https://yetus.apache.org/documentation/0.7.0/precommit-patchnames for instructions. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 7s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 7s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 30s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 5s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 48s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 31s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 12m 17s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}136m 23s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}184m 38s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.regionserver.TestWALLockup | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-21097 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12939343/21097.v3.txt | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux c0dce5058377 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 5c1b325b51 | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC3 | | unit |
[jira] [Updated] (HBASE-21179) Fix the number of actions in responseTooSlow log
[ https://issues.apache.org/jira/browse/HBASE-21179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guangxu Cheng updated HBASE-21179: -- Attachment: HBASE-21179.branch-1.001.patch > Fix the number of actions in responseTooSlow log > > > Key: HBASE-21179 > URL: https://issues.apache.org/jira/browse/HBASE-21179 > Project: HBase > Issue Type: Bug > Components: rpc >Reporter: Guangxu Cheng >Assignee: Guangxu Cheng >Priority: Major > Attachments: HBASE-21179.branch-1.001.patch, > HBASE-21179.master.001.patch, HBASE-21179.master.002.patch > > > {panel:title=responseTooSlow|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1|bgColor=#CE} > 2018-09-10 16:13:53,022 WARN > [B.DefaultRpcServer.handler=209,queue=29,port=60020] ipc.RpcServer: > (responseTooSlow): > {"processingtimems":321262,"call":"Multi(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MultiRequest)","client":"127.0.0.1:56149","param":"region= > > tsdb,\\x00\\x00.[\\x89\\x1F\\xB0\\x00\\x00\\x01\\x00\\x01Y\\x00\\x00\\x02\\x00\\x00\\x04,1536133210446.7c752de470bd5558a001117b123a5db5., > {color:red}for 1 actions and 1st row{color} > key=\\x00\\x00.[\\x96\\x16p","starttimems":1536566911759,"queuetimems":0,"class":"HRegionServer","responsesize":2,"method":"Multi"} > {panel} > The responseTooSlow log is printed when the processing time of a request > exceeds the specified threshold. The number of actions and the contents of > the first rowkey in the request will be included in the log. > However, the number of actions is inaccurate, and it is actually the number > of regions that the request needs to visit. > Just like the logs above, users may be mistaken for using 321262ms to process > an action, which is incredible, so we need to fix it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21164) reportForDuty should do (expotential) backoff rather than retry every 3 seconds (default).
[ https://issues.apache.org/jira/browse/HBASE-21164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611595#comment-16611595 ] stack commented on HBASE-21164: --- bq. ...we will wait a lot of time waiting the regionserver to stop when shut down. There is a facility to wake this.sleeper. Could call from stop/abort? [~liuml07] Pardon my not giving this attention. I have been running w/ this in place and have been trying to capture a good example of it in action to show how this patch is useful. [~allan163] has a point though. Can't sleep too long else will take a while to notice a shutdown Could just have a max of a minute or two if the wake this.sleeper isn't working. Thanks. > reportForDuty should do (expotential) backoff rather than retry every 3 > seconds (default). > -- > > Key: HBASE-21164 > URL: https://issues.apache.org/jira/browse/HBASE-21164 > Project: HBase > Issue Type: Improvement > Components: regionserver >Reporter: stack >Assignee: Mingliang Liu >Priority: Minor > Attachments: HBASE-21164.005.patch, HBASE-21164.branch-2.1.001.patch, > HBASE-21164.branch-2.1.002.patch, HBASE-21164.branch-2.1.003.patch, > HBASE-21164.branch-2.1.004.patch > > > RegionServers do reportForDuty on startup to tell Master they are available. > If Master is initializing, and especially on a big cluster when it can take a > while particularly if something is amiss, the log every three seconds is > annoying and doesn't do anything of use. Do backoff if fails up to a > reasonable maximum period. Here is example: > {code} > 2018-09-06 14:01:39,312 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty to > master=vc0207.halxg.cloudera.com,22001,1536266763109 with port=22001, > startcode=1536266763109 > 2018-09-06 14:01:39,312 WARN > org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; > sleeping and then retrying. > > {code} > For example, I am looking at a large cluster now that had a backlog of > procedure WALs. It is taking a couple of hours recreating the procedure-state > because there are millions of procedures outstanding. Meantime, the Master > log is just full of the above message -- every three seconds... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21097) Flush pressure assertion may fail in testFlushThroughputTuning
[ https://issues.apache.org/jira/browse/HBASE-21097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611593#comment-16611593 ] Hadoop QA commented on HBASE-21097: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s{color} | {color:blue} Docker mode activated. {color} | | {color:blue}0{color} | {color:blue} patch {color} | {color:blue} 0m 1s{color} | {color:blue} The patch file was not named according to hbase's naming conventions. Please see https://yetus.apache.org/documentation/0.7.0/precommit-patchnames for instructions. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 43s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 51s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 12s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 17s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 6s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 17s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 10m 52s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green}139m 24s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 30s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}182m 37s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-21097 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12939342/21097.v3.txt | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 45844407283c 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 10:45:36 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh | | git revision | master / 5c1b325b51 | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC3 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/14394/testReport/ | | Max. process+thread count | 4466 (vs. ulimit of 1) | | modules |
[jira] [Commented] (HBASE-21182) Failed to execute start-hbase.sh
[ https://issues.apache.org/jira/browse/HBASE-21182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611592#comment-16611592 ] Toshihiro Suzuki commented on HBASE-21182: -- I can reproduce this issue locally. Digging. > Failed to execute start-hbase.sh > > > Key: HBASE-21182 > URL: https://issues.apache.org/jira/browse/HBASE-21182 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Subrat Mishra >Priority: Major > > Built master branch like below: > {code:java} > mvn clean install -DskipTests{code} > Then tried to execute start-hbase.sh failed with NoClassDefFoundError > {code:java} > ./bin/start-hbase.sh > Error: A JNI error has occurred, please check your installation and try again > Exception in thread "main" java.lang.NoClassDefFoundError: > org/apache/hadoop/hbase/shaded/org/eclipse/jetty/server/Connector > at java.lang.Class.getDeclaredMethods0(Native Method) > at java.lang.Class.privateGetDeclaredMethods(Class.java:2701) > at java.lang.Class.privateGetMethodRecursive(Class.java:3048) > at java.lang.Class.getMethod0(Class.java:3018) > at java.lang.Class.getMethod(Class.java:1784) > at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544) > at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526) > Caused by: java.lang.ClassNotFoundException: > org.apache.hadoop.hbase.shaded.org.eclipse.jetty.server.Connector{code} > Note: It worked after reverting HBASE-21153 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21185) WALPrettyPrinter: Additional useful info to be printed by wal printer tool, for debugability purposes
[ https://issues.apache.org/jira/browse/HBASE-21185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611578#comment-16611578 ] Allan Yang commented on HBASE-21185: {code} + op.put("total_size_sum", new Long(cell.getFamilyLength() + + cell.getQualifierLength() + + cell.getRowLength() + + cell.getTagsLength() + + cell.getValueLength())); } {code} You forget there is a timestamp in cells which takes 8 bytes. You should used method in CellUtils class avoiding writing method yourself to make better encapsulation. For this case, you can use getSumOfCellElementLengths(making it public first). > WALPrettyPrinter: Additional useful info to be printed by wal printer tool, > for debugability purposes > - > > Key: HBASE-21185 > URL: https://issues.apache.org/jira/browse/HBASE-21185 > Project: HBase > Issue Type: Improvement >Reporter: Wellington Chevreuil >Assignee: Wellington Chevreuil >Priority: Trivial > Attachments: HBASE-21185.master.001.patch > > > *WALPrettyPrinter* is very useful for troubleshooting wal issues, such as > faulty replication sinks. An useful information one might want to track is > the size of a single WAL entry edit, as well as size for each edit cell. Am > proposing a patch that adds calculations for these two, as well an option to > seek straight to a given position on the WAL file being analysed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21172) Reimplement the retry backoff logic for ReopenTableRegionsProcedure
[ https://issues.apache.org/jira/browse/HBASE-21172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611574#comment-16611574 ] Hadoop QA commented on HBASE-21172: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} branch-2.1 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 24s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 4s{color} | {color:green} branch-2.1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 2s{color} | {color:green} branch-2.1 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 24s{color} | {color:green} branch-2.1 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 32s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 18s{color} | {color:green} branch-2.1 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s{color} | {color:green} branch-2.1 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s{color} | {color:green} hbase-procedure: The patch generated 0 new + 6 unchanged - 2 fixed = 6 total (was 8) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 12s{color} | {color:green} The patch hbase-server passed checkstyle {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 36s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 8m 48s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 55s{color} | {color:green} hbase-procedure in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}132m 12s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 49s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}175m 2s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:42ca976 | | JIRA Issue | HBASE-21172 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12939340/HBASE-21172-branch-2.1-v1.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 1a286a2e4df8 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27
[jira] [Commented] (HBASE-21179) Fix the number of actions in responseTooSlow log
[ https://issues.apache.org/jira/browse/HBASE-21179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611562#comment-16611562 ] Guangxu Cheng commented on HBASE-21179: --- {quote}I think this is a nice to have for branch-1 too (smile) {quote} [~carp84] branch-1 can support jdk1.7 and this patch uses the syntax of jdk1.8. So let me generate another patch for branch-1.Thanks > Fix the number of actions in responseTooSlow log > > > Key: HBASE-21179 > URL: https://issues.apache.org/jira/browse/HBASE-21179 > Project: HBase > Issue Type: Bug > Components: rpc >Reporter: Guangxu Cheng >Assignee: Guangxu Cheng >Priority: Major > Attachments: HBASE-21179.master.001.patch, > HBASE-21179.master.002.patch > > > {panel:title=responseTooSlow|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1|bgColor=#CE} > 2018-09-10 16:13:53,022 WARN > [B.DefaultRpcServer.handler=209,queue=29,port=60020] ipc.RpcServer: > (responseTooSlow): > {"processingtimems":321262,"call":"Multi(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MultiRequest)","client":"127.0.0.1:56149","param":"region= > > tsdb,\\x00\\x00.[\\x89\\x1F\\xB0\\x00\\x00\\x01\\x00\\x01Y\\x00\\x00\\x02\\x00\\x00\\x04,1536133210446.7c752de470bd5558a001117b123a5db5., > {color:red}for 1 actions and 1st row{color} > key=\\x00\\x00.[\\x96\\x16p","starttimems":1536566911759,"queuetimems":0,"class":"HRegionServer","responsesize":2,"method":"Multi"} > {panel} > The responseTooSlow log is printed when the processing time of a request > exceeds the specified threshold. The number of actions and the contents of > the first rowkey in the request will be included in the log. > However, the number of actions is inaccurate, and it is actually the number > of regions that the request needs to visit. > Just like the logs above, users may be mistaken for using 321262ms to process > an action, which is incredible, so we need to fix it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21164) reportForDuty should do (expotential) backoff rather than retry every 3 seconds (default).
[ https://issues.apache.org/jira/browse/HBASE-21164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611556#comment-16611556 ] Allan Yang commented on HBASE-21164: {code} while (keepLooping()) { RegionServerStartupResponse w = reportForDuty(); if (w == null) { - LOG.warn("reportForDuty failed; sleeping and then retrying."); - this.sleeper.sleep(); + long sleepTime = rc.getBackoffTimeAndIncrementAttempts(); + LOG.warn("reportForDuty failed; sleeping {} ms and then retrying.", sleepTime); + this.sleeper.sleep(sleepTime); } else { {code} I don't think backing off here is a good idea. If sleeping time is too long, we will wait a lot of time waiting the regionserver to stop when shut down. Another opinion is that I think System.currentTimeMillis() is good enough, we don't need to be so acute when sleeping. > reportForDuty should do (expotential) backoff rather than retry every 3 > seconds (default). > -- > > Key: HBASE-21164 > URL: https://issues.apache.org/jira/browse/HBASE-21164 > Project: HBase > Issue Type: Improvement > Components: regionserver >Reporter: stack >Assignee: Mingliang Liu >Priority: Minor > Attachments: HBASE-21164.005.patch, HBASE-21164.branch-2.1.001.patch, > HBASE-21164.branch-2.1.002.patch, HBASE-21164.branch-2.1.003.patch, > HBASE-21164.branch-2.1.004.patch > > > RegionServers do reportForDuty on startup to tell Master they are available. > If Master is initializing, and especially on a big cluster when it can take a > while particularly if something is amiss, the log every three seconds is > annoying and doesn't do anything of use. Do backoff if fails up to a > reasonable maximum period. Here is example: > {code} > 2018-09-06 14:01:39,312 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty to > master=vc0207.halxg.cloudera.com,22001,1536266763109 with port=22001, > startcode=1536266763109 > 2018-09-06 14:01:39,312 WARN > org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; > sleeping and then retrying. > > {code} > For example, I am looking at a large cluster now that had a backlog of > procedure WALs. It is taking a couple of hours recreating the procedure-state > because there are millions of procedures outstanding. Meantime, the Master > log is just full of the above message -- every three seconds... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21172) Reimplement the retry backoff logic for ReopenTableRegionsProcedure
[ https://issues.apache.org/jira/browse/HBASE-21172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611526#comment-16611526 ] Duo Zhang commented on HBASE-21172: --- Yes, I think branch-2.1 and branch-2.0 can share the same patch? > Reimplement the retry backoff logic for ReopenTableRegionsProcedure > --- > > Key: HBASE-21172 > URL: https://issues.apache.org/jira/browse/HBASE-21172 > Project: HBase > Issue Type: Sub-task > Components: amv2, proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3 > > Attachments: HBASE-21172-branch-2.1-v1.patch, > HBASE-21172-branch-2.1-v1.patch, HBASE-21172-branch-2.1.patch, > HBASE-21172-v1.patch, HBASE-21172-v2.patch, HBASE-21172-v3.patch, > HBASE-21172-v4.patch, HBASE-21172.patch > > > Now we just do a blocking sleep in the execute method, and there is no > exponential backoff. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21172) Reimplement the retry backoff logic for ReopenTableRegionsProcedure
[ https://issues.apache.org/jira/browse/HBASE-21172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611522#comment-16611522 ] Allan Yang commented on HBASE-21172: Will this go to branch-2.0? I see you have 2.0.3 as a fixed version > Reimplement the retry backoff logic for ReopenTableRegionsProcedure > --- > > Key: HBASE-21172 > URL: https://issues.apache.org/jira/browse/HBASE-21172 > Project: HBase > Issue Type: Sub-task > Components: amv2, proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3 > > Attachments: HBASE-21172-branch-2.1-v1.patch, > HBASE-21172-branch-2.1-v1.patch, HBASE-21172-branch-2.1.patch, > HBASE-21172-v1.patch, HBASE-21172-v2.patch, HBASE-21172-v3.patch, > HBASE-21172-v4.patch, HBASE-21172.patch > > > Now we just do a blocking sleep in the execute method, and there is no > exponential backoff. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21035) Meta Table should be able to online even if all procedures are lost
[ https://issues.apache.org/jira/browse/HBASE-21035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611514#comment-16611514 ] Allan Yang commented on HBASE-21035: {quote} Oh, I should have mentioned, I removed all master proc WALs at one stage (too many, no tooling yet to fix STUCK assigns, each start up accumulated more WALs). That would explain "nothing to assign" and why hbase:meta had no assign. Now I'm in the state where Master exits because zk has a location for meta, we keep trying to go there but it will never be open at the zk location. {quote} It can explain, then. You encounter the same case I have -- 'no one bring up the meta table if all procedure lost'. > Meta Table should be able to online even if all procedures are lost > --- > > Key: HBASE-21035 > URL: https://issues.apache.org/jira/browse/HBASE-21035 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Attachments: HBASE-21035.branch-2.0.001.patch > > > After HBASE-20708, we changed the way we init after master starts. It will > only check WAL dirs and compare to Zookeeper RS nodes to decide which server > need to expire. For servers which's dir is ending with 'SPLITTING', we assure > that there will be a SCP for it. > But, if the server with the meta region crashed before master restarts, and > if all the procedure wals are lost (due to bug, or deleted manually, > whatever), the new restarted master will be stuck when initing. Since no one > will bring meta region online. > Although it is an anomaly case, but I think no matter what happens, we need > to online meta region. Otherwise, we are sitting ducks, noting can be done. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21164) reportForDuty should do (expotential) backoff rather than retry every 3 seconds (default).
[ https://issues.apache.org/jira/browse/HBASE-21164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611511#comment-16611511 ] Mingliang Liu commented on HBASE-21164: --- V5 patch is to replace {{System.currentTimeMillis()}} with {{Time.monotonicNow()}} in Sleeper to avoid effect of System time changes on elapsed time calculations. > reportForDuty should do (expotential) backoff rather than retry every 3 > seconds (default). > -- > > Key: HBASE-21164 > URL: https://issues.apache.org/jira/browse/HBASE-21164 > Project: HBase > Issue Type: Improvement > Components: regionserver >Reporter: stack >Assignee: Mingliang Liu >Priority: Minor > Attachments: HBASE-21164.005.patch, HBASE-21164.branch-2.1.001.patch, > HBASE-21164.branch-2.1.002.patch, HBASE-21164.branch-2.1.003.patch, > HBASE-21164.branch-2.1.004.patch > > > RegionServers do reportForDuty on startup to tell Master they are available. > If Master is initializing, and especially on a big cluster when it can take a > while particularly if something is amiss, the log every three seconds is > annoying and doesn't do anything of use. Do backoff if fails up to a > reasonable maximum period. Here is example: > {code} > 2018-09-06 14:01:39,312 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty to > master=vc0207.halxg.cloudera.com,22001,1536266763109 with port=22001, > startcode=1536266763109 > 2018-09-06 14:01:39,312 WARN > org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; > sleeping and then retrying. > > {code} > For example, I am looking at a large cluster now that had a backlog of > procedure WALs. It is taking a couple of hours recreating the procedure-state > because there are millions of procedures outstanding. Meantime, the Master > log is just full of the above message -- every three seconds... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21164) reportForDuty should do (expotential) backoff rather than retry every 3 seconds (default).
[ https://issues.apache.org/jira/browse/HBASE-21164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HBASE-21164: -- Attachment: HBASE-21164.005.patch > reportForDuty should do (expotential) backoff rather than retry every 3 > seconds (default). > -- > > Key: HBASE-21164 > URL: https://issues.apache.org/jira/browse/HBASE-21164 > Project: HBase > Issue Type: Improvement > Components: regionserver >Reporter: stack >Assignee: Mingliang Liu >Priority: Minor > Attachments: HBASE-21164.005.patch, HBASE-21164.branch-2.1.001.patch, > HBASE-21164.branch-2.1.002.patch, HBASE-21164.branch-2.1.003.patch, > HBASE-21164.branch-2.1.004.patch > > > RegionServers do reportForDuty on startup to tell Master they are available. > If Master is initializing, and especially on a big cluster when it can take a > while particularly if something is amiss, the log every three seconds is > annoying and doesn't do anything of use. Do backoff if fails up to a > reasonable maximum period. Here is example: > {code} > 2018-09-06 14:01:39,312 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty to > master=vc0207.halxg.cloudera.com,22001,1536266763109 with port=22001, > startcode=1536266763109 > 2018-09-06 14:01:39,312 WARN > org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; > sleeping and then retrying. > > {code} > For example, I am looking at a large cluster now that had a backlog of > procedure WALs. It is taking a couple of hours recreating the procedure-state > because there are millions of procedures outstanding. Meantime, the Master > log is just full of the above message -- every three seconds... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21179) Fix the number of actions in responseTooSlow log
[ https://issues.apache.org/jira/browse/HBASE-21179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611507#comment-16611507 ] Yu Li commented on HBASE-21179: --- I think this is a nice to have for branch-1 too (smile) > Fix the number of actions in responseTooSlow log > > > Key: HBASE-21179 > URL: https://issues.apache.org/jira/browse/HBASE-21179 > Project: HBase > Issue Type: Bug > Components: rpc >Reporter: Guangxu Cheng >Assignee: Guangxu Cheng >Priority: Major > Attachments: HBASE-21179.master.001.patch, > HBASE-21179.master.002.patch > > > {panel:title=responseTooSlow|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1|bgColor=#CE} > 2018-09-10 16:13:53,022 WARN > [B.DefaultRpcServer.handler=209,queue=29,port=60020] ipc.RpcServer: > (responseTooSlow): > {"processingtimems":321262,"call":"Multi(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MultiRequest)","client":"127.0.0.1:56149","param":"region= > > tsdb,\\x00\\x00.[\\x89\\x1F\\xB0\\x00\\x00\\x01\\x00\\x01Y\\x00\\x00\\x02\\x00\\x00\\x04,1536133210446.7c752de470bd5558a001117b123a5db5., > {color:red}for 1 actions and 1st row{color} > key=\\x00\\x00.[\\x96\\x16p","starttimems":1536566911759,"queuetimems":0,"class":"HRegionServer","responsesize":2,"method":"Multi"} > {panel} > The responseTooSlow log is printed when the processing time of a request > exceeds the specified threshold. The number of actions and the contents of > the first rowkey in the request will be included in the log. > However, the number of actions is inaccurate, and it is actually the number > of regions that the request needs to visit. > Just like the logs above, users may be mistaken for using 321262ms to process > an action, which is incredible, so we need to fix it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21179) Fix the number of actions in responseTooSlow log
[ https://issues.apache.org/jira/browse/HBASE-21179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611504#comment-16611504 ] Guangxu Cheng commented on HBASE-21179: --- Pushed to branch-2+ Thank [~yuzhih...@gmail.com] [~allan163] [~stack] [~carp84] for review. > Fix the number of actions in responseTooSlow log > > > Key: HBASE-21179 > URL: https://issues.apache.org/jira/browse/HBASE-21179 > Project: HBase > Issue Type: Bug > Components: rpc >Reporter: Guangxu Cheng >Assignee: Guangxu Cheng >Priority: Major > Attachments: HBASE-21179.master.001.patch, > HBASE-21179.master.002.patch > > > {panel:title=responseTooSlow|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1|bgColor=#CE} > 2018-09-10 16:13:53,022 WARN > [B.DefaultRpcServer.handler=209,queue=29,port=60020] ipc.RpcServer: > (responseTooSlow): > {"processingtimems":321262,"call":"Multi(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MultiRequest)","client":"127.0.0.1:56149","param":"region= > > tsdb,\\x00\\x00.[\\x89\\x1F\\xB0\\x00\\x00\\x01\\x00\\x01Y\\x00\\x00\\x02\\x00\\x00\\x04,1536133210446.7c752de470bd5558a001117b123a5db5., > {color:red}for 1 actions and 1st row{color} > key=\\x00\\x00.[\\x96\\x16p","starttimems":1536566911759,"queuetimems":0,"class":"HRegionServer","responsesize":2,"method":"Multi"} > {panel} > The responseTooSlow log is printed when the processing time of a request > exceeds the specified threshold. The number of actions and the contents of > the first rowkey in the request will be included in the log. > However, the number of actions is inaccurate, and it is actually the number > of regions that the request needs to visit. > Just like the logs above, users may be mistaken for using 321262ms to process > an action, which is incredible, so we need to fix it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21174) [REST] Failed to parse empty qualifier in TableResource#getScanResource
[ https://issues.apache.org/jira/browse/HBASE-21174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611498#comment-16611498 ] Guangxu Cheng commented on HBASE-21174: --- Pushed to branch-2+.Thanks [~yuzhih...@gmail.com] for review. > [REST] Failed to parse empty qualifier in TableResource#getScanResource > --- > > Key: HBASE-21174 > URL: https://issues.apache.org/jira/browse/HBASE-21174 > Project: HBase > Issue Type: Bug > Components: REST >Affects Versions: 3.0.0, 2.2.0 >Reporter: Guangxu Cheng >Assignee: Guangxu Cheng >Priority: Major > Attachments: HBASE-21174.master.001.patch, > HBASE-21174.master.002.patch > > > {code:xml} > GET /t1/*?column=f:c1=f: > {code} > If I want to get the values of 'f:'(empty qualifier) for all rows in the > table by rest server, I will send the above request. However, this request > will return all column values. > {code:java|title=TableResource#getScanResource|borderStyle=solid} > for (String csplit : column) { > String[] familysplit = csplit.trim().split(":"); > if (familysplit.length == 2) { > if (familysplit[1].length() > 0) { > if (LOG.isTraceEnabled()) { > LOG.trace("Scan family and column : " + familysplit[0] + " " + > familysplit[1]); > } > tableScan.addColumn(Bytes.toBytes(familysplit[0]), > Bytes.toBytes(familysplit[1])); > } else { > tableScan.addFamily(Bytes.toBytes(familysplit[0])); > if (LOG.isTraceEnabled()) { > LOG.trace("Scan family : " + familysplit[0] + " and empty > qualifier."); > } > tableScan.addColumn(Bytes.toBytes(familysplit[0]), null); > } > } else if (StringUtils.isNotEmpty(familysplit[0])) { > if (LOG.isTraceEnabled()) { > LOG.trace("Scan family : " + familysplit[0]); > } > tableScan.addFamily(Bytes.toBytes(familysplit[0])); > } > } > {code} > Through the above code, when the column has an empty qualifier, the empty > qualifier cannot be parsed correctly.In other words, 'f:'(empty qualifier) > and 'f' (column family) are considered to have the same meaning, which is > wrong. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21179) Fix the number of actions in responseTooSlow log
[ https://issues.apache.org/jira/browse/HBASE-21179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611490#comment-16611490 ] Yu Li commented on HBASE-21179: --- +1, good catch. > Fix the number of actions in responseTooSlow log > > > Key: HBASE-21179 > URL: https://issues.apache.org/jira/browse/HBASE-21179 > Project: HBase > Issue Type: Bug > Components: rpc >Reporter: Guangxu Cheng >Assignee: Guangxu Cheng >Priority: Major > Attachments: HBASE-21179.master.001.patch, > HBASE-21179.master.002.patch > > > {panel:title=responseTooSlow|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1|bgColor=#CE} > 2018-09-10 16:13:53,022 WARN > [B.DefaultRpcServer.handler=209,queue=29,port=60020] ipc.RpcServer: > (responseTooSlow): > {"processingtimems":321262,"call":"Multi(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MultiRequest)","client":"127.0.0.1:56149","param":"region= > > tsdb,\\x00\\x00.[\\x89\\x1F\\xB0\\x00\\x00\\x01\\x00\\x01Y\\x00\\x00\\x02\\x00\\x00\\x04,1536133210446.7c752de470bd5558a001117b123a5db5., > {color:red}for 1 actions and 1st row{color} > key=\\x00\\x00.[\\x96\\x16p","starttimems":1536566911759,"queuetimems":0,"class":"HRegionServer","responsesize":2,"method":"Multi"} > {panel} > The responseTooSlow log is printed when the processing time of a request > exceeds the specified threshold. The number of actions and the contents of > the first rowkey in the request will be included in the log. > However, the number of actions is inaccurate, and it is actually the number > of regions that the request needs to visit. > Just like the logs above, users may be mistaken for using 321262ms to process > an action, which is incredible, so we need to fix it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21097) Flush pressure assertion may fail in testFlushThroughputTuning
[ https://issues.apache.org/jira/browse/HBASE-21097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-21097: -- Component/s: regionserver > Flush pressure assertion may fail in testFlushThroughputTuning > --- > > Key: HBASE-21097 > URL: https://issues.apache.org/jira/browse/HBASE-21097 > Project: HBase > Issue Type: Test > Components: regionserver >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3 > > Attachments: 21097.v1.txt, 21097.v2.txt, 21097.v3.txt, > HBASE-21097.patch > > > From > https://builds.apache.org/job/PreCommit-HBASE-Build/14137/artifact/patchprocess/patch-unit-hbase-server.txt > : > {code} > [ERROR] > testFlushThroughputTuning(org.apache.hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController) > Time elapsed: 17.446 s <<< FAILURE! > java.lang.AssertionError: expected:<0.0> but was:<1.2906294173808417E-6> > at > org.apache.hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController.testFlushThroughputTuning(TestFlushWithThroughputController.java:185) > {code} > Here is the related assertion: > {code} > assertEquals(0.0, regionServer.getFlushPressure(), EPSILON); > {code} > where EPSILON = 1E-6 > In the above case, due to margin of 2.9E-7, the assertion didn't pass. > It seems the epsilon can be adjusted to accommodate different workload / > hardware combination. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21097) Flush pressure assertion may fail in testFlushThroughputTuning
[ https://issues.apache.org/jira/browse/HBASE-21097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-21097: -- Fix Version/s: 2.0.3 2.1.1 > Flush pressure assertion may fail in testFlushThroughputTuning > --- > > Key: HBASE-21097 > URL: https://issues.apache.org/jira/browse/HBASE-21097 > Project: HBase > Issue Type: Test > Components: regionserver >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3 > > Attachments: 21097.v1.txt, 21097.v2.txt, 21097.v3.txt, > HBASE-21097.patch > > > From > https://builds.apache.org/job/PreCommit-HBASE-Build/14137/artifact/patchprocess/patch-unit-hbase-server.txt > : > {code} > [ERROR] > testFlushThroughputTuning(org.apache.hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController) > Time elapsed: 17.446 s <<< FAILURE! > java.lang.AssertionError: expected:<0.0> but was:<1.2906294173808417E-6> > at > org.apache.hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController.testFlushThroughputTuning(TestFlushWithThroughputController.java:185) > {code} > Here is the related assertion: > {code} > assertEquals(0.0, regionServer.getFlushPressure(), EPSILON); > {code} > where EPSILON = 1E-6 > In the above case, due to margin of 2.9E-7, the assertion didn't pass. > It seems the epsilon can be adjusted to accommodate different workload / > hardware combination. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21097) Flush pressure assertion may fail in testFlushThroughputTuning
[ https://issues.apache.org/jira/browse/HBASE-21097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611472#comment-16611472 ] Duo Zhang commented on HBASE-21097: --- The reason is that we changed the way to calculate the flush pressure OK? After HBASE-15787 or HBASE-18294, we use heap size instead of data size to calculate the flush pressure so it will never be zero, so the assertion is useless. > Flush pressure assertion may fail in testFlushThroughputTuning > --- > > Key: HBASE-21097 > URL: https://issues.apache.org/jira/browse/HBASE-21097 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: 21097.v1.txt, 21097.v2.txt, 21097.v3.txt, > HBASE-21097.patch > > > From > https://builds.apache.org/job/PreCommit-HBASE-Build/14137/artifact/patchprocess/patch-unit-hbase-server.txt > : > {code} > [ERROR] > testFlushThroughputTuning(org.apache.hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController) > Time elapsed: 17.446 s <<< FAILURE! > java.lang.AssertionError: expected:<0.0> but was:<1.2906294173808417E-6> > at > org.apache.hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController.testFlushThroughputTuning(TestFlushWithThroughputController.java:185) > {code} > Here is the related assertion: > {code} > assertEquals(0.0, regionServer.getFlushPressure(), EPSILON); > {code} > where EPSILON = 1E-6 > In the above case, due to margin of 2.9E-7, the assertion didn't pass. > It seems the epsilon can be adjusted to accommodate different workload / > hardware combination. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21145) Backport "HBASE-21126 Add ability for HBase Canary to ignore a configurable number of ZooKeeper down nodes" to branch-2.1
[ https://issues.apache.org/jira/browse/HBASE-21145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-21145: -- Summary: Backport "HBASE-21126 Add ability for HBase Canary to ignore a configurable number of ZooKeeper down nodes" to branch-2.1 (was: (2.1) Add ability for HBase Canary to ignore a configurable number of ZooKeeper down nodes) > Backport "HBASE-21126 Add ability for HBase Canary to ignore a configurable > number of ZooKeeper down nodes" to branch-2.1 > - > > Key: HBASE-21145 > URL: https://issues.apache.org/jira/browse/HBASE-21145 > Project: HBase > Issue Type: Improvement > Components: canary, Zookeeper >Affects Versions: 1.0.0, 3.0.0, 2.0.0 >Reporter: David Manning >Assignee: Duo Zhang >Priority: Minor > Fix For: 2.1.1 > > Attachments: HBASE-21126.branch-1.001.patch, > HBASE-21126.master.001.patch, HBASE-21126.master.002.patch, > HBASE-21126.master.003.patch, zookeeperCanaryLocalTestValidation.txt > > Original Estimate: 48h > Remaining Estimate: 48h > > When running org.apache.hadoop.hbase.tool.Canary with args -zookeeper > -treatFailureAsError, the Canary will try to get a znode from each ZooKeeper > server in the ensemble. If any server is unavailable or unresponsive, the > canary will exit with a failure code. > If we use the Canary to gauge server health, and alert accordingly, this can > be too strict. For example, in a 5-node ZooKeeper cluster, having one node > down is safe and expected in rolling upgrades/patches. > This is a request to allow the Canary to take another parameter > {code:java} > -permittedZookeeperFailures {code} > If N=1, in the 5-node ZooKeeper ensemble example, then the Canary will still > pass if 4 ZooKeeper nodes are reachable, but fail if 3 or fewer are reachable. > (This is my first Jira posting... sorry if I messed anything up.) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21097) Flush pressure assertion may fail in testFlushThroughputTuning
[ https://issues.apache.org/jira/browse/HBASE-21097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-21097: --- Attachment: (was: 21097.v3.txt) > Flush pressure assertion may fail in testFlushThroughputTuning > --- > > Key: HBASE-21097 > URL: https://issues.apache.org/jira/browse/HBASE-21097 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: 21097.v1.txt, 21097.v2.txt, 21097.v3.txt, > HBASE-21097.patch > > > From > https://builds.apache.org/job/PreCommit-HBASE-Build/14137/artifact/patchprocess/patch-unit-hbase-server.txt > : > {code} > [ERROR] > testFlushThroughputTuning(org.apache.hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController) > Time elapsed: 17.446 s <<< FAILURE! > java.lang.AssertionError: expected:<0.0> but was:<1.2906294173808417E-6> > at > org.apache.hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController.testFlushThroughputTuning(TestFlushWithThroughputController.java:185) > {code} > Here is the related assertion: > {code} > assertEquals(0.0, regionServer.getFlushPressure(), EPSILON); > {code} > where EPSILON = 1E-6 > In the above case, due to margin of 2.9E-7, the assertion didn't pass. > It seems the epsilon can be adjusted to accommodate different workload / > hardware combination. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21125) Backport 'HBASE-20942 Improve RpcServer TRACE logging' to branch-2.1
[ https://issues.apache.org/jira/browse/HBASE-21125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-21125: -- Summary: Backport 'HBASE-20942 Improve RpcServer TRACE logging' to branch-2.1 (was: CLONE - Improve RpcServer TRACE logging) > Backport 'HBASE-20942 Improve RpcServer TRACE logging' to branch-2.1 > > > Key: HBASE-21125 > URL: https://issues.apache.org/jira/browse/HBASE-21125 > Project: HBase > Issue Type: Task > Components: Operability >Affects Versions: 2.1.0, 1.4.6 >Reporter: Esteban Gutierrez >Assignee: Duo Zhang >Priority: Major > Fix For: 2.1.1 > > Attachments: HBASE-20942.002.patch, HBASE-20942.003.patch, > HBASE-20942.004.patch, HBASE-20942.005.patch > > > Two things: > * We truncate RpcServer output to 1000 characters for trace logging. Would > be better if that value was configurable. > * There is the chance for an ArrayIndexOutOfBounds when truncating the TRACE > log message. > Esteban mentioned this to me earlier, so I'm crediting him as the reporter. > cc: [~elserj] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21097) Flush pressure assertion may fail in testFlushThroughputTuning
[ https://issues.apache.org/jira/browse/HBASE-21097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-21097: --- Attachment: 21097.v3.txt > Flush pressure assertion may fail in testFlushThroughputTuning > --- > > Key: HBASE-21097 > URL: https://issues.apache.org/jira/browse/HBASE-21097 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: 21097.v1.txt, 21097.v2.txt, 21097.v3.txt, > HBASE-21097.patch > > > From > https://builds.apache.org/job/PreCommit-HBASE-Build/14137/artifact/patchprocess/patch-unit-hbase-server.txt > : > {code} > [ERROR] > testFlushThroughputTuning(org.apache.hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController) > Time elapsed: 17.446 s <<< FAILURE! > java.lang.AssertionError: expected:<0.0> but was:<1.2906294173808417E-6> > at > org.apache.hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController.testFlushThroughputTuning(TestFlushWithThroughputController.java:185) > {code} > Here is the related assertion: > {code} > assertEquals(0.0, regionServer.getFlushPressure(), EPSILON); > {code} > where EPSILON = 1E-6 > In the above case, due to margin of 2.9E-7, the assertion didn't pass. > It seems the epsilon can be adjusted to accommodate different workload / > hardware combination. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-21125) CLONE - Improve RpcServer TRACE logging
[ https://issues.apache.org/jira/browse/HBASE-21125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang resolved HBASE-21125. --- Resolution: Fixed Cherry picked to branch-2.1. Thanks [~elserj] and [~esteban]. > CLONE - Improve RpcServer TRACE logging > --- > > Key: HBASE-21125 > URL: https://issues.apache.org/jira/browse/HBASE-21125 > Project: HBase > Issue Type: Task > Components: Operability >Affects Versions: 2.1.0, 1.4.6 >Reporter: Esteban Gutierrez >Assignee: Duo Zhang >Priority: Major > Fix For: 2.1.1 > > Attachments: HBASE-20942.002.patch, HBASE-20942.003.patch, > HBASE-20942.004.patch, HBASE-20942.005.patch > > > Two things: > * We truncate RpcServer output to 1000 characters for trace logging. Would > be better if that value was configurable. > * There is the chance for an ArrayIndexOutOfBounds when truncating the TRACE > log message. > Esteban mentioned this to me earlier, so I'm crediting him as the reporter. > cc: [~elserj] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21097) Flush pressure assertion may fail in testFlushThroughputTuning
[ https://issues.apache.org/jira/browse/HBASE-21097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611457#comment-16611457 ] Duo Zhang commented on HBASE-21097: --- Please add a comment to mention that we used to have a assertion here any why we removed it. > Flush pressure assertion may fail in testFlushThroughputTuning > --- > > Key: HBASE-21097 > URL: https://issues.apache.org/jira/browse/HBASE-21097 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: 21097.v1.txt, 21097.v2.txt, 21097.v3.txt, > HBASE-21097.patch > > > From > https://builds.apache.org/job/PreCommit-HBASE-Build/14137/artifact/patchprocess/patch-unit-hbase-server.txt > : > {code} > [ERROR] > testFlushThroughputTuning(org.apache.hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController) > Time elapsed: 17.446 s <<< FAILURE! > java.lang.AssertionError: expected:<0.0> but was:<1.2906294173808417E-6> > at > org.apache.hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController.testFlushThroughputTuning(TestFlushWithThroughputController.java:185) > {code} > Here is the related assertion: > {code} > assertEquals(0.0, regionServer.getFlushPressure(), EPSILON); > {code} > where EPSILON = 1E-6 > In the above case, due to margin of 2.9E-7, the assertion didn't pass. > It seems the epsilon can be adjusted to accommodate different workload / > hardware combination. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21097) Flush pressure assertion may fail in testFlushThroughputTuning
[ https://issues.apache.org/jira/browse/HBASE-21097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-21097: --- Attachment: 21097.v3.txt > Flush pressure assertion may fail in testFlushThroughputTuning > --- > > Key: HBASE-21097 > URL: https://issues.apache.org/jira/browse/HBASE-21097 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: 21097.v1.txt, 21097.v2.txt, 21097.v3.txt, > HBASE-21097.patch > > > From > https://builds.apache.org/job/PreCommit-HBASE-Build/14137/artifact/patchprocess/patch-unit-hbase-server.txt > : > {code} > [ERROR] > testFlushThroughputTuning(org.apache.hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController) > Time elapsed: 17.446 s <<< FAILURE! > java.lang.AssertionError: expected:<0.0> but was:<1.2906294173808417E-6> > at > org.apache.hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController.testFlushThroughputTuning(TestFlushWithThroughputController.java:185) > {code} > Here is the related assertion: > {code} > assertEquals(0.0, regionServer.getFlushPressure(), EPSILON); > {code} > where EPSILON = 1E-6 > In the above case, due to margin of 2.9E-7, the assertion didn't pass. > It seems the epsilon can be adjusted to accommodate different workload / > hardware combination. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21097) Flush pressure assertion may fail in testFlushThroughputTuning
[ https://issues.apache.org/jira/browse/HBASE-21097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611447#comment-16611447 ] Duo Zhang commented on HBASE-21097: --- Let's just remove the assertion. As now we use heap size and it is never zero. > Flush pressure assertion may fail in testFlushThroughputTuning > --- > > Key: HBASE-21097 > URL: https://issues.apache.org/jira/browse/HBASE-21097 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: 21097.v1.txt, 21097.v2.txt, HBASE-21097.patch > > > From > https://builds.apache.org/job/PreCommit-HBASE-Build/14137/artifact/patchprocess/patch-unit-hbase-server.txt > : > {code} > [ERROR] > testFlushThroughputTuning(org.apache.hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController) > Time elapsed: 17.446 s <<< FAILURE! > java.lang.AssertionError: expected:<0.0> but was:<1.2906294173808417E-6> > at > org.apache.hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController.testFlushThroughputTuning(TestFlushWithThroughputController.java:185) > {code} > Here is the related assertion: > {code} > assertEquals(0.0, regionServer.getFlushPressure(), EPSILON); > {code} > where EPSILON = 1E-6 > In the above case, due to margin of 2.9E-7, the assertion didn't pass. > It seems the epsilon can be adjusted to accommodate different workload / > hardware combination. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21172) Reimplement the retry backoff logic for ReopenTableRegionsProcedure
[ https://issues.apache.org/jira/browse/HBASE-21172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-21172: -- Attachment: HBASE-21172-branch-2.1-v1.patch > Reimplement the retry backoff logic for ReopenTableRegionsProcedure > --- > > Key: HBASE-21172 > URL: https://issues.apache.org/jira/browse/HBASE-21172 > Project: HBase > Issue Type: Sub-task > Components: amv2, proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3 > > Attachments: HBASE-21172-branch-2.1-v1.patch, > HBASE-21172-branch-2.1-v1.patch, HBASE-21172-branch-2.1.patch, > HBASE-21172-v1.patch, HBASE-21172-v2.patch, HBASE-21172-v3.patch, > HBASE-21172-v4.patch, HBASE-21172.patch > > > Now we just do a blocking sleep in the execute method, and there is no > exponential backoff. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21168) BloomFilterUtil uses hardcoded randomness
[ https://issues.apache.org/jira/browse/HBASE-21168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611407#comment-16611407 ] Andrew Purtell commented on HBASE-21168: +1 > BloomFilterUtil uses hardcoded randomness > - > > Key: HBASE-21168 > URL: https://issues.apache.org/jira/browse/HBASE-21168 > Project: HBase > Issue Type: Task >Affects Versions: 2.0.0 >Reporter: Mike Drob >Assignee: Mike Drob >Priority: Minor > Fix For: 3.0.0 > > Attachments: HBASE-21168.master.001.patch, > HBASE-21168.master.002.patch > > > This was flagged by a Fortify scan and while it doesn't appear to be a real > issue, it's pretty easy to take care of anyway. > The hard coded rand can be moved to the test class that actually needs it to > make the static analysis happy. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19418) RANGE_OF_DELAY in PeriodicMemstoreFlusher should be configurable.
[ https://issues.apache.org/jira/browse/HBASE-19418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611394#comment-16611394 ] Hadoop QA commented on HBASE-19418: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 4s{color} | {color:red} HBASE-19418 does not apply to master. Rebase required? Wrong Branch? See https://yetus.apache.org/documentation/0.7.0/precommit-patchnames for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HBASE-19418 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12939336/HBASE-19418.master.000.patch | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/14392/console | | Powered by | Apache Yetus 0.7.0 http://yetus.apache.org | This message was automatically generated. > RANGE_OF_DELAY in PeriodicMemstoreFlusher should be configurable. > - > > Key: HBASE-19418 > URL: https://issues.apache.org/jira/browse/HBASE-19418 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0-alpha-4 >Reporter: Jean-Marc Spaggiari >Assignee: Ramie Raufdeen >Priority: Major > Attachments: HBASE-19418.master.000.patch > > > When RSs have a LOT of regions and CFs, flushing everything within 5 minutes > is not always doable. It might be interesting to be able to increase the > RANGE_OF_DELAY. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19418) RANGE_OF_DELAY in PeriodicMemstoreFlusher should be configurable.
[ https://issues.apache.org/jira/browse/HBASE-19418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramie Raufdeen updated HBASE-19418: --- Attachment: HBASE-19418.master.000.patch Status: Patch Available (was: In Progress) > RANGE_OF_DELAY in PeriodicMemstoreFlusher should be configurable. > - > > Key: HBASE-19418 > URL: https://issues.apache.org/jira/browse/HBASE-19418 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0-alpha-4 >Reporter: Jean-Marc Spaggiari >Assignee: Ramie Raufdeen >Priority: Major > Attachments: HBASE-19418.master.000.patch > > > When RSs have a LOT of regions and CFs, flushing everything within 5 minutes > is not always doable. It might be interesting to be able to increase the > RANGE_OF_DELAY. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19418) RANGE_OF_DELAY in PeriodicMemstoreFlusher should be configurable.
[ https://issues.apache.org/jira/browse/HBASE-19418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramie Raufdeen updated HBASE-19418: --- Attachment: (was: patch.diff) > RANGE_OF_DELAY in PeriodicMemstoreFlusher should be configurable. > - > > Key: HBASE-19418 > URL: https://issues.apache.org/jira/browse/HBASE-19418 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0-alpha-4 >Reporter: Jean-Marc Spaggiari >Assignee: Ramie Raufdeen >Priority: Major > Attachments: HBASE-19418.master.000.patch > > > When RSs have a LOT of regions and CFs, flushing everything within 5 minutes > is not always doable. It might be interesting to be able to increase the > RANGE_OF_DELAY. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19418) RANGE_OF_DELAY in PeriodicMemstoreFlusher should be configurable.
[ https://issues.apache.org/jira/browse/HBASE-19418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramie Raufdeen updated HBASE-19418: --- Status: In Progress (was: Patch Available) > RANGE_OF_DELAY in PeriodicMemstoreFlusher should be configurable. > - > > Key: HBASE-19418 > URL: https://issues.apache.org/jira/browse/HBASE-19418 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0-alpha-4 >Reporter: Jean-Marc Spaggiari >Assignee: Ramie Raufdeen >Priority: Major > Attachments: patch.diff > > > When RSs have a LOT of regions and CFs, flushing everything within 5 minutes > is not always doable. It might be interesting to be able to increase the > RANGE_OF_DELAY. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramie Raufdeen reassigned HBASE-18451: -- Assignee: (was: Ramie Raufdeen) > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Priority: Major > Attachments: HBASE-18451.master.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 35390ms > 2017-07-24 18:45:33,362 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore
[jira] [Commented] (HBASE-21098) Improve Snapshot Performance with Temporary Snapshot Directory when rootDir on S3
[ https://issues.apache.org/jira/browse/HBASE-21098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611372#comment-16611372 ] Zach York commented on HBASE-21098: --- Starting to commit to master, branch-2, and (hopefully) branch-1 > Improve Snapshot Performance with Temporary Snapshot Directory when rootDir > on S3 > - > > Key: HBASE-21098 > URL: https://issues.apache.org/jira/browse/HBASE-21098 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0, 1.4.8, 2.1.1 >Reporter: Tyler Mi >Priority: Major > Attachments: HBASE-21098.master.001.patch, > HBASE-21098.master.002.patch, HBASE-21098.master.003.patch, > HBASE-21098.master.004.patch, HBASE-21098.master.005.patch, > HBASE-21098.master.006.patch, HBASE-21098.master.007.patch, > HBASE-21098.master.008.patch, HBASE-21098.master.009.patch, > HBASE-21098.master.010.patch, HBASE-21098.master.011.patch, > HBASE-21098.master.012.patch, HBASE-21098.master.013.patch > > > When using Apache HBase, the snapshot feature can be used to make a point in > time recovery. To do this, HBase creates a manifest of all the files in all > of the Regions so that those files can be referenced again when a user > restores a snapshot. With HBase's S3 storage mode, developers can store their > data off-cluster on Amazon S3. However, utilizing S3 as a file system is > inefficient in some operations, namely renames. Most Hadoop ecosystem > applications use an atomic rename as a method of committing data. However, > with S3, a rename is a separate copy and then a delete of every file which is > no longer atomic and, in fact, quite costly. In addition, puts and deletes on > S3 have latency issues that traditional filesystems do not encounter when > manipulating the region snapshots to consolidate into a single manifest. When > HBase on S3 users have a significant amount of regions, puts, deletes, and > renames (the final commit stage of the snapshot) become the bottleneck > causing snapshots to take many minutes or even hours to complete. > The purpose of this patch is to increase the overall performance of snapshots > while utilizing HBase on S3 through the use of a temporary directory for the > snapshots that exists on a traditional filesystem like HDFS to circumvent the > bottlenecks. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21181) Use the same filesystem for wal archive directory and wal directory
[ https://issues.apache.org/jira/browse/HBASE-21181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-21181: --- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.0.3 2.1.1 2.2.0 3.0.0 Status: Resolved (was: Patch Available) > Use the same filesystem for wal archive directory and wal directory > --- > > Key: HBASE-21181 > URL: https://issues.apache.org/jira/browse/HBASE-21181 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0, 2.1.1 >Reporter: Tak Lon (Stephen) Wu >Assignee: Tak Lon (Stephen) Wu >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3 > > Attachments: HBASE-21181.master.001.patch > > > when {{hbase.wal.dir}} is set to any filesystem other than the same > filesystem used by rootDir e.g. {{hbase.wal.dir}} set to > {{hdfs://something/wal}} and {{hbase.rootdir}} set to {{s3://something}}, > before this change, WAL archive directory ({{walArchiveDir}}) cannot be > created and failed the WALProcedureStore on HMaster. > The issue is that WAL archive directory was considered to be collocated with > {{hbase.rootdir}} and creates a subdirectory under it. this logic needs to be > updated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21173) Remove the duplicate HRegion#close in TestHRegion
[ https://issues.apache.org/jira/browse/HBASE-21173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611354#comment-16611354 ] Hudson commented on HBASE-21173: Results for branch branch-1.3 [build #463 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/463/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/463//General_Nightly_Build_Report/] (/) {color:green}+1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/463//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/463//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > Remove the duplicate HRegion#close in TestHRegion > - > > Key: HBASE-21173 > URL: https://issues.apache.org/jira/browse/HBASE-21173 > Project: HBase > Issue Type: Bug > Components: test >Affects Versions: 3.0.0, 2.2.0 >Reporter: Guangxu Cheng >Assignee: Guangxu Cheng >Priority: Minor > Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.4.8 > > Attachments: HBASE-21173.branch-1.001.patch, > HBASE-21173.master.001.patch, HBASE-21173.master.002.patch > > > After HBASE-21138, some test methods still have the duplicate > HRegion#close.So open this issue to remove the duplicate close -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21181) Use the same filesystem for wal archive directory and wal directory
[ https://issues.apache.org/jira/browse/HBASE-21181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611352#comment-16611352 ] Mingliang Liu commented on HBASE-21181: --- +1 (non-binding) {{walDir}} and {{walArchiveDir}} dir should be managed together by the same file system (FS). Otherwise, the WAL FS will [fail to create|https://github.com/apache/hbase/blob/b83613fdce7c5e57766417768b2fa1a13fcb4106/hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/store/wal/WALProcedureStore.java#L229] a {{walArchiveDir}} which is resolved on root FS. Other than that, this patch makes sense for another more obvious reason: we archive the wal log files by {{fs.rename()}}. > Use the same filesystem for wal archive directory and wal directory > --- > > Key: HBASE-21181 > URL: https://issues.apache.org/jira/browse/HBASE-21181 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0, 2.1.1 >Reporter: Tak Lon (Stephen) Wu >Assignee: Tak Lon (Stephen) Wu >Priority: Major > Attachments: HBASE-21181.master.001.patch > > > when {{hbase.wal.dir}} is set to any filesystem other than the same > filesystem used by rootDir e.g. {{hbase.wal.dir}} set to > {{hdfs://something/wal}} and {{hbase.rootdir}} set to {{s3://something}}, > before this change, WAL archive directory ({{walArchiveDir}}) cannot be > created and failed the WALProcedureStore on HMaster. > The issue is that WAL archive directory was considered to be collocated with > {{hbase.rootdir}} and creates a subdirectory under it. this logic needs to be > updated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21181) Use the same filesystem for wal archive directory and wal directory
[ https://issues.apache.org/jira/browse/HBASE-21181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611335#comment-16611335 ] Andrew Purtell commented on HBASE-21181: Minor nit: package scoped new method Path getWalArchiveDir() is missing @VisibleForTesting annotation, presume we want it. I'll add it during commit. +1 > Use the same filesystem for wal archive directory and wal directory > --- > > Key: HBASE-21181 > URL: https://issues.apache.org/jira/browse/HBASE-21181 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0, 2.1.1 >Reporter: Tak Lon (Stephen) Wu >Assignee: Tak Lon (Stephen) Wu >Priority: Major > Attachments: HBASE-21181.master.001.patch > > > when {{hbase.wal.dir}} is set to any filesystem other than the same > filesystem used by rootDir e.g. {{hbase.wal.dir}} set to > {{hdfs://something/wal}} and {{hbase.rootdir}} set to {{s3://something}}, > before this change, WAL archive directory ({{walArchiveDir}}) cannot be > created and failed the WALProcedureStore on HMaster. > The issue is that WAL archive directory was considered to be collocated with > {{hbase.rootdir}} and creates a subdirectory under it. this logic needs to be > updated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21021) Result returned by Append operation should be ordered
[ https://issues.apache.org/jira/browse/HBASE-21021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-21021: --- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.2.0 3.0.0 Release Note: This change ensures Append operations are assembled into the expected order. Status: Resolved (was: Patch Available) > Result returned by Append operation should be ordered > - > > Key: HBASE-21021 > URL: https://issues.apache.org/jira/browse/HBASE-21021 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.0, 1.5.0 >Reporter: Nihal Jain >Assignee: Nihal Jain >Priority: Major > Fix For: 3.0.0, 1.5.0, 2.2.0 > > Attachments: HBASE-21021.branch-1.001.patch, > HBASE-21021.master.001.patch > > > *Problem:* > The result returned by the append operation should be ordered. Currently, it > returns an unordered list, which may cause problems like if the user tries to > perform Result.getValue(byte[] family, byte[] qualifier), even if the > returned result has a value corresponding to (family, qualifier), the method > may return null as it performs a binary search over the unsorted result > (which should have been sorted actually). > > The result is enumerated by iterating over each entry of tempMemstore hashmap > (which will never be ordered) and adding the values (see > [HRegion.java#L7882|https://github.com/apache/hbase/blob/1b50fe53724aa62a242b7f64adf7845048df/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java#L7882]). > > *Actual:* The returned result is unordered > *Expected:* Similar to increment op, the returned result should be ordered. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Issue Comment Deleted] (HBASE-21182) Failed to execute start-hbase.sh
[ https://issues.apache.org/jira/browse/HBASE-21182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nihal Jain updated HBASE-21182: --- Comment: was deleted (was: Shaded jars would be built only with -Prelease profile. I think your shaded jars are empty. Hence classes could not be found.) > Failed to execute start-hbase.sh > > > Key: HBASE-21182 > URL: https://issues.apache.org/jira/browse/HBASE-21182 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Subrat Mishra >Priority: Major > > Built master branch like below: > {code:java} > mvn clean install -DskipTests{code} > Then tried to execute start-hbase.sh failed with NoClassDefFoundError > {code:java} > ./bin/start-hbase.sh > Error: A JNI error has occurred, please check your installation and try again > Exception in thread "main" java.lang.NoClassDefFoundError: > org/apache/hadoop/hbase/shaded/org/eclipse/jetty/server/Connector > at java.lang.Class.getDeclaredMethods0(Native Method) > at java.lang.Class.privateGetDeclaredMethods(Class.java:2701) > at java.lang.Class.privateGetMethodRecursive(Class.java:3048) > at java.lang.Class.getMethod0(Class.java:3018) > at java.lang.Class.getMethod(Class.java:1784) > at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544) > at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526) > Caused by: java.lang.ClassNotFoundException: > org.apache.hadoop.hbase.shaded.org.eclipse.jetty.server.Connector{code} > Note: It worked after reverting HBASE-21153 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21182) Failed to execute start-hbase.sh
[ https://issues.apache.org/jira/browse/HBASE-21182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611293#comment-16611293 ] Nihal Jain commented on HBASE-21182: Shaded jars would be built only with -Prelease profile. I think your shaded jars are empty. Hence classes could not be found. > Failed to execute start-hbase.sh > > > Key: HBASE-21182 > URL: https://issues.apache.org/jira/browse/HBASE-21182 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Subrat Mishra >Priority: Major > > Built master branch like below: > {code:java} > mvn clean install -DskipTests{code} > Then tried to execute start-hbase.sh failed with NoClassDefFoundError > {code:java} > ./bin/start-hbase.sh > Error: A JNI error has occurred, please check your installation and try again > Exception in thread "main" java.lang.NoClassDefFoundError: > org/apache/hadoop/hbase/shaded/org/eclipse/jetty/server/Connector > at java.lang.Class.getDeclaredMethods0(Native Method) > at java.lang.Class.privateGetDeclaredMethods(Class.java:2701) > at java.lang.Class.privateGetMethodRecursive(Class.java:3048) > at java.lang.Class.getMethod0(Class.java:3018) > at java.lang.Class.getMethod(Class.java:1784) > at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544) > at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526) > Caused by: java.lang.ClassNotFoundException: > org.apache.hadoop.hbase.shaded.org.eclipse.jetty.server.Connector{code} > Note: It worked after reverting HBASE-21153 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21021) Result returned by Append operation should be ordered
[ https://issues.apache.org/jira/browse/HBASE-21021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611281#comment-16611281 ] Andrew Purtell commented on HBASE-21021: Thanks for the patch. Let me try to commit after some local checks. > Result returned by Append operation should be ordered > - > > Key: HBASE-21021 > URL: https://issues.apache.org/jira/browse/HBASE-21021 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.0, 1.5.0 >Reporter: Nihal Jain >Assignee: Nihal Jain >Priority: Major > Fix For: 1.5.0 > > Attachments: HBASE-21021.branch-1.001.patch, > HBASE-21021.master.001.patch > > > *Problem:* > The result returned by the append operation should be ordered. Currently, it > returns an unordered list, which may cause problems like if the user tries to > perform Result.getValue(byte[] family, byte[] qualifier), even if the > returned result has a value corresponding to (family, qualifier), the method > may return null as it performs a binary search over the unsorted result > (which should have been sorted actually). > > The result is enumerated by iterating over each entry of tempMemstore hashmap > (which will never be ordered) and adding the values (see > [HRegion.java#L7882|https://github.com/apache/hbase/blob/1b50fe53724aa62a242b7f64adf7845048df/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java#L7882]). > > *Actual:* The returned result is unordered > *Expected:* Similar to increment op, the returned result should be ordered. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21184) Update site-wide references from http to https
[ https://issues.apache.org/jira/browse/HBASE-21184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611260#comment-16611260 ] Hadoop QA commented on HBASE-21184: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 13s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue} 2m 53s{color} | {color:blue} Shelldocs was not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1679 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 33s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 15s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 1s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 8m 7s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 21m 18s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} refguide {color} | {color:blue} 5m 17s{color} | {color:blue} branch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 10s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hbase-checkstyle hbase-build-support hbase-annotations hbase-build-configuration hbase-resource-bundle hbase-testing-util hbase-shaded hbase-shaded/hbase-shaded-client hbase-shaded/hbase-shaded-client-byo-hadoop hbase-shaded/hbase-shaded-mapreduce hbase-spark-it hbase-assembly hbase-shaded/hbase-shaded-check-invariants hbase-shaded/hbase-shaded-with-hadoop-check-invariants hbase-archetypes hbase-archetypes/hbase-archetype-builder . {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 23s{color} | {color:blue} hbase-hadoop2-compat in master has 18 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 10m 41s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} scaladoc {color} | {color:green} 9m 7s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 4m 55s{color} | {color:red} root in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 6m 1s{color} | {color:red} root in the patch failed. {color} | | {color:red}-1{color} | {color:red} cc {color} | {color:red} 6m 1s{color} | {color:red} root in the patch failed. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 6m 1s{color} | {color:red} root in the patch failed. {color} | | {color:red}-1{color} | {color:red} scalac {color} | {color:red} 6m 1s{color} | {color:red} root in the patch failed. {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 8m 10s{color} | {color:red} root: The patch generated 1 new + 15294 unchanged - 1 fixed = 15295 total (was 15295) {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 2m 25s{color} | {color:red} root in the patch failed. {color} | | {color:green}+1{color} | {color:green} perlcritic {color} | {color:green} 0m 4s{color} | {color:green} There were no new perlcritic issues. {color} | | {color:green}+1{color} | {color:green} pylint {color} | {color:green} 0m 21s{color} | {color:green} There were no new pylint issues. {color} | | {color:green}+1{color} | {color:green} rubocop {color} | {color:green} 8m 24s{color} | {color:green} The patch generated 0 new + 2778 unchanged - 1 fixed = 2778 total (was 2779)
[jira] [Commented] (HBASE-20542) Better heap utilization for IMC with MSLABs
[ https://issues.apache.org/jira/browse/HBASE-20542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611249#comment-16611249 ] Tak Lon (Stephen) Wu commented on HBASE-20542: -- should we marked this patch as resolved ? or are we pending performance test for IMC in 2.2.0 (even so we should have a new Jira) ?? > Better heap utilization for IMC with MSLABs > --- > > Key: HBASE-20542 > URL: https://issues.apache.org/jira/browse/HBASE-20542 > Project: HBase > Issue Type: Sub-task > Components: in-memory-compaction >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-20542-addendum.master.005.patch, > HBASE-20542.branch-2.001.patch, HBASE-20542.branch-2.003.patch, > HBASE-20542.branch-2.004.patch, HBASE-20542.branch-2.005.patch, > HBASE-20542.master.003.patch, HBASE-20542.master.005-addendum.patch, run.sh, > workloada, workloadc, workloadx, workloady > > > Following HBASE-20188 we realized in-memory compaction combined with MSLABs > may suffer from heap under-utilization due to internal fragmentation. This > jira presents a solution to circumvent this problem. The main idea is to have > each update operation check if it will cause overflow in the active segment > *before* it is writing the new value (instead of checking the size after the > write is completed), and if it is then the active segment is atomically > swapped with a new empty segment, and is pushed (full-yet-not-overflowed) to > the compaction pipeline. Later on the IMC deamon will run its compaction > operation (flatten index/merge indices/data compaction) in the background. > Some subtle concurrency issues should be handled with care. We next elaborate > on them. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21173) Remove the duplicate HRegion#close in TestHRegion
[ https://issues.apache.org/jira/browse/HBASE-21173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611201#comment-16611201 ] Hudson commented on HBASE-21173: Results for branch branch-1.4 [build #458 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/458/]: (x) *{color:red}-1 overall{color}* details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/458//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/458//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/458//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > Remove the duplicate HRegion#close in TestHRegion > - > > Key: HBASE-21173 > URL: https://issues.apache.org/jira/browse/HBASE-21173 > Project: HBase > Issue Type: Bug > Components: test >Affects Versions: 3.0.0, 2.2.0 >Reporter: Guangxu Cheng >Assignee: Guangxu Cheng >Priority: Minor > Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.4.8 > > Attachments: HBASE-21173.branch-1.001.patch, > HBASE-21173.master.001.patch, HBASE-21173.master.002.patch > > > After HBASE-21138, some test methods still have the duplicate > HRegion#close.So open this issue to remove the duplicate close -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-15728) Add remaining per-table region / store / flush / compaction related metrics
[ https://issues.apache.org/jira/browse/HBASE-15728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611177#comment-16611177 ] Xu Cang commented on HBASE-15728: - branch-1 related work will be tracked here I think https://issues.apache.org/jira/browse/HBASE-21140 Since the original patch for branch-1 has been reverted. And there are other pending issues need to be addressed. > Add remaining per-table region / store / flush / compaction related metrics > > > Key: HBASE-15728 > URL: https://issues.apache.org/jira/browse/HBASE-15728 > Project: HBase > Issue Type: Sub-task > Components: metrics >Reporter: Enis Soztutar >Assignee: Xu Cang >Priority: Major > Fix For: 3.0.0, 1.5.0, 2.2.0 > > Attachments: HBASE-15728.branch-1.001.patch, > HBASE-15728.branch-1.001.patch, HBASE-15728.branch-1.002.patch, > HBASE-15728.branch-2.addendum.001.patch, HBASE-15728.master.001.patch, > HBASE-15728.master.addendum.001.patch, hbase-15728_v1.patch > > > Continuing on the work for per-table metrics, HBASE-15518 and HBASE-15671. > We need to add some remaining metrics at the per-table level, so that we will > have the same metrics reported at the per-regionserver, per-region and > per-table levels. > After this patch, most of the metrics at the RS and all of the per-region > level are also reported at the per-table level. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-15728) Add remaining per-table region / store / flush / compaction related metrics
[ https://issues.apache.org/jira/browse/HBASE-15728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611168#comment-16611168 ] Andrew Purtell commented on HBASE-15728: Any concerns about the addendum? There should be an addendum patch for branch-1 too, right? Although it will be a little different > Add remaining per-table region / store / flush / compaction related metrics > > > Key: HBASE-15728 > URL: https://issues.apache.org/jira/browse/HBASE-15728 > Project: HBase > Issue Type: Sub-task > Components: metrics >Reporter: Enis Soztutar >Assignee: Xu Cang >Priority: Major > Fix For: 3.0.0, 1.5.0, 2.2.0 > > Attachments: HBASE-15728.branch-1.001.patch, > HBASE-15728.branch-1.001.patch, HBASE-15728.branch-1.002.patch, > HBASE-15728.branch-2.addendum.001.patch, HBASE-15728.master.001.patch, > HBASE-15728.master.addendum.001.patch, hbase-15728_v1.patch > > > Continuing on the work for per-table metrics, HBASE-15518 and HBASE-15671. > We need to add some remaining metrics at the per-table level, so that we will > have the same metrics reported at the per-regionserver, per-region and > per-table levels. > After this patch, most of the metrics at the RS and all of the per-region > level are also reported at the per-table level. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21140) Backport 'HBASE-21136 NPE in MetricsTableSourceImpl.updateFlushTime' to branch-1 . (and backport HBASE-15728 for branch-1)
[ https://issues.apache.org/jira/browse/HBASE-21140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611166#comment-16611166 ] Andrew Purtell commented on HBASE-21140: bq. May I open a new Jira issue to track this effort: "backport new IdReadWriteLock implementation from master to branch-1" and mark it as a dependency for this Jira issue. Sure, please go ahead > Backport 'HBASE-21136 NPE in MetricsTableSourceImpl.updateFlushTime' to > branch-1 . (and backport HBASE-15728 for branch-1) > --- > > Key: HBASE-21140 > URL: https://issues.apache.org/jira/browse/HBASE-21140 > Project: HBase > Issue Type: Bug > Components: metrics >Reporter: Duo Zhang >Assignee: Xu Cang >Priority: Major > Fix For: 1.5.0 > > Attachments: > HBASE-21140.diff_against_cf198a65e8d704d28538c4c165a941b9e5bac678.branch-1.001.patch > > > There is no computeIfAbsent method on branch-1 as we still need to support > JDK7, so the fix will be different with branch-2+. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21172) Reimplement the retry backoff logic for ReopenTableRegionsProcedure
[ https://issues.apache.org/jira/browse/HBASE-21172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610993#comment-16610993 ] Hudson commented on HBASE-21172: Results for branch branch-2 [build #1234 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1234/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1234//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1234//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1234//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Reimplement the retry backoff logic for ReopenTableRegionsProcedure > --- > > Key: HBASE-21172 > URL: https://issues.apache.org/jira/browse/HBASE-21172 > Project: HBase > Issue Type: Sub-task > Components: amv2, proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3 > > Attachments: HBASE-21172-branch-2.1-v1.patch, > HBASE-21172-branch-2.1.patch, HBASE-21172-v1.patch, HBASE-21172-v2.patch, > HBASE-21172-v3.patch, HBASE-21172-v4.patch, HBASE-21172.patch > > > Now we just do a blocking sleep in the execute method, and there is no > exponential backoff. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21185) WALPrettyPrinter: Additional useful info to be printed by wal printer tool, for debugability purposes
[ https://issues.apache.org/jira/browse/HBASE-21185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wellington Chevreuil updated HBASE-21185: - Attachment: HBASE-21185.master.001.patch > WALPrettyPrinter: Additional useful info to be printed by wal printer tool, > for debugability purposes > - > > Key: HBASE-21185 > URL: https://issues.apache.org/jira/browse/HBASE-21185 > Project: HBase > Issue Type: Improvement >Reporter: Wellington Chevreuil >Assignee: Wellington Chevreuil >Priority: Trivial > Attachments: HBASE-21185.master.001.patch > > > *WALPrettyPrinter* is very useful for troubleshooting wal issues, such as > faulty replication sinks. An useful information one might want to track is > the size of a single WAL entry edit, as well as size for each edit cell. Am > proposing a patch that adds calculations for these two, as well an option to > seek straight to a given position on the WAL file being analysed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21185) WALPrettyPrinter: Additional useful info to be printed by wal printer tool, for debugability purposes
[ https://issues.apache.org/jira/browse/HBASE-21185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wellington Chevreuil updated HBASE-21185: - Summary: WALPrettyPrinter: Additional useful info to be printed by wal printer tool, for debugability purposes (was: WALPrettyPrinter: Additional useful infos to be printed by wal printer tool, for debugability purposes) > WALPrettyPrinter: Additional useful info to be printed by wal printer tool, > for debugability purposes > - > > Key: HBASE-21185 > URL: https://issues.apache.org/jira/browse/HBASE-21185 > Project: HBase > Issue Type: Improvement >Reporter: Wellington Chevreuil >Assignee: Wellington Chevreuil >Priority: Trivial > > *WALPrettyPrinter* is very useful for troubleshooting wal issues, such as > faulty replication sinks. An useful information one might want to track is > the size of a single WAL entry edit, as well as size for each edit cell. Am > proposing a patch that adds calculations for these two, as well an option to > seek straight to a given position on the WAL file being analysed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21185) WALPrettyPrinter: Additional useful infos to be printed by wal printer tool, for debugability purposes
Wellington Chevreuil created HBASE-21185: Summary: WALPrettyPrinter: Additional useful infos to be printed by wal printer tool, for debugability purposes Key: HBASE-21185 URL: https://issues.apache.org/jira/browse/HBASE-21185 Project: HBase Issue Type: Improvement Reporter: Wellington Chevreuil Assignee: Wellington Chevreuil *WALPrettyPrinter* is very useful for troubleshooting wal issues, such as faulty replication sinks. An useful information one might want to track is the size of a single WAL entry edit, as well as size for each edit cell. Am proposing a patch that adds calculations for these two, as well an option to seek straight to a given position on the WAL file being analysed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21184) Update site-wide references from http to https
[ https://issues.apache.org/jira/browse/HBASE-21184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610969#comment-16610969 ] Hadoop QA commented on HBASE-21184: --- (!) A patch to the testing environment has been detected. Re-executing against the patched versions to perform further tests. The console is at https://builds.apache.org/job/PreCommit-HBASE-Build/14391/console in case of problems. > Update site-wide references from http to https > -- > > Key: HBASE-21184 > URL: https://issues.apache.org/jira/browse/HBASE-21184 > Project: HBase > Issue Type: Task > Components: scripts, website >Affects Versions: 2.0.2 >Reporter: Misty Linville >Assignee: Misty Linville >Priority: Major > Attachments: HBASE-21184-1.patch > > > This is a naive approach! I basically replaced http:// with https:// > everywhere in plaintext and source files. I'm not on an appropriate host to > build this today so I will try it later, unless someone else wants to try it > first. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21184) Update site-wide references from http to https
[ https://issues.apache.org/jira/browse/HBASE-21184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misty Linville updated HBASE-21184: --- Attachment: HBASE-21184-1.patch > Update site-wide references from http to https > -- > > Key: HBASE-21184 > URL: https://issues.apache.org/jira/browse/HBASE-21184 > Project: HBase > Issue Type: Task > Components: scripts, website >Affects Versions: 2.0.2 >Reporter: Misty Linville >Assignee: Misty Linville >Priority: Major > Attachments: HBASE-21184-1.patch > > > This is a naive approach! I basically replaced http:// with https:// > everywhere in plaintext and source files. I'm not on an appropriate host to > build this today so I will try it later, unless someone else wants to try it > first. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21184) Update site-wide references from http to https
[ https://issues.apache.org/jira/browse/HBASE-21184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misty Linville updated HBASE-21184: --- Status: Patch Available (was: Open) > Update site-wide references from http to https > -- > > Key: HBASE-21184 > URL: https://issues.apache.org/jira/browse/HBASE-21184 > Project: HBase > Issue Type: Task > Components: scripts, website >Affects Versions: 2.0.2 >Reporter: Misty Linville >Assignee: Misty Linville >Priority: Major > Attachments: HBASE-21184-1.patch > > > This is a naive approach! I basically replaced http:// with https:// > everywhere in plaintext and source files. I'm not on an appropriate host to > build this today so I will try it later, unless someone else wants to try it > first. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21184) Update site-wide references from http to https
Misty Linville created HBASE-21184: -- Summary: Update site-wide references from http to https Key: HBASE-21184 URL: https://issues.apache.org/jira/browse/HBASE-21184 Project: HBase Issue Type: Task Components: scripts, website Affects Versions: 2.0.2 Reporter: Misty Linville Assignee: Misty Linville This is a naive approach! I basically replaced http:// with https:// everywhere in plaintext and source files. I'm not on an appropriate host to build this today so I will try it later, unless someone else wants to try it first. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21172) Reimplement the retry backoff logic for ReopenTableRegionsProcedure
[ https://issues.apache.org/jira/browse/HBASE-21172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610941#comment-16610941 ] Hadoop QA commented on HBASE-21172: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 48s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 1s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} branch-2.1 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 37s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 48s{color} | {color:green} branch-2.1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 16s{color} | {color:green} branch-2.1 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 24s{color} | {color:green} branch-2.1 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 50s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 50s{color} | {color:green} branch-2.1 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s{color} | {color:green} branch-2.1 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s{color} | {color:green} hbase-procedure: The patch generated 0 new + 6 unchanged - 2 fixed = 6 total (was 8) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 5s{color} | {color:green} The patch hbase-server passed checkstyle {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 37s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 8m 51s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 58s{color} | {color:green} hbase-procedure in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}214m 44s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 41s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}260m 35s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController | | | hadoop.hbase.client.TestAsyncTableGetMultiThreaded | | | hadoop.hbase.client.TestFromClientSide | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:42ca976 | | JIRA Issue | HBASE-21172 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12939250/HBASE-21172-branch-2.1-v1.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck
[jira] [Updated] (HBASE-21183) loadincrementalHFiles sometimes throws FileNotFoundException on retry
[ https://issues.apache.org/jira/browse/HBASE-21183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Robertson updated HBASE-21183: -- Description: On a nightly batch job which prepares 100s of well balanced HFiles at around 2GB each, we see sporadic failures in a bulk load. I'm unable to paste the logs here (different network) but they show e.g. the following on a failing day: {code:java} Trying to load hfile... /my/input/path/... Attempt to bulk load region containing ... failed. This is recoverable and will be retried Attempt to bulk load region containing ... failed. This is recoverable and will be retried Attempt to bulk load region containing ... failed. This is recoverable and will be retried Split occurred while grouping HFiles, retry attempt 1 with 3 files remaining to group or split Trying to load hfile... IOException during splitting java.io.FileNotFoundException: File does not exist: /my/input/path/... {code} The exception get's thrown from [this line|https://github.com/apache/hbase/blob/branch-1.2/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java#L685]. I should note that this is a secure cluster (CDH 5.12.x). I've tried to go through the code, and don't spot an obvious race condition. I don't spot any changes related to this for the later 1.x versions so presume this exists in 1.5. I'm yet to get access to the NameNode audit logs when this occurs to trace through the rename() calls around these particular files. I don't see timeouts like HBASE-4030 was: On a nightly batch job which prepares 100s of well balanced HFiles at around 2GB each, we see sporadic failures in a bulk load. I'm unable to paste the logs here (different network) but they show e.g. the following on a failing day: {code} Trying to load hfile... /my/input/path/... Attempt to bulk load region containing ... failed. This is recoverable and will be retried Attempt to bulk load region containing ... failed. This is recoverable and will be retried Attempt to bulk load region containing ... failed. This is recoverable and will be retried Split occurred while grouping HFiles, retry attempt 1 with 3 files remaining to group or split Trying to load hfile... IOException during splitting java.io.FileNotFoundException: File does not exist: /my/input/path/... {code} The exception get's thrown from [this line|https://github.com/apache/hbase/blob/branch-1.2/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java#L685]. I should note that this is a secure cluster (CDH 5.12.x). I've tried to go through the code, and don't spot an obvious race condition. I don't spot any changes related to this for the later 1.x versions so presume this exists in 1.5. I'm yet to get access to the NameNode audit logs when this occurs to trace through the rename() calls around these particular files. > loadincrementalHFiles sometimes throws FileNotFoundException on retry > - > > Key: HBASE-21183 > URL: https://issues.apache.org/jira/browse/HBASE-21183 > Project: HBase > Issue Type: Bug >Affects Versions: 1.2.0 >Reporter: Tim Robertson >Priority: Major > > On a nightly batch job which prepares 100s of well balanced HFiles at around > 2GB each, we see sporadic failures in a bulk load. > I'm unable to paste the logs here (different network) but they show e.g. the > following on a failing day: > {code:java} > Trying to load hfile... /my/input/path/... > Attempt to bulk load region containing ... failed. This is recoverable and > will be retried > Attempt to bulk load region containing ... failed. This is recoverable and > will be retried > Attempt to bulk load region containing ... failed. This is recoverable and > will be retried > Split occurred while grouping HFiles, retry attempt 1 with 3 files remaining > to group or split > Trying to load hfile... > IOException during splitting > java.io.FileNotFoundException: File does not exist: /my/input/path/... > {code} > The exception get's thrown from [this > line|https://github.com/apache/hbase/blob/branch-1.2/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java#L685]. > > I should note that this is a secure cluster (CDH 5.12.x). > I've tried to go through the code, and don't spot an obvious race condition. > I don't spot any changes related to this for the later 1.x versions so > presume this exists in 1.5. > I'm yet to get access to the NameNode audit logs when this occurs to trace > through the rename() calls around these particular files. > I don't see timeouts like HBASE-4030 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21183) loadincrementalHFiles sometimes throws FileNotFoundException on retry
Tim Robertson created HBASE-21183: - Summary: loadincrementalHFiles sometimes throws FileNotFoundException on retry Key: HBASE-21183 URL: https://issues.apache.org/jira/browse/HBASE-21183 Project: HBase Issue Type: Bug Affects Versions: 1.2.0 Reporter: Tim Robertson On a nightly batch job which prepares 100s of well balanced HFiles at around 2GB each, we see sporadic failures in a bulk load. I'm unable to paste the logs here (different network) but they show e.g. the following on a failing day: {code} Trying to load hfile... /my/input/path/... Attempt to bulk load region containing ... failed. This is recoverable and will be retried Attempt to bulk load region containing ... failed. This is recoverable and will be retried Attempt to bulk load region containing ... failed. This is recoverable and will be retried Split occurred while grouping HFiles, retry attempt 1 with 3 files remaining to group or split Trying to load hfile... IOException during splitting java.io.FileNotFoundException: File does not exist: /my/input/path/... {code} The exception get's thrown from [this line|https://github.com/apache/hbase/blob/branch-1.2/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java#L685]. I should note that this is a secure cluster (CDH 5.12.x). I've tried to go through the code, and don't spot an obvious race condition. I don't spot any changes related to this for the later 1.x versions so presume this exists in 1.5. I'm yet to get access to the NameNode audit logs when this occurs to trace through the rename() calls around these particular files. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21172) Reimplement the retry backoff logic for ReopenTableRegionsProcedure
[ https://issues.apache.org/jira/browse/HBASE-21172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610619#comment-16610619 ] Hudson commented on HBASE-21172: Results for branch master [build #485 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/485/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/master/485//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/master/485//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/master/485//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Reimplement the retry backoff logic for ReopenTableRegionsProcedure > --- > > Key: HBASE-21172 > URL: https://issues.apache.org/jira/browse/HBASE-21172 > Project: HBase > Issue Type: Sub-task > Components: amv2, proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3 > > Attachments: HBASE-21172-branch-2.1-v1.patch, > HBASE-21172-branch-2.1.patch, HBASE-21172-v1.patch, HBASE-21172-v2.patch, > HBASE-21172-v3.patch, HBASE-21172-v4.patch, HBASE-21172.patch > > > Now we just do a blocking sleep in the execute method, and there is no > exponential backoff. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21158) Empty qualifier cell is always returned when using QualifierFilter
[ https://issues.apache.org/jira/browse/HBASE-21158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610617#comment-16610617 ] Hudson commented on HBASE-21158: Results for branch master [build #485 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/485/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/master/485//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/master/485//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/master/485//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Empty qualifier cell is always returned when using QualifierFilter > -- > > Key: HBASE-21158 > URL: https://issues.apache.org/jira/browse/HBASE-21158 > Project: HBase > Issue Type: Bug > Components: Filters >Affects Versions: 3.0.0, 2.2.0 >Reporter: Guangxu Cheng >Assignee: Guangxu Cheng >Priority: Major > Fix For: 3.0.0, 1.5.0, 1.3.3, 1.2.8, 2.2.0, 1.4.8, 2.1.1, 2.0.3 > > Attachments: HBASE-21158.branch-1.001.patch, > HBASE-21158.master.001.patch, HBASE-21158.master.002.patch, > HBASE-21158.master.003.patch, HBASE-21158.master.004.patch > > > {code:xml} > hbase(main):002:0> put 'testTable','testrow','f:testcol1','testvalue1' > 0 row(s) in 0.0040 seconds > hbase(main):003:0> put 'testTable','testrow','f:','testvalue2' > 0 row(s) in 0.0070 seconds > # get row with empty column f:, result is correct. > hbase(main):004:0> scan 'testTable',{FILTER => "QualifierFilter (=, > 'binary:')"} > ROW COLUMN+CELL > > > testrowcolumn=f:, > timestamp=1536218563581, value=testvalue2 > > 1 row(s) in 0.0460 seconds > # get row with column f:testcol1, result is incorrect. > hbase(main):005:0> scan 'testTable',{FILTER => "QualifierFilter (=, > 'binary:testcol1')"} > ROW COLUMN+CELL > > > testrowcolumn=f:, > timestamp=1536218563581, value=testvalue2 > > testrowcolumn=f:testcol1, > timestamp=1536218550827, value=testvalue1 > > 1 row(s) in 0.0070 seconds > {code} > As the above operation, when the row contains empty qualifier column, empty > qualifier cell is always returned when using QualifierFilter. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21173) Remove the duplicate HRegion#close in TestHRegion
[ https://issues.apache.org/jira/browse/HBASE-21173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610618#comment-16610618 ] Hudson commented on HBASE-21173: Results for branch master [build #485 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/485/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/master/485//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/master/485//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/master/485//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Remove the duplicate HRegion#close in TestHRegion > - > > Key: HBASE-21173 > URL: https://issues.apache.org/jira/browse/HBASE-21173 > Project: HBase > Issue Type: Bug > Components: test >Affects Versions: 3.0.0, 2.2.0 >Reporter: Guangxu Cheng >Assignee: Guangxu Cheng >Priority: Minor > Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.4.8 > > Attachments: HBASE-21173.branch-1.001.patch, > HBASE-21173.master.001.patch, HBASE-21173.master.002.patch > > > After HBASE-21138, some test methods still have the duplicate > HRegion#close.So open this issue to remove the duplicate close -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HBASE-21182) Failed to execute start-hbase.sh
[ https://issues.apache.org/jira/browse/HBASE-21182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610592#comment-16610592 ] Sean Busbey edited comment on HBASE-21182 at 9/11/18 1:40 PM: -- please open these kinds of concerns on the [dev@hbase mailing list|https://lists.apache.org/list.html?d...@hbase.apache.org] if the automated build system doesn't show a problem. The nightly test that builds and runs an HBase instance on top of Hadoop 2 and Hadoop 3 passed with HBASE-21153 change in place, so something else is probably going on. https://builds.apache.org/view/H-L/view/HBase/job/HBase%20Nightly/job/master/483/artifact/output-integration/ was (Author: busbey): please open these kinds of concerns on the [dev@hbase mailing list|https://lists.apache.org/list.html?d...@hbase.apache.org] if the automate build system doesn't show a problem. The nightly test that builds and runs an HBase instance on top of Hadoop 2 and Hadoop 3 passed with HBASE-21153 change in place, so something else is probably going on. https://builds.apache.org/view/H-L/view/HBase/job/HBase%20Nightly/job/master/483/artifact/output-integration/ > Failed to execute start-hbase.sh > > > Key: HBASE-21182 > URL: https://issues.apache.org/jira/browse/HBASE-21182 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Subrat Mishra >Priority: Major > > Built master branch like below: > {code:java} > mvn clean install -DskipTests{code} > Then tried to execute start-hbase.sh failed with NoClassDefFoundError > {code:java} > ./bin/start-hbase.sh > Error: A JNI error has occurred, please check your installation and try again > Exception in thread "main" java.lang.NoClassDefFoundError: > org/apache/hadoop/hbase/shaded/org/eclipse/jetty/server/Connector > at java.lang.Class.getDeclaredMethods0(Native Method) > at java.lang.Class.privateGetDeclaredMethods(Class.java:2701) > at java.lang.Class.privateGetMethodRecursive(Class.java:3048) > at java.lang.Class.getMethod0(Class.java:3018) > at java.lang.Class.getMethod(Class.java:1784) > at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544) > at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526) > Caused by: java.lang.ClassNotFoundException: > org.apache.hadoop.hbase.shaded.org.eclipse.jetty.server.Connector{code} > Note: It worked after reverting HBASE-21153 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21182) Failed to execute start-hbase.sh
[ https://issues.apache.org/jira/browse/HBASE-21182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610592#comment-16610592 ] Sean Busbey commented on HBASE-21182: - please open these kinds of concerns on the [dev@hbase mailing list|https://lists.apache.org/list.html?d...@hbase.apache.org] if the automate build system doesn't show a problem. The nightly test that builds and runs an HBase instance on top of Hadoop 2 and Hadoop 3 passed with HBASE-21153 change in place, so something else is probably going on. https://builds.apache.org/view/H-L/view/HBase/job/HBase%20Nightly/job/master/483/artifact/output-integration/ > Failed to execute start-hbase.sh > > > Key: HBASE-21182 > URL: https://issues.apache.org/jira/browse/HBASE-21182 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Subrat Mishra >Priority: Major > > Built master branch like below: > {code:java} > mvn clean install -DskipTests{code} > Then tried to execute start-hbase.sh failed with NoClassDefFoundError > {code:java} > ./bin/start-hbase.sh > Error: A JNI error has occurred, please check your installation and try again > Exception in thread "main" java.lang.NoClassDefFoundError: > org/apache/hadoop/hbase/shaded/org/eclipse/jetty/server/Connector > at java.lang.Class.getDeclaredMethods0(Native Method) > at java.lang.Class.privateGetDeclaredMethods(Class.java:2701) > at java.lang.Class.privateGetMethodRecursive(Class.java:3048) > at java.lang.Class.getMethod0(Class.java:3018) > at java.lang.Class.getMethod(Class.java:1784) > at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544) > at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526) > Caused by: java.lang.ClassNotFoundException: > org.apache.hadoop.hbase.shaded.org.eclipse.jetty.server.Connector{code} > Note: It worked after reverting HBASE-21153 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21172) Reimplement the retry backoff logic for ReopenTableRegionsProcedure
[ https://issues.apache.org/jira/browse/HBASE-21172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610575#comment-16610575 ] Hudson commented on HBASE-21172: Results for branch branch-2 [build #1233 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1233/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1233//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1233//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1233//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Reimplement the retry backoff logic for ReopenTableRegionsProcedure > --- > > Key: HBASE-21172 > URL: https://issues.apache.org/jira/browse/HBASE-21172 > Project: HBase > Issue Type: Sub-task > Components: amv2, proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3 > > Attachments: HBASE-21172-branch-2.1-v1.patch, > HBASE-21172-branch-2.1.patch, HBASE-21172-v1.patch, HBASE-21172-v2.patch, > HBASE-21172-v3.patch, HBASE-21172-v4.patch, HBASE-21172.patch > > > Now we just do a blocking sleep in the execute method, and there is no > exponential backoff. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21035) Meta Table should be able to online even if all procedures are lost
[ https://issues.apache.org/jira/browse/HBASE-21035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610550#comment-16610550 ] stack commented on HBASE-21035: --- Oh, I should have mentioned, I removed all master proc WALs at one stage (too many, no tooling yet to fix STUCK assigns, each start up accumulated more WALs). That would explain "nothing to assign" and why hbase:meta had no assign. Now I'm in the state where Master exits because zk has a location for meta, we keep trying to go there but it will never be open at the zk location. > Meta Table should be able to online even if all procedures are lost > --- > > Key: HBASE-21035 > URL: https://issues.apache.org/jira/browse/HBASE-21035 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Attachments: HBASE-21035.branch-2.0.001.patch > > > After HBASE-20708, we changed the way we init after master starts. It will > only check WAL dirs and compare to Zookeeper RS nodes to decide which server > need to expire. For servers which's dir is ending with 'SPLITTING', we assure > that there will be a SCP for it. > But, if the server with the meta region crashed before master restarts, and > if all the procedure wals are lost (due to bug, or deleted manually, > whatever), the new restarted master will be stuck when initing. Since no one > will bring meta region online. > Although it is an anomaly case, but I think no matter what happens, we need > to online meta region. Otherwise, we are sitting ducks, noting can be done. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21035) Meta Table should be able to online even if all procedures are lost
[ https://issues.apache.org/jira/browse/HBASE-21035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610534#comment-16610534 ] stack commented on HBASE-21035: --- bq. I think Allan Yang has a valid point. I suppose they have not regions to assign. Let me try and do a better accounting. Thanks boys. > Meta Table should be able to online even if all procedures are lost > --- > > Key: HBASE-21035 > URL: https://issues.apache.org/jira/browse/HBASE-21035 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Attachments: HBASE-21035.branch-2.0.001.patch > > > After HBASE-20708, we changed the way we init after master starts. It will > only check WAL dirs and compare to Zookeeper RS nodes to decide which server > need to expire. For servers which's dir is ending with 'SPLITTING', we assure > that there will be a SCP for it. > But, if the server with the meta region crashed before master restarts, and > if all the procedure wals are lost (due to bug, or deleted manually, > whatever), the new restarted master will be stuck when initing. Since no one > will bring meta region online. > Although it is an anomaly case, but I think no matter what happens, we need > to online meta region. Otherwise, we are sitting ducks, noting can be done. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21172) Reimplement the retry backoff logic for ReopenTableRegionsProcedure
[ https://issues.apache.org/jira/browse/HBASE-21172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-21172: -- Attachment: HBASE-21172-branch-2.1-v1.patch > Reimplement the retry backoff logic for ReopenTableRegionsProcedure > --- > > Key: HBASE-21172 > URL: https://issues.apache.org/jira/browse/HBASE-21172 > Project: HBase > Issue Type: Sub-task > Components: amv2, proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3 > > Attachments: HBASE-21172-branch-2.1-v1.patch, > HBASE-21172-branch-2.1.patch, HBASE-21172-v1.patch, HBASE-21172-v2.patch, > HBASE-21172-v3.patch, HBASE-21172-v4.patch, HBASE-21172.patch > > > Now we just do a blocking sleep in the execute method, and there is no > exponential backoff. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21162) Revert suspicious change to BoundedByteBufferPool and disable use of direct buffers for IPC reservoir by default
[ https://issues.apache.org/jira/browse/HBASE-21162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610521#comment-16610521 ] Hudson commented on HBASE-21162: Results for branch branch-1 [build #455 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/455/]: (x) *{color:red}-1 overall{color}* details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/455//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/455//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/455//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 source release artifact{color} -- See build output for details. > Revert suspicious change to BoundedByteBufferPool and disable use of direct > buffers for IPC reservoir by default > > > Key: HBASE-21162 > URL: https://issues.apache.org/jira/browse/HBASE-21162 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.7 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Critical > Fix For: 1.5.0, 1.4.8 > > Attachments: HBASE-21162-branch-1.patch, HBASE-21162-branch-1.patch, > HBASE-21162-branch-1.patch > > > We had a production incident where we traced the issue to a direct buffer > leak. On a hunch we tried setting hbase.ipc.server.reservoir.enabled = false > and after that no native memory leak could be observed in any regionserver > process under the triggering load. > On HBASE-19239 (Fix findbugs and error-prone issues) I made a change to > BoundedByteBufferPool that is suspicious given this finding. It was committed > to branch-1.4 and branch-1. I'm going to revert this change. > In addition the allocation of direct memory for the server RPC reservoir is a > bit problematic in that tracing native memory or direct buffer leaks to a > particular class or compilation unit is difficult, so I also propose > allocating the reservoir on the heap by default instead. Should there be a > leak it is much easier to do an analysis of a heap dump with familiar tools > to find it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21158) Empty qualifier cell is always returned when using QualifierFilter
[ https://issues.apache.org/jira/browse/HBASE-21158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610522#comment-16610522 ] Hudson commented on HBASE-21158: Results for branch branch-1 [build #455 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/455/]: (x) *{color:red}-1 overall{color}* details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/455//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/455//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/455//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 source release artifact{color} -- See build output for details. > Empty qualifier cell is always returned when using QualifierFilter > -- > > Key: HBASE-21158 > URL: https://issues.apache.org/jira/browse/HBASE-21158 > Project: HBase > Issue Type: Bug > Components: Filters >Affects Versions: 3.0.0, 2.2.0 >Reporter: Guangxu Cheng >Assignee: Guangxu Cheng >Priority: Major > Fix For: 3.0.0, 1.5.0, 1.3.3, 1.2.8, 2.2.0, 1.4.8, 2.1.1, 2.0.3 > > Attachments: HBASE-21158.branch-1.001.patch, > HBASE-21158.master.001.patch, HBASE-21158.master.002.patch, > HBASE-21158.master.003.patch, HBASE-21158.master.004.patch > > > {code:xml} > hbase(main):002:0> put 'testTable','testrow','f:testcol1','testvalue1' > 0 row(s) in 0.0040 seconds > hbase(main):003:0> put 'testTable','testrow','f:','testvalue2' > 0 row(s) in 0.0070 seconds > # get row with empty column f:, result is correct. > hbase(main):004:0> scan 'testTable',{FILTER => "QualifierFilter (=, > 'binary:')"} > ROW COLUMN+CELL > > > testrowcolumn=f:, > timestamp=1536218563581, value=testvalue2 > > 1 row(s) in 0.0460 seconds > # get row with column f:testcol1, result is incorrect. > hbase(main):005:0> scan 'testTable',{FILTER => "QualifierFilter (=, > 'binary:testcol1')"} > ROW COLUMN+CELL > > > testrowcolumn=f:, > timestamp=1536218563581, value=testvalue2 > > testrowcolumn=f:testcol1, > timestamp=1536218550827, value=testvalue1 > > 1 row(s) in 0.0070 seconds > {code} > As the above operation, when the row contains empty qualifier column, empty > qualifier cell is always returned when using QualifierFilter. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21035) Meta Table should be able to online even if all procedures are lost
[ https://issues.apache.org/jira/browse/HBASE-21035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610520#comment-16610520 ] Duo Zhang commented on HBASE-21035: --- I think [~allan163] has a valid point. For SCP, if meta is not online, i.e, we haven't finished the loadMeta yet for AssignmentManager, then SCP will be suspended. So it is a bit strange that all the SCPs are finished but the meta is still not online... > Meta Table should be able to online even if all procedures are lost > --- > > Key: HBASE-21035 > URL: https://issues.apache.org/jira/browse/HBASE-21035 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Attachments: HBASE-21035.branch-2.0.001.patch > > > After HBASE-20708, we changed the way we init after master starts. It will > only check WAL dirs and compare to Zookeeper RS nodes to decide which server > need to expire. For servers which's dir is ending with 'SPLITTING', we assure > that there will be a SCP for it. > But, if the server with the meta region crashed before master restarts, and > if all the procedure wals are lost (due to bug, or deleted manually, > whatever), the new restarted master will be stuck when initing. Since no one > will bring meta region online. > Although it is an anomaly case, but I think no matter what happens, we need > to online meta region. Otherwise, we are sitting ducks, noting can be done. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21179) Fix the number of actions in responseTooSlow log
[ https://issues.apache.org/jira/browse/HBASE-21179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610517#comment-16610517 ] stack commented on HBASE-21179: --- +1 > Fix the number of actions in responseTooSlow log > > > Key: HBASE-21179 > URL: https://issues.apache.org/jira/browse/HBASE-21179 > Project: HBase > Issue Type: Bug > Components: rpc >Reporter: Guangxu Cheng >Assignee: Guangxu Cheng >Priority: Major > Attachments: HBASE-21179.master.001.patch, > HBASE-21179.master.002.patch > > > {panel:title=responseTooSlow|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1|bgColor=#CE} > 2018-09-10 16:13:53,022 WARN > [B.DefaultRpcServer.handler=209,queue=29,port=60020] ipc.RpcServer: > (responseTooSlow): > {"processingtimems":321262,"call":"Multi(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MultiRequest)","client":"127.0.0.1:56149","param":"region= > > tsdb,\\x00\\x00.[\\x89\\x1F\\xB0\\x00\\x00\\x01\\x00\\x01Y\\x00\\x00\\x02\\x00\\x00\\x04,1536133210446.7c752de470bd5558a001117b123a5db5., > {color:red}for 1 actions and 1st row{color} > key=\\x00\\x00.[\\x96\\x16p","starttimems":1536566911759,"queuetimems":0,"class":"HRegionServer","responsesize":2,"method":"Multi"} > {panel} > The responseTooSlow log is printed when the processing time of a request > exceeds the specified threshold. The number of actions and the contents of > the first rowkey in the request will be included in the log. > However, the number of actions is inaccurate, and it is actually the number > of regions that the request needs to visit. > Just like the logs above, users may be mistaken for using 321262ms to process > an action, which is incredible, so we need to fix it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21179) Fix the number of actions in responseTooSlow log
[ https://issues.apache.org/jira/browse/HBASE-21179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610509#comment-16610509 ] Hadoop QA commented on HBASE-21179: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange} 0m 0s{color} | {color:orange} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 58s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 53s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 0s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 46s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 9m 44s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 58s{color} | {color:green} hbase-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 8s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 35m 37s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-21179 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12939240/HBASE-21179.master.002.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux c08a22b395f7 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 07:31:43 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 2ab8122a24 | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC3 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/14389/testReport/ | | Max. process+thread count | 292 (vs. ulimit of 1) | | modules | C: hbase-client U: hbase-client | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/14389/console | |
[jira] [Commented] (HBASE-21035) Meta Table should be able to online even if all procedures are lost
[ https://issues.apache.org/jira/browse/HBASE-21035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610503#comment-16610503 ] stack commented on HBASE-21035: --- bq. stack, sir, I don't get it, how can SCP finish without meta online? No region to assign? Its hard to trace why on my big cluster, but we somehow lose accounting of meta after a bunch of crashing and restarting of master. I've also been playing with the startup sequence which probably messed things up (Masters do not progress beyond waitForMasterActive -- they don't get to the run method it seems). Anyways, I can get into a state where all SCPs are done but meta is not online (Last night meta was in the OPENING state after all SCPs were done). If meta is not online and we can't scan it to loadMeta, then the Master shutsdown after a minute. I'm working on having the Master hold before the first meta scan if it can't find an online meta for the operator to insert an assign at least (I think we might just auto-assign if all SCPs are done and there is still no meta online). We need the 'hold' at least. Replaying all the WALs can take a while. It would be frustrating to the operator watching hundreds of backed-up WALs replaying and then the Master exits when done. > Meta Table should be able to online even if all procedures are lost > --- > > Key: HBASE-21035 > URL: https://issues.apache.org/jira/browse/HBASE-21035 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Attachments: HBASE-21035.branch-2.0.001.patch > > > After HBASE-20708, we changed the way we init after master starts. It will > only check WAL dirs and compare to Zookeeper RS nodes to decide which server > need to expire. For servers which's dir is ending with 'SPLITTING', we assure > that there will be a SCP for it. > But, if the server with the meta region crashed before master restarts, and > if all the procedure wals are lost (due to bug, or deleted manually, > whatever), the new restarted master will be stuck when initing. Since no one > will bring meta region online. > Although it is an anomaly case, but I think no matter what happens, we need > to online meta region. Otherwise, we are sitting ducks, noting can be done. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21162) Revert suspicious change to BoundedByteBufferPool and disable use of direct buffers for IPC reservoir by default
[ https://issues.apache.org/jira/browse/HBASE-21162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610491#comment-16610491 ] Hudson commented on HBASE-21162: Results for branch branch-1.4 [build #457 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/457/]: (x) *{color:red}-1 overall{color}* details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/457//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/457//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/457//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > Revert suspicious change to BoundedByteBufferPool and disable use of direct > buffers for IPC reservoir by default > > > Key: HBASE-21162 > URL: https://issues.apache.org/jira/browse/HBASE-21162 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.7 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Critical > Fix For: 1.5.0, 1.4.8 > > Attachments: HBASE-21162-branch-1.patch, HBASE-21162-branch-1.patch, > HBASE-21162-branch-1.patch > > > We had a production incident where we traced the issue to a direct buffer > leak. On a hunch we tried setting hbase.ipc.server.reservoir.enabled = false > and after that no native memory leak could be observed in any regionserver > process under the triggering load. > On HBASE-19239 (Fix findbugs and error-prone issues) I made a change to > BoundedByteBufferPool that is suspicious given this finding. It was committed > to branch-1.4 and branch-1. I'm going to revert this change. > In addition the allocation of direct memory for the server RPC reservoir is a > bit problematic in that tracing native memory or direct buffer leaks to a > particular class or compilation unit is difficult, so I also propose > allocating the reservoir on the heap by default instead. Should there be a > leak it is much easier to do an analysis of a heap dump with familiar tools > to find it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21158) Empty qualifier cell is always returned when using QualifierFilter
[ https://issues.apache.org/jira/browse/HBASE-21158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610492#comment-16610492 ] Hudson commented on HBASE-21158: Results for branch branch-1.4 [build #457 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/457/]: (x) *{color:red}-1 overall{color}* details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/457//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/457//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/457//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > Empty qualifier cell is always returned when using QualifierFilter > -- > > Key: HBASE-21158 > URL: https://issues.apache.org/jira/browse/HBASE-21158 > Project: HBase > Issue Type: Bug > Components: Filters >Affects Versions: 3.0.0, 2.2.0 >Reporter: Guangxu Cheng >Assignee: Guangxu Cheng >Priority: Major > Fix For: 3.0.0, 1.5.0, 1.3.3, 1.2.8, 2.2.0, 1.4.8, 2.1.1, 2.0.3 > > Attachments: HBASE-21158.branch-1.001.patch, > HBASE-21158.master.001.patch, HBASE-21158.master.002.patch, > HBASE-21158.master.003.patch, HBASE-21158.master.004.patch > > > {code:xml} > hbase(main):002:0> put 'testTable','testrow','f:testcol1','testvalue1' > 0 row(s) in 0.0040 seconds > hbase(main):003:0> put 'testTable','testrow','f:','testvalue2' > 0 row(s) in 0.0070 seconds > # get row with empty column f:, result is correct. > hbase(main):004:0> scan 'testTable',{FILTER => "QualifierFilter (=, > 'binary:')"} > ROW COLUMN+CELL > > > testrowcolumn=f:, > timestamp=1536218563581, value=testvalue2 > > 1 row(s) in 0.0460 seconds > # get row with column f:testcol1, result is incorrect. > hbase(main):005:0> scan 'testTable',{FILTER => "QualifierFilter (=, > 'binary:testcol1')"} > ROW COLUMN+CELL > > > testrowcolumn=f:, > timestamp=1536218563581, value=testvalue2 > > testrowcolumn=f:testcol1, > timestamp=1536218550827, value=testvalue1 > > 1 row(s) in 0.0070 seconds > {code} > As the above operation, when the row contains empty qualifier column, empty > qualifier cell is always returned when using QualifierFilter. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21173) Remove the duplicate HRegion#close in TestHRegion
[ https://issues.apache.org/jira/browse/HBASE-21173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610469#comment-16610469 ] Hudson commented on HBASE-21173: SUCCESS: Integrated in Jenkins build HBase-1.3-IT #474 (See [https://builds.apache.org/job/HBase-1.3-IT/474/]) HBASE-21173 Remove the duplicate HRegion#close in TestHRegion (guangxucheng: rev b7dfb7462ce51f49b38bae5cadeb076805e6cfae) * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java > Remove the duplicate HRegion#close in TestHRegion > - > > Key: HBASE-21173 > URL: https://issues.apache.org/jira/browse/HBASE-21173 > Project: HBase > Issue Type: Bug > Components: test >Affects Versions: 3.0.0, 2.2.0 >Reporter: Guangxu Cheng >Assignee: Guangxu Cheng >Priority: Minor > Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.4.8 > > Attachments: HBASE-21173.branch-1.001.patch, > HBASE-21173.master.001.patch, HBASE-21173.master.002.patch > > > After HBASE-21138, some test methods still have the duplicate > HRegion#close.So open this issue to remove the duplicate close -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21162) Revert suspicious change to BoundedByteBufferPool and disable use of direct buffers for IPC reservoir by default
[ https://issues.apache.org/jira/browse/HBASE-21162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610464#comment-16610464 ] Hudson commented on HBASE-21162: Results for branch branch-1 [build #456 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/456/]: (x) *{color:red}-1 overall{color}* details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/456//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/456//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/456//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 source release artifact{color} -- See build output for details. > Revert suspicious change to BoundedByteBufferPool and disable use of direct > buffers for IPC reservoir by default > > > Key: HBASE-21162 > URL: https://issues.apache.org/jira/browse/HBASE-21162 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.7 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Critical > Fix For: 1.5.0, 1.4.8 > > Attachments: HBASE-21162-branch-1.patch, HBASE-21162-branch-1.patch, > HBASE-21162-branch-1.patch > > > We had a production incident where we traced the issue to a direct buffer > leak. On a hunch we tried setting hbase.ipc.server.reservoir.enabled = false > and after that no native memory leak could be observed in any regionserver > process under the triggering load. > On HBASE-19239 (Fix findbugs and error-prone issues) I made a change to > BoundedByteBufferPool that is suspicious given this finding. It was committed > to branch-1.4 and branch-1. I'm going to revert this change. > In addition the allocation of direct memory for the server RPC reservoir is a > bit problematic in that tracing native memory or direct buffer leaks to a > particular class or compilation unit is difficult, so I also propose > allocating the reservoir on the heap by default instead. Should there be a > leak it is much easier to do an analysis of a heap dump with familiar tools > to find it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21158) Empty qualifier cell is always returned when using QualifierFilter
[ https://issues.apache.org/jira/browse/HBASE-21158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610465#comment-16610465 ] Hudson commented on HBASE-21158: Results for branch branch-1 [build #456 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/456/]: (x) *{color:red}-1 overall{color}* details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/456//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/456//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/456//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 source release artifact{color} -- See build output for details. > Empty qualifier cell is always returned when using QualifierFilter > -- > > Key: HBASE-21158 > URL: https://issues.apache.org/jira/browse/HBASE-21158 > Project: HBase > Issue Type: Bug > Components: Filters >Affects Versions: 3.0.0, 2.2.0 >Reporter: Guangxu Cheng >Assignee: Guangxu Cheng >Priority: Major > Fix For: 3.0.0, 1.5.0, 1.3.3, 1.2.8, 2.2.0, 1.4.8, 2.1.1, 2.0.3 > > Attachments: HBASE-21158.branch-1.001.patch, > HBASE-21158.master.001.patch, HBASE-21158.master.002.patch, > HBASE-21158.master.003.patch, HBASE-21158.master.004.patch > > > {code:xml} > hbase(main):002:0> put 'testTable','testrow','f:testcol1','testvalue1' > 0 row(s) in 0.0040 seconds > hbase(main):003:0> put 'testTable','testrow','f:','testvalue2' > 0 row(s) in 0.0070 seconds > # get row with empty column f:, result is correct. > hbase(main):004:0> scan 'testTable',{FILTER => "QualifierFilter (=, > 'binary:')"} > ROW COLUMN+CELL > > > testrowcolumn=f:, > timestamp=1536218563581, value=testvalue2 > > 1 row(s) in 0.0460 seconds > # get row with column f:testcol1, result is incorrect. > hbase(main):005:0> scan 'testTable',{FILTER => "QualifierFilter (=, > 'binary:testcol1')"} > ROW COLUMN+CELL > > > testrowcolumn=f:, > timestamp=1536218563581, value=testvalue2 > > testrowcolumn=f:testcol1, > timestamp=1536218550827, value=testvalue1 > > 1 row(s) in 0.0070 seconds > {code} > As the above operation, when the row contains empty qualifier column, empty > qualifier cell is always returned when using QualifierFilter. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21179) Fix the number of actions in responseTooSlow log
[ https://issues.apache.org/jira/browse/HBASE-21179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610443#comment-16610443 ] Allan Yang commented on HBASE-21179: +1 > Fix the number of actions in responseTooSlow log > > > Key: HBASE-21179 > URL: https://issues.apache.org/jira/browse/HBASE-21179 > Project: HBase > Issue Type: Bug > Components: rpc >Reporter: Guangxu Cheng >Assignee: Guangxu Cheng >Priority: Major > Attachments: HBASE-21179.master.001.patch, > HBASE-21179.master.002.patch > > > {panel:title=responseTooSlow|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1|bgColor=#CE} > 2018-09-10 16:13:53,022 WARN > [B.DefaultRpcServer.handler=209,queue=29,port=60020] ipc.RpcServer: > (responseTooSlow): > {"processingtimems":321262,"call":"Multi(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MultiRequest)","client":"127.0.0.1:56149","param":"region= > > tsdb,\\x00\\x00.[\\x89\\x1F\\xB0\\x00\\x00\\x01\\x00\\x01Y\\x00\\x00\\x02\\x00\\x00\\x04,1536133210446.7c752de470bd5558a001117b123a5db5., > {color:red}for 1 actions and 1st row{color} > key=\\x00\\x00.[\\x96\\x16p","starttimems":1536566911759,"queuetimems":0,"class":"HRegionServer","responsesize":2,"method":"Multi"} > {panel} > The responseTooSlow log is printed when the processing time of a request > exceeds the specified threshold. The number of actions and the contents of > the first rowkey in the request will be included in the log. > However, the number of actions is inaccurate, and it is actually the number > of regions that the request needs to visit. > Just like the logs above, users may be mistaken for using 321262ms to process > an action, which is incredible, so we need to fix it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21158) Empty qualifier cell is always returned when using QualifierFilter
[ https://issues.apache.org/jira/browse/HBASE-21158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610441#comment-16610441 ] Hudson commented on HBASE-21158: Results for branch branch-1.3 [build #462 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/462/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/462//General_Nightly_Build_Report/] (/) {color:green}+1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/462//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/462//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > Empty qualifier cell is always returned when using QualifierFilter > -- > > Key: HBASE-21158 > URL: https://issues.apache.org/jira/browse/HBASE-21158 > Project: HBase > Issue Type: Bug > Components: Filters >Affects Versions: 3.0.0, 2.2.0 >Reporter: Guangxu Cheng >Assignee: Guangxu Cheng >Priority: Major > Fix For: 3.0.0, 1.5.0, 1.3.3, 1.2.8, 2.2.0, 1.4.8, 2.1.1, 2.0.3 > > Attachments: HBASE-21158.branch-1.001.patch, > HBASE-21158.master.001.patch, HBASE-21158.master.002.patch, > HBASE-21158.master.003.patch, HBASE-21158.master.004.patch > > > {code:xml} > hbase(main):002:0> put 'testTable','testrow','f:testcol1','testvalue1' > 0 row(s) in 0.0040 seconds > hbase(main):003:0> put 'testTable','testrow','f:','testvalue2' > 0 row(s) in 0.0070 seconds > # get row with empty column f:, result is correct. > hbase(main):004:0> scan 'testTable',{FILTER => "QualifierFilter (=, > 'binary:')"} > ROW COLUMN+CELL > > > testrowcolumn=f:, > timestamp=1536218563581, value=testvalue2 > > 1 row(s) in 0.0460 seconds > # get row with column f:testcol1, result is incorrect. > hbase(main):005:0> scan 'testTable',{FILTER => "QualifierFilter (=, > 'binary:testcol1')"} > ROW COLUMN+CELL > > > testrowcolumn=f:, > timestamp=1536218563581, value=testvalue2 > > testrowcolumn=f:testcol1, > timestamp=1536218550827, value=testvalue1 > > 1 row(s) in 0.0070 seconds > {code} > As the above operation, when the row contains empty qualifier column, empty > qualifier cell is always returned when using QualifierFilter. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21179) Fix the number of actions in responseTooSlow log
[ https://issues.apache.org/jira/browse/HBASE-21179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610439#comment-16610439 ] Guangxu Cheng commented on HBASE-21179: --- bq.The "actions" can be written as "action(s)" 002 addressed this. Thanks [~yuzhih...@gmail.com] > Fix the number of actions in responseTooSlow log > > > Key: HBASE-21179 > URL: https://issues.apache.org/jira/browse/HBASE-21179 > Project: HBase > Issue Type: Bug > Components: rpc >Reporter: Guangxu Cheng >Assignee: Guangxu Cheng >Priority: Major > Attachments: HBASE-21179.master.001.patch, > HBASE-21179.master.002.patch > > > {panel:title=responseTooSlow|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1|bgColor=#CE} > 2018-09-10 16:13:53,022 WARN > [B.DefaultRpcServer.handler=209,queue=29,port=60020] ipc.RpcServer: > (responseTooSlow): > {"processingtimems":321262,"call":"Multi(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MultiRequest)","client":"127.0.0.1:56149","param":"region= > > tsdb,\\x00\\x00.[\\x89\\x1F\\xB0\\x00\\x00\\x01\\x00\\x01Y\\x00\\x00\\x02\\x00\\x00\\x04,1536133210446.7c752de470bd5558a001117b123a5db5., > {color:red}for 1 actions and 1st row{color} > key=\\x00\\x00.[\\x96\\x16p","starttimems":1536566911759,"queuetimems":0,"class":"HRegionServer","responsesize":2,"method":"Multi"} > {panel} > The responseTooSlow log is printed when the processing time of a request > exceeds the specified threshold. The number of actions and the contents of > the first rowkey in the request will be included in the log. > However, the number of actions is inaccurate, and it is actually the number > of regions that the request needs to visit. > Just like the logs above, users may be mistaken for using 321262ms to process > an action, which is incredible, so we need to fix it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21179) Fix the number of actions in responseTooSlow log
[ https://issues.apache.org/jira/browse/HBASE-21179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guangxu Cheng updated HBASE-21179: -- Attachment: HBASE-21179.master.002.patch > Fix the number of actions in responseTooSlow log > > > Key: HBASE-21179 > URL: https://issues.apache.org/jira/browse/HBASE-21179 > Project: HBase > Issue Type: Bug > Components: rpc >Reporter: Guangxu Cheng >Assignee: Guangxu Cheng >Priority: Major > Attachments: HBASE-21179.master.001.patch, > HBASE-21179.master.002.patch > > > {panel:title=responseTooSlow|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1|bgColor=#CE} > 2018-09-10 16:13:53,022 WARN > [B.DefaultRpcServer.handler=209,queue=29,port=60020] ipc.RpcServer: > (responseTooSlow): > {"processingtimems":321262,"call":"Multi(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MultiRequest)","client":"127.0.0.1:56149","param":"region= > > tsdb,\\x00\\x00.[\\x89\\x1F\\xB0\\x00\\x00\\x01\\x00\\x01Y\\x00\\x00\\x02\\x00\\x00\\x04,1536133210446.7c752de470bd5558a001117b123a5db5., > {color:red}for 1 actions and 1st row{color} > key=\\x00\\x00.[\\x96\\x16p","starttimems":1536566911759,"queuetimems":0,"class":"HRegionServer","responsesize":2,"method":"Multi"} > {panel} > The responseTooSlow log is printed when the processing time of a request > exceeds the specified threshold. The number of actions and the contents of > the first rowkey in the request will be included in the log. > However, the number of actions is inaccurate, and it is actually the number > of regions that the request needs to visit. > Just like the logs above, users may be mistaken for using 321262ms to process > an action, which is incredible, so we need to fix it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21158) Empty qualifier cell is always returned when using QualifierFilter
[ https://issues.apache.org/jira/browse/HBASE-21158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610428#comment-16610428 ] Hudson commented on HBASE-21158: Results for branch branch-1.2 [build #469 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/469/]: (/) *{color:green}+1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/469//General_Nightly_Build_Report/] (/) {color:green}+1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/469//JDK7_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/469//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > Empty qualifier cell is always returned when using QualifierFilter > -- > > Key: HBASE-21158 > URL: https://issues.apache.org/jira/browse/HBASE-21158 > Project: HBase > Issue Type: Bug > Components: Filters >Affects Versions: 3.0.0, 2.2.0 >Reporter: Guangxu Cheng >Assignee: Guangxu Cheng >Priority: Major > Fix For: 3.0.0, 1.5.0, 1.3.3, 1.2.8, 2.2.0, 1.4.8, 2.1.1, 2.0.3 > > Attachments: HBASE-21158.branch-1.001.patch, > HBASE-21158.master.001.patch, HBASE-21158.master.002.patch, > HBASE-21158.master.003.patch, HBASE-21158.master.004.patch > > > {code:xml} > hbase(main):002:0> put 'testTable','testrow','f:testcol1','testvalue1' > 0 row(s) in 0.0040 seconds > hbase(main):003:0> put 'testTable','testrow','f:','testvalue2' > 0 row(s) in 0.0070 seconds > # get row with empty column f:, result is correct. > hbase(main):004:0> scan 'testTable',{FILTER => "QualifierFilter (=, > 'binary:')"} > ROW COLUMN+CELL > > > testrowcolumn=f:, > timestamp=1536218563581, value=testvalue2 > > 1 row(s) in 0.0460 seconds > # get row with column f:testcol1, result is incorrect. > hbase(main):005:0> scan 'testTable',{FILTER => "QualifierFilter (=, > 'binary:testcol1')"} > ROW COLUMN+CELL > > > testrowcolumn=f:, > timestamp=1536218563581, value=testvalue2 > > testrowcolumn=f:testcol1, > timestamp=1536218550827, value=testvalue1 > > 1 row(s) in 0.0070 seconds > {code} > As the above operation, when the row contains empty qualifier column, empty > qualifier cell is always returned when using QualifierFilter. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21179) Fix the number of actions in responseTooSlow log
[ https://issues.apache.org/jira/browse/HBASE-21179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610426#comment-16610426 ] Ted Yu commented on HBASE-21179: Looks good. The "actions" can be written as "action(s)" > Fix the number of actions in responseTooSlow log > > > Key: HBASE-21179 > URL: https://issues.apache.org/jira/browse/HBASE-21179 > Project: HBase > Issue Type: Bug > Components: rpc >Reporter: Guangxu Cheng >Assignee: Guangxu Cheng >Priority: Major > Attachments: HBASE-21179.master.001.patch > > > {panel:title=responseTooSlow|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1|bgColor=#CE} > 2018-09-10 16:13:53,022 WARN > [B.DefaultRpcServer.handler=209,queue=29,port=60020] ipc.RpcServer: > (responseTooSlow): > {"processingtimems":321262,"call":"Multi(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MultiRequest)","client":"127.0.0.1:56149","param":"region= > > tsdb,\\x00\\x00.[\\x89\\x1F\\xB0\\x00\\x00\\x01\\x00\\x01Y\\x00\\x00\\x02\\x00\\x00\\x04,1536133210446.7c752de470bd5558a001117b123a5db5., > {color:red}for 1 actions and 1st row{color} > key=\\x00\\x00.[\\x96\\x16p","starttimems":1536566911759,"queuetimems":0,"class":"HRegionServer","responsesize":2,"method":"Multi"} > {panel} > The responseTooSlow log is printed when the processing time of a request > exceeds the specified threshold. The number of actions and the contents of > the first rowkey in the request will be included in the log. > However, the number of actions is inaccurate, and it is actually the number > of regions that the request needs to visit. > Just like the logs above, users may be mistaken for using 321262ms to process > an action, which is incredible, so we need to fix it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21172) Reimplement the retry backoff logic for ReopenTableRegionsProcedure
[ https://issues.apache.org/jira/browse/HBASE-21172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610420#comment-16610420 ] Hadoop QA commented on HBASE-21172: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 6s{color} | {color:red} HBASE-21172 does not apply to branch-2.1. Rebase required? Wrong Branch? See https://yetus.apache.org/documentation/0.7.0/precommit-patchnames for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HBASE-21172 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12939232/HBASE-21172-branch-2.1.patch | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/14388/console | | Powered by | Apache Yetus 0.7.0 http://yetus.apache.org | This message was automatically generated. > Reimplement the retry backoff logic for ReopenTableRegionsProcedure > --- > > Key: HBASE-21172 > URL: https://issues.apache.org/jira/browse/HBASE-21172 > Project: HBase > Issue Type: Sub-task > Components: amv2, proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3 > > Attachments: HBASE-21172-branch-2.1.patch, HBASE-21172-v1.patch, > HBASE-21172-v2.patch, HBASE-21172-v3.patch, HBASE-21172-v4.patch, > HBASE-21172.patch > > > Now we just do a blocking sleep in the execute method, and there is no > exponential backoff. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21173) Remove the duplicate HRegion#close in TestHRegion
[ https://issues.apache.org/jira/browse/HBASE-21173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610413#comment-16610413 ] Hudson commented on HBASE-21173: Results for branch branch-2 [build #1232 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1232/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1232//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1232//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1232//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Remove the duplicate HRegion#close in TestHRegion > - > > Key: HBASE-21173 > URL: https://issues.apache.org/jira/browse/HBASE-21173 > Project: HBase > Issue Type: Bug > Components: test >Affects Versions: 3.0.0, 2.2.0 >Reporter: Guangxu Cheng >Assignee: Guangxu Cheng >Priority: Minor > Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.4.8 > > Attachments: HBASE-21173.branch-1.001.patch, > HBASE-21173.master.001.patch, HBASE-21173.master.002.patch > > > After HBASE-21138, some test methods still have the duplicate > HRegion#close.So open this issue to remove the duplicate close -- This message was sent by Atlassian JIRA (v7.6.3#76005)