[jira] [Commented] (HBASE-25890) [branch-1] add -U for maven build to force a check for updated releases and snapshots on remote repositories
[ https://issues.apache.org/jira/browse/HBASE-25890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17344691#comment-17344691 ] Reid Chan commented on HBASE-25890: --- I want to find out where is the script located https://ci-hadoop.apache.org/blue/organizations/jenkins/HBase%2FHBase-PreCommit-GitHub-PR/detail/PR-3264/3/pipeline, but couldn't. I thought it was hbase-personality.sh, seems it is only part of the build. > [branch-1] add -U for maven build to force a check for updated releases and > snapshots on remote repositories > > > Key: HBASE-25890 > URL: https://issues.apache.org/jira/browse/HBASE-25890 > Project: HBase > Issue Type: Task >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > Fix For: 1.7.0 > > > {code} > [ERROR] Failed to execute goal on project hbase-assembly: Could not resolve > dependencies for project org.apache.hbase:hbase-assembly:pom:1.7.0: Could not > find artifact org.apache.hbase:hbase-thrift:jar:1.7.0 in apache release > (https://repository.apache.org/content/repositories/releases/) -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException > [ERROR] > [ERROR] After correcting the problems, you can resume the build with the > command > [ERROR] mvn -rf :hbase-assembly > {code} > {code} > [ERROR] Failed to execute goal on project hbase-assembly: Could not resolve > dependencies for project org.apache.hbase:hbase-assembly:pom:1.7.0: Failure > to find org.apache.hbase:hbase-thrift:jar:1.7.0 in > https://repository.apache.org/content/repositories/releases/ was cached in > the local repository, resolution will not be reattempted until the update > interval of apache release has elapsed or updates are forced -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException > [ERROR] > [ERROR] After correcting the problems, you can resume the build with the > command > [ERROR] mvn -rf :hbase-assembly > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25890) [branch-1] add -U for maven build to force a check for updated releases and snapshots on remote repositories
[ https://issues.apache.org/jira/browse/HBASE-25890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17344609#comment-17344609 ] Reid Chan commented on HBASE-25890: --- The build command in {{mvninstall}} is {code} mvn --batch-mode -Dmaven.repo.local=/home/jenkins/jenkins-home/workspace/Base-PreCommit-GitHub-PR_PR-3264/yetus-m2/hbase-branch-1-patch-1 -DHBasePatchProcess -Dhttps.protocols=TLSv1.2 -fae clean install -DskipTests=true -Dmaven.javadoc.skip=true -Dcheckstyle.skip=true -Dfindbugs.skip=true -Dspotbugs.skip=true {code} doesn't contain -U. > [branch-1] add -U for maven build to force a check for updated releases and > snapshots on remote repositories > > > Key: HBASE-25890 > URL: https://issues.apache.org/jira/browse/HBASE-25890 > Project: HBase > Issue Type: Task >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > Fix For: 1.7.0 > > > {code} > [ERROR] Failed to execute goal on project hbase-assembly: Could not resolve > dependencies for project org.apache.hbase:hbase-assembly:pom:1.7.0: Could not > find artifact org.apache.hbase:hbase-thrift:jar:1.7.0 in apache release > (https://repository.apache.org/content/repositories/releases/) -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException > [ERROR] > [ERROR] After correcting the problems, you can resume the build with the > command > [ERROR] mvn -rf :hbase-assembly > {code} > {code} > [ERROR] Failed to execute goal on project hbase-assembly: Could not resolve > dependencies for project org.apache.hbase:hbase-assembly:pom:1.7.0: Failure > to find org.apache.hbase:hbase-thrift:jar:1.7.0 in > https://repository.apache.org/content/repositories/releases/ was cached in > the local repository, resolution will not be reattempted until the update > interval of apache release has elapsed or updates are forced -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException > [ERROR] > [ERROR] After correcting the problems, you can resume the build with the > command > [ERROR] mvn -rf :hbase-assembly > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25890) [branch-1] add -U for maven build to force a check for updated releases and snapshots on remote repositories
[ https://issues.apache.org/jira/browse/HBASE-25890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17344605#comment-17344605 ] Reid Chan commented on HBASE-25890: --- the -U didn't take effect in {{mvninstall}}. Checking. > [branch-1] add -U for maven build to force a check for updated releases and > snapshots on remote repositories > > > Key: HBASE-25890 > URL: https://issues.apache.org/jira/browse/HBASE-25890 > Project: HBase > Issue Type: Task >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > Fix For: 1.7.0 > > > {code} > [ERROR] Failed to execute goal on project hbase-assembly: Could not resolve > dependencies for project org.apache.hbase:hbase-assembly:pom:1.7.0: Could not > find artifact org.apache.hbase:hbase-thrift:jar:1.7.0 in apache release > (https://repository.apache.org/content/repositories/releases/) -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException > [ERROR] > [ERROR] After correcting the problems, you can resume the build with the > command > [ERROR] mvn -rf :hbase-assembly > {code} > {code} > [ERROR] Failed to execute goal on project hbase-assembly: Could not resolve > dependencies for project org.apache.hbase:hbase-assembly:pom:1.7.0: Failure > to find org.apache.hbase:hbase-thrift:jar:1.7.0 in > https://repository.apache.org/content/repositories/releases/ was cached in > the local repository, resolution will not be reattempted until the update > interval of apache release has elapsed or updates are forced -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException > [ERROR] > [ERROR] After correcting the problems, you can resume the build with the > command > [ERROR] mvn -rf :hbase-assembly > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Issue Comment Deleted] (HBASE-25890) [branch-1] add -U for maven build to force a check for updated releases and snapshots on remote repositories
[ https://issues.apache.org/jira/browse/HBASE-25890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-25890: -- Comment: was deleted (was: After PR#3265, the 2nd warning ".* cached in the local repository" is gone. But we still have the 1st warning.) > [branch-1] add -U for maven build to force a check for updated releases and > snapshots on remote repositories > > > Key: HBASE-25890 > URL: https://issues.apache.org/jira/browse/HBASE-25890 > Project: HBase > Issue Type: Task >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > Fix For: 1.7.0 > > > {code} > [ERROR] Failed to execute goal on project hbase-assembly: Could not resolve > dependencies for project org.apache.hbase:hbase-assembly:pom:1.7.0: Could not > find artifact org.apache.hbase:hbase-thrift:jar:1.7.0 in apache release > (https://repository.apache.org/content/repositories/releases/) -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException > [ERROR] > [ERROR] After correcting the problems, you can resume the build with the > command > [ERROR] mvn -rf :hbase-assembly > {code} > {code} > [ERROR] Failed to execute goal on project hbase-assembly: Could not resolve > dependencies for project org.apache.hbase:hbase-assembly:pom:1.7.0: Failure > to find org.apache.hbase:hbase-thrift:jar:1.7.0 in > https://repository.apache.org/content/repositories/releases/ was cached in > the local repository, resolution will not be reattempted until the update > interval of apache release has elapsed or updates are forced -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException > [ERROR] > [ERROR] After correcting the problems, you can resume the build with the > command > [ERROR] mvn -rf :hbase-assembly > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25890) [branch-1] add -U for maven build to force a check for updated releases and snapshots on remote repositories
[ https://issues.apache.org/jira/browse/HBASE-25890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17344591#comment-17344591 ] Reid Chan commented on HBASE-25890: --- After PR#3265, the 2nd warning ".* cached in the local repository" is gone. But we still have the 1st warning. > [branch-1] add -U for maven build to force a check for updated releases and > snapshots on remote repositories > > > Key: HBASE-25890 > URL: https://issues.apache.org/jira/browse/HBASE-25890 > Project: HBase > Issue Type: Task >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > Fix For: 1.7.0 > > > {code} > [ERROR] Failed to execute goal on project hbase-assembly: Could not resolve > dependencies for project org.apache.hbase:hbase-assembly:pom:1.7.0: Could not > find artifact org.apache.hbase:hbase-thrift:jar:1.7.0 in apache release > (https://repository.apache.org/content/repositories/releases/) -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException > [ERROR] > [ERROR] After correcting the problems, you can resume the build with the > command > [ERROR] mvn -rf :hbase-assembly > {code} > {code} > [ERROR] Failed to execute goal on project hbase-assembly: Could not resolve > dependencies for project org.apache.hbase:hbase-assembly:pom:1.7.0: Failure > to find org.apache.hbase:hbase-thrift:jar:1.7.0 in > https://repository.apache.org/content/repositories/releases/ was cached in > the local repository, resolution will not be reattempted until the update > interval of apache release has elapsed or updates are forced -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException > [ERROR] > [ERROR] After correcting the problems, you can resume the build with the > command > [ERROR] mvn -rf :hbase-assembly > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25858) [branch-1] make hbase-thrift optional in hbase-assembly module
[ https://issues.apache.org/jira/browse/HBASE-25858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17344589#comment-17344589 ] Reid Chan commented on HBASE-25858: --- Filed HBASE-25890, cc [~apurtell] > [branch-1] make hbase-thrift optional in hbase-assembly module > -- > > Key: HBASE-25858 > URL: https://issues.apache.org/jira/browse/HBASE-25858 > Project: HBase > Issue Type: Task >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > > {code} > [ERROR] Failed to execute goal on project hbase-assembly: Could not resolve > dependencies for project org.apache.hbase:hbase-assembly:pom:1.7.0: Failure > to find org.apache.hbase:hbase-thrift:jar:1.7.0 in > https://repository.apache.org/content/repositories/releases/ was cached in > the local repository, resolution will not be reattempted until the update > interval of apache release has elapsed or updates are forced -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25890) [branch-1] add -U for maven build to force a check for updated releases and snapshots on remote repositories
[ https://issues.apache.org/jira/browse/HBASE-25890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-25890: -- Description: {code} [ERROR] Failed to execute goal on project hbase-assembly: Could not resolve dependencies for project org.apache.hbase:hbase-assembly:pom:1.7.0: Could not find artifact org.apache.hbase:hbase-thrift:jar:1.7.0 in apache release (https://repository.apache.org/content/repositories/releases/) -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn -rf :hbase-assembly {code} {code} [ERROR] Failed to execute goal on project hbase-assembly: Could not resolve dependencies for project org.apache.hbase:hbase-assembly:pom:1.7.0: Failure to find org.apache.hbase:hbase-thrift:jar:1.7.0 in https://repository.apache.org/content/repositories/releases/ was cached in the local repository, resolution will not be reattempted until the update interval of apache release has elapsed or updates are forced -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn -rf :hbase-assembly {code} was: {code} [ERROR] Failed to execute goal on project hbase-assembly: Could not resolve dependencies for project org.apache.hbase:hbase-assembly:pom:1.7.0: Could not find artifact org.apache.hbase:hbase-thrift:jar:1.7.0 in apache release (https://repository.apache.org/content/repositories/releases/) -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn -rf :hbase-assembly {code} > [branch-1] add -U for maven build to force a check for updated releases and > snapshots on remote repositories > > > Key: HBASE-25890 > URL: https://issues.apache.org/jira/browse/HBASE-25890 > Project: HBase > Issue Type: Task >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > Fix For: 1.7.0 > > > {code} > [ERROR] Failed to execute goal on project hbase-assembly: Could not resolve > dependencies for project org.apache.hbase:hbase-assembly:pom:1.7.0: Could not > find artifact org.apache.hbase:hbase-thrift:jar:1.7.0 in apache release > (https://repository.apache.org/content/repositories/releases/) -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException > [ERROR] > [ERROR] After correcting the problems, you can resume the build with the > command > [ERROR] mvn -rf :hbase-assembly > {code} > {code} > [ERROR] Failed to execute goal on project hbase-assembly: Could not resolve > dependencies for project org.apache.hbase:hbase-assembly:pom:1.7.0: Failure > to find org.apache.hbase:hbase-thrift:jar:1.7.0 in > https://repository.apache.org/content/repositories/releases/ was cached in > the local repository, resolution will not be reattempted until the update > interval of apache release has elapsed or updates are forced -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/disp
[jira] [Updated] (HBASE-25890) [branch-1] add -U for maven build to force a check for updated releases and snapshots on remote repositories
[ https://issues.apache.org/jira/browse/HBASE-25890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-25890: -- Fix Version/s: 1.7.0 > [branch-1] add -U for maven build to force a check for updated releases and > snapshots on remote repositories > > > Key: HBASE-25890 > URL: https://issues.apache.org/jira/browse/HBASE-25890 > Project: HBase > Issue Type: Task >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > Fix For: 1.7.0 > > > {code} > [ERROR] Failed to execute goal on project hbase-assembly: Could not resolve > dependencies for project org.apache.hbase:hbase-assembly:pom:1.7.0: Could not > find artifact org.apache.hbase:hbase-thrift:jar:1.7.0 in apache release > (https://repository.apache.org/content/repositories/releases/) -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException > [ERROR] > [ERROR] After correcting the problems, you can resume the build with the > command > [ERROR] mvn -rf :hbase-assembly > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25890) [branch-1] add -U for maven build to force a check for updated releases and snapshots on remote repositories
[ https://issues.apache.org/jira/browse/HBASE-25890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17344378#comment-17344378 ] Reid Chan commented on HBASE-25890: --- Refer to [here|https://stackoverflow.com/questions/4856307/when-maven-says-resolution-will-not-be-reattempted-until-the-update-interval-of] > [branch-1] add -U for maven build to force a check for updated releases and > snapshots on remote repositories > > > Key: HBASE-25890 > URL: https://issues.apache.org/jira/browse/HBASE-25890 > Project: HBase > Issue Type: Task >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > > {code} > [ERROR] Failed to execute goal on project hbase-assembly: Could not resolve > dependencies for project org.apache.hbase:hbase-assembly:pom:1.7.0: Could not > find artifact org.apache.hbase:hbase-thrift:jar:1.7.0 in apache release > (https://repository.apache.org/content/repositories/releases/) -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException > [ERROR] > [ERROR] After correcting the problems, you can resume the build with the > command > [ERROR] mvn -rf :hbase-assembly > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HBASE-25890) [branch-1] add -U for maven build to force a check for updated releases and snapshots on remote repositories
Reid Chan created HBASE-25890: - Summary: [branch-1] add -U for maven build to force a check for updated releases and snapshots on remote repositories Key: HBASE-25890 URL: https://issues.apache.org/jira/browse/HBASE-25890 Project: HBase Issue Type: Task Reporter: Reid Chan Assignee: Reid Chan {code} [ERROR] Failed to execute goal on project hbase-assembly: Could not resolve dependencies for project org.apache.hbase:hbase-assembly:pom:1.7.0: Could not find artifact org.apache.hbase:hbase-thrift:jar:1.7.0 in apache release (https://repository.apache.org/content/repositories/releases/) -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn -rf :hbase-assembly {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25858) [branch-1] make hbase-thrift optional in hbase-assembly module
[ https://issues.apache.org/jira/browse/HBASE-25858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17344285#comment-17344285 ] Reid Chan commented on HBASE-25858: --- I want to update the build command, with one more para `mvn -U *args`, it seems relate to the local repo containing the hbase-thrift.jar > [branch-1] make hbase-thrift optional in hbase-assembly module > -- > > Key: HBASE-25858 > URL: https://issues.apache.org/jira/browse/HBASE-25858 > Project: HBase > Issue Type: Task >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > > {code} > [ERROR] Failed to execute goal on project hbase-assembly: Could not resolve > dependencies for project org.apache.hbase:hbase-assembly:pom:1.7.0: Failure > to find org.apache.hbase:hbase-thrift:jar:1.7.0 in > https://repository.apache.org/content/repositories/releases/ was cached in > the local repository, resolution will not be reattempted until the update > interval of apache release has elapsed or updates are forced -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25858) [branch-1] make hbase-thrift optional in hbase-assembly module
[ https://issues.apache.org/jira/browse/HBASE-25858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17344256#comment-17344256 ] Reid Chan commented on HBASE-25858: --- >From HBASE-25887's pr > [branch-1] make hbase-thrift optional in hbase-assembly module > -- > > Key: HBASE-25858 > URL: https://issues.apache.org/jira/browse/HBASE-25858 > Project: HBase > Issue Type: Task >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > > {code} > [ERROR] Failed to execute goal on project hbase-assembly: Could not resolve > dependencies for project org.apache.hbase:hbase-assembly:pom:1.7.0: Failure > to find org.apache.hbase:hbase-thrift:jar:1.7.0 in > https://repository.apache.org/content/repositories/releases/ was cached in > the local repository, resolution will not be reattempted until the update > interval of apache release has elapsed or updates are forced -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25858) [branch-1] make hbase-thrift optional in hbase-assembly module
[ https://issues.apache.org/jira/browse/HBASE-25858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17344255#comment-17344255 ] Reid Chan commented on HBASE-25858: --- Oohs, still seeing after merged {code} [ERROR] Failed to execute goal on project hbase-assembly: Could not resolve dependencies for project org.apache.hbase:hbase-assembly:pom:1.7.0: Failure to find org.apache.hbase:hbase-thrift:jar:1.7.0 in https://repository.apache.org/content/repositories/releases/ was cached in the local repository, resolution will not be reattempted until the update interval of apache release has elapsed or updates are forced -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn -rf :hbase-assembly {code} Any thoughts? [~apurtell] > [branch-1] make hbase-thrift optional in hbase-assembly module > -- > > Key: HBASE-25858 > URL: https://issues.apache.org/jira/browse/HBASE-25858 > Project: HBase > Issue Type: Task >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > > {code} > [ERROR] Failed to execute goal on project hbase-assembly: Could not resolve > dependencies for project org.apache.hbase:hbase-assembly:pom:1.7.0: Failure > to find org.apache.hbase:hbase-thrift:jar:1.7.0 in > https://repository.apache.org/content/repositories/releases/ was cached in > the local repository, resolution will not be reattempted until the update > interval of apache release has elapsed or updates are forced -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25887) Corrupt wal while region server is aborting.
[ https://issues.apache.org/jira/browse/HBASE-25887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17344251#comment-17344251 ] Reid Chan commented on HBASE-25887: --- ok > Corrupt wal while region server is aborting. > > > Key: HBASE-25887 > URL: https://issues.apache.org/jira/browse/HBASE-25887 > Project: HBase > Issue Type: Improvement > Components: regionserver, wal >Affects Versions: 1.6.0 >Reporter: Rushabh Shah >Assignee: Rushabh Shah >Priority: Major > Fix For: 1.7.0 > > > We have seen a case in our production cluster where we ended up in corrupt > wal. WALSplitter logged the below error > {noformat} > 2021-05-12 00:42:46,786 FATAL [:60020-1] regionserver.HRegionServer - > ABORTING region server HOST-B,60020,16207794418 > 88: Caught throwable while processing event RS_LOG_REPLAY > java.lang.NullPointerException > at org.apache.hadoop.hbase.CellUtil.matchingFamily(CellUtil.java:411) > at > org.apache.hadoop.hbase.regionserver.wal.WALEdit.isMetaEditFamily(WALEdit.java:145) > at > org.apache.hadoop.hbase.regionserver.wal.WALEdit.isMetaEdit(WALEdit.java:150) > at > org.apache.hadoop.hbase.wal.WALSplitter.splitLogFile(WALSplitter.java:408) > at > org.apache.hadoop.hbase.wal.WALSplitter.splitLogFile(WALSplitter.java:261) > at > org.apache.hadoop.hbase.regionserver.SplitLogWorker$1.exec(SplitLogWorker.java:105) > at > org.apache.hadoop.hbase.regionserver.handler.WALSplitterHandler.process(WALSplitterHandler.java:72) > at > org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {noformat} > Looking at the raw wal file, we could see that the last WALEdit contains the > region id, tablename and sequence number but cells were not persisted. > Looking at the logs of the RS that generated that corrupt wal file, > {noformat} > 2021-05-11 23:29:22,114 DEBUG [/HOST-A:60020] wal.FSHLog - Closing WAL writer > in /hbase/WALs/HOST-A,60020,1620774393046 > 2021-05-11 23:29:22,196 DEBUG [/HOST-A:60020] ipc.AbstractRpcClient - > Stopping rpc client > 2021-05-11 23:29:22,198 INFO [/HOST-A:60020] regionserver.Leases - > regionserver/HOST-A/:60020 closing leases > 2021-05-11 23:29:22,198 INFO [/HOST-A:60020] regionserver.Leases - > regionserver/HOST-A:/HOST-A:60020 closed leases > 2021-05-11 23:29:22,198 WARN [0020.append-pool8-t1] wal.FSHLog - Append > sequenceId=7147823, requesting roll of WAL > java.nio.channels.ClosedChannelException > at > org.apache.hadoop.hdfs.DataStreamer$LastExceptionInStreamer.throwException4Close(DataStreamer.java:331) > at > org.apache.hadoop.hdfs.DFSOutputStream.checkClosed(DFSOutputStream.java:151) > at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:105) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:58) > at java.io.DataOutputStream.write(DataOutputStream.java:107) > at org.apache.hadoop.hbase.KeyValue.write(KeyValue.java:2543) > at > org.apache.phoenix.hbase.index.wal.KeyValueCodec.write(KeyValueCodec.java:104) > at > org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec$IndexKeyValueEncoder.write(IndexedWALEditCodec.java:218) > at > org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.append(ProtobufLogWriter.java:128) > at > org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.append(FSHLog.java:2083) > at > org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1941) > at > org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1857) > at > com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:129) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {noformat} > These 2 lines are interesting. > {quote}2021-05-11 23:29:22,114 DEBUG [/HOST-A:60020] wal.FSHLog - Closing WAL > writer in /hbase/WALs/HOST-A,60020,1620774393046 > > > 2021-05-11 23:29:22,198 WARN [0020.append-pool8-t1] wal.FSHLog - Append > sequenceId=7147823, requesting roll of WAL > java.nio.channels.ClosedChannelException > {quote} > The append thread encountered java.nio.channels.ClosedChannelException while > writing to wal file because the wal file was a
[jira] [Updated] (HBASE-25879) [branch-1] Update CHANGES.txt and tag 1.7.0RC0 to the most recent commit
[ https://issues.apache.org/jira/browse/HBASE-25879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-25879: -- Summary: [branch-1] Update CHANGES.txt and tag 1.7.0RC0 to the most recent commit (was: [branch-1] Update CHANGES.txt and the 1.7.0RC0 the most recent commit) > [branch-1] Update CHANGES.txt and tag 1.7.0RC0 to the most recent commit > > > Key: HBASE-25879 > URL: https://issues.apache.org/jira/browse/HBASE-25879 > Project: HBase > Issue Type: Task >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25032) Wait for region server to become online before adding it to online servers in Master
[ https://issues.apache.org/jira/browse/HBASE-25032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17342633#comment-17342633 ] Reid Chan commented on HBASE-25032: --- Ooh, I saw already reverted. > Wait for region server to become online before adding it to online servers in > Master > > > Key: HBASE-25032 > URL: https://issues.apache.org/jira/browse/HBASE-25032 > Project: HBase > Issue Type: Bug >Reporter: Sandeep Guggilam >Assignee: Caroline Zhou >Priority: Major > Labels: master, regionserver > Fix For: 3.0.0-alpha-1, 2.5.0 > > > As part of RS start up, RS reports for duty to Master . Master acknowledges > the request and adds it to the onlineServers list for further assigning any > regions to the RS > Once Master acknowledges the reportForDuty and sends back the response, RS > does a bunch of stuff like initializing replication sources etc before > becoming online. However, sometimes there could be an issue with initializing > replication sources when it is unable to connect to peer clusters because of > some kerberos configuration and there would be a delay of around 20 mins in > becoming online. > > Since master considers it online, it tries to assign regions and which fails > with ServerNotRunningYet exception, then the master tries to unassign which > again fails with the same exception leading the region to FAILED_CLOSE state. > > It would be good to have a check to see if the RS is ready to accept the > assignment requests before adding it to online servers list which would > account for any such delays as described above -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25032) Wait for region server to become online before adding it to online servers in Master
[ https://issues.apache.org/jira/browse/HBASE-25032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17342630#comment-17342630 ] Reid Chan commented on HBASE-25032: --- Catch up late, do I still need to revert it from branch-1? > Wait for region server to become online before adding it to online servers in > Master > > > Key: HBASE-25032 > URL: https://issues.apache.org/jira/browse/HBASE-25032 > Project: HBase > Issue Type: Bug >Reporter: Sandeep Guggilam >Assignee: Caroline Zhou >Priority: Major > Labels: master, regionserver > Fix For: 3.0.0-alpha-1, 2.5.0 > > > As part of RS start up, RS reports for duty to Master . Master acknowledges > the request and adds it to the onlineServers list for further assigning any > regions to the RS > Once Master acknowledges the reportForDuty and sends back the response, RS > does a bunch of stuff like initializing replication sources etc before > becoming online. However, sometimes there could be an issue with initializing > replication sources when it is unable to connect to peer clusters because of > some kerberos configuration and there would be a delay of around 20 mins in > becoming online. > > Since master considers it online, it tries to assign regions and which fails > with ServerNotRunningYet exception, then the master tries to unassign which > again fails with the same exception leading the region to FAILED_CLOSE state. > > It would be good to have a check to see if the RS is ready to accept the > assignment requests before adding it to online servers list which would > account for any such delays as described above -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HBASE-25804) [branch-1] Make hbase-thrift module build with jdk8
[ https://issues.apache.org/jira/browse/HBASE-25804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan resolved HBASE-25804. --- Resolution: Fixed > [branch-1] Make hbase-thrift module build with jdk8 > --- > > Key: HBASE-25804 > URL: https://issues.apache.org/jira/browse/HBASE-25804 > Project: HBase > Issue Type: Task > Components: build >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > Fix For: 1.7.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HBASE-25879) [branch-1] Update CHANGES.txt and the 1.7.0RC0 the most recent commit
Reid Chan created HBASE-25879: - Summary: [branch-1] Update CHANGES.txt and the 1.7.0RC0 the most recent commit Key: HBASE-25879 URL: https://issues.apache.org/jira/browse/HBASE-25879 Project: HBase Issue Type: Task Reporter: Reid Chan Assignee: Reid Chan -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25858) [branch-1] make hbase-thrift optional in hbase-assembly module
[ https://issues.apache.org/jira/browse/HBASE-25858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17342273#comment-17342273 ] Reid Chan commented on HBASE-25858: --- Try merging, FYI [~apurtell] > [branch-1] make hbase-thrift optional in hbase-assembly module > -- > > Key: HBASE-25858 > URL: https://issues.apache.org/jira/browse/HBASE-25858 > Project: HBase > Issue Type: Task >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > > {code} > [ERROR] Failed to execute goal on project hbase-assembly: Could not resolve > dependencies for project org.apache.hbase:hbase-assembly:pom:1.7.0: Failure > to find org.apache.hbase:hbase-thrift:jar:1.7.0 in > https://repository.apache.org/content/repositories/releases/ was cached in > the local repository, resolution will not be reattempted until the update > interval of apache release has elapsed or updates are forced -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25858) [branch-1] make hbase-thrift optional in hbase-assembly module
[ https://issues.apache.org/jira/browse/HBASE-25858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17340670#comment-17340670 ] Reid Chan commented on HBASE-25858: --- QA complaint a lot, looks like we need to merge it to see exact effect, WDYT. > [branch-1] make hbase-thrift optional in hbase-assembly module > -- > > Key: HBASE-25858 > URL: https://issues.apache.org/jira/browse/HBASE-25858 > Project: HBase > Issue Type: Task >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > > {code} > [ERROR] Failed to execute goal on project hbase-assembly: Could not resolve > dependencies for project org.apache.hbase:hbase-assembly:pom:1.7.0: Failure > to find org.apache.hbase:hbase-thrift:jar:1.7.0 in > https://repository.apache.org/content/repositories/releases/ was cached in > the local repository, resolution will not be reattempted until the update > interval of apache release has elapsed or updates are forced -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25858) [branch-1] make hbase-thrift optional in hbase-assembly module
[ https://issues.apache.org/jira/browse/HBASE-25858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-25858: -- Summary: [branch-1] make hbase-thrift optional in hbase-assembly module (was: [branch-1] hbase-assembly module includes hbase-thrift only when using jdk8) > [branch-1] make hbase-thrift optional in hbase-assembly module > -- > > Key: HBASE-25858 > URL: https://issues.apache.org/jira/browse/HBASE-25858 > Project: HBase > Issue Type: Task >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > > {code} > [ERROR] Failed to execute goal on project hbase-assembly: Could not resolve > dependencies for project org.apache.hbase:hbase-assembly:pom:1.7.0: Failure > to find org.apache.hbase:hbase-thrift:jar:1.7.0 in > https://repository.apache.org/content/repositories/releases/ was cached in > the local repository, resolution will not be reattempted until the update > interval of apache release has elapsed or updates are forced -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25858) [branch-1] hbase-assembly module includes hbase-thrift only when using jdk8
[ https://issues.apache.org/jira/browse/HBASE-25858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17340565#comment-17340565 ] Reid Chan commented on HBASE-25858: --- Thanks Andrew. I tried the optional tag then built it with both jdk 7&8 using command `mvn -DskipTests package assembly:single` Both worked successfully, but I expected tar ball built from jdk7 shouldn't contain thrift related jars, but it does contain. Does it meet expectation? > [branch-1] hbase-assembly module includes hbase-thrift only when using jdk8 > --- > > Key: HBASE-25858 > URL: https://issues.apache.org/jira/browse/HBASE-25858 > Project: HBase > Issue Type: Task >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > > {code} > [ERROR] Failed to execute goal on project hbase-assembly: Could not resolve > dependencies for project org.apache.hbase:hbase-assembly:pom:1.7.0: Failure > to find org.apache.hbase:hbase-thrift:jar:1.7.0 in > https://repository.apache.org/content/repositories/releases/ was cached in > the local repository, resolution will not be reattempted until the update > interval of apache release has elapsed or updates are forced -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25858) [branch-1] hbase-assembly module includes hbase-thrift only when using jdk8
[ https://issues.apache.org/jira/browse/HBASE-25858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17340274#comment-17340274 ] Reid Chan commented on HBASE-25858: --- maven profile seems possible, let me try. > [branch-1] hbase-assembly module includes hbase-thrift only when using jdk8 > --- > > Key: HBASE-25858 > URL: https://issues.apache.org/jira/browse/HBASE-25858 > Project: HBase > Issue Type: Task >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > > {code} > [ERROR] Failed to execute goal on project hbase-assembly: Could not resolve > dependencies for project org.apache.hbase:hbase-assembly:pom:1.7.0: Failure > to find org.apache.hbase:hbase-thrift:jar:1.7.0 in > https://repository.apache.org/content/repositories/releases/ was cached in > the local repository, resolution will not be reattempted until the update > interval of apache release has elapsed or updates are forced -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25612) HMaster should abort if ReplicationLogCleaner is not able to delete oldWALs.
[ https://issues.apache.org/jira/browse/HBASE-25612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-25612: -- Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) > HMaster should abort if ReplicationLogCleaner is not able to delete oldWALs. > > > Key: HBASE-25612 > URL: https://issues.apache.org/jira/browse/HBASE-25612 > Project: HBase > Issue Type: Improvement >Affects Versions: 1.6.0 >Reporter: Rushabh Shah >Assignee: Rushabh Shah >Priority: Major > Fix For: 1.7.0 > > > In our production cluster, we encountered an issue where the number of files > within /hbase/oldWALs directory were growing exponentially from about 4000 > baseline to 15 and growing at the rate of 333 files per minute. > On further investigation we found that ReplicatonLogCleaner thread was > getting aborted since it was not able to talk to zookeeper. Stack trace below > {noformat} > 2021-02-25 23:05:01,149 WARN [an-pool3-thread-1729] zookeeper.ZKUtil - > replicationLogCleaner-0x302e05e0d8f, > quorum=zookeeper-0:2181,zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181,zookeeper-4:2181, > baseZNode=/hbase Unable to get data of znode /hbase/replication/rs > org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode > = Session expired for /hbase/replication/rs > at org.apache.zookeeper.KeeperException.create(KeeperException.java:130) > at org.apache.zookeeper.KeeperException.create(KeeperException.java:54) > at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1229) > at > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:374) > at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:713) > at > org.apache.hadoop.hbase.replication.ReplicationQueuesClientZKImpl.getQueuesZNodeCversion(ReplicationQueuesClientZKImpl.java:87) > at > org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner.loadWALsFromQueues(ReplicationLogCleaner.java:99) > at > org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner.getDeletableFiles(ReplicationLogCleaner.java:70) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore.checkAndDeleteFiles(CleanerChore.java:262) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore.access$200(CleanerChore.java:52) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore$3.act(CleanerChore.java:413) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore$3.act(CleanerChore.java:410) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore.deleteAction(CleanerChore.java:481) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore.traverseAndDelete(CleanerChore.java:410) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore.access$100(CleanerChore.java:52) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore$1.run(CleanerChore.java:220) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2021-02-25 23:05:01,149 WARN [an-pool3-thread-1729] > master.ReplicationLogCleaner - ReplicationLogCleaner received abort, > ignoring. Reason: Failed to get stat of replication rs node > 2021-02-25 23:05:01,149 DEBUG [an-pool3-thread-1729] > master.ReplicationLogCleaner - > org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode > = Session expired for /hbase/replication/rs > 2021-02-25 23:05:01,150 WARN [an-pool3-thread-1729] > master.ReplicationLogCleaner - Failed to read zookeeper, skipping checking > deletable files > {noformat} > > {quote} 2021-02-25 23:05:01,149 WARN [an-pool3-thread-1729] > master.ReplicationLogCleaner - ReplicationLogCleaner received abort, > ignoring. Reason: Failed to get stat of replication rs node > {quote} > > This line is more scary where HMaster invoked Abortable but just ignored and > HMaster was doing it business as usual. > We have max files per directory configuration in namenode which is set to 1M > in our clusters. If this directory reached that limit then that would have > brought down the whole cluster. > We shouldn't ignore Abortable and should crash the Hmaster if Abortable is > invoked. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25858) [branch-1] hbase-assembly module includes hbase-thrift only when using jdk8
[ https://issues.apache.org/jira/browse/HBASE-25858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17340262#comment-17340262 ] Reid Chan commented on HBASE-25858: --- Is it possible to include dependency when satisfying some conditions? like built with jdk8 as root pom does. > [branch-1] hbase-assembly module includes hbase-thrift only when using jdk8 > --- > > Key: HBASE-25858 > URL: https://issues.apache.org/jira/browse/HBASE-25858 > Project: HBase > Issue Type: Task >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > > {code} > [ERROR] Failed to execute goal on project hbase-assembly: Could not resolve > dependencies for project org.apache.hbase:hbase-assembly:pom:1.7.0: Failure > to find org.apache.hbase:hbase-thrift:jar:1.7.0 in > https://repository.apache.org/content/repositories/releases/ was cached in > the local repository, resolution will not be reattempted until the update > interval of apache release has elapsed or updates are forced -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HBASE-25856) sorry for my mistake, could someone delete it.
[ https://issues.apache.org/jira/browse/HBASE-25856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan resolved HBASE-25856. --- Resolution: Invalid > sorry for my mistake, could someone delete it. > -- > > Key: HBASE-25856 > URL: https://issues.apache.org/jira/browse/HBASE-25856 > Project: HBase > Issue Type: Improvement >Reporter: junwen yang >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Reopened] (HBASE-25856) sorry for my mistake, could someone delete it.
[ https://issues.apache.org/jira/browse/HBASE-25856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan reopened HBASE-25856: --- > sorry for my mistake, could someone delete it. > -- > > Key: HBASE-25856 > URL: https://issues.apache.org/jira/browse/HBASE-25856 > Project: HBase > Issue Type: Improvement >Reporter: junwen yang >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25858) [branch-1] hbase-assembly module includes hbase-thrift only when using jdk8
[ https://issues.apache.org/jira/browse/HBASE-25858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17340145#comment-17340145 ] Reid Chan commented on HBASE-25858: --- [~andrew.purt...@gmail.com] > [branch-1] hbase-assembly module includes hbase-thrift only when using jdk8 > --- > > Key: HBASE-25858 > URL: https://issues.apache.org/jira/browse/HBASE-25858 > Project: HBase > Issue Type: Task >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > > {code} > [ERROR] Failed to execute goal on project hbase-assembly: Could not resolve > dependencies for project org.apache.hbase:hbase-assembly:pom:1.7.0: Failure > to find org.apache.hbase:hbase-thrift:jar:1.7.0 in > https://repository.apache.org/content/repositories/releases/ was cached in > the local repository, resolution will not be reattempted until the update > interval of apache release has elapsed or updates are forced -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25858) [branch-1] hbase-assembly module includes hbase-thrift only when using jdk8
[ https://issues.apache.org/jira/browse/HBASE-25858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-25858: -- Description: {code} [ERROR] Failed to execute goal on project hbase-assembly: Could not resolve dependencies for project org.apache.hbase:hbase-assembly:pom:1.7.0: Failure to find org.apache.hbase:hbase-thrift:jar:1.7.0 in https://repository.apache.org/content/repositories/releases/ was cached in the local repository, resolution will not be reattempted until the update interval of apache release has elapsed or updates are forced -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException {code} > [branch-1] hbase-assembly module includes hbase-thrift only when using jdk8 > --- > > Key: HBASE-25858 > URL: https://issues.apache.org/jira/browse/HBASE-25858 > Project: HBase > Issue Type: Task >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > > {code} > [ERROR] Failed to execute goal on project hbase-assembly: Could not resolve > dependencies for project org.apache.hbase:hbase-assembly:pom:1.7.0: Failure > to find org.apache.hbase:hbase-thrift:jar:1.7.0 in > https://repository.apache.org/content/repositories/releases/ was cached in > the local repository, resolution will not be reattempted until the update > interval of apache release has elapsed or updates are forced -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HBASE-25858) [branch-1] hbase-assembly module includes hbase-thrift only when using jdk8
Reid Chan created HBASE-25858: - Summary: [branch-1] hbase-assembly module includes hbase-thrift only when using jdk8 Key: HBASE-25858 URL: https://issues.apache.org/jira/browse/HBASE-25858 Project: HBase Issue Type: Task Reporter: Reid Chan Assignee: Reid Chan -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HBASE-25831) [branch-1] remove thrift examples out of hbase-examples module for bypassing the thrift version check
[ https://issues.apache.org/jira/browse/HBASE-25831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan resolved HBASE-25831. --- Fix Version/s: 1.7.0 Hadoop Flags: Reviewed Resolution: Fixed > [branch-1] remove thrift examples out of hbase-examples module for bypassing > the thrift version check > - > > Key: HBASE-25831 > URL: https://issues.apache.org/jira/browse/HBASE-25831 > Project: HBase > Issue Type: Task > Components: Thrift >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > Fix For: 1.7.0 > > > [ERROR] Failed to execute goal on project hbase-examples: Could not resolve > dependencies for project org.apache.hbase:hbase-examples:jar:1.7.0: Could not > find artifact org.apache.hbase:hbase-thrift:jar:1.7.0 in apache release > (https://repository.apache.org/content/repositories/releases/) -> [Help 1] > This is the msg when I tried to run make_rc.sh, we need to remove thrift > related codes from hbase-examples for making release successfully. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25831) [branch-1] remove thrift examples out of hbase-examples module for bypassing the thrift version check
[ https://issues.apache.org/jira/browse/HBASE-25831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17340143#comment-17340143 ] Reid Chan commented on HBASE-25831: --- Checked, hbase-example is clean now. One last module hbase-assembly still has the hbase-thrift issue. > [branch-1] remove thrift examples out of hbase-examples module for bypassing > the thrift version check > - > > Key: HBASE-25831 > URL: https://issues.apache.org/jira/browse/HBASE-25831 > Project: HBase > Issue Type: Task > Components: Thrift >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > > [ERROR] Failed to execute goal on project hbase-examples: Could not resolve > dependencies for project org.apache.hbase:hbase-examples:jar:1.7.0: Could not > find artifact org.apache.hbase:hbase-thrift:jar:1.7.0 in apache release > (https://repository.apache.org/content/repositories/releases/) -> [Help 1] > This is the msg when I tried to run make_rc.sh, we need to remove thrift > related codes from hbase-examples for making release successfully. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25612) HMaster should abort if ReplicationLogCleaner is not able to delete oldWALs.
[ https://issues.apache.org/jira/browse/HBASE-25612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17339983#comment-17339983 ] Reid Chan commented on HBASE-25612: --- I need some time to review this approach and background, not familiar with it. Currently aborting master sounds aggressive, when backup master becomes active will it be aborted again?(might caused by zk cluster issue), then back and forth. > HMaster should abort if ReplicationLogCleaner is not able to delete oldWALs. > > > Key: HBASE-25612 > URL: https://issues.apache.org/jira/browse/HBASE-25612 > Project: HBase > Issue Type: Improvement >Affects Versions: 1.6.0 >Reporter: Rushabh Shah >Assignee: Rushabh Shah >Priority: Major > Fix For: 1.7.0 > > > In our production cluster, we encountered an issue where the number of files > within /hbase/oldWALs directory were growing exponentially from about 4000 > baseline to 15 and growing at the rate of 333 files per minute. > On further investigation we found that ReplicatonLogCleaner thread was > getting aborted since it was not able to talk to zookeeper. Stack trace below > {noformat} > 2021-02-25 23:05:01,149 WARN [an-pool3-thread-1729] zookeeper.ZKUtil - > replicationLogCleaner-0x302e05e0d8f, > quorum=zookeeper-0:2181,zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181,zookeeper-4:2181, > baseZNode=/hbase Unable to get data of znode /hbase/replication/rs > org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode > = Session expired for /hbase/replication/rs > at org.apache.zookeeper.KeeperException.create(KeeperException.java:130) > at org.apache.zookeeper.KeeperException.create(KeeperException.java:54) > at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1229) > at > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:374) > at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:713) > at > org.apache.hadoop.hbase.replication.ReplicationQueuesClientZKImpl.getQueuesZNodeCversion(ReplicationQueuesClientZKImpl.java:87) > at > org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner.loadWALsFromQueues(ReplicationLogCleaner.java:99) > at > org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner.getDeletableFiles(ReplicationLogCleaner.java:70) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore.checkAndDeleteFiles(CleanerChore.java:262) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore.access$200(CleanerChore.java:52) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore$3.act(CleanerChore.java:413) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore$3.act(CleanerChore.java:410) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore.deleteAction(CleanerChore.java:481) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore.traverseAndDelete(CleanerChore.java:410) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore.access$100(CleanerChore.java:52) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore$1.run(CleanerChore.java:220) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2021-02-25 23:05:01,149 WARN [an-pool3-thread-1729] > master.ReplicationLogCleaner - ReplicationLogCleaner received abort, > ignoring. Reason: Failed to get stat of replication rs node > 2021-02-25 23:05:01,149 DEBUG [an-pool3-thread-1729] > master.ReplicationLogCleaner - > org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode > = Session expired for /hbase/replication/rs > 2021-02-25 23:05:01,150 WARN [an-pool3-thread-1729] > master.ReplicationLogCleaner - Failed to read zookeeper, skipping checking > deletable files > {noformat} > > {quote} 2021-02-25 23:05:01,149 WARN [an-pool3-thread-1729] > master.ReplicationLogCleaner - ReplicationLogCleaner received abort, > ignoring. Reason: Failed to get stat of replication rs node > {quote} > > This line is more scary where HMaster invoked Abortable but just ignored and > HMaster was doing it business as usual. > We have max files per directory configuration in namenode which is set to 1M > in our clusters. If this directory reached that limit then that would have > brought down the whole cluster. > We shouldn't ignore Abortable and should crash the Hmaster if Abortable is > invoked. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25831) [branch-1] remove thrift examples out of hbase-examples module for bypassing the thrift version check
[ https://issues.apache.org/jira/browse/HBASE-25831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17339730#comment-17339730 ] Reid Chan commented on HBASE-25831: --- cc [~busbey], It's better if I could get your review&help as well. > [branch-1] remove thrift examples out of hbase-examples module for bypassing > the thrift version check > - > > Key: HBASE-25831 > URL: https://issues.apache.org/jira/browse/HBASE-25831 > Project: HBase > Issue Type: Task > Components: Thrift >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > > [ERROR] Failed to execute goal on project hbase-examples: Could not resolve > dependencies for project org.apache.hbase:hbase-examples:jar:1.7.0: Could not > find artifact org.apache.hbase:hbase-thrift:jar:1.7.0 in apache release > (https://repository.apache.org/content/repositories/releases/) -> [Help 1] > This is the msg when I tried to run make_rc.sh, we need to remove thrift > related codes from hbase-examples for making release successfully. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-21674) Port HBASE-21652 (Refactor ThriftServer making thrift2 server inherited from thrift1 server) to branch-1
[ https://issues.apache.org/jira/browse/HBASE-21674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17339725#comment-17339725 ] Reid Chan commented on HBASE-21674: --- Thanks Sean! I think the thrift issue, what I'm struggling recently, is quite a tough task to me. When you have time, could you take a look at this [thread|https://lists.apache.org/thread.html/r118b08134676d9234362a28898249186fe73a1fb08535d6eec6a91d3%40%3Cdev.hbase.apache.org%3E], kind of related. > Port HBASE-21652 (Refactor ThriftServer making thrift2 server inherited from > thrift1 server) to branch-1 > > > Key: HBASE-21674 > URL: https://issues.apache.org/jira/browse/HBASE-21674 > Project: HBase > Issue Type: Sub-task >Reporter: Andrew Kyle Purtell >Assignee: Yutong Xiao >Priority: Major > Fix For: 1.8.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HBASE-25846) Backport 'HBASE-25825 RSGroupBasedLoadBalancer.onConfigurationChange should chain the request to internal balancer' to branch-1
[ https://issues.apache.org/jira/browse/HBASE-25846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan resolved HBASE-25846. --- Hadoop Flags: Reviewed Resolution: Fixed > Backport 'HBASE-25825 RSGroupBasedLoadBalancer.onConfigurationChange should > chain the request to internal balancer' to branch-1 > --- > > Key: HBASE-25846 > URL: https://issues.apache.org/jira/browse/HBASE-25846 > Project: HBase > Issue Type: Improvement >Reporter: Caroline Zhou >Assignee: Caroline Zhou >Priority: Minor > Fix For: 1.7.0 > > > In branch-1, > [RSGroupBasedLoadBalancer#onConfigurationChange|https://github.com/apache/hbase/blob/branch-1/hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupBasedLoadBalancer.java#L452] > doesn't do anything – it should call the internal balancer's > onConfigurationChange(). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25846) Backport 'HBASE-25825 RSGroupBasedLoadBalancer.onConfigurationChange should chain the request to internal balancer' to branch-1
[ https://issues.apache.org/jira/browse/HBASE-25846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-25846: -- Fix Version/s: 1.7.0 > Backport 'HBASE-25825 RSGroupBasedLoadBalancer.onConfigurationChange should > chain the request to internal balancer' to branch-1 > --- > > Key: HBASE-25846 > URL: https://issues.apache.org/jira/browse/HBASE-25846 > Project: HBase > Issue Type: Improvement >Reporter: Caroline Zhou >Assignee: Caroline Zhou >Priority: Minor > Fix For: 1.7.0 > > > In branch-1, > [RSGroupBasedLoadBalancer#onConfigurationChange|https://github.com/apache/hbase/blob/branch-1/hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupBasedLoadBalancer.java#L452] > doesn't do anything – it should call the internal balancer's > onConfigurationChange(). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25612) HMaster should abort if ReplicationLogCleaner is not able to delete oldWALs.
[ https://issues.apache.org/jira/browse/HBASE-25612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17339385#comment-17339385 ] Reid Chan commented on HBASE-25612: --- Known issue, working on it.. > HMaster should abort if ReplicationLogCleaner is not able to delete oldWALs. > > > Key: HBASE-25612 > URL: https://issues.apache.org/jira/browse/HBASE-25612 > Project: HBase > Issue Type: Improvement >Affects Versions: 1.6.0 >Reporter: Rushabh Shah >Assignee: Rushabh Shah >Priority: Major > Fix For: 1.7.0 > > > In our production cluster, we encountered an issue where the number of files > within /hbase/oldWALs directory were growing exponentially from about 4000 > baseline to 15 and growing at the rate of 333 files per minute. > On further investigation we found that ReplicatonLogCleaner thread was > getting aborted since it was not able to talk to zookeeper. Stack trace below > {noformat} > 2021-02-25 23:05:01,149 WARN [an-pool3-thread-1729] zookeeper.ZKUtil - > replicationLogCleaner-0x302e05e0d8f, > quorum=zookeeper-0:2181,zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181,zookeeper-4:2181, > baseZNode=/hbase Unable to get data of znode /hbase/replication/rs > org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode > = Session expired for /hbase/replication/rs > at org.apache.zookeeper.KeeperException.create(KeeperException.java:130) > at org.apache.zookeeper.KeeperException.create(KeeperException.java:54) > at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1229) > at > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:374) > at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:713) > at > org.apache.hadoop.hbase.replication.ReplicationQueuesClientZKImpl.getQueuesZNodeCversion(ReplicationQueuesClientZKImpl.java:87) > at > org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner.loadWALsFromQueues(ReplicationLogCleaner.java:99) > at > org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner.getDeletableFiles(ReplicationLogCleaner.java:70) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore.checkAndDeleteFiles(CleanerChore.java:262) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore.access$200(CleanerChore.java:52) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore$3.act(CleanerChore.java:413) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore$3.act(CleanerChore.java:410) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore.deleteAction(CleanerChore.java:481) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore.traverseAndDelete(CleanerChore.java:410) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore.access$100(CleanerChore.java:52) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore$1.run(CleanerChore.java:220) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2021-02-25 23:05:01,149 WARN [an-pool3-thread-1729] > master.ReplicationLogCleaner - ReplicationLogCleaner received abort, > ignoring. Reason: Failed to get stat of replication rs node > 2021-02-25 23:05:01,149 DEBUG [an-pool3-thread-1729] > master.ReplicationLogCleaner - > org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode > = Session expired for /hbase/replication/rs > 2021-02-25 23:05:01,150 WARN [an-pool3-thread-1729] > master.ReplicationLogCleaner - Failed to read zookeeper, skipping checking > deletable files > {noformat} > > {quote} 2021-02-25 23:05:01,149 WARN [an-pool3-thread-1729] > master.ReplicationLogCleaner - ReplicationLogCleaner received abort, > ignoring. Reason: Failed to get stat of replication rs node > {quote} > > This line is more scary where HMaster invoked Abortable but just ignored and > HMaster was doing it business as usual. > We have max files per directory configuration in namenode which is set to 1M > in our clusters. If this directory reached that limit then that would have > brought down the whole cluster. > We shouldn't ignore Abortable and should crash the Hmaster if Abortable is > invoked. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25831) [branch-1] remove thrift examples out of hbase-examples module for bypassing the thrift version check
[ https://issues.apache.org/jira/browse/HBASE-25831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17339377#comment-17339377 ] Reid Chan commented on HBASE-25831: --- There's no hbase-thrift module. Not sure whether I should merge the commit, but it looks like the PR's modifications didn't take effect? {code:java} [INFO] --< org.apache.hbase:hbase-examples >--- [INFO] Building Apache HBase - Examples 1.7.0 [INFO] [ jar ]- [INFO] [INFO] --- maven-dependency-plugin:3.1.1:tree (default-cli) @ hbase-examples --- [INFO] org.apache.hbase:hbase-examples:jar:1.7.0 [INFO] +- org.apache.hbase:hbase-annotations:test-jar:tests:1.7.0:test [INFO] | \- jdk.tools:jdk.tools:jar:1.8:system [INFO] +- com.fasterxml.jackson.jaxrs:jackson-jaxrs-json-provider:jar:2.9.10:test [INFO] | +- com.fasterxml.jackson.jaxrs:jackson-jaxrs-base:jar:2.9.10:test [INFO] | \- com.fasterxml.jackson.module:jackson-module-jaxb-annotations:jar:2.9.10:test [INFO] +- com.fasterxml.jackson.core:jackson-annotations:jar:2.9.10:test [INFO] +- com.fasterxml.jackson.core:jackson-core:jar:2.9.10:test [INFO] +- com.fasterxml.jackson.core:jackson-databind:jar:2.9.10.1:test [INFO] +- org.apache.hbase:hbase-common:jar:1.7.0:compile [INFO] | +- org.apache.hbase:hbase-annotations:jar:1.7.0:compile [INFO] | +- com.google.guava:guava:jar:12.0.1:compile [INFO] | +- commons-codec:commons-codec:jar:1.9:compile [INFO] | +- commons-lang:commons-lang:jar:2.6:compile [INFO] | +- commons-collections:commons-collections:jar:3.2.2:compile [INFO] | +- commons-io:commons-io:jar:2.4:compile [INFO] | +- org.apache.hbase.thirdparty:hbase-shaded-gson:jar:3.0.0:compile [INFO] | +- org.mortbay.jetty:jetty-util:jar:6.1.26:compile [INFO] | \- org.apache.htrace:htrace-core:jar:3.1.0-incubating:compile [INFO] +- org.apache.hbase:hbase-protocol:jar:1.7.0:compile [INFO] +- org.apache.hbase:hbase-client:jar:1.7.0:compile [INFO] | +- io.netty:netty-all:jar:4.1.8.Final:compile [INFO] | +- org.jruby.jcodings:jcodings:jar:1.0.8:compile [INFO] | +- org.jruby.joni:joni:jar:2.1.2:compile [INFO] | +- com.yammer.metrics:metrics-core:jar:2.2.0:compile [INFO] | \- org.apache.hadoop:hadoop-auth:jar:2.8.5:compile [INFO] | +- org.apache.directory.server:apacheds-kerberos-codec:jar:2.0.0-M15:compile [INFO] | | +- org.apache.directory.server:apacheds-i18n:jar:2.0.0-M15:compile [INFO] | | +- org.apache.directory.api:api-asn1-api:jar:1.0.0-M20:compile [INFO] | | \- org.apache.directory.api:api-util:jar:1.0.0-M20:compile [INFO] | \- org.apache.curator:curator-framework:jar:2.7.1:compile [INFO] +- org.apache.hbase:hbase-server:jar:1.7.0:compile [INFO] | +- org.apache.hbase:hbase-procedure:jar:1.7.0:compile [INFO] | +- org.apache.hbase:hbase-prefix-tree:jar:1.7.0:runtime [INFO] | +- org.apache.hbase:hbase-metrics-api:jar:1.7.0:compile [INFO] | +- org.apache.hbase:hbase-metrics:jar:1.7.0:compile [INFO] | | \- io.dropwizard.metrics:metrics-core:jar:3.1.2:compile [INFO] | +- commons-httpclient:commons-httpclient:jar:3.1:compile [INFO] | +- org.apache.hbase:hbase-hadoop-compat:jar:1.7.0:compile [INFO] | +- org.apache.hbase:hbase-hadoop2-compat:jar:1.7.0:compile [INFO] | +- com.sun.jersey:jersey-core:jar:1.9:compile [INFO] | +- com.sun.jersey:jersey-server:jar:1.9:compile [INFO] | | \- asm:asm:jar:3.1:compile [INFO] | +- commons-cli:commons-cli:jar:1.2:compile [INFO] | +- org.apache.commons:commons-math:jar:2.2:compile [INFO] | +- org.mortbay.jetty:jetty:jar:6.1.26:compile [INFO] | +- org.mortbay.jetty:jetty-sslengine:jar:6.1.26:compile [INFO] | +- org.mortbay.jetty:jsp-2.1:jar:6.1.14:compile [INFO] | +- org.mortbay.jetty:jsp-api-2.1:jar:6.1.14:compile [INFO] | +- org.mortbay.jetty:servlet-api-2.5:jar:6.1.14:compile [INFO] | +- tomcat:jasper-compiler:jar:5.5.23:runtime [INFO] | +- tomcat:jasper-runtime:jar:5.5.23:runtime [INFO] | | \- commons-el:commons-el:jar:1.0:runtime [INFO] | +- org.jamon:jamon-runtime:jar:2.4.1:compile [INFO] | +- com.lmax:disruptor:jar:3.4.2:compile [INFO] | +- org.apache.httpcomponents:httpclient:jar:4.5.2:compile [INFO] | +- org.apache.httpcomponents:httpcore:jar:4.4.4:compile [INFO] | +- org.apache.hadoop:hadoop-client:jar:2.8.5:compile [INFO] | | +- org.apache.hadoop:hadoop-hdfs-client:jar:2.8.5:compile [INFO] | | +- org.apache.hadoop:hadoop-mapreduce-client-app:jar:2.8.5:compile [INFO] | | \- org.apache.hadoop:hadoop-yarn-api:jar:2.8.5:compile [INFO] | \- org.apache.hadoop:hadoop-hdfs:jar:2.8.5:compile [INFO] | +- commons-daemon:commons-daemon:jar:1.0.13:compile [INFO] | \- org.fusesource.leveldbjni:leveldbjni-all:jar:1.8:compile [INFO] +- org.apache.hbase:hbase-testing-util:jar:1.7.0:test [INFO] | +- org.apache.hbase:hbase-common:test-jar:tests:1.7.0:test [INFO] | +- org.apache.hbase:hbase-server:test-jar:tests:1.7.0:tes
[jira] [Commented] (HBASE-25831) [branch-1] remove thrift examples out of hbase-examples module for bypassing the thrift version check
[ https://issues.apache.org/jira/browse/HBASE-25831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17338863#comment-17338863 ] Reid Chan commented on HBASE-25831: --- Oh, i spot that the thrift.version in hbase root pom.xml is still 0.13.0. (seems not related though > [branch-1] remove thrift examples out of hbase-examples module for bypassing > the thrift version check > - > > Key: HBASE-25831 > URL: https://issues.apache.org/jira/browse/HBASE-25831 > Project: HBase > Issue Type: Task > Components: Thrift >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > > [ERROR] Failed to execute goal on project hbase-examples: Could not resolve > dependencies for project org.apache.hbase:hbase-examples:jar:1.7.0: Could not > find artifact org.apache.hbase:hbase-thrift:jar:1.7.0 in apache release > (https://repository.apache.org/content/repositories/releases/) -> [Help 1] > This is the msg when I tried to run make_rc.sh, we need to remove thrift > related codes from hbase-examples for making release successfully. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25831) [branch-1] remove thrift examples out of hbase-examples module for bypassing the thrift version check
[ https://issues.apache.org/jira/browse/HBASE-25831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17338861#comment-17338861 ] Reid Chan commented on HBASE-25831: --- Not really understand: {code} [INFO] [INFO] Building Apache HBase - Examples 1.7.0 [INFO] Downloading: https://repository.apache.org/content/repositories/releases/org/apache/hbase/hbase-thrift/1.7.0/hbase-thrift-1.7.0.pom Downloading: https://repository.jboss.org/nexus/content/groups/public-jboss/org/apache/hbase/hbase-thrift/1.7.0/hbase-thrift-1.7.0.pom Downloading: https://repo.maven.apache.org/maven2/org/apache/hbase/hbase-thrift/1.7.0/hbase-thrift-1.7.0.pom [WARNING] The POM for org.apache.hbase:hbase-thrift:jar:1.7.0 is missing, no dependency information available Downloading: https://repository.apache.org/content/repositories/releases/org/apache/thrift/libthrift/0.13.0/libthrift-0.13.0.pom Downloaded: https://repository.apache.org/content/repositories/releases/org/apache/thrift/libthrift/0.13.0/libthrift-0.13.0.pom (3 KB at 4.0 KB/sec) Downloading: https://repository.apache.org/content/repositories/releases/org/apache/httpcomponents/httpclient/4.5.6/httpclient-4.5.6.pom Downloaded: https://repository.apache.org/content/repositories/releases/org/apache/httpcomponents/httpclient/4.5.6/httpclient-4.5.6.pom (7 KB at 19.8 KB/sec) Downloading: https://repository.apache.org/content/repositories/releases/org/apache/httpcomponents/httpcomponents-client/4.5.6/httpcomponents-client-4.5.6.pom Downloaded: https://repository.apache.org/content/repositories/releases/org/apache/httpcomponents/httpcomponents-client/4.5.6/httpcomponents-client-4.5.6.pom (16 KB at 31.3 KB/sec) Downloading: https://repository.apache.org/content/repositories/releases/org/apache/httpcomponents/httpcomponents-parent/10/httpcomponents-parent-10.pom Downloaded: https://repository.apache.org/content/repositories/releases/org/apache/httpcomponents/httpcomponents-parent/10/httpcomponents-parent-10.pom (33 KB at 67.6 KB/sec) Downloading: https://repository.apache.org/content/repositories/releases/javax/annotation/javax.annotation-api/1.3.2/javax.annotation-api-1.3.2.pom Downloading: https://repository.jboss.org/nexus/content/groups/public-jboss/javax/annotation/javax.annotation-api/1.3.2/javax.annotation-api-1.3.2.pom Downloaded: https://repository.jboss.org/nexus/content/groups/public-jboss/javax/annotation/javax.annotation-api/1.3.2/javax.annotation-api-1.3.2.pom (15 KB at 93.0 KB/sec) Downloading: https://repository.apache.org/content/repositories/releases/net/java/jvnet-parent/3/jvnet-parent-3.pom Downloading: https://repository.jboss.org/nexus/content/groups/public-jboss/net/java/jvnet-parent/3/jvnet-parent-3.pom Downloaded: https://repository.jboss.org/nexus/content/groups/public-jboss/net/java/jvnet-parent/3/jvnet-parent-3.pom (5 KB at 31.2 KB/sec) Downloading: https://repository.apache.org/content/repositories/releases/org/apache/hbase/hbase-thrift/1.7.0/hbase-thrift-1.7.0.jar Downloading: https://repository.apache.org/content/repositories/releases/org/apache/thrift/libthrift/0.13.0/libthrift-0.13.0.jar Downloading: https://repository.apache.org/content/repositories/releases/javax/annotation/javax.annotation-api/1.3.2/javax.annotation-api-1.3.2.jar Downloaded: https://repository.apache.org/content/repositories/releases/org/apache/thrift/libthrift/0.13.0/libthrift-0.13.0.jar (241 KB at 166.4 KB/sec) Downloading: https://repository.jboss.org/nexus/content/groups/public-jboss/org/apache/hbase/hbase-thrift/1.7.0/hbase-thrift-1.7.0.jar Downloading: https://repository.jboss.org/nexus/content/groups/public-jboss/javax/annotation/javax.annotation-api/1.3.2/javax.annotation-api-1.3.2.jar Downloaded: https://repository.jboss.org/nexus/content/groups/public-jboss/javax/annotation/javax.annotation-api/1.3.2/javax.annotation-api-1.3.2.jar (26 KB at 170.8 KB/sec) Downloading: https://repo.maven.apache.org/maven2/org/apache/hbase/hbase-thrift/1.7.0/hbase-thrift-1.7.0.jar {code} I already removed hbase-thrift dependency from pom.xml, why here still trying to download the hbase-thrift-1.7.0.jar. Any idea? > [branch-1] remove thrift examples out of hbase-examples module for bypassing > the thrift version check > - > > Key: HBASE-25831 > URL: https://issues.apache.org/jira/browse/HBASE-25831 > Project: HBase > Issue Type: Task > Components: Thrift >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > > [ERROR] Failed to execute goal on project hbase-examples: Could not resolve > dependencies for project org.apache.hbase:hbase
[jira] [Resolved] (HBASE-25845) [branch-1] Precommit fails to build docker due to python-dateutil
[ https://issues.apache.org/jira/browse/HBASE-25845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan resolved HBASE-25845. --- Fix Version/s: 1.7.0 Resolution: Fixed > [branch-1] Precommit fails to build docker due to python-dateutil > - > > Key: HBASE-25845 > URL: https://issues.apache.org/jira/browse/HBASE-25845 > Project: HBase > Issue Type: Task >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Major > Fix For: 1.7.0 > > > 02:34:22 Downloading/unpacking python-dateutil > 02:34:22Cannot fetch index base URL https://pypi.python.org/simple/ > 02:34:22Could not find any downloads that satisfy the requirement > python-dateutil > 02:34:22 Cleaning up... > 02:34:22 No distributions at all found for python-dateutil > 02:34:22 Storing debug log for failure in /root/.pip/pip.log > 02:34:22 The command '/bin/sh -c pip install python-dateutil' returned a > non-zero code: 1 > 02:34:22 ERROR: Docker failed to build yetus/hbase:edccfe439a. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25845) [branch-1] Precommit fails to build docker due to python-dateutil
[ https://issues.apache.org/jira/browse/HBASE-25845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-25845: -- Component/s: build > [branch-1] Precommit fails to build docker due to python-dateutil > - > > Key: HBASE-25845 > URL: https://issues.apache.org/jira/browse/HBASE-25845 > Project: HBase > Issue Type: Task > Components: build >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Major > Fix For: 1.7.0 > > > 02:34:22 Downloading/unpacking python-dateutil > 02:34:22Cannot fetch index base URL https://pypi.python.org/simple/ > 02:34:22Could not find any downloads that satisfy the requirement > python-dateutil > 02:34:22 Cleaning up... > 02:34:22 No distributions at all found for python-dateutil > 02:34:22 Storing debug log for failure in /root/.pip/pip.log > 02:34:22 The command '/bin/sh -c pip install python-dateutil' returned a > non-zero code: 1 > 02:34:22 ERROR: Docker failed to build yetus/hbase:edccfe439a. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25612) HMaster should abort if ReplicationLogCleaner is not able to delete oldWALs.
[ https://issues.apache.org/jira/browse/HBASE-25612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17338824#comment-17338824 ] Reid Chan commented on HBASE-25612: --- could you update the branch-1 pull request again, I think the pre-commit works now. > HMaster should abort if ReplicationLogCleaner is not able to delete oldWALs. > > > Key: HBASE-25612 > URL: https://issues.apache.org/jira/browse/HBASE-25612 > Project: HBase > Issue Type: Improvement >Affects Versions: 1.6.0 >Reporter: Rushabh Shah >Assignee: Rushabh Shah >Priority: Major > Fix For: 1.8.0 > > > In our production cluster, we encountered an issue where the number of files > within /hbase/oldWALs directory were growing exponentially from about 4000 > baseline to 15 and growing at the rate of 333 files per minute. > On further investigation we found that ReplicatonLogCleaner thread was > getting aborted since it was not able to talk to zookeeper. Stack trace below > {noformat} > 2021-02-25 23:05:01,149 WARN [an-pool3-thread-1729] zookeeper.ZKUtil - > replicationLogCleaner-0x302e05e0d8f, > quorum=zookeeper-0:2181,zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181,zookeeper-4:2181, > baseZNode=/hbase Unable to get data of znode /hbase/replication/rs > org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode > = Session expired for /hbase/replication/rs > at org.apache.zookeeper.KeeperException.create(KeeperException.java:130) > at org.apache.zookeeper.KeeperException.create(KeeperException.java:54) > at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1229) > at > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:374) > at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:713) > at > org.apache.hadoop.hbase.replication.ReplicationQueuesClientZKImpl.getQueuesZNodeCversion(ReplicationQueuesClientZKImpl.java:87) > at > org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner.loadWALsFromQueues(ReplicationLogCleaner.java:99) > at > org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner.getDeletableFiles(ReplicationLogCleaner.java:70) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore.checkAndDeleteFiles(CleanerChore.java:262) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore.access$200(CleanerChore.java:52) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore$3.act(CleanerChore.java:413) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore$3.act(CleanerChore.java:410) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore.deleteAction(CleanerChore.java:481) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore.traverseAndDelete(CleanerChore.java:410) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore.access$100(CleanerChore.java:52) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore$1.run(CleanerChore.java:220) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2021-02-25 23:05:01,149 WARN [an-pool3-thread-1729] > master.ReplicationLogCleaner - ReplicationLogCleaner received abort, > ignoring. Reason: Failed to get stat of replication rs node > 2021-02-25 23:05:01,149 DEBUG [an-pool3-thread-1729] > master.ReplicationLogCleaner - > org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode > = Session expired for /hbase/replication/rs > 2021-02-25 23:05:01,150 WARN [an-pool3-thread-1729] > master.ReplicationLogCleaner - Failed to read zookeeper, skipping checking > deletable files > {noformat} > > {quote} 2021-02-25 23:05:01,149 WARN [an-pool3-thread-1729] > master.ReplicationLogCleaner - ReplicationLogCleaner received abort, > ignoring. Reason: Failed to get stat of replication rs node > {quote} > > This line is more scary where HMaster invoked Abortable but just ignored and > HMaster was doing it business as usual. > We have max files per directory configuration in namenode which is set to 1M > in our clusters. If this directory reached that limit then that would have > brought down the whole cluster. > We shouldn't ignore Abortable and should crash the Hmaster if Abortable is > invoked. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25612) HMaster should abort if ReplicationLogCleaner is not able to delete oldWALs.
[ https://issues.apache.org/jira/browse/HBASE-25612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17338726#comment-17338726 ] Reid Chan commented on HBASE-25612: --- can, let me fix the precommit error first. > HMaster should abort if ReplicationLogCleaner is not able to delete oldWALs. > > > Key: HBASE-25612 > URL: https://issues.apache.org/jira/browse/HBASE-25612 > Project: HBase > Issue Type: Improvement >Affects Versions: 1.6.0 >Reporter: Rushabh Shah >Assignee: Rushabh Shah >Priority: Major > Fix For: 1.8.0 > > > In our production cluster, we encountered an issue where the number of files > within /hbase/oldWALs directory were growing exponentially from about 4000 > baseline to 15 and growing at the rate of 333 files per minute. > On further investigation we found that ReplicatonLogCleaner thread was > getting aborted since it was not able to talk to zookeeper. Stack trace below > {noformat} > 2021-02-25 23:05:01,149 WARN [an-pool3-thread-1729] zookeeper.ZKUtil - > replicationLogCleaner-0x302e05e0d8f, > quorum=zookeeper-0:2181,zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181,zookeeper-4:2181, > baseZNode=/hbase Unable to get data of znode /hbase/replication/rs > org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode > = Session expired for /hbase/replication/rs > at org.apache.zookeeper.KeeperException.create(KeeperException.java:130) > at org.apache.zookeeper.KeeperException.create(KeeperException.java:54) > at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1229) > at > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:374) > at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:713) > at > org.apache.hadoop.hbase.replication.ReplicationQueuesClientZKImpl.getQueuesZNodeCversion(ReplicationQueuesClientZKImpl.java:87) > at > org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner.loadWALsFromQueues(ReplicationLogCleaner.java:99) > at > org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner.getDeletableFiles(ReplicationLogCleaner.java:70) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore.checkAndDeleteFiles(CleanerChore.java:262) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore.access$200(CleanerChore.java:52) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore$3.act(CleanerChore.java:413) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore$3.act(CleanerChore.java:410) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore.deleteAction(CleanerChore.java:481) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore.traverseAndDelete(CleanerChore.java:410) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore.access$100(CleanerChore.java:52) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore$1.run(CleanerChore.java:220) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2021-02-25 23:05:01,149 WARN [an-pool3-thread-1729] > master.ReplicationLogCleaner - ReplicationLogCleaner received abort, > ignoring. Reason: Failed to get stat of replication rs node > 2021-02-25 23:05:01,149 DEBUG [an-pool3-thread-1729] > master.ReplicationLogCleaner - > org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode > = Session expired for /hbase/replication/rs > 2021-02-25 23:05:01,150 WARN [an-pool3-thread-1729] > master.ReplicationLogCleaner - Failed to read zookeeper, skipping checking > deletable files > {noformat} > > {quote} 2021-02-25 23:05:01,149 WARN [an-pool3-thread-1729] > master.ReplicationLogCleaner - ReplicationLogCleaner received abort, > ignoring. Reason: Failed to get stat of replication rs node > {quote} > > This line is more scary where HMaster invoked Abortable but just ignored and > HMaster was doing it business as usual. > We have max files per directory configuration in namenode which is set to 1M > in our clusters. If this directory reached that limit then that would have > brought down the whole cluster. > We shouldn't ignore Abortable and should crash the Hmaster if Abortable is > invoked. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HBASE-25845) [branch-1] Precommit fails to build docker due to python-dateutil
Reid Chan created HBASE-25845: - Summary: [branch-1] Precommit fails to build docker due to python-dateutil Key: HBASE-25845 URL: https://issues.apache.org/jira/browse/HBASE-25845 Project: HBase Issue Type: Task Reporter: Reid Chan Assignee: Reid Chan 02:34:22 Downloading/unpacking python-dateutil 02:34:22Cannot fetch index base URL https://pypi.python.org/simple/ 02:34:22Could not find any downloads that satisfy the requirement python-dateutil 02:34:22 Cleaning up... 02:34:22 No distributions at all found for python-dateutil 02:34:22 Storing debug log for failure in /root/.pip/pip.log 02:34:22 The command '/bin/sh -c pip install python-dateutil' returned a non-zero code: 1 02:34:22 ERROR: Docker failed to build yetus/hbase:edccfe439a. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25831) [branch-1] remove thrift examples out of hbase-examples module for bypassing the thrift version check
[ https://issues.apache.org/jira/browse/HBASE-25831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17337200#comment-17337200 ] Reid Chan commented on HBASE-25831: --- [~andrew.purt...@gmail.com] > [branch-1] remove thrift examples out of hbase-examples module for bypassing > the thrift version check > - > > Key: HBASE-25831 > URL: https://issues.apache.org/jira/browse/HBASE-25831 > Project: HBase > Issue Type: Task > Components: Thrift >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > > [ERROR] Failed to execute goal on project hbase-examples: Could not resolve > dependencies for project org.apache.hbase:hbase-examples:jar:1.7.0: Could not > find artifact org.apache.hbase:hbase-thrift:jar:1.7.0 in apache release > (https://repository.apache.org/content/repositories/releases/) -> [Help 1] > This is the msg when I tried to run make_rc.sh, we need to remove thrift > related codes from hbase-examples for making release successfully. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25831) [branch-1] remove thrift examples out of hbase-examples module for bypassing the thrift version check
[ https://issues.apache.org/jira/browse/HBASE-25831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-25831: -- Description: [ERROR] Failed to execute goal on project hbase-examples: Could not resolve dependencies for project org.apache.hbase:hbase-examples:jar:1.7.0: Could not find artifact org.apache.hbase:hbase-thrift:jar:1.7.0 in apache release (https://repository.apache.org/content/repositories/releases/) -> [Help 1] This is the msg when I tried to run make_rc.sh, we need to remove thrift related codes from hbase-examples for making release successfully. > [branch-1] remove thrift examples out of hbase-examples module for bypassing > the thrift version check > - > > Key: HBASE-25831 > URL: https://issues.apache.org/jira/browse/HBASE-25831 > Project: HBase > Issue Type: Task > Components: Thrift >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > > [ERROR] Failed to execute goal on project hbase-examples: Could not resolve > dependencies for project org.apache.hbase:hbase-examples:jar:1.7.0: Could not > find artifact org.apache.hbase:hbase-thrift:jar:1.7.0 in apache release > (https://repository.apache.org/content/repositories/releases/) -> [Help 1] > This is the msg when I tried to run make_rc.sh, we need to remove thrift > related codes from hbase-examples for making release successfully. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25831) [branch-1] remove thrift examples out of hbase-examples module for bypassing the thrift version check
[ https://issues.apache.org/jira/browse/HBASE-25831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-25831: -- Summary: [branch-1] remove thrift examples out of hbase-examples module for bypassing the thrift version check (was: [branch-1] remove thrift examples out of hbaes-examples module for bypassing the thrift version check) > [branch-1] remove thrift examples out of hbase-examples module for bypassing > the thrift version check > - > > Key: HBASE-25831 > URL: https://issues.apache.org/jira/browse/HBASE-25831 > Project: HBase > Issue Type: Task > Components: Thrift >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HBASE-25831) [branch-1] remove thrift examples out of hbaes-examples module for bypassing the thrift version check
Reid Chan created HBASE-25831: - Summary: [branch-1] remove thrift examples out of hbaes-examples module for bypassing the thrift version check Key: HBASE-25831 URL: https://issues.apache.org/jira/browse/HBASE-25831 Project: HBase Issue Type: Task Components: Thrift Reporter: Reid Chan Assignee: Reid Chan -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25809) [branch-1] TestAtomicOperation.testMultiRowMutationMultiThreads deadlock
[ https://issues.apache.org/jira/browse/HBASE-25809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-25809: -- Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) > [branch-1] TestAtomicOperation.testMultiRowMutationMultiThreads deadlock > > > Key: HBASE-25809 > URL: https://issues.apache.org/jira/browse/HBASE-25809 > Project: HBase > Issue Type: Bug >Reporter: Andrew Kyle Purtell >Assignee: Andrew Kyle Purtell >Priority: Major > Fix For: 1.7.0 > > > TestAtomicOperation.testMultiRowMutationMultiThreads deadlocks. > There is an easy fix for the test that synchronizes on the CHM instead of the > object. We already have a Findbugs exceptions for synchronization on the CHM > and get-then-set on it is what we synchronizing for anyway. > This is only relevant for branch-1 because we have to synchronize > get-then-set due to Java 7 compatibility. For branch-2 and master we use > CHM#computeIfPresent. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25808) [branch-1] Backport improvements to FSHLog from branch-2
[ https://issues.apache.org/jira/browse/HBASE-25808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-25808: -- Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) > [branch-1] Backport improvements to FSHLog from branch-2 > > > Key: HBASE-25808 > URL: https://issues.apache.org/jira/browse/HBASE-25808 > Project: HBase > Issue Type: New Feature > Components: regionserver, wal >Affects Versions: 1.6.0 >Reporter: Andrew Kyle Purtell >Assignee: Andrew Kyle Purtell >Priority: Major > Fix For: 1.7.0 > > > From c96b642f15d (HBASE-15265 Implement an asynchronous FSHLog) > - Better formatting; use { } instead of dangling ifs, elses, etc. in > RingBufferEventHandler. > > From b1269ec57ff (HBASE-19811 Fix findbugs and error-prone warnings in > hbase-server (branch-2)) > - Change syncFuturesCount in RingBufferEventHandler from 'volatile int' > to AtomicInteger. > 'volatile' is insufficient for multithreaded access. > > From afc1746757f (HBASE-24034 [Flakey Tests] A couple of fixes and > cleanups) > - Make a local copy of takeSyncFuture after we get it in SyncRunner#run. > This looks like a workaround for a JIT/compiler bug in some Java > versions. (SCARY) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HBASE-25804) [branch-1] Make hbase-thrift module build with jdk8
[ https://issues.apache.org/jira/browse/HBASE-25804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17329130#comment-17329130 ] Reid Chan edited comment on HBASE-25804 at 4/22/21, 1:51 PM: - Not sure if I understand correctly, ping [~andrew.purt...@gmail.com]. https://github.com/apache/hbase/pull/3193 was (Author: reidchan): Not sure if I understand correctly, ping [~andrew.purt...@gmail.com] > [branch-1] Make hbase-thrift module build with jdk8 > --- > > Key: HBASE-25804 > URL: https://issues.apache.org/jira/browse/HBASE-25804 > Project: HBase > Issue Type: Task > Components: build >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > Fix For: 1.7.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25804) [branch-1] Make hbase-thrift module build with jdk8
[ https://issues.apache.org/jira/browse/HBASE-25804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17329130#comment-17329130 ] Reid Chan commented on HBASE-25804: --- Not sure if I understand correctly, ping [~andrew.purt...@gmail.com] > [branch-1] Make hbase-thrift module build with jdk8 > --- > > Key: HBASE-25804 > URL: https://issues.apache.org/jira/browse/HBASE-25804 > Project: HBase > Issue Type: Task > Components: build >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > Fix For: 1.7.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HBASE-25804) [branch-1] Make hbase-thrift module build with jdk8
Reid Chan created HBASE-25804: - Summary: [branch-1] Make hbase-thrift module build with jdk8 Key: HBASE-25804 URL: https://issues.apache.org/jira/browse/HBASE-25804 Project: HBase Issue Type: Task Components: build Reporter: Reid Chan Assignee: Reid Chan Fix For: 1.7.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HBASE-25748) [Flake Test][branch-1] TestAdmin2
[ https://issues.apache.org/jira/browse/HBASE-25748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan resolved HBASE-25748. --- Resolution: Fixed > [Flake Test][branch-1] TestAdmin2 > - > > Key: HBASE-25748 > URL: https://issues.apache.org/jira/browse/HBASE-25748 > Project: HBase > Issue Type: Test > Components: test >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Major > Fix For: 1.7.0 > > > Attempt to improve the stability. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25748) [Flake Test][branch-1] TestAdmin2
[ https://issues.apache.org/jira/browse/HBASE-25748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17319145#comment-17319145 ] Reid Chan commented on HBASE-25748: --- https://ci-hadoop.apache.org/job/HBase/job/HBase-Flaky-Tests/job/branch-1/1135/testReport/junit/org.apache.hadoop.hbase.client/TestAdmin2/testCreateTableRPCTimeOut/ keeps failed, but local works well. > [Flake Test][branch-1] TestAdmin2 > - > > Key: HBASE-25748 > URL: https://issues.apache.org/jira/browse/HBASE-25748 > Project: HBase > Issue Type: Test > Components: test >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Major > Fix For: 1.7.0 > > > Attempt to improve the stability. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Reopened] (HBASE-25748) [Flake Test][branch-1] TestAdmin2
[ https://issues.apache.org/jira/browse/HBASE-25748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan reopened HBASE-25748: --- Spotted a bug in testCreateTableRPCTimeOut, let me try to fix it. > [Flake Test][branch-1] TestAdmin2 > - > > Key: HBASE-25748 > URL: https://issues.apache.org/jira/browse/HBASE-25748 > Project: HBase > Issue Type: Test > Components: test >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Major > Fix For: 1.7.0 > > > Attempt to improve the stability. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HBASE-25753) [Flake Test][branch-1] TestSnapshotCloneIndependence's teardown() is always timeout
[ https://issues.apache.org/jira/browse/HBASE-25753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan resolved HBASE-25753. --- Resolution: Fixed > [Flake Test][branch-1] TestSnapshotCloneIndependence's teardown() is always > timeout > --- > > Key: HBASE-25753 > URL: https://issues.apache.org/jira/browse/HBASE-25753 > Project: HBase > Issue Type: Test >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Major > Fix For: 1.7.0 > > > The tearDown method after each test is always timeout. > So the fix is to > 1. make the tearDown method without After annotation and make it private > 2. move it under each test > 3. Increase timeout to 3 for each test, as for it is MediumTests. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HBASE-25753) [Flake Test][branch-1] TestSnapshotCloneIndependence's teardown() is always timeout
Reid Chan created HBASE-25753: - Summary: [Flake Test][branch-1] TestSnapshotCloneIndependence's teardown() is always timeout Key: HBASE-25753 URL: https://issues.apache.org/jira/browse/HBASE-25753 Project: HBase Issue Type: Test Reporter: Reid Chan Assignee: Reid Chan Fix For: 1.7.0 The tearDown method after each test is always timeout. So the fix is to 1. make the tearDown method without After annotation and make it private 2. move it under each test 3. Increase timeout to 3 for each test, as for it is MediumTests. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HBASE-25748) [Flake Test][branch-1] TestAdmin2
[ https://issues.apache.org/jira/browse/HBASE-25748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan resolved HBASE-25748. --- Hadoop Flags: Reviewed Resolution: Fixed > [Flake Test][branch-1] TestAdmin2 > - > > Key: HBASE-25748 > URL: https://issues.apache.org/jira/browse/HBASE-25748 > Project: HBase > Issue Type: Test >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Major > Fix For: 1.7.0 > > > Attempt to improve the stability. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25748) [Flake Test][branch-1] TestAdmin2
[ https://issues.apache.org/jira/browse/HBASE-25748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-25748: -- Component/s: test > [Flake Test][branch-1] TestAdmin2 > - > > Key: HBASE-25748 > URL: https://issues.apache.org/jira/browse/HBASE-25748 > Project: HBase > Issue Type: Test > Components: test >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Major > Fix For: 1.7.0 > > > Attempt to improve the stability. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25748) [Flake Test][branch-1] TestAdmin2
[ https://issues.apache.org/jira/browse/HBASE-25748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17317079#comment-17317079 ] Reid Chan commented on HBASE-25748: --- Most of them are "java.util.concurrent.TimeoutException: The procedure $id is still running" > [Flake Test][branch-1] TestAdmin2 > - > > Key: HBASE-25748 > URL: https://issues.apache.org/jira/browse/HBASE-25748 > Project: HBase > Issue Type: Test >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Major > Fix For: 1.7.0 > > > Attempt to improve the stability. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HBASE-25748) [Flake Test][branch-1] TestAdmin2
Reid Chan created HBASE-25748: - Summary: [Flake Test][branch-1] TestAdmin2 Key: HBASE-25748 URL: https://issues.apache.org/jira/browse/HBASE-25748 Project: HBase Issue Type: Test Reporter: Reid Chan Assignee: Reid Chan Fix For: 1.7.0 Attempt to improve the stability. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25740) Backport HBASE-25629 to branch-1
[ https://issues.apache.org/jira/browse/HBASE-25740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17316837#comment-17316837 ] Reid Chan commented on HBASE-25740: --- For those timeout failed, I think it is related to resources. Some failures are probabilistic, not constantly failed like previous TestConnectionImplementation. You could try to run tests on your local to verify whether they passes (I did), LGTM, although not sure any failed tests I missed, but i'll try more runs. Please ping me, if you create a new ticket about branch-1, thanks [~gjacoby]! > Backport HBASE-25629 to branch-1 > > > Key: HBASE-25740 > URL: https://issues.apache.org/jira/browse/HBASE-25740 > Project: HBase > Issue Type: Test >Affects Versions: 1.7.0 >Reporter: Geoffrey Jacoby >Assignee: Geoffrey Jacoby >Priority: Major > Fix For: 1.7.0 > > > HBASE-25629 recently fixed an issue where TestCurrentHourProvider > consistently failed on certain OSes due to quirks in time zone > implementations. This test is also failing in branch-1, so in order to > expedite a potential 1.7.0 release we should backport to branch-1 as well. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25740) Backport HBASE-25629 to branch-1
[ https://issues.apache.org/jira/browse/HBASE-25740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17316095#comment-17316095 ] Reid Chan commented on HBASE-25740: --- Hi [~gjacoby], do you still have any pending JIRAs targeting to branch-1, because I'm ready for RC0, please feel free to ping me if any. > Backport HBASE-25629 to branch-1 > > > Key: HBASE-25740 > URL: https://issues.apache.org/jira/browse/HBASE-25740 > Project: HBase > Issue Type: Test >Affects Versions: 1.7.0 >Reporter: Geoffrey Jacoby >Assignee: Geoffrey Jacoby >Priority: Major > Fix For: 1.7.0 > > > HBASE-25629 recently fixed an issue where TestCurrentHourProvider > consistently failed on certain OSes due to quirks in time zone > implementations. This test is also failing in branch-1, so in order to > expedite a potential 1.7.0 release we should backport to branch-1 as well. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25740) Backport HBASE-25629 to branch-1
[ https://issues.apache.org/jira/browse/HBASE-25740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-25740: -- Fix Version/s: 1.7.0 > Backport HBASE-25629 to branch-1 > > > Key: HBASE-25740 > URL: https://issues.apache.org/jira/browse/HBASE-25740 > Project: HBase > Issue Type: Test >Affects Versions: 1.7.0 >Reporter: Geoffrey Jacoby >Assignee: Geoffrey Jacoby >Priority: Major > Fix For: 1.7.0 > > > HBASE-25629 recently fixed an issue where TestCurrentHourProvider > consistently failed on certain OSes due to quirks in time zone > implementations. This test is also failing in branch-1, so in order to > expedite a potential 1.7.0 release we should backport to branch-1 as well. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HBASE-25740) Backport HBASE-25629 to branch-1
[ https://issues.apache.org/jira/browse/HBASE-25740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan resolved HBASE-25740. --- Hadoop Flags: Reviewed Resolution: Fixed > Backport HBASE-25629 to branch-1 > > > Key: HBASE-25740 > URL: https://issues.apache.org/jira/browse/HBASE-25740 > Project: HBase > Issue Type: Test >Affects Versions: 1.7.0 >Reporter: Geoffrey Jacoby >Assignee: Geoffrey Jacoby >Priority: Major > Fix For: 1.7.0 > > > HBASE-25629 recently fixed an issue where TestCurrentHourProvider > consistently failed on certain OSes due to quirks in time zone > implementations. This test is also failing in branch-1, so in order to > expedite a potential 1.7.0 release we should backport to branch-1 as well. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25687) Backport "HBASE-25681 Add a switch for server/table queryMeter" to branch-2 and branch-1
[ https://issues.apache.org/jira/browse/HBASE-25687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-25687: -- Fix Version/s: 1.7.0 > Backport "HBASE-25681 Add a switch for server/table queryMeter" to branch-2 > and branch-1 > > > Key: HBASE-25687 > URL: https://issues.apache.org/jira/browse/HBASE-25687 > Project: HBase > Issue Type: Sub-task >Reporter: Baiqiang Zhao >Assignee: Baiqiang Zhao >Priority: Major > Fix For: 1.7.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25122) [Flake Test][branch-1] TestExportSnapshotWithTemporaryDirectory
[ https://issues.apache.org/jira/browse/HBASE-25122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-25122: -- Issue Type: Test (was: Improvement) > [Flake Test][branch-1] TestExportSnapshotWithTemporaryDirectory > --- > > Key: HBASE-25122 > URL: https://issues.apache.org/jira/browse/HBASE-25122 > Project: HBase > Issue Type: Test > Components: test >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Minor > Fix For: 1.7.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25114) [Flake Test][branch-1] TestFromClientSide#testCacheOnWriteEvictOnClose
[ https://issues.apache.org/jira/browse/HBASE-25114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-25114: -- Issue Type: Test (was: Improvement) > [Flake Test][branch-1] TestFromClientSide#testCacheOnWriteEvictOnClose > -- > > Key: HBASE-25114 > URL: https://issues.apache.org/jira/browse/HBASE-25114 > Project: HBase > Issue Type: Test >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Minor > Fix For: 1.7.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25025) [Flaky Test][branch-1] TestFromClientSide#testCheckAndDeleteWithCompareOp
[ https://issues.apache.org/jira/browse/HBASE-25025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-25025: -- Issue Type: Test (was: Improvement) > [Flaky Test][branch-1] TestFromClientSide#testCheckAndDeleteWithCompareOp > - > > Key: HBASE-25025 > URL: https://issues.apache.org/jira/browse/HBASE-25025 > Project: HBase > Issue Type: Test > Components: test >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Minor > Fix For: 1.7.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25031) [Flaky Test] TestReplicationDisableInactivePeer#testDisableInactivePeer
[ https://issues.apache.org/jira/browse/HBASE-25031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-25031: -- Issue Type: Test (was: Improvement) > [Flaky Test] TestReplicationDisableInactivePeer#testDisableInactivePeer > --- > > Key: HBASE-25031 > URL: https://issues.apache.org/jira/browse/HBASE-25031 > Project: HBase > Issue Type: Test > Components: test >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Minor > Fix For: 1.7.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25030) [Flaky Test] TestRestartCluster#testClusterRestart
[ https://issues.apache.org/jira/browse/HBASE-25030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-25030: -- Issue Type: Test (was: Improvement) > [Flaky Test] TestRestartCluster#testClusterRestart > -- > > Key: HBASE-25030 > URL: https://issues.apache.org/jira/browse/HBASE-25030 > Project: HBase > Issue Type: Test > Components: test >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Minor > Fix For: 1.7.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-23330) Expose cluster ID for clients using it for delegation token based auth
[ https://issues.apache.org/jira/browse/HBASE-23330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-23330: -- Fix Version/s: 1.7.0 > Expose cluster ID for clients using it for delegation token based auth > > > Key: HBASE-23330 > URL: https://issues.apache.org/jira/browse/HBASE-23330 > Project: HBase > Issue Type: Sub-task > Components: Client, master >Affects Versions: 3.0.0-alpha-1 >Reporter: Bharath Vissapragada >Assignee: Bharath Vissapragada >Priority: Major > Fix For: 3.0.0-alpha-1, 2.3.0, 1.7.0 > > > As Gary Helming noted in HBASE-18095, some clients use Cluster ID for > delgation based auth. > {quote} > There is an additional complication here for token-based authentication. When > a delegation token is used for SASL authentication, the client uses the > cluster ID obtained from Zookeeper to select the token identifier to use. So > there would also need to be some Zookeeper-less, unauthenticated way to > obtain the cluster ID as well. > {quote} > Once we move ZK out of the picture, cluster ID sits behind an end point that > needs to be authenticated. Figure out a way to expose this to clients. > One suggestion in the comments (from Andrew) > {quote} > Cluster ID lookup is most easily accomplished with a new servlet on the > HTTP(S) endpoint on the masters, serving the cluster ID as plain text. It > can't share the RPC server endpoint when SASL is enabled because any > interaction with that endpoint must be authenticated. This is ugly but > alternatives seem worse. One alternative would be a second RPC port for APIs > that do not / cannot require prior authentication. > {quote} > There could be implications if SPNEGO is enabled on these http(s) end points. > We need to make sure that it is handled. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-24765) Dynamic master discovery
[ https://issues.apache.org/jira/browse/HBASE-24765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-24765: -- Fix Version/s: 1.7.0 > Dynamic master discovery > > > Key: HBASE-24765 > URL: https://issues.apache.org/jira/browse/HBASE-24765 > Project: HBase > Issue Type: Sub-task > Components: Client >Affects Versions: 3.0.0-alpha-1, 2.4.0 >Reporter: Bharath Vissapragada >Assignee: Bharath Vissapragada >Priority: Major > Fix For: 3.0.0-alpha-1, 1.7.0, 2.4.0 > > > [~stack]'s idea in the design doc for splittable-meta. > We can keep a live list of masters to query by fetching the list of available > masters from any of the available masters configured in the seed list. User > configured list of masters ("hbase.masters") would be used as a seed list. > The endpoints are refreshed every 5mins or if any of the registry RPCs hit an > error (which ever happens first). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-23305) Master based registry implementation
[ https://issues.apache.org/jira/browse/HBASE-23305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-23305: -- Fix Version/s: 1.7.0 > Master based registry implementation > > > Key: HBASE-23305 > URL: https://issues.apache.org/jira/browse/HBASE-23305 > Project: HBase > Issue Type: Sub-task > Components: master >Reporter: Bharath Vissapragada >Assignee: Bharath Vissapragada >Priority: Major > Fix For: 3.0.0-alpha-1, 2.3.0, 1.7.0 > > > Once we have all the RPCs in place (via HBASE-23304), implement a pluggable > master based AsyncRegistry (like ZKAsyncRegistry) which clients can use to > directly connect to master and fetch all the meta information needed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-23604) Clarify AsyncRegistry usage in the code
[ https://issues.apache.org/jira/browse/HBASE-23604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-23604: -- Fix Version/s: 1.7.0 > Clarify AsyncRegistry usage in the code > --- > > Key: HBASE-23604 > URL: https://issues.apache.org/jira/browse/HBASE-23604 > Project: HBase > Issue Type: Task > Components: Client >Reporter: Bharath Vissapragada >Assignee: Bharath Vissapragada >Priority: Minor > Fix For: 3.0.0-alpha-1, 2.3.0, 1.7.0, HBASE-18095 > > > As [~stack] noted in the code review > https://github.com/apache/hbase/pull/954, the usage of registry in the client > code is not super clear. The ask here is to rename it something that makes > the context more clear. > Creating a separate jira because the patch touches a lots of files. I don't > want to mix it with the patch for HBASE-23305. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-23304) Implement RPCs needed for master based registry
[ https://issues.apache.org/jira/browse/HBASE-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-23304: -- Fix Version/s: 1.7.0 > Implement RPCs needed for master based registry > --- > > Key: HBASE-23304 > URL: https://issues.apache.org/jira/browse/HBASE-23304 > Project: HBase > Issue Type: Sub-task > Components: master >Affects Versions: 3.0.0-alpha-1 >Reporter: Bharath Vissapragada >Assignee: Bharath Vissapragada >Priority: Major > Fix For: 3.0.0-alpha-1, 2.3.0, 1.7.0 > > > We need to implement RPCs on masters needed by client to fetch information > like clusterID, active master server name, meta locations etc. These RPCs are > used by clients during connection init. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-23275) Track active master server name in ActiveMasterManager
[ https://issues.apache.org/jira/browse/HBASE-23275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-23275: -- Fix Version/s: 1.7.0 > Track active master server name in ActiveMasterManager > -- > > Key: HBASE-23275 > URL: https://issues.apache.org/jira/browse/HBASE-23275 > Project: HBase > Issue Type: Sub-task > Components: master >Affects Versions: 3.0.0-alpha-1 >Reporter: Bharath Vissapragada >Assignee: Bharath Vissapragada >Priority: Major > Fix For: 3.0.0-alpha-1, 2.3.0, 1.7.0 > > > We already track whether the cluster has an active master, it is just another > RPC to the zookeeper to fetch the active master's hostname. Tracking it helps > load balance client requests to fetch the active master information. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-23257) Track ClusterID in stand by masters
[ https://issues.apache.org/jira/browse/HBASE-23257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-23257: -- Fix Version/s: 1.7.0 > Track ClusterID in stand by masters > --- > > Key: HBASE-23257 > URL: https://issues.apache.org/jira/browse/HBASE-23257 > Project: HBase > Issue Type: Sub-task > Components: master >Affects Versions: 3.0.0-alpha-1 >Reporter: Bharath Vissapragada >Assignee: Bharath Vissapragada >Priority: Major > Fix For: 3.0.0-alpha-1, 2.3.0, 1.7.0 > > > Currently, only active master tracks the cluster ID. As a part of removing > client dependency on ZK (HBASE-18095), it was noted that having stand by > masters serve ClusterID will help load balance the client requests instead of > hot-spotting the active master. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-24807) Backport HBASE-20417 to branch-1
[ https://issues.apache.org/jira/browse/HBASE-24807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-24807: -- Fix Version/s: 1.7.0 > Backport HBASE-20417 to branch-1 > > > Key: HBASE-24807 > URL: https://issues.apache.org/jira/browse/HBASE-24807 > Project: HBase > Issue Type: Bug > Components: Replication >Affects Versions: 1.4.14 >Reporter: Wellington Chevreuil >Assignee: Wellington Chevreuil >Priority: Major > Fix For: 1.7.0, 1.4.14 > > > The wal reader shouldn't keep running with peer disabled. We need to backport > HBASE-20417 to branch-1 or do something similar, if backport isn't possible > due to differences in the code base. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-24807) Backport HBASE-20417 to branch-1
[ https://issues.apache.org/jira/browse/HBASE-24807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-24807: -- Fix Version/s: (was: 1.4.14) > Backport HBASE-20417 to branch-1 > > > Key: HBASE-24807 > URL: https://issues.apache.org/jira/browse/HBASE-24807 > Project: HBase > Issue Type: Bug > Components: Replication >Affects Versions: 1.4.14 >Reporter: Wellington Chevreuil >Assignee: Wellington Chevreuil >Priority: Major > Fix For: 1.7.0 > > > The wal reader shouldn't keep running with peer disabled. We need to backport > HBASE-20417 to branch-1 or do something similar, if backport isn't possible > due to differences in the code base. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24086) Disable output stream capability enforcement when running in standalone mode
[ https://issues.apache.org/jira/browse/HBASE-24086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17315933#comment-17315933 ] Reid Chan commented on HBASE-24086: --- Oh, I see the invert in later commits, pardon~. > Disable output stream capability enforcement when running in standalone mode > > > Key: HBASE-24086 > URL: https://issues.apache.org/jira/browse/HBASE-24086 > Project: HBase > Issue Type: Task > Components: master, Operability >Affects Versions: 3.0.0-alpha-1, 2.3.0 >Reporter: Nick Dimiduk >Priority: Critical > > {noformat} > $ > JAVA_HOME=/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home > mvn clean install -DskipTests > $ > JAVA_HOME=/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home > ./bin/hbase master start > {noformat} > gives > {noformat} > 2020-03-30 17:12:43,857 ERROR > [master/192.168.111.13:16000:becomeActiveMaster] master.HMaster: Failed to > become active master > > java.io.IOException: cannot get log writer > > > at > org.apache.hadoop.hbase.wal.AsyncFSWALProvider.createAsyncWriter(AsyncFSWALProvider.java:118) > > > at > org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.createAsyncWriter(AsyncFSWAL.java:704) > > > at > org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.createWriterInstance(AsyncFSWAL.java:710) > > > at > org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.createWriterInstance(AsyncFSWAL.java:128) > > > at > org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.rollWriter(AbstractFSWAL.java:839) > > > at > org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.rollWriter(AbstractFSWAL.java:549) > > > at > org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.init(AbstractFSWAL.java:490) > > > at > org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:156) > > > at > org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:61) > > > at org.apache.hadoop.hbase.wal.WALFactory.getWAL(WALFactory.java:297) > > > at > org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.createWAL(RegionProcedureStore.java:256) > > > at > org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.bootstrap(RegionProcedureStore.java:273) > > > at > org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.recoverLease(RegionProcedureStore.java:482) > > > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:587) > > > at > org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1575) > > > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:961) > > > at > org.apache.hadoop.hbase.master.HMaster.start
[jira] [Commented] (HBASE-24086) Disable output stream capability enforcement when running in standalone mode
[ https://issues.apache.org/jira/browse/HBASE-24086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17315931#comment-17315931 ] Reid Chan commented on HBASE-24086: --- Resolution is Won't Fix, but I see it applied to some branches, but there's no Fix version/s.. > Disable output stream capability enforcement when running in standalone mode > > > Key: HBASE-24086 > URL: https://issues.apache.org/jira/browse/HBASE-24086 > Project: HBase > Issue Type: Task > Components: master, Operability >Affects Versions: 3.0.0-alpha-1, 2.3.0 >Reporter: Nick Dimiduk >Priority: Critical > > {noformat} > $ > JAVA_HOME=/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home > mvn clean install -DskipTests > $ > JAVA_HOME=/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home > ./bin/hbase master start > {noformat} > gives > {noformat} > 2020-03-30 17:12:43,857 ERROR > [master/192.168.111.13:16000:becomeActiveMaster] master.HMaster: Failed to > become active master > > java.io.IOException: cannot get log writer > > > at > org.apache.hadoop.hbase.wal.AsyncFSWALProvider.createAsyncWriter(AsyncFSWALProvider.java:118) > > > at > org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.createAsyncWriter(AsyncFSWAL.java:704) > > > at > org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.createWriterInstance(AsyncFSWAL.java:710) > > > at > org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.createWriterInstance(AsyncFSWAL.java:128) > > > at > org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.rollWriter(AbstractFSWAL.java:839) > > > at > org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.rollWriter(AbstractFSWAL.java:549) > > > at > org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.init(AbstractFSWAL.java:490) > > > at > org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:156) > > > at > org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:61) > > > at org.apache.hadoop.hbase.wal.WALFactory.getWAL(WALFactory.java:297) > > > at > org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.createWAL(RegionProcedureStore.java:256) > > > at > org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.bootstrap(RegionProcedureStore.java:273) > > > at > org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.recoverLease(RegionProcedureStore.java:482) > > > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:587) > > > at > org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1575) > > > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:961) > > > at
[jira] [Updated] (HBASE-24081) Provide documentation for running Yetus with HBase
[ https://issues.apache.org/jira/browse/HBASE-24081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-24081: -- Fix Version/s: 1.7.0 > Provide documentation for running Yetus with HBase > -- > > Key: HBASE-24081 > URL: https://issues.apache.org/jira/browse/HBASE-24081 > Project: HBase > Issue Type: Task > Components: documentation >Reporter: Nick Dimiduk >Assignee: Nick Dimiduk >Priority: Major > Fix For: 3.0.0-alpha-1, 1.7.0, 2.4.0 > > > A colleague asked how to use Yetus with HBase, so I wrote up a little how-to > doc. Maybe it's useful to someone else? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-24016) Change nightly poll from cron @daily to pollSCM @daily; i.e. run nightly if a change ONLY
[ https://issues.apache.org/jira/browse/HBASE-24016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-24016: -- Fix Version/s: 1.7.0 > Change nightly poll from cron @daily to pollSCM @daily; i.e. run nightly if a > change ONLY > - > > Key: HBASE-24016 > URL: https://issues.apache.org/jira/browse/HBASE-24016 > Project: HBase > Issue Type: Bug >Reporter: Michael Stack >Assignee: Michael Stack >Priority: Major > Fix For: 3.0.0-alpha-1, 2.3.0, 1.3.7, 1.7.0, 2.1.10, 1.4.14, 2.2.5 > > Attachments: > 0001-HBASE-24016-Change-nightly-poll-from-cron-daily-to-p.patch > > > Change build on branch-1.3, 1.4, 2.1, and feature branches > HBASE-23162-branch-1 and HBASE-22114-branch-1 to be pollSCM @daily -- i.e. > poll once a day and if change run nightly -- rather than build every night > regardless. > See > https://lists.apache.org/thread.html/r5dca2cacc123f2e5719c622add6853ac62b56b2a77885fe0b2eb53c3%40%3Cdev.hbase.apache.org%3E > for dev list discussion on downing our nightly load. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25731) TestConnectionImplementation BadHostname tests fail in branch-1
[ https://issues.apache.org/jira/browse/HBASE-25731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17315228#comment-17315228 ] Reid Chan commented on HBASE-25731: --- Thanks [~gjacoby] > TestConnectionImplementation BadHostname tests fail in branch-1 > --- > > Key: HBASE-25731 > URL: https://issues.apache.org/jira/browse/HBASE-25731 > Project: HBase > Issue Type: Test >Affects Versions: 1.7.0 >Reporter: Geoffrey Jacoby >Assignee: Geoffrey Jacoby >Priority: Major > > TestConnectionImplementation.testGetAdminBadHostname and > testGetClientBadHostname are consistently failing in branch-1. > This is because they're assuming that the validity of the host is checked > immediately upon getting the protobuf service object, when instead the > service code purposefully waits until the first service call to check. > I'll revise the tests to make service calls and verify that they return the > correct exceptions (ServiceException wrapping an UnknownHostException). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HBASE-25731) TestConnectionImplementation BadHostname tests fail in branch-1
[ https://issues.apache.org/jira/browse/HBASE-25731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan resolved HBASE-25731. --- Fix Version/s: 1.7.0 Hadoop Flags: Reviewed Resolution: Fixed > TestConnectionImplementation BadHostname tests fail in branch-1 > --- > > Key: HBASE-25731 > URL: https://issues.apache.org/jira/browse/HBASE-25731 > Project: HBase > Issue Type: Test >Affects Versions: 1.7.0 >Reporter: Geoffrey Jacoby >Assignee: Geoffrey Jacoby >Priority: Major > Fix For: 1.7.0 > > > TestConnectionImplementation.testGetAdminBadHostname and > testGetClientBadHostname are consistently failing in branch-1. > This is because they're assuming that the validity of the host is checked > immediately upon getting the protobuf service object, when instead the > service code purposefully waits until the first service call to check. > I'll revise the tests to make service calls and verify that they return the > correct exceptions (ServiceException wrapping an UnknownHostException). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-17890) FuzzyRowFilter fail if unaligned support is false
[ https://issues.apache.org/jira/browse/HBASE-17890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-17890: -- Fix Version/s: (was: 1.7.0) 1.8.0 > FuzzyRowFilter fail if unaligned support is false > - > > Key: HBASE-17890 > URL: https://issues.apache.org/jira/browse/HBASE-17890 > Project: HBase > Issue Type: Sub-task > Components: util >Affects Versions: 1.2.5, 2.0.0 >Reporter: Jerry He >Assignee: Chia-Ping Tsai >Priority: Major > Fix For: 3.0.0-alpha-1, 1.8.0 > > Attachments: HBASE-17890.v0.branch-1.patch, HBASE-17890.v0.patch, > HBASE-17890.v1.branch-1.patch, HBASE-17890.v1.patch, HBASE-17890.v2.patch, > HBASE-17890.v3.patch, HBASE-17890.v3.patch, HBASE-17890.v3.patch, > HBASE-17890.v3.patch, HBASE-17890.v3.patch > > > When unaligned support is false, FuzzyRow tests fail: > {noformat} > Failed tests: > TestFuzzyRowAndColumnRangeFilter.Test:134->runTest:157->runScanner:186 > expected:<10> but was:<0> > TestFuzzyRowFilter.testSatisfiesForward:81 expected: but was: > TestFuzzyRowFilter.testSatisfiesReverse:121 expected: but > was: > TestFuzzyRowFilterEndToEnd.testEndToEnd:247->runTest1:278->runScanner:343 > expected:<6250> but was:<0> > TestFuzzyRowFilterEndToEnd.testFilterList:385->runTest:417->runScanner:445 > expected:<5> but was:<0> > TestFuzzyRowFilterEndToEnd.testHBASE14782:204 expected:<6> but was:<0> > {noformat} > This can be reproduced in the case described in HBASE-17869. Or on a platform > really without unaligned support. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-21140) Backport 'HBASE-21136 NPE in MetricsTableSourceImpl.updateFlushTime' to branch-1 . (and backport HBASE-15728 for branch-1)
[ https://issues.apache.org/jira/browse/HBASE-21140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-21140: -- Fix Version/s: (was: 1.7.0) 1.8.0 > Backport 'HBASE-21136 NPE in MetricsTableSourceImpl.updateFlushTime' to > branch-1 . (and backport HBASE-15728 for branch-1) > --- > > Key: HBASE-21140 > URL: https://issues.apache.org/jira/browse/HBASE-21140 > Project: HBase > Issue Type: Bug > Components: metrics >Reporter: Duo Zhang >Assignee: Xu Cang >Priority: Major > Fix For: 1.8.0 > > Attachments: > HBASE-21140.diff_against_cf198a65e8d704d28538c4c165a941b9e5bac678.branch-1.001.patch > > > There is no computeIfAbsent method on branch-1 as we still need to support > JDK7, so the fix will be different with branch-2+. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-21674) Port HBASE-21652 (Refactor ThriftServer making thrift2 server inherited from thrift1 server) to branch-1
[ https://issues.apache.org/jira/browse/HBASE-21674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-21674: -- Fix Version/s: (was: 1.7.0) 1.8.0 > Port HBASE-21652 (Refactor ThriftServer making thrift2 server inherited from > thrift1 server) to branch-1 > > > Key: HBASE-21674 > URL: https://issues.apache.org/jira/browse/HBASE-21674 > Project: HBase > Issue Type: Sub-task >Reporter: Andrew Kyle Purtell >Assignee: Yutong Xiao >Priority: Major > Fix For: 1.8.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-22114) Port HBASE-15560 (TinyLFU-based BlockCache) to branch-1
[ https://issues.apache.org/jira/browse/HBASE-22114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-22114: -- Fix Version/s: (was: 1.7.0) 1.8.0 > Port HBASE-15560 (TinyLFU-based BlockCache) to branch-1 > --- > > Key: HBASE-22114 > URL: https://issues.apache.org/jira/browse/HBASE-22114 > Project: HBase > Issue Type: Sub-task >Reporter: Andrew Kyle Purtell >Assignee: Andrew Kyle Purtell >Priority: Major > Fix For: 1.8.0 > > Attachments: HBASE-22114-branch-1.patch, HBASE-22114-branch-1.patch, > HBASE-22114-branch-1.patch > > > HBASE-15560 introduces the TinyLFU cache policy for the blockcache. > W-TinyLFU ([research paper|http://arxiv.org/pdf/1512.00727.pdf]) records the > frequency in a counting sketch, ages periodically by halving the counters, > and orders entries by SLRU. An entry is discarded by comparing the frequency > of the new arrival (candidate) to the SLRU's victim, and keeping the one with > the highest frequency. This allows the operations to be performed in O(1) > time and, though the use of a compact sketch, a much larger history is > retained beyond the current working set. In a variety of real world traces > the policy had [near optimal hit > rates|https://github.com/ben-manes/caffeine/wiki/Efficiency]. > The implementation of HBASE-15560 uses several Java 8 idioms, depends on JRE > 8+ type Optional, and has dependencies on libraries compiled with Java 8+ > bytecode. It could be backported to branch-1 but must be made optional both > at compile time and runtime, enabled by the 'build-with-jdk8' build profile. > The TinyLFU policy must go into its own build module. > The blockcache must be modified to load L1 implementation/policy dynamically > at startup by reflection if the policy is "TinyLFU" -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-24893) TestLogLevel failing on hadoop-ci (branch-1)
[ https://issues.apache.org/jira/browse/HBASE-24893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-24893: -- Fix Version/s: (was: 1.8.0) 1.7.0 > TestLogLevel failing on hadoop-ci (branch-1) > > > Key: HBASE-24893 > URL: https://issues.apache.org/jira/browse/HBASE-24893 > Project: HBase > Issue Type: Bug > Components: test >Reporter: Andrew Kyle Purtell >Assignee: Abhey Rana >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > TestLogLevel is failing the branch-1 builds on hadoop-ci. > The test needs some improvement. The code seems to be doing the right thing > but the error condition the test is expecting varies by JVM or JVM version: > {noformat} > Expected to find 'Unrecognized SSL message' but got unexpected exception: > javax.net.ssl.SSLException: Unsupported or unrecognized SSL message > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HBASE-24893) TestLogLevel failing on hadoop-ci (branch-1)
[ https://issues.apache.org/jira/browse/HBASE-24893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan resolved HBASE-24893. --- Resolution: Fixed > TestLogLevel failing on hadoop-ci (branch-1) > > > Key: HBASE-24893 > URL: https://issues.apache.org/jira/browse/HBASE-24893 > Project: HBase > Issue Type: Bug > Components: test >Reporter: Andrew Kyle Purtell >Assignee: Abhey Rana >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > TestLogLevel is failing the branch-1 builds on hadoop-ci. > The test needs some improvement. The code seems to be doing the right thing > but the error condition the test is expecting varies by JVM or JVM version: > {noformat} > Expected to find 'Unrecognized SSL message' but got unexpected exception: > javax.net.ssl.SSLException: Unsupported or unrecognized SSL message > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-22126) TestBlocksRead is flaky
[ https://issues.apache.org/jira/browse/HBASE-22126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-22126: -- Fix Version/s: (was: 1.7.0) 1.8.0 > TestBlocksRead is flaky > --- > > Key: HBASE-22126 > URL: https://issues.apache.org/jira/browse/HBASE-22126 > Project: HBase > Issue Type: Test > Components: test >Affects Versions: 1.5.0 >Reporter: Andrew Kyle Purtell >Assignee: Sandeep Pal >Priority: Major > Labels: branch-1 > Fix For: 1.8.0 > > > TestBlocksRead does not fail when invoked by itself but is flaky when run as > part of the suite. > Some kind of race during setup. > [ERROR] > testBlocksStoredWhenCachingDisabled(org.apache.hadoop.hbase.regionserver.TestBlocksRead) > Time elapsed: 0.19 s <<< ERROR! > java.net.ConnectException: Call From $HOST/$IP to localhost:59658 failed on > connection exception: java.net.ConnectException: Connection refused; For more > details see: http://wiki.apache.org/hadoop/ConnectionRefused > at > org.apache.hadoop.hbase.regionserver.TestBlocksRead.initHRegion(TestBlocksRead.java:112) > at > org.apache.hadoop.hbase.regionserver.TestBlocksRead.testBlocksStoredWhenCachingDisabled(TestBlocksRead.java:389) > Caused by: java.net.ConnectException: Connection refused > at > org.apache.hadoop.hbase.regionserver.TestBlocksRead.initHRegion(TestBlocksRead.java:112) > at > org.apache.hadoop.hbase.regionserver.TestBlocksRead.testBlocksStoredWhenCachingDisabled(TestBlocksRead.java:389) > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-24893) TestLogLevel failing on hadoop-ci (branch-1)
[ https://issues.apache.org/jira/browse/HBASE-24893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-24893: -- Fix Version/s: (was: 1.7.0) 1.8.0 > TestLogLevel failing on hadoop-ci (branch-1) > > > Key: HBASE-24893 > URL: https://issues.apache.org/jira/browse/HBASE-24893 > Project: HBase > Issue Type: Bug > Components: test >Reporter: Andrew Kyle Purtell >Assignee: Abhey Rana >Priority: Major > Labels: pull-request-available > Fix For: 1.8.0 > > > TestLogLevel is failing the branch-1 builds on hadoop-ci. > The test needs some improvement. The code seems to be doing the right thing > but the error condition the test is expecting varies by JVM or JVM version: > {noformat} > Expected to find 'Unrecognized SSL message' but got unexpected exception: > javax.net.ssl.SSLException: Unsupported or unrecognized SSL message > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-24525) [branch-1] Support ZooKeeper 3.6.0+
[ https://issues.apache.org/jira/browse/HBASE-24525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-24525: -- Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) > [branch-1] Support ZooKeeper 3.6.0+ > --- > > Key: HBASE-24525 > URL: https://issues.apache.org/jira/browse/HBASE-24525 > Project: HBase > Issue Type: Improvement > Components: Zookeeper >Reporter: Andrew Kyle Purtell >Assignee: Andrew Kyle Purtell >Priority: Minor > Fix For: 1.7.0 > > > Fix compilation issues against ZooKeeper 3.6.0. Backwards compatible changes > with 3.4 and 3.5. Tested with: > {{ mvn clean install -Dtest=org.apache.hadoop.hbase.zookeeper.**}} > {{ mvn clean install -Dzookeeper.version=3.5.8 > -Dtest=org.apache.hadoop.hbase.zookeeper.**}} > {{ mvn clean install -Dzookeeper.version=3.6.0 > -Dtest=org.apache.hadoop.hbase.zookeeper.**}} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25612) HMaster should abort if ReplicationLogCleaner is not able to delete oldWALs.
[ https://issues.apache.org/jira/browse/HBASE-25612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-25612: -- Fix Version/s: (was: 1.7.0) 1.8.0 > HMaster should abort if ReplicationLogCleaner is not able to delete oldWALs. > > > Key: HBASE-25612 > URL: https://issues.apache.org/jira/browse/HBASE-25612 > Project: HBase > Issue Type: Improvement >Affects Versions: 1.6.0 >Reporter: Rushabh Shah >Assignee: Rushabh Shah >Priority: Major > Fix For: 3.0.0-alpha-1, 2.5.0, 1.8.0, 2.4.3 > > > In our production cluster, we encountered an issue where the number of files > within /hbase/oldWALs directory were growing exponentially from about 4000 > baseline to 15 and growing at the rate of 333 files per minute. > On further investigation we found that ReplicatonLogCleaner thread was > getting aborted since it was not able to talk to zookeeper. Stack trace below > {noformat} > 2021-02-25 23:05:01,149 WARN [an-pool3-thread-1729] zookeeper.ZKUtil - > replicationLogCleaner-0x302e05e0d8f, > quorum=zookeeper-0:2181,zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181,zookeeper-4:2181, > baseZNode=/hbase Unable to get data of znode /hbase/replication/rs > org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode > = Session expired for /hbase/replication/rs > at org.apache.zookeeper.KeeperException.create(KeeperException.java:130) > at org.apache.zookeeper.KeeperException.create(KeeperException.java:54) > at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1229) > at > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:374) > at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:713) > at > org.apache.hadoop.hbase.replication.ReplicationQueuesClientZKImpl.getQueuesZNodeCversion(ReplicationQueuesClientZKImpl.java:87) > at > org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner.loadWALsFromQueues(ReplicationLogCleaner.java:99) > at > org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner.getDeletableFiles(ReplicationLogCleaner.java:70) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore.checkAndDeleteFiles(CleanerChore.java:262) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore.access$200(CleanerChore.java:52) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore$3.act(CleanerChore.java:413) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore$3.act(CleanerChore.java:410) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore.deleteAction(CleanerChore.java:481) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore.traverseAndDelete(CleanerChore.java:410) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore.access$100(CleanerChore.java:52) > at > org.apache.hadoop.hbase.master.cleaner.CleanerChore$1.run(CleanerChore.java:220) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2021-02-25 23:05:01,149 WARN [an-pool3-thread-1729] > master.ReplicationLogCleaner - ReplicationLogCleaner received abort, > ignoring. Reason: Failed to get stat of replication rs node > 2021-02-25 23:05:01,149 DEBUG [an-pool3-thread-1729] > master.ReplicationLogCleaner - > org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode > = Session expired for /hbase/replication/rs > 2021-02-25 23:05:01,150 WARN [an-pool3-thread-1729] > master.ReplicationLogCleaner - Failed to read zookeeper, skipping checking > deletable files > {noformat} > > {quote} 2021-02-25 23:05:01,149 WARN [an-pool3-thread-1729] > master.ReplicationLogCleaner - ReplicationLogCleaner received abort, > ignoring. Reason: Failed to get stat of replication rs node > {quote} > > This line is more scary where HMaster invoked Abortable but just ignored and > HMaster was doing it business as usual. > We have max files per directory configuration in namenode which is set to 1M > in our clusters. If this directory reached that limit then that would have > brought down the whole cluster. > We shouldn't ignore Abortable and should crash the Hmaster if Abortable is > invoked. > -- This message was sent by Atlassian Jira (v8.3.4#803005)