[jira] [Created] (HBASE-22852) hbase nightlies leaking gpg-agents
Allen Wittenauer created HBASE-22852: Summary: hbase nightlies leaking gpg-agents Key: HBASE-22852 URL: https://issues.apache.org/jira/browse/HBASE-22852 Project: HBase Issue Type: Bug Reporter: Allen Wittenauer FYI, just triggered yetus master, which includes code to find and kill long-running processes still attached to the Jenkins workspace directory. It came up with this: https://builds.apache.org/view/S-Z/view/Yetus/job/yetus-github-multibranch/job/master/134/console {code} USER PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND jenkins752 0.0 0.0 93612 584 ?Ss Aug12 0:00 gpg-agent --homedir /home/jenkins/jenkins-slave/workspace/HBase_Nightly_HBASE-20952/downloads-hadoop-2/.gpg --use-standard-socket --daemon Killing 752 *** {code} (repeat 10s of times, which slightly different dates, pids, versions, etc) Also, be aware that any other process running on the node (such as the other executor) has extremely easy access to whatever gpg creds you are using... -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Comment Edited] (HBASE-22167) Unify the new github based pre commit job and our nightly job
[ https://issues.apache.org/jira/browse/HBASE-22167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16815496#comment-16815496 ] Allen Wittenauer edited comment on HBASE-22167 at 4/11/19 3:14 PM: --- Definitely not supported yet. Jenkins auth tokens aren't supported by Pipeline jobs. This means that the jenkins-admin code needs to actually have a real user to auth against jenkins in order to submit jobs. The alternative is to have jenkins-admin write something that can be read by groovy code sitting in a Jenkins pipeline that does a job submission without needing to auth. If that path is taken, then it also needs to have a {project list} -> {job list} mapping, since there is no real 1:1 mapping anymore. (e.g., HADOOP, HDFS, YARN, ... -> hadoop-multibranch-pipeline) Yet Another Alternative is to try and replace jenkins-admin with the jenkins-jira plugin. It's loaded on our Jenkins server, but my attempts to use it in any meaningful way fell apart since it didn't seem to understand attachments very well. But the theory was that a pipeline job could be written that would take that plugins input and just resubmit to the appropriate the multibranch job. All-in-all, it's a lot of work. I had some same code written to implement the jenkins-admin-as-a-pipeline-job but I can't seem to find it. Plus I'm not doing much with the ASF anymore so it sort of fell off my priority list. was (Author: aw): Definitely not supported yet. Jenkins auth tokens aren't supported by Pipeline jobs. This means that the jenkins-admin code needs to actually have a real user to auth against jenkins in order to submit jobs. The alternative is to have jenkins-admin write something that can be read by groovy code sitting in a Jenkins pipeline that does a job submission without needing to auth. If that path is taken, then it also needs to have a {project list} -> {job list} mapping, since there is no real 1:1 mapping anymore. (e.g., HADOOP, HDFS, YARN, ... -> hadoop-multibranch-pipeline) Yet Another Alternative is to try and replace jenkins-admin with the jenkins-jira plugin. It's loaded on our Jenkins server, but my attempts to use it in any meaningful way fell apart since it didn't seem to understand attachments very well. But the theory was that a pipeline job could be written that would take that plugins input and just resubmit to the appropriate the multibranch job. All-in-all, it's a lot of work. I had some same code written to implement the latter but I can't seem to find it. Plus I'm not doing much with the ASF anymore so it sort of fell off my priority list. > Unify the new github based pre commit job and our nightly job > - > > Key: HBASE-22167 > URL: https://issues.apache.org/jira/browse/HBASE-22167 > Project: HBase > Issue Type: Improvement >Reporter: Duo Zhang >Priority: Minor > > Now we use two jenkins files and set up two jobs on jenkins. They both use > yetus and seems yetus 0.9.0 can have a PR tab and a branch tab in the same > job. So we can unify them together. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-22167) Unify the new github based pre commit job and our nightly job
[ https://issues.apache.org/jira/browse/HBASE-22167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16815496#comment-16815496 ] Allen Wittenauer commented on HBASE-22167: -- Definitely not supported yet. Jenkins auth tokens aren't supported by Pipeline jobs. This means that the jenkins-admin code needs to actually have a real user to auth against jenkins in order to submit jobs. The alternative is to have jenkins-admin write something that can be read by groovy code sitting in a Jenkins pipeline that does a job submission without needing to auth. If that path is taken, then it also needs to have a {project list} -> {job list} mapping, since there is no real 1:1 mapping anymore. (e.g., HADOOP, HDFS, YARN, ... -> hadoop-multibranch-pipeline) Yet Another Alternative is to try and replace jenkins-admin with the jenkins-jira plugin. It's loaded on our Jenkins server, but my attempts to use it in any meaningful way fell apart since it didn't seem to understand attachments very well. But the theory was that a pipeline job could be written that would take that plugins input and just resubmit to the appropriate the multibranch job. All-in-all, it's a lot of work. I had some same code written to implement the latter but I can't seem to find it. Plus I'm not doing much with the ASF anymore so it sort of fell off my priority list. > Unify the new github based pre commit job and our nightly job > - > > Key: HBASE-22167 > URL: https://issues.apache.org/jira/browse/HBASE-22167 > Project: HBase > Issue Type: Improvement >Reporter: Duo Zhang >Priority: Minor > > Now we use two jenkins files and set up two jobs on jenkins. They both use > yetus and seems yetus 0.9.0 can have a PR tab and a branch tab in the same > job. So we can unify them together. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21955) Auto insert release changes and releasenotes in release scripts
[ https://issues.apache.org/jira/browse/HBASE-21955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16797836#comment-16797836 ] Allen Wittenauer commented on HBASE-21955: -- btw, yetus 0.9.0 has a maven plugin that can run releasedocmaker. no need to download yetus, etc, if you go that approach. > Auto insert release changes and releasenotes in release scripts > --- > > Key: HBASE-21955 > URL: https://issues.apache.org/jira/browse/HBASE-21955 > Project: HBase > Issue Type: Sub-task >Reporter: stack >Priority: Major > Attachments: rns.sh > > > Should be able to script updating changes and releasenotes as part of > create-releases/release-build.sh. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21432) [hbase-connectors] Add Apache Yetus integration for hbase-connectors repository
[ https://issues.apache.org/jira/browse/HBASE-21432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16697372#comment-16697372 ] Allen Wittenauer commented on HBASE-21432: -- I've setup a job on the ASF Jenkins pointing to my tree and using Github Branch Source plugin: https://builds.apache.org/view/S-Z/view/Yetus/job/yetus-buretoolbox-demo/ (Normally all branches would be listed, but I've got it configured to only work with PRs to cut down the amount of extra output in the tabs and because I don't want to trigger a storm of jobs.) By far, the biggest gotcha when working with this plugin is that jobs absolutely must delete their workspace on exit. Otherwise slaves fill up fast. (and I suspect this is one of the key problems with the ASF Jenkins infra and space... it isn't particularly obvious that each branch, pr, etc, gets it's *own* workspace dir) > [hbase-connectors] Add Apache Yetus integration for hbase-connectors > repository > > > Key: HBASE-21432 > URL: https://issues.apache.org/jira/browse/HBASE-21432 > Project: HBase > Issue Type: Task > Components: build, hbase-connectors >Affects Versions: connector-1.0.0 >Reporter: Peter Somogyi >Priority: Major > > Add automated testing for pull requests and patch files created for > hbase-connectors repository. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HBASE-21432) [hbase-connectors] Add Apache Yetus integration for hbase-connectors repository
[ https://issues.apache.org/jira/browse/HBASE-21432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16696270#comment-16696270 ] Allen Wittenauer edited comment on HBASE-21432 at 11/23/18 5:13 PM: The Github Pull Request Builder has been mostly replaced by the Github Branch Source Plugin (which is what I meant above). The big gotcha is that gprb is for freestyle jobs, gbsp is for pipeline jobs. Using Jenkins w/a Multibranch Pipeline + Github Branch Source Plugin, it enables a tab-based system that allows one to re-run branches, PR, and, if enabled, tags on demand from the Jenkins UI. The Yetus integration that is done as part of YETUS-681 (and committed into my dev branch) just makes test-patch smarter to know it is running under Jenkins, where the PR is at, etc., so that there isn't a need to manually configure or pass parameters for details that Jenkins itself shares via environment variables. This means that one stanza in the Pipeline can do both full builds and incremental/PR builds with no work required in the Pipeline to figure out what type of run it is. was (Author: aw): The Github Pull Request Builder has been mostly replaced by the Github Branch Source Plugin (which is what I meant above). The big gotcha is that gprb is for freestyle jobs, gbsp is for pipeline jobs. Using Jenkins w/a Multibranch Pipeline + Github Branch Source Plugin, it enables a tab-based system that allows one to re-run branches, PR, and, if enabled, tags on demand from the Jenkins UI. The Yetus integration that is done as part of YETUS-708 (and committed into my dev branch) just makes test-patch smarter to know it is running under Jenkins, where the PR is at, etc., so that there isn't a need to manually configure or pass parameters for details that Jenkins itself shares via environment variables. This means that one stanza in the Pipeline can do both full builds and incremental/PR builds with no work required in the Pipeline to figure out what type of run it is. > [hbase-connectors] Add Apache Yetus integration for hbase-connectors > repository > > > Key: HBASE-21432 > URL: https://issues.apache.org/jira/browse/HBASE-21432 > Project: HBase > Issue Type: Task > Components: build, hbase-connectors >Affects Versions: connector-1.0.0 >Reporter: Peter Somogyi >Priority: Major > > Add automated testing for pull requests and patch files created for > hbase-connectors repository. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21432) [hbase-connectors] Add Apache Yetus integration for hbase-connectors repository
[ https://issues.apache.org/jira/browse/HBASE-21432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16696270#comment-16696270 ] Allen Wittenauer commented on HBASE-21432: -- The Github Pull Request Builder has been mostly replaced by the Github Branch Source Plugin (which is what I meant above). The big gotcha is that gprb is for freestyle jobs, gbsp is for pipeline jobs. Using Jenkins w/a Multibranch Pipeline + Github Branch Source Plugin, it enables a tab-based system that allows one to re-run branches, PR, and, if enabled, tags on demand from the Jenkins UI. The Yetus integration that is done as part of YETUS-708 (and committed into my dev branch) just makes test-patch smarter to know it is running under Jenkins, where the PR is at, etc., so that there isn't a need to manually configure or pass parameters for details that Jenkins itself shares via environment variables. This means that one stanza in the Pipeline can do both full builds and incremental/PR builds with no work required in the Pipeline to figure out what type of run it is. > [hbase-connectors] Add Apache Yetus integration for hbase-connectors > repository > > > Key: HBASE-21432 > URL: https://issues.apache.org/jira/browse/HBASE-21432 > Project: HBase > Issue Type: Task > Components: build, hbase-connectors >Affects Versions: connector-1.0.0 >Reporter: Peter Somogyi >Priority: Major > > Add automated testing for pull requests and patch files created for > hbase-connectors repository. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21432) [hbase-connectors] Add Apache Yetus integration for hbase-connectors repository
[ https://issues.apache.org/jira/browse/HBASE-21432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16692743#comment-16692743 ] Allen Wittenauer commented on HBASE-21432: -- FWIW, I've got native support for Jenkins' Github Source Plugin ready to roll, just waiting for a series of patch review (since it is part of a chain). With it, test-patch can use webhooks from github--and therefore no poll script required. If you want to play with it, it's all sitting in either my github or gitlab repos. > [hbase-connectors] Add Apache Yetus integration for hbase-connectors > repository > > > Key: HBASE-21432 > URL: https://issues.apache.org/jira/browse/HBASE-21432 > Project: HBase > Issue Type: Task > Components: build, hbase-connectors >Affects Versions: connector-1.0.0 >Reporter: Peter Somogyi >Priority: Major > > Add automated testing for pull requests and patch files created for > hbase-connectors repository. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-14163) hbase master stop loops both processes forever
[ https://issues.apache.org/jira/browse/HBASE-14163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HBASE-14163. -- Resolution: Won't Fix > hbase master stop loops both processes forever > -- > > Key: HBASE-14163 > URL: https://issues.apache.org/jira/browse/HBASE-14163 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 2.0.0 >Reporter: Allen Wittenauer >Assignee: Andrew Purtell >Priority: Major > > It would appear that there is an infinite loop in the zk client connection > code when performing a master stop when no external zk servers are configured. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20971) Please add OWASP Dependency Check to the core build (pom.xml) and all sub-component builds.
[ https://issues.apache.org/jira/browse/HBASE-20971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16569357#comment-16569357 ] Allen Wittenauer commented on HBASE-20971: -- FWIW, the Apache Yetus team is adding support for the OWASP dependency checker to precommit and qbt. See YETUS-441 for details. > Please add OWASP Dependency Check to the core build (pom.xml) and all > sub-component builds. > --- > > Key: HBASE-20971 > URL: https://issues.apache.org/jira/browse/HBASE-20971 > Project: HBase > Issue Type: New Feature > Components: build >Affects Versions: 3.0.0, 2.2.0, 2.1.1 > Environment: All development, build, test, environments. >Reporter: Albert Baker >Priority: Major > Labels: build, easy-fix, security > Original Estimate: 1h > Remaining Estimate: 1h > > Please add OWASP Dependency Check to the build (pom.xml). OWASP DC makes an > outbound REST call to MITRE Common Vulnerabilities & Exposures (CVE) to > perform a lookup for each dependant .jar to list any/all known > vulnerabilities for each jar. This step is needed because a manual MITRE CVE > lookup/check on the main component does not include checking for > vulnerabilities in components or in dependant libraries. > OWASP Dependency check : > https://www.owasp.org/index.php/OWASP_Dependency_Check has plug-ins for most > Java build/make types (ant, maven, ivy, gradle). > Also, add the appropriate command to the nightly build to generate a report > of all known vulnerabilities in any/all third party libraries/dependencies > that get pulled in. example : mvn -Powasp -Dtest=false -DfailIfNoTests=false > clean aggregate > Generating this report nightly/weekly will help inform the project's > development team if any dependant libraries have a reported known > vulnerailities. Project teams that keep up with removing vulnerabilities on a > weekly basis will help protect businesses that rely on these open source > componets. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19902) Current Jenkins Madness: OOME, can't start minihbasecluster, etc.
[ https://issues.apache.org/jira/browse/HBASE-19902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16349025#comment-16349025 ] Allen Wittenauer commented on HBASE-19902: -- https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-HBASE-Build/11330/console is INFRA-15920 . It looks like test-patch averages a bit over 5k when doing certain modules, with the 6020 being a bit of an outlier. But as you said, 6020 might even be low under the particular conditions. I can't think of any other parameters that might be useful here. Filed YETUS-612 to keep track of memory in the docker container. It'd be nice to know how much mem and IO is being used. > Current Jenkins Madness: OOME, can't start minihbasecluster, etc. > - > > Key: HBASE-19902 > URL: https://issues.apache.org/jira/browse/HBASE-19902 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack >Priority: Major > Attachments: HBASE-19902.temporary-2.001.patch > > > Trying to figure what is going on w/ jenkins build > Changed the hadoopqa config to output long process listing rather than just > 'java'... > I can't get loadavg... tried dumping /proc... > /tmp/jenkins6485196190911961762.sh: line 48: /loadavg: Permission denied > Looking at https://builds.apache.org/job/PreCommit-HBASE-Build/11273/console, > see 7 java processes running on H2. Extra args on ps may help here whether it > zombies of us. > Test run was find then fell into hbase-server second part and soon after > started failing.. > https://builds.apache.org/job/PreCommit-HBASE-Build/11273/artifact/patchprocess/patch-unit-hbase-server.txt > Looking at first test failure... this is where main thread is, trying to get > thread info: > {code} > Thread 23 (Time-limited test): > State: RUNNABLE > Blocked count: 118 > Waited count: 58 > Stack: > sun.management.ThreadImpl.getThreadInfo1(Native Method) > sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:178) > sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:139) > > org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.java:168) > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > java.lang.reflect.Method.invoke(Method.java:498) > > org.apache.hadoop.hbase.util.Threads$PrintThreadInfoLazyHolder$1.printThreadInfo(Threads.java:294) > org.apache.hadoop.hbase.util.Threads.printThreadInfo(Threads.java:341) > > org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:191) > > org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:391) > org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:262) > org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:119) > > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:1025) > > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:971) > > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:842) > > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:824) > > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:806) > > org.apache.hadoop.hbase.AcidGuaranteesTestBase.setUpBeforeClass(AcidGuaranteesTestBase.java:61) > {code} > Master is not coming up > {code} > 2018-01-31 02:22:31,474 ERROR [Time-limited test] > hbase.MiniHBaseCluster(267): Error starting cluster > java.lang.RuntimeException: Master not active after 3ms > at > org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:192) > at > org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:391) > at > org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:262) > at > org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:119) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:1025) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:971) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:842) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:824) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:806) > at > org.apache.hadoop.hbase.AcidGuaranteesTestBase.setUpBeforeClass(AcidGuaranteesTestBase.java:61) >
[jira] [Commented] (HBASE-19902) Current Jenkins Madness: OOME, can't start minihbasecluster, etc.
[ https://issues.apache.org/jira/browse/HBASE-19902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16348987#comment-16348987 ] Allen Wittenauer commented on HBASE-19902: -- Youch. 11314 went well over 5k: | Max. process+thread count | 6020 (vs. ulimit of 1) | > Current Jenkins Madness: OOME, can't start minihbasecluster, etc. > - > > Key: HBASE-19902 > URL: https://issues.apache.org/jira/browse/HBASE-19902 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack >Priority: Major > Attachments: HBASE-19902.temporary-2.001.patch > > > Trying to figure what is going on w/ jenkins build > Changed the hadoopqa config to output long process listing rather than just > 'java'... > I can't get loadavg... tried dumping /proc... > /tmp/jenkins6485196190911961762.sh: line 48: /loadavg: Permission denied > Looking at https://builds.apache.org/job/PreCommit-HBASE-Build/11273/console, > see 7 java processes running on H2. Extra args on ps may help here whether it > zombies of us. > Test run was find then fell into hbase-server second part and soon after > started failing.. > https://builds.apache.org/job/PreCommit-HBASE-Build/11273/artifact/patchprocess/patch-unit-hbase-server.txt > Looking at first test failure... this is where main thread is, trying to get > thread info: > {code} > Thread 23 (Time-limited test): > State: RUNNABLE > Blocked count: 118 > Waited count: 58 > Stack: > sun.management.ThreadImpl.getThreadInfo1(Native Method) > sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:178) > sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:139) > > org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.java:168) > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > java.lang.reflect.Method.invoke(Method.java:498) > > org.apache.hadoop.hbase.util.Threads$PrintThreadInfoLazyHolder$1.printThreadInfo(Threads.java:294) > org.apache.hadoop.hbase.util.Threads.printThreadInfo(Threads.java:341) > > org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:191) > > org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:391) > org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:262) > org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:119) > > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:1025) > > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:971) > > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:842) > > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:824) > > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:806) > > org.apache.hadoop.hbase.AcidGuaranteesTestBase.setUpBeforeClass(AcidGuaranteesTestBase.java:61) > {code} > Master is not coming up > {code} > 2018-01-31 02:22:31,474 ERROR [Time-limited test] > hbase.MiniHBaseCluster(267): Error starting cluster > java.lang.RuntimeException: Master not active after 3ms > at > org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:192) > at > org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:391) > at > org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:262) > at > org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:119) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:1025) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:971) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:842) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:824) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:806) > at > org.apache.hadoop.hbase.AcidGuaranteesTestBase.setUpBeforeClass(AcidGuaranteesTestBase.java:61) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(F
[jira] [Comment Edited] (HBASE-19902) Current Jenkins Madness: OOME, can't start minihbasecluster, etc.
[ https://issues.apache.org/jira/browse/HBASE-19902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16348032#comment-16348032 ] Allen Wittenauer edited comment on HBASE-19902 at 2/1/18 5:28 AM: -- Awesome work! Thanks [~stack]. I spent some time looking over the output of various jobs. At this point, I'm not entirely convinced that hbase is hitting the proc limit [*]. I'm more inclined to think that it's actually hitting the Docker memory. By chance, did anyone up the --dockermemlimit setting? If not, try --dockermemlimit=20g . That should be less than half of the node's RAM. EDIT: * - at least, at anything past the 5k mark. was (Author: aw): Awesome work! Thanks [~stack]. I spent some time looking over the output of various jobs. At this point, I'm not entirely convinced that hbase is hitting the proc limit. I'm more inclined to think that it's actually hitting the Docker memory. By chance, did anyone up the --dockermemlimit setting? If not, try --dockermemlimit=20g . That should be less than half of the node's RAM. > Current Jenkins Madness: OOME, can't start minihbasecluster, etc. > - > > Key: HBASE-19902 > URL: https://issues.apache.org/jira/browse/HBASE-19902 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack >Priority: Major > Attachments: HBASE-19902.temporary-2.001.patch > > > Trying to figure what is going on w/ jenkins build > Changed the hadoopqa config to output long process listing rather than just > 'java'... > I can't get loadavg... tried dumping /proc... > /tmp/jenkins6485196190911961762.sh: line 48: /loadavg: Permission denied > Looking at https://builds.apache.org/job/PreCommit-HBASE-Build/11273/console, > see 7 java processes running on H2. Extra args on ps may help here whether it > zombies of us. > Test run was find then fell into hbase-server second part and soon after > started failing.. > https://builds.apache.org/job/PreCommit-HBASE-Build/11273/artifact/patchprocess/patch-unit-hbase-server.txt > Looking at first test failure... this is where main thread is, trying to get > thread info: > {code} > Thread 23 (Time-limited test): > State: RUNNABLE > Blocked count: 118 > Waited count: 58 > Stack: > sun.management.ThreadImpl.getThreadInfo1(Native Method) > sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:178) > sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:139) > > org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.java:168) > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > java.lang.reflect.Method.invoke(Method.java:498) > > org.apache.hadoop.hbase.util.Threads$PrintThreadInfoLazyHolder$1.printThreadInfo(Threads.java:294) > org.apache.hadoop.hbase.util.Threads.printThreadInfo(Threads.java:341) > > org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:191) > > org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:391) > org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:262) > org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:119) > > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:1025) > > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:971) > > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:842) > > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:824) > > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:806) > > org.apache.hadoop.hbase.AcidGuaranteesTestBase.setUpBeforeClass(AcidGuaranteesTestBase.java:61) > {code} > Master is not coming up > {code} > 2018-01-31 02:22:31,474 ERROR [Time-limited test] > hbase.MiniHBaseCluster(267): Error starting cluster > java.lang.RuntimeException: Master not active after 3ms > at > org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:192) > at > org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:391) > at > org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:262) > at > org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:119) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:1025) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:971) > at
[jira] [Commented] (HBASE-19902) Current Jenkins Madness: OOME, can't start minihbasecluster, etc.
[ https://issues.apache.org/jira/browse/HBASE-19902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16348032#comment-16348032 ] Allen Wittenauer commented on HBASE-19902: -- Awesome work! Thanks [~stack]. I spent some time looking over the output of various jobs. At this point, I'm not entirely convinced that hbase is hitting the proc limit. I'm more inclined to think that it's actually hitting the Docker memory. By chance, did anyone up the --dockermemlimit setting? If not, try --dockermemlimit=20g . That should be less than half of the node's RAM. > Current Jenkins Madness: OOME, can't start minihbasecluster, etc. > - > > Key: HBASE-19902 > URL: https://issues.apache.org/jira/browse/HBASE-19902 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack >Priority: Major > Attachments: HBASE-19902.temporary-2.001.patch > > > Trying to figure what is going on w/ jenkins build > Changed the hadoopqa config to output long process listing rather than just > 'java'... > I can't get loadavg... tried dumping /proc... > /tmp/jenkins6485196190911961762.sh: line 48: /loadavg: Permission denied > Looking at https://builds.apache.org/job/PreCommit-HBASE-Build/11273/console, > see 7 java processes running on H2. Extra args on ps may help here whether it > zombies of us. > Test run was find then fell into hbase-server second part and soon after > started failing.. > https://builds.apache.org/job/PreCommit-HBASE-Build/11273/artifact/patchprocess/patch-unit-hbase-server.txt > Looking at first test failure... this is where main thread is, trying to get > thread info: > {code} > Thread 23 (Time-limited test): > State: RUNNABLE > Blocked count: 118 > Waited count: 58 > Stack: > sun.management.ThreadImpl.getThreadInfo1(Native Method) > sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:178) > sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:139) > > org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.java:168) > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > java.lang.reflect.Method.invoke(Method.java:498) > > org.apache.hadoop.hbase.util.Threads$PrintThreadInfoLazyHolder$1.printThreadInfo(Threads.java:294) > org.apache.hadoop.hbase.util.Threads.printThreadInfo(Threads.java:341) > > org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:191) > > org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:391) > org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:262) > org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:119) > > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:1025) > > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:971) > > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:842) > > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:824) > > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:806) > > org.apache.hadoop.hbase.AcidGuaranteesTestBase.setUpBeforeClass(AcidGuaranteesTestBase.java:61) > {code} > Master is not coming up > {code} > 2018-01-31 02:22:31,474 ERROR [Time-limited test] > hbase.MiniHBaseCluster(267): Error starting cluster > java.lang.RuntimeException: Master not active after 3ms > at > org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:192) > at > org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:391) > at > org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:262) > at > org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:119) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:1025) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:971) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:842) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:824) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:806) > at > org.apache.hadoop.hbase.AcidGuaranteesTestBase.setUpBeforeClass(AcidGuaranteesTestBase.java:61) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.Nati
[jira] [Comment Edited] (HBASE-19902) Current Jenkins Madness: OOME, can't start minihbasecluster, etc.
[ https://issues.apache.org/jira/browse/HBASE-19902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347336#comment-16347336 ] Allen Wittenauer edited comment on HBASE-19902 at 1/31/18 6:30 PM: --- bq. proclimit of 20k instead of 5k (looks like machines allow for 60k fds). The proclimit option sets ulimit -u, the maximum number of processes allowed. There is no correlation with fds. [Yetus does not set that ulimit value.] The process limit is exceedingly tricky. There is the actual value set by ulimit -u and friends. Then there are cgroup settings enforced by systemd. The cgroup limit (set by the UserTasksMax in systemd settings) is the ultimate authority. It also counts across the entire node, not by process group or session or any of the other normal boundaries. The default limit ends up being a bit over 12k on the build nodes. To make matters worse, Java native threads (on Linux, at least) count against this limit. Running 'ps -L -u jenkins -o lwp' will give an approximate idea of how many processes are in play at any given time. [The number reported by Yetus when in Docker mode is this number but only present in the container.] In the end, this means that all threads/processes consumed by BOTH executors and the jenkins slave process must be less than ~13k. was (Author: aw): bq. proclimit of 20k instead of 5k (looks like machines allow for 60k fds). The proclimit option sets ulimit -u, the maximum number of processes allowed. There is no correlation with fds. [Yetus does not set that ulimit value.] The process limit is exceedingly tricky. There is the actual value set by ulimit -u and friends. Then there are cgroup settings enforced by systemd. The cgroup limit (set by the UserTasksMax in systemd settings) is the ultimate authority. It also counts across the entire node, not by process group or session or any of the other normal boundaries. The default limit ends up being a bit over 12k on the build nodes. To make matters worse, Java native threads count against this limit. Running 'ps -L -u jenkins -o lwp' will give an approximate idea of how many processes are in play at any given time. [The number reported by Yetus when in Docker mode is this number but only present in the container.] In the end, this means that all threads/processes consumed by BOTH executors and the jenkins slave process must be less than ~13k. > Current Jenkins Madness: OOME, can't start minihbasecluster, etc. > - > > Key: HBASE-19902 > URL: https://issues.apache.org/jira/browse/HBASE-19902 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack >Priority: Major > Attachments: HBASE-19902.temporary-2.001.patch > > > Trying to figure what is going on w/ jenkins build > Changed the hadoopqa config to output long process listing rather than just > 'java'... > I can't get loadavg... tried dumping /proc... > /tmp/jenkins6485196190911961762.sh: line 48: /loadavg: Permission denied > Looking at https://builds.apache.org/job/PreCommit-HBASE-Build/11273/console, > see 7 java processes running on H2. Extra args on ps may help here whether it > zombies of us. > Test run was find then fell into hbase-server second part and soon after > started failing.. > https://builds.apache.org/job/PreCommit-HBASE-Build/11273/artifact/patchprocess/patch-unit-hbase-server.txt > Looking at first test failure... this is where main thread is, trying to get > thread info: > {code} > Thread 23 (Time-limited test): > State: RUNNABLE > Blocked count: 118 > Waited count: 58 > Stack: > sun.management.ThreadImpl.getThreadInfo1(Native Method) > sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:178) > sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:139) > > org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.java:168) > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > java.lang.reflect.Method.invoke(Method.java:498) > > org.apache.hadoop.hbase.util.Threads$PrintThreadInfoLazyHolder$1.printThreadInfo(Threads.java:294) > org.apache.hadoop.hbase.util.Threads.printThreadInfo(Threads.java:341) > > org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:191) > > org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:391) > org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:262) > org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:119) > > org.apache.hadoop.hbase.HBaseTestingUtili
[jira] [Commented] (HBASE-19902) Current Jenkins Madness: OOME, can't start minihbasecluster, etc.
[ https://issues.apache.org/jira/browse/HBASE-19902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347336#comment-16347336 ] Allen Wittenauer commented on HBASE-19902: -- bq. proclimit of 20k instead of 5k (looks like machines allow for 60k fds). The proclimit option sets ulimit -u, the maximum number of processes allowed. There is no correlation with fds. [Yetus does not set that ulimit value.] The process limit is exceedingly tricky. There is the actual value set by ulimit -u and friends. Then there are cgroup settings enforced by systemd. The cgroup limit (set by the UserTasksMax in systemd settings) is the ultimate authority. It also counts across the entire node, not by process group or session or any of the other normal boundaries. The default limit ends up being a bit over 12k on the build nodes. To make matters worse, Java native threads count against this limit. Running 'ps -L -u jenkins -o lwp' will give an approximate idea of how many processes are in play at any given time. [The number reported by Yetus when in Docker mode is this number but only present in the container.] In the end, this means that all threads/processes consumed by BOTH executors and the jenkins slave process must be less than ~13k. > Current Jenkins Madness: OOME, can't start minihbasecluster, etc. > - > > Key: HBASE-19902 > URL: https://issues.apache.org/jira/browse/HBASE-19902 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack >Priority: Major > Attachments: HBASE-19902.temporary-2.001.patch > > > Trying to figure what is going on w/ jenkins build > Changed the hadoopqa config to output long process listing rather than just > 'java'... > I can't get loadavg... tried dumping /proc... > /tmp/jenkins6485196190911961762.sh: line 48: /loadavg: Permission denied > Looking at https://builds.apache.org/job/PreCommit-HBASE-Build/11273/console, > see 7 java processes running on H2. Extra args on ps may help here whether it > zombies of us. > Test run was find then fell into hbase-server second part and soon after > started failing.. > https://builds.apache.org/job/PreCommit-HBASE-Build/11273/artifact/patchprocess/patch-unit-hbase-server.txt > Looking at first test failure... this is where main thread is, trying to get > thread info: > {code} > Thread 23 (Time-limited test): > State: RUNNABLE > Blocked count: 118 > Waited count: 58 > Stack: > sun.management.ThreadImpl.getThreadInfo1(Native Method) > sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:178) > sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:139) > > org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.java:168) > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > java.lang.reflect.Method.invoke(Method.java:498) > > org.apache.hadoop.hbase.util.Threads$PrintThreadInfoLazyHolder$1.printThreadInfo(Threads.java:294) > org.apache.hadoop.hbase.util.Threads.printThreadInfo(Threads.java:341) > > org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:191) > > org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:391) > org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:262) > org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:119) > > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:1025) > > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:971) > > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:842) > > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:824) > > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:806) > > org.apache.hadoop.hbase.AcidGuaranteesTestBase.setUpBeforeClass(AcidGuaranteesTestBase.java:61) > {code} > Master is not coming up > {code} > 2018-01-31 02:22:31,474 ERROR [Time-limited test] > hbase.MiniHBaseCluster(267): Error starting cluster > java.lang.RuntimeException: Master not active after 3ms > at > org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:192) > at > org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:391) > at > org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:262) > at > org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:119) > at > org.apache.hadoop.hbase.HBaseTes
[jira] [Commented] (HBASE-19902) Current Jenkins Madness: OOME, can't start minihbasecluster, etc.
[ https://issues.apache.org/jira/browse/HBASE-19902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347195#comment-16347195 ] Allen Wittenauer commented on HBASE-19902: -- Copying my comment from HBASE-19887: === Chances are that the unit tests are going over the 5k mark. The number in the output is what was measured as successfully launched in a given interval. It does not measure how many threads were attempted. One way to further test this is to set proclimit to something higher (like 10k) and running on H30 which has a higher UserTasksMax configured. === Two other things: * be aware of parallelism. If parallelism is set to five, two tests are running, and three new tests try to launch at the same time, but each needs 900, the run will blow up but the number reported will be low. * One of the outcomes of HDFS-12711 was finding out that surefire will not always report test failures under certain circumstances such as if surefire itself starts to OOM. In other words, if surefire fails to launch a test, it may not record ANY result for it. This means tests may have been failing before but were never reported as neither success nor fail. They just never existed as far as the harness is concerned. Now, these tests are getting reported because the lower limit means troubled tests fail quicker, freeing up more resources for surefire to keep pounding away. See also SUREFIRE-1447. > Current Jenkins Madness: OOME, can't start minihbasecluster, etc. > - > > Key: HBASE-19902 > URL: https://issues.apache.org/jira/browse/HBASE-19902 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack >Priority: Major > Attachments: HBASE-19902.temporary-2.001.patch > > > Trying to figure what is going on w/ jenkins build > Changed the hadoopqa config to output long process listing rather than just > 'java'... > I can't get loadavg... tried dumping /proc... > /tmp/jenkins6485196190911961762.sh: line 48: /loadavg: Permission denied > Looking at https://builds.apache.org/job/PreCommit-HBASE-Build/11273/console, > see 7 java processes running on H2. Extra args on ps may help here whether it > zombies of us. > Test run was find then fell into hbase-server second part and soon after > started failing.. > https://builds.apache.org/job/PreCommit-HBASE-Build/11273/artifact/patchprocess/patch-unit-hbase-server.txt > Looking at first test failure... this is where main thread is, trying to get > thread info: > {code} > Thread 23 (Time-limited test): > State: RUNNABLE > Blocked count: 118 > Waited count: 58 > Stack: > sun.management.ThreadImpl.getThreadInfo1(Native Method) > sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:178) > sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:139) > > org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.java:168) > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > java.lang.reflect.Method.invoke(Method.java:498) > > org.apache.hadoop.hbase.util.Threads$PrintThreadInfoLazyHolder$1.printThreadInfo(Threads.java:294) > org.apache.hadoop.hbase.util.Threads.printThreadInfo(Threads.java:341) > > org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:191) > > org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:391) > org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:262) > org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:119) > > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:1025) > > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:971) > > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:842) > > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:824) > > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:806) > > org.apache.hadoop.hbase.AcidGuaranteesTestBase.setUpBeforeClass(AcidGuaranteesTestBase.java:61) > {code} > Master is not coming up > {code} > 2018-01-31 02:22:31,474 ERROR [Time-limited test] > hbase.MiniHBaseCluster(267): Error starting cluster > java.lang.RuntimeException: Master not active after 3ms > at > org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:192) > at > org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:391) > at > org.apache.hadoop.hbase.MiniHBaseClust
[jira] [Commented] (HBASE-19887) Do not overwrite the surefire junit listener property in the pom of sub modules
[ https://issues.apache.org/jira/browse/HBASE-19887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347061#comment-16347061 ] Allen Wittenauer commented on HBASE-19887: -- Chances are that the unit tests are going over the 5k mark. The number in the output is what was measured as successfully launched in a given interval. It does not measure how many threads were attempted. One way to further test this is to set proclimit to something higher (like 10k) and running on H30 which has a higher UserTasksMax configured. > Do not overwrite the surefire junit listener property in the pom of sub > modules > --- > > Key: HBASE-19887 > URL: https://issues.apache.org/jira/browse/HBASE-19887 > Project: HBase > Issue Type: Sub-task > Components: build >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-19887-v1.patch, HBASE-19887-v1.patch, > HBASE-19887-v1.patch, HBASE-19887-v1.patch, HBASE-19887-v1.patch, > HBASE-19887-v1.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19887) Do not overwrite the surefire junit listener property in the pom of sub modules
[ https://issues.apache.org/jira/browse/HBASE-19887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16345775#comment-16345775 ] Allen Wittenauer commented on HBASE-19887: -- BTW, 0.7.0 now really does abort when Jenkins sends a kill signal to Yetus docker containers. :) > Do not overwrite the surefire junit listener property in the pom of sub > modules > --- > > Key: HBASE-19887 > URL: https://issues.apache.org/jira/browse/HBASE-19887 > Project: HBase > Issue Type: Sub-task > Components: build >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Attachments: HBASE-19887-v1.patch, HBASE-19887-v1.patch, > HBASE-19887.patch, HBASE-19887.patch, HBASE-19887.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19887) Do not overwrite the surefire junit listener property in the pom of sub modules
[ https://issues.apache.org/jira/browse/HBASE-19887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16345695#comment-16345695 ] Allen Wittenauer commented on HBASE-19887: -- Jobs is hitting the new process limit code in 0.7.0: |Max. process+thread count|923 (vs. ulimit of 1000)| See https://issues.apache.org/jira/browse/HBASE-19898?focusedCommentId=16345664&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16345664 > Do not overwrite the surefire junit listener property in the pom of sub > modules > --- > > Key: HBASE-19887 > URL: https://issues.apache.org/jira/browse/HBASE-19887 > Project: HBase > Issue Type: Sub-task > Components: build >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Attachments: HBASE-19887-v1.patch, HBASE-19887.patch, > HBASE-19887.patch, HBASE-19887.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19898) Canary should choose RegionStdOutSink automatically when write sniffing is specified
[ https://issues.apache.org/jira/browse/HBASE-19898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16345664#comment-16345664 ] Allen Wittenauer commented on HBASE-19898: -- The mvn command never run. It looks like the job is hitting either the Yetus resource limits or the global resource limits: https://builds.apache.org/job/PreCommit-HBASE-Build/11260/artifact/patchprocess/coprocessors.txt Add something like --proclimit=5000 to the command line, which is slightly less than half of the max processes that the ASF Infra team has configured. Any more than that and jobs will either randomly fail (best case) or cause the node to fail (worst case, see HDFS-12711). See INFRA-15685 where I'm trying to get it raised to something more reasonable. > Canary should choose RegionStdOutSink automatically when write sniffing is > specified > > > Key: HBASE-19898 > URL: https://issues.apache.org/jira/browse/HBASE-19898 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Major > Attachments: 19898.v1.txt > > > Currently RegionServerStdOutSink is instantiated by default, even if user > specifies -writeSniffing on the command line. > Write sniffing would be ignored since Sink instance is of > RegionServerStdOutSink class: > {code} > if (this.sink instanceof RegionServerStdOutSink || this.regionServerMode) > { > monitor = > new RegionServerMonitor(connection, monitorTargets, this.useRegExp, > (StdOutSink) this.sink, this.executor, > this.regionServerAllRegions, > this.treatFailureAsError); > {code} > RegionStdOutSink should be used for write sniffing. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-17285) Misconfiguration of JVM GC options in HADOOP_CLIENT_OPTS may break `bin/hbase`
[ https://issues.apache.org/jira/browse/HBASE-17285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15740383#comment-15740383 ] Allen Wittenauer commented on HBASE-17285: -- Unfortunately, all of the _OPTS handling in most of the Hadoop ecosystem scripts I've looked at do very bad things and are pretty much dependent upon using space delimiters. This means no, folks can't properly quote it in scripts and there are some limitations on these values. This obviously causes other problems (the biggest one probably being the inability to use directory paths with spaces) which is why shellcheck is throwing a fit. The only real solution I've found is to convert them all to arrays. This can be done in a somewhat backward compatible change, but it's massive amount of work, even for the rewritten scripts. See HADOOP-13365 for what I've started doing in Hadoop. > Misconfiguration of JVM GC options in HADOOP_CLIENT_OPTS may break `bin/hbase` > -- > > Key: HBASE-17285 > URL: https://issues.apache.org/jira/browse/HBASE-17285 > Project: HBase > Issue Type: Bug > Components: scripts >Reporter: Josh Elser >Assignee: Josh Elser >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-17285.001.patch > > > Had the great fun of digging through this one. Had a user reporting that > hiveserver2 was no longer finding HBase jars on the classpath. This is > supposed to happen via {{hbase mapredcp}}. > It turned out that they had configured hive-env.sh to set > {{HADOOP_CLIENT_OPTS="-XX:+PrintGCDetails"}} (among other things), which > creates a big multi-line string instead of just a directory. Because of poor > quoting in {{bin/hbase}}, this gives you a wonderfully intuitive error: > {noformat} > Error: Could not find or load main class Heap > {noformat} > That {{Heap}} is actually from the JVM GC details that it was told to print. > While I don't expect this to be a common problem people run into, it's one > that we can address with better quoting. e.g. > {noformat} > + exec > /Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/bin/java > -Dproc_mapredcp '-XX:OnOutOfMemoryError=kill -9 %p' -XX:+UseConcMarkSweepGC > -Dhbase.log.dir=/usr/local/lib/hbase//logs -Dhbase.log.file=hbase.log > -Dhbase.home.dir=/usr/local/lib/hbase/ -Dhbase.id.str= > -Dhbase.root.logger=INFO,console > '-Djava.library.path='\''/usr/local/lib/hadoop//lib/native' Heap PSYoungGen > total 76800K, used 7942K '[0x0007f550,' 0x0007faa8, > '0x0008)' eden space 66048K, 12% used > '[0x0007f550,0x0007f5cc19c0,0x0007f958)' from space > 10752K, 0% used '[0x0007fa00,0x0007fa00,0x0007faa8)' > to space 10752K, 0% used > '[0x0007f958,0x0007f958,0x0007fa00)' ParOldGen total > 174592K, used 0K '[0x0007e000,' 0x0007eaa8, > '0x0007f550)' object space 174592K, 0% used > '[0x0007e000,0x0007e000,0x0007eaa8)' PSPermGen total > 21504K, used 2756K '[0x0007dae0,' 0x0007dc30, > '0x0007e000)' object space 21504K, 12% used > '[0x0007dae0,0x0007db0b11b8,0x0007dc30)'\''' > -Dhbase.security.logger=INFO,NullAppender > org.apache.hadoop.hbase.util.MapreduceDependencyClasspathTool > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13525) Update test-patch to leverage Apache Yetus
[ https://issues.apache.org/jira/browse/HBASE-13525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15088511#comment-15088511 ] Allen Wittenauer commented on HBASE-13525: -- FWIW, my current plan for test-patch, etc, in hadoop is in HADOOP-12651. It basically replaces them with wrappers that do downloads, etc. > Update test-patch to leverage Apache Yetus > -- > > Key: HBASE-13525 > URL: https://issues.apache.org/jira/browse/HBASE-13525 > Project: HBase > Issue Type: Improvement > Components: build >Reporter: Sean Busbey >Assignee: Sean Busbey > Labels: jenkins > Fix For: 2.0.0 > > Attachments: HBASE-13525.1.patch > > > Once HADOOP-11746 lands over in Hadoop, incorporate its changes into our > test-patch. Most likely easiest approach is to start with the Hadoop version > and add in the features we have locally that they don't. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14175) Adopt releasedocmaker for better generated release notes
[ https://issues.apache.org/jira/browse/HBASE-14175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14648539#comment-14648539 ] Allen Wittenauer commented on HBASE-14175: -- OK, docs are now viewable online here: https://github.com/apache/hadoop/blob/HADOOP-12111/dev-support/docs/releasedocmaker.md :D > Adopt releasedocmaker for better generated release notes > > > Key: HBASE-14175 > URL: https://issues.apache.org/jira/browse/HBASE-14175 > Project: HBase > Issue Type: Task >Reporter: Andrew Purtell > Fix For: 2.0.0 > > > We should consider adopting Hadoop's releasedocmaker for better release > notes. This would pull out text from the JIRA 'release notes' field with > clean presentation and is vastly superior to our current notes, which are > simply JIRA's list of issues by fix version. Could hook it into the site > build. A convenient part of Yetus to get up and running with. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14175) Adopt releasedocmaker for better generated release notes
[ https://issues.apache.org/jira/browse/HBASE-14175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14648524#comment-14648524 ] Allen Wittenauer commented on HBASE-14175: -- BTW, documentation is currently sitting in HADOOP-12228 . If someone from Yetus could +1 it, I'll commit it *hint hint* bq. so not sure 0.98.0 will work - will be fun to test. I did a run a while back. ( https://github.com/aw-altiscale/eco-release-metadata/tree/master/HBASE ) It works as well as expected. A few people that have played with the output have taken the opportunity to clean things up since it tends to highlight things like bogus release notes. The lint mode tries to help with some of those things, but it's tuned pretty closely to Hadoop's needs. I suspect HBase is going to be better shape due to building release notes from JIRA anyway. > Adopt releasedocmaker for better generated release notes > > > Key: HBASE-14175 > URL: https://issues.apache.org/jira/browse/HBASE-14175 > Project: HBase > Issue Type: Task >Reporter: Andrew Purtell > Fix For: 2.0.0 > > > We should consider adopting Hadoop's releasedocmaker for better release > notes. This would pull out text from the JIRA 'release notes' field with > clean presentation and is vastly superior to our current notes, which are > simply JIRA's list of issues by fix version. Could hook it into the site > build. A convenient part of Yetus to get up and running with. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14163) hbase master stop loops both processes forever
[ https://issues.apache.org/jira/browse/HBASE-14163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14647987#comment-14647987 ] Allen Wittenauer commented on HBASE-14163: -- I wonder if this is a race condition. > hbase master stop loops both processes forever > -- > > Key: HBASE-14163 > URL: https://issues.apache.org/jira/browse/HBASE-14163 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 2.0.0 >Reporter: Allen Wittenauer > > It would appear that there is an infinite loop in the zk client connection > code when performing a master stop when no external zk servers are configured. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14163) hbase master stop loops both processes forever
[ https://issues.apache.org/jira/browse/HBASE-14163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14646330#comment-14646330 ] Allen Wittenauer commented on HBASE-14163: -- So, I just set -Djava.net.preferIPv4Stack=true for HBASE_OPTS in hbase-env.sh and still see the same behavior, minus trying to use IPv6. This is on Mac OS X 10.9.5 with JDK 1.7.0_67. > hbase master stop loops both processes forever > -- > > Key: HBASE-14163 > URL: https://issues.apache.org/jira/browse/HBASE-14163 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 2.0.0 >Reporter: Allen Wittenauer > > It would appear that there is an infinite loop in the zk client connection > code when performing a master stop when no external zk servers are configured. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14163) hbase master stop loops both processes forever
[ https://issues.apache.org/jira/browse/HBASE-14163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14646153#comment-14646153 ] Allen Wittenauer commented on HBASE-14163: -- How long did it take for your hbase master to shutdown? > hbase master stop loops both processes forever > -- > > Key: HBASE-14163 > URL: https://issues.apache.org/jira/browse/HBASE-14163 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 2.0.0 >Reporter: Allen Wittenauer > > It would appear that there is an infinite loop in the zk client connection > code when performing a master stop when no external zk servers are configured. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13231) shell script rewrite
[ https://issues.apache.org/jira/browse/HBASE-13231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14645011#comment-14645011 ] Allen Wittenauer commented on HBASE-13231: -- Linking HBASE-14163 as a blocker. > shell script rewrite > > > Key: HBASE-13231 > URL: https://issues.apache.org/jira/browse/HBASE-13231 > Project: HBase > Issue Type: New Feature > Components: scripts >Affects Versions: 2.0.0 >Reporter: Allen Wittenauer > Attachments: HBASE-13231-donotuse.patch > > > This JIRA is for updating the HBase bash scripts to something remotely > modern. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14163) hbase master stop loops both processes forever
[ https://issues.apache.org/jira/browse/HBASE-14163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HBASE-14163: - Component/s: master > hbase master stop loops both processes forever > -- > > Key: HBASE-14163 > URL: https://issues.apache.org/jira/browse/HBASE-14163 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 2.0.0 >Reporter: Allen Wittenauer > > It would appear that there is an infinite loop in the zk client connection > code when performing a master stop when no external zk servers are configured. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14163) hbase master stop loops both processes forever
[ https://issues.apache.org/jira/browse/HBASE-14163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HBASE-14163: - Description: It would appear that there is an infinite loop in the zk client connection code when performing a master stop when no external zk servers are configured. (was: It would appear that there is an infinite loop in the zk client connection code when performing a master stop when no external zk servers are available.) > hbase master stop loops both processes forever > -- > > Key: HBASE-14163 > URL: https://issues.apache.org/jira/browse/HBASE-14163 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Allen Wittenauer > > It would appear that there is an infinite loop in the zk client connection > code when performing a master stop when no external zk servers are configured. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14163) hbase master stop loops both processes forever
[ https://issues.apache.org/jira/browse/HBASE-14163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14644985#comment-14644985 ] Allen Wittenauer commented on HBASE-14163: -- I can reproduce this repeatedly using master. # untar a fresh install with nothing configured except what ships out of the box # bin/hbase master start # let it start up # in another window, bin/hbase master stop Both processes are now looping: {code} 2015-07-28 13:16:11,985 INFO [10.248.3.81:53113.activeMasterManager-SendThread(localhost:2181)] zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) 2015-07-28 13:16:11,985 WARN [10.248.3.81:53113.activeMasterManager-SendThread(localhost:2181)] zookeeper.ClientCnxn: Session 0x14ed64a3ddd0004 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) 2015-07-28 13:16:12,603 INFO [10.248.3.81:53113.activeMasterManager-SendThread(localhost:2181)] zookeeper.ClientCnxn: Opening socket connection to server localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL (unknown error) 2015-07-28 13:16:12,603 WARN [10.248.3.81:53113.activeMasterManager-SendThread(localhost:2181)] zookeeper.ClientCnxn: Session 0x14ed64a3ddd0004 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) {code} ... until a kill or ctrl-c is sent. > hbase master stop loops both processes forever > -- > > Key: HBASE-14163 > URL: https://issues.apache.org/jira/browse/HBASE-14163 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Allen Wittenauer > > It would appear that there is an infinite loop in the zk client connection > code when performing a master stop when no external zk servers are available. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14163) hbase master stop loops both processes forever
Allen Wittenauer created HBASE-14163: Summary: hbase master stop loops both processes forever Key: HBASE-14163 URL: https://issues.apache.org/jira/browse/HBASE-14163 Project: HBase Issue Type: Bug Affects Versions: 2.0.0 Reporter: Allen Wittenauer It would appear that there is an infinite loop in the zk client connection code when performing a master stop when no external zk servers are available. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13525) Update test-patch to leverage rewrite in Hadoop
[ https://issues.apache.org/jira/browse/HBASE-13525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14554973#comment-14554973 ] Allen Wittenauer commented on HBASE-13525: -- I'll be around off and on over the weekend if you need help. I know that HADOOP-11929 will be helpful here too, but I need to get HADOOP-11933 in first since it's less of an invasive change. I'll likely finish 11929 relatively soon though. *crosses fingers* > Update test-patch to leverage rewrite in Hadoop > --- > > Key: HBASE-13525 > URL: https://issues.apache.org/jira/browse/HBASE-13525 > Project: HBase > Issue Type: Improvement > Components: build >Reporter: Sean Busbey >Assignee: Sean Busbey > Labels: jenkins > Fix For: 2.0.0 > > > Once HADOOP-11746 lands over in Hadoop, incorporate its changes into our > test-patch. Most likely easiest approach is to start with the Hadoop version > and add in the features we have locally that they don't. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13680) Unhandled exception thrown when "hadoop.rpc.protection" is "privacy" in hdfs and in hbase it is "authentication"
[ https://issues.apache.org/jira/browse/HBASE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542096#comment-14542096 ] Allen Wittenauer commented on HBASE-13680: -- Moving this to HBase. There's nothng Hadoop can do about an HBase stack trace. > Unhandled exception thrown when "hadoop.rpc.protection" is "privacy" in hdfs > and in hbase it is "authentication" > > > Key: HBASE-13680 > URL: https://issues.apache.org/jira/browse/HBASE-13680 > Project: HBase > Issue Type: Bug >Reporter: Archana T >Assignee: surendra singh lilhore >Priority: Minor > > Unhandled exception thrown when "hadoop.rpc.protection" is "privacy" in hdfs > and in hbase it is "authentication" > 2015-05-13 22:40:18,772 | FATAL | master:51-196-28-1:21300 | Master server > abort: loaded coprocessors are: [org.apache.hadoop.hbase.JMXListener] | > org.apache.hadoop.hbase.master.HMaster.abort(HMaster.java:2279) > 2015-05-13 22:40:18,773 | FATAL | master:51-196-28-1:21300 | Unhandled > exception. Starting shutdown. | > org.apache.hadoop.hbase.master.HMaster.abort(HMaster.java:2284) > org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.sortLocatedBlocks(DatanodeManager.java:375) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1631) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:500) > at -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Moved] (HBASE-13680) Unhandled exception thrown when "hadoop.rpc.protection" is "privacy" in hdfs and in hbase it is "authentication"
[ https://issues.apache.org/jira/browse/HBASE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer moved HDFS-8389 to HBASE-13680: Affects Version/s: (was: 2.4.0) Key: HBASE-13680 (was: HDFS-8389) Project: HBase (was: Hadoop HDFS) > Unhandled exception thrown when "hadoop.rpc.protection" is "privacy" in hdfs > and in hbase it is "authentication" > > > Key: HBASE-13680 > URL: https://issues.apache.org/jira/browse/HBASE-13680 > Project: HBase > Issue Type: Bug >Reporter: Archana T >Assignee: surendra singh lilhore >Priority: Minor > > Unhandled exception thrown when "hadoop.rpc.protection" is "privacy" in hdfs > and in hbase it is "authentication" > 2015-05-13 22:40:18,772 | FATAL | master:51-196-28-1:21300 | Master server > abort: loaded coprocessors are: [org.apache.hadoop.hbase.JMXListener] | > org.apache.hadoop.hbase.master.HMaster.abort(HMaster.java:2279) > 2015-05-13 22:40:18,773 | FATAL | master:51-196-28-1:21300 | Unhandled > exception. Starting shutdown. | > org.apache.hadoop.hbase.master.HMaster.abort(HMaster.java:2284) > org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.sortLocatedBlocks(DatanodeManager.java:375) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1631) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:500) > at -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13525) Update test-patch to leverage rewrite in Hadoop
[ https://issues.apache.org/jira/browse/HBASE-13525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505782#comment-14505782 ] Allen Wittenauer commented on HBASE-13525: -- One of my unstated goals with HADOOP-11746 was for it to be a mostly drop-in replacement for any project currently using the older bits. (A quick pass through most of the ecosystem reveals that the majority are using some form of it or another). The plug-in capabilities certainly make it easier to add custom stuff, but it's a lot harder to fix some assumptions made about the source tree layout and maven usage (of course). (Weirdly, this is the second time that having some generic bits that the entire ecosystem could leverage has come up with some major rewrite I've undertaken. Roman suggested pulling the base shell scripts out of Hadoop and forming a completely separate project!) Anyway, for better or worse, HBase probably has the most customized out of all of them, based upon my quick pass through. I'll try to offer guidance where I can. It'll be interesting to see what does/doesn't work, especially with the new framework. The biggest one I'm worried about are the backward compatibility bits. I *hope* override works there, but I haven't had a chance to actually test that part ;) > Update test-patch to leverage rewrite in Hadoop > --- > > Key: HBASE-13525 > URL: https://issues.apache.org/jira/browse/HBASE-13525 > Project: HBase > Issue Type: Improvement > Components: build >Reporter: Sean Busbey > Labels: jenkins > Fix For: 2.0.0 > > > Once HADOOP-11746 lands over in Hadoop, incorporate its changes into our > test-patch. Most likely easiest approach is to start with the Hadoop version > and add in the features we have locally that they don't. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13231) shell script rewrite
[ https://issues.apache.org/jira/browse/HBASE-13231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HBASE-13231: - Attachment: HBASE-13231-donotuse.patch -donotuse: * "initial" revision from a few months ago This is just a 'thought balloon' patch. It does not work. It is incomplete. It is full of embarrassing mistakes. There is a lot of copypasta. Chances are good I'll start over. It is just to simulate some discussion and generate some ideas of the things we'd like to see different. > shell script rewrite > > > Key: HBASE-13231 > URL: https://issues.apache.org/jira/browse/HBASE-13231 > Project: HBase > Issue Type: New Feature > Components: scripts >Affects Versions: 2.0.0 >Reporter: Allen Wittenauer > Attachments: HBASE-13231-donotuse.patch > > > This JIRA is for updating the HBase bash scripts to something remotely > modern. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13231) shell script rewrite
[ https://issues.apache.org/jira/browse/HBASE-13231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HBASE-13231: - Description: This JIRA is for updating the HBase bash scripts to something remotely modern. was: This JIRA is for updating the HBase shell code to something remotely modern. > shell script rewrite > > > Key: HBASE-13231 > URL: https://issues.apache.org/jira/browse/HBASE-13231 > Project: HBase > Issue Type: New Feature > Components: scripts >Affects Versions: 2.0.0 >Reporter: Allen Wittenauer > > This JIRA is for updating the HBase bash scripts to something remotely > modern. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13231) shell script rewrite
[ https://issues.apache.org/jira/browse/HBASE-13231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14360707#comment-14360707 ] Allen Wittenauer commented on HBASE-13231: -- Just the bash code. I'll remove the shell component. Thanks! > shell script rewrite > > > Key: HBASE-13231 > URL: https://issues.apache.org/jira/browse/HBASE-13231 > Project: HBase > Issue Type: New Feature > Components: scripts >Affects Versions: 2.0.0 >Reporter: Allen Wittenauer > > This JIRA is for updating the HBase shell code to something remotely modern. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13231) shell script rewrite
[ https://issues.apache.org/jira/browse/HBASE-13231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HBASE-13231: - Component/s: (was: shell) > shell script rewrite > > > Key: HBASE-13231 > URL: https://issues.apache.org/jira/browse/HBASE-13231 > Project: HBase > Issue Type: New Feature > Components: scripts >Affects Versions: 2.0.0 >Reporter: Allen Wittenauer > > This JIRA is for updating the HBase shell code to something remotely modern. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13231) shell script rewrite
[ https://issues.apache.org/jira/browse/HBASE-13231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14360670#comment-14360670 ] Allen Wittenauer commented on HBASE-13231: -- errr, "not near a HBase expert". Woops. haha. > shell script rewrite > > > Key: HBASE-13231 > URL: https://issues.apache.org/jira/browse/HBASE-13231 > Project: HBase > Issue Type: New Feature > Components: scripts, shell >Affects Versions: 2.0.0 >Reporter: Allen Wittenauer > > This JIRA is for updating the HBase shell code to something remotely modern. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13231) shell script rewrite
[ https://issues.apache.org/jira/browse/HBASE-13231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14360661#comment-14360661 ] Allen Wittenauer commented on HBASE-13231: -- About a year ago, [~apurtell] and I were talking about HADOOP-9902 and how the work was progressing. He mentioned that the HBase scripts were based on -ancient tomes- the Hadoop shell scripts and that it would be nice to see them rewritten as well. After many -beers- objective points to back his position, he convinced me that I should probably take a look and work on them. It took a while to finish a big chunk of the work in Hadoop, followed on with more work as bugs and new ideas popped up (HADOOP-11010) . With the help of a lot of other folks after the base work was done, that work as mostly slowed down to a very stable state, with just a few things to finish up (minus unit test). A few months back, I started to see what the state of the HBase code actually was. Again after many -beers- hours of deep analysis, I did a bit of playing around, using the Hadoop code as a base. I had some basic stuff, but hit a few pot holes esp when it came to bw compat. I sort of put things on hold as HBase 1.0 had shipped and other, non-Apache stuff floated to the top. [~busbey] knew I was working on said scripts off & on over the past few months and suggested I open this JIRA so that he could -hold something over my head- potentially get something for 1.1 or (more likely) 2.0. I need to do some cleanup, but I'll try and post what I have thus far. It's in an incomplete state (read: not usable), but it will give the community a sense of what I think the direction should probably be. Feedback is always great, esp if I've done something completely idiotic. Just bear in mind I'm near a HBase expert so there is a very high probably of that occurring. Just to expectation set: when it comes to this type of thing, compatibility is usually a secondary concern, with future capabilities and ease of using usually more primary. In the case of Hadoop, I estimate it is around 80-90% backward compat with lots of things triggering deprecation warnings. Of course, the community ultimately decides but I wanted to throw that out there. > shell script rewrite > > > Key: HBASE-13231 > URL: https://issues.apache.org/jira/browse/HBASE-13231 > Project: HBase > Issue Type: New Feature > Components: scripts, shell >Affects Versions: 2.0.0 >Reporter: Allen Wittenauer > > This JIRA is for updating the HBase shell code to something remotely modern. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-13231) shell script rewrite
Allen Wittenauer created HBASE-13231: Summary: shell script rewrite Key: HBASE-13231 URL: https://issues.apache.org/jira/browse/HBASE-13231 Project: HBase Issue Type: New Feature Components: scripts, shell Affects Versions: 2.0.0 Reporter: Allen Wittenauer This JIRA is for updating the HBase shell code to something remotely modern. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11534) Remove broken JAVA_HOME autodetection in hbase-config.sh
[ https://issues.apache.org/jira/browse/HBASE-11534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14064985#comment-14064985 ] Allen Wittenauer commented on HBASE-11534: -- FWIW, I'm a big fan of "Let the installer figure this out." i.e., this is one place where having a distribution (bigtop or otherwise) is ideal because they can get away with taking a lot of time to configure and tune the system as a one-time operation. Taking the hit every time is...excessive. Anyway, kudos for fixing this. > Remove broken JAVA_HOME autodetection in hbase-config.sh > > > Key: HBASE-11534 > URL: https://issues.apache.org/jira/browse/HBASE-11534 > Project: HBase > Issue Type: Bug >Reporter: Andrew Purtell >Assignee: Esteban Gutierrez >Priority: Minor > Fix For: 0.99.0, 0.96.3, 0.98.5, 0.94.22, 2.0.0 > > Attachments: HBASE-11534.patch > > > [~aw] mentioned on Twitter that the old JAVA_HOME autodetection script we > have in hbase-config.sh is very unlikely to do the right thing now. Rip it > out. -- This message was sent by Atlassian JIRA (v6.2#6252)