[jira] [Commented] (YARN-10058) Capacity Scheduler dispatcher hang when async thread crash
[ https://issues.apache.org/jira/browse/YARN-10058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17002713#comment-17002713 ] tuyu commented on YARN-10058: - when patch YARN-8737 to local repo, this can not fix race condition, the async thread also crash,and capacity scheduler will hang. I think no matter what happens,if global scheduler's async thread crash, current RM should exit or transition to standby. > Capacity Scheduler dispatcher hang when async thread crash > -- > > Key: YARN-10058 > URL: https://issues.apache.org/jira/browse/YARN-10058 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.2.0, 3.2.1 >Reporter: tuyu >Priority: Major > Fix For: 3.2.1 > > Attachments: 0001-global-scheduling-standby-hang.patch > > > when capacity scheduler enable global scheduler, if global scheduler's > AsyncScheduleThread crash, the capacity scheduler dispatcher will hang for > long time. This behavior is unreasonable. > if this situation happen, In HA mode, current RM should change to standby -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10059) Final states of failed-to-localize containers are not recorded in NM state store
[ https://issues.apache.org/jira/browse/YARN-10059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17002700#comment-17002700 ] Tao Yang commented on YARN-10059: - Attached v1 patch for review. > Final states of failed-to-localize containers are not recorded in NM state > store > > > Key: YARN-10059 > URL: https://issues.apache.org/jira/browse/YARN-10059 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-10059.001.patch > > > Currently we found an issue that many localizers of completed containers were > launched and exhausted memory/cpu of that machine after NM restarted, these > containers were all failed and completed when localizing on a non-existed > local directory which is caused by another problem, but their final states > weren't recorded in NM state store. > The process flow of a fail-to-localize container is as follow: > {noformat} > ResourceLocalizationService$LocalizerRunner#run > -> ContainerImpl$ResourceFailedTransition#transition handle LOCALIZING -> > LOCALIZATION_FAILED upon RESOURCE_FAILED > dispatch LocalizationEventType.CLEANUP_CONTAINER_RESOURCES > -> ResourceLocalizationService#handleCleanupContainerResources handle > CLEANUP_CONTAINER_RESOURCES > dispatch ContainerEventType.CONTAINER_RESOURCES_CLEANEDUP > -> ContainerImpl$LocalizationFailedToDoneTransition#transition > handle LOCALIZATION_FAILED -> DONE upon CONTAINER_RESOURCES_CLEANEDUP > {noformat} > There's no update for state store in this flow now, which is required to > avoid unnecessary localizations after NM restarts. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10059) Final states of failed-to-localize containers are not recorded in NM state store
[ https://issues.apache.org/jira/browse/YARN-10059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang updated YARN-10059: Attachment: YARN-10059.001.patch > Final states of failed-to-localize containers are not recorded in NM state > store > > > Key: YARN-10059 > URL: https://issues.apache.org/jira/browse/YARN-10059 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-10059.001.patch > > > Currently we found an issue that many localizers of completed containers were > launched and exhausted memory/cpu of that machine after NM restarted, these > containers were all failed and completed when localizing on a non-existed > local directory which is caused by another problem, but their final states > weren't recorded in NM state store. > The process flow of a fail-to-localize container is as follow: > {noformat} > ResourceLocalizationService$LocalizerRunner#run > -> ContainerImpl$ResourceFailedTransition#transition handle LOCALIZING -> > LOCALIZATION_FAILED upon RESOURCE_FAILED > dispatch LocalizationEventType.CLEANUP_CONTAINER_RESOURCES > -> ResourceLocalizationService#handleCleanupContainerResources handle > CLEANUP_CONTAINER_RESOURCES > dispatch ContainerEventType.CONTAINER_RESOURCES_CLEANEDUP > -> ContainerImpl$LocalizationFailedToDoneTransition#transition > handle LOCALIZATION_FAILED -> DONE upon CONTAINER_RESOURCES_CLEANEDUP > {noformat} > There's no update for state store in this flow now, which is required to > avoid unnecessary localizations after NM restarts. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10059) Final states of failed-to-localize containers are not recorded in NM state store
Tao Yang created YARN-10059: --- Summary: Final states of failed-to-localize containers are not recorded in NM state store Key: YARN-10059 URL: https://issues.apache.org/jira/browse/YARN-10059 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Reporter: Tao Yang Assignee: Tao Yang Currently we found an issue that many localizers of completed containers were launched and exhausted memory/cpu of that machine after NM restarted, these containers were all failed and completed when localizing on a non-existed local directory which is caused by another problem, but their final states weren't recorded in NM state store. The process flow of a fail-to-localize container is as follow: {noformat} ResourceLocalizationService$LocalizerRunner#run -> ContainerImpl$ResourceFailedTransition#transition handle LOCALIZING -> LOCALIZATION_FAILED upon RESOURCE_FAILED dispatch LocalizationEventType.CLEANUP_CONTAINER_RESOURCES -> ResourceLocalizationService#handleCleanupContainerResources handle CLEANUP_CONTAINER_RESOURCES dispatch ContainerEventType.CONTAINER_RESOURCES_CLEANEDUP -> ContainerImpl$LocalizationFailedToDoneTransition#transition handle LOCALIZATION_FAILED -> DONE upon CONTAINER_RESOURCES_CLEANEDUP {noformat} There's no update for state store in this flow now, which is required to avoid unnecessary localizations after NM restarts. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-10042) Uupgrade grpc-xxx depdencies to 1.26.0
[ https://issues.apache.org/jira/browse/YARN-10042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena reassigned YARN-10042: --- Assignee: liusheng > Uupgrade grpc-xxx depdencies to 1.26.0 > -- > > Key: YARN-10042 > URL: https://issues.apache.org/jira/browse/YARN-10042 > Project: Hadoop YARN > Issue Type: Bug >Reporter: liusheng >Assignee: liusheng >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-10042.001.patch, > hadoop_build_aarch64_grpc_1.26.0.log, hadoop_build_x86_64_grpc_1.26.0.log, > yarn_csi_tests_aarch64_grpc_1.26.0.log, yarn_csi_tests_x86_64_grpc_1.26.0.log > > > For now, Hadoop YARN use grpc-context, grpc-core, grpc-netty, grpc-protobuf, > grpc-protobuf-lite, grpc-stub and protoc-gen-grpc-java of version 1.15.1, but > the "protoc-gen-grpc-java" cannot support on aarch64 platform. Now the > grpc-java repo has support aarch64 platform and release in 1.26.0 in maven > central. > see: > [https://github.com/grpc/grpc-java/pull/6496] > [https://search.maven.org/search?q=g:io.grpc] > It is better to upgrade the version of grpc-xxx dependencies to 1.26.0 > version. both x86_64 and aarch64 server are building OK accroding to my > testing, please see the attachment, they are: log of building on aarch64, log > of building on x86_64, log of running tests of yarn csi on aarch64, log of > running tests of yarn csi on x86_64. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10058) Capacity Scheduler dispatcher hang when async thread crash
tuyu created YARN-10058: --- Summary: Capacity Scheduler dispatcher hang when async thread crash Key: YARN-10058 URL: https://issues.apache.org/jira/browse/YARN-10058 Project: Hadoop YARN Issue Type: Bug Components: capacity scheduler Affects Versions: 3.2.1, 3.2.0 Reporter: tuyu Fix For: 3.2.1 when capacity scheduler enable global scheduler, if global scheduler's AsyncScheduleThread crash, the capacity scheduler dispatcher will hang for long time. This behavior is unreasonable. if this situation happen, In HA mode, current RM should change to standby -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10057) Upgrade the dependencies managed by yarnpkg
[ https://issues.apache.org/jira/browse/YARN-10057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated YARN-10057: - Summary: Upgrade the dependencies managed by yarnpkg (was: Upgrade dependencies in yarnpkg) > Upgrade the dependencies managed by yarnpkg > --- > > Key: YARN-10057 > URL: https://issues.apache.org/jira/browse/YARN-10057 > Project: Hadoop YARN > Issue Type: Bug > Components: build, yarn-ui-v2 >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Major > > Run "yarn upgrade" to update the dependencies managed by yarnpkg. > Dependabot automatically created the following pull requests and this issue > is to close them. > * https://github.com/apache/hadoop/pull/1741 > * https://github.com/apache/hadoop/pull/1742 > * https://github.com/apache/hadoop/pull/1743 > * https://github.com/apache/hadoop/pull/1744 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10057) Upgrade dependencies in yarnpkg
[ https://issues.apache.org/jira/browse/YARN-10057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated YARN-10057: - Description: Run "yarn upgrade" to update the dependencies managed by yarnpkg. Dependabot automatically created the following pull requests and this issue is to close them. * https://github.com/apache/hadoop/pull/1741 * https://github.com/apache/hadoop/pull/1742 * https://github.com/apache/hadoop/pull/1743 * https://github.com/apache/hadoop/pull/1744 was: Run "yarn upgrade" to update the dependencies managed by yarnpkg. Dependabot created the following pull requests and this issue is to close them. * https://github.com/apache/hadoop/pull/1741 * https://github.com/apache/hadoop/pull/1742 * https://github.com/apache/hadoop/pull/1743 * https://github.com/apache/hadoop/pull/1744 > Upgrade dependencies in yarnpkg > --- > > Key: YARN-10057 > URL: https://issues.apache.org/jira/browse/YARN-10057 > Project: Hadoop YARN > Issue Type: Bug > Components: build, yarn-ui-v2 >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Major > > Run "yarn upgrade" to update the dependencies managed by yarnpkg. > Dependabot automatically created the following pull requests and this issue > is to close them. > * https://github.com/apache/hadoop/pull/1741 > * https://github.com/apache/hadoop/pull/1742 > * https://github.com/apache/hadoop/pull/1743 > * https://github.com/apache/hadoop/pull/1744 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10057) Upgrade dependencies in yarnpkg
Akira Ajisaka created YARN-10057: Summary: Upgrade dependencies in yarnpkg Key: YARN-10057 URL: https://issues.apache.org/jira/browse/YARN-10057 Project: Hadoop YARN Issue Type: Bug Components: build, yarn-ui-v2 Reporter: Akira Ajisaka Assignee: Akira Ajisaka Run "yarn upgrade" to update the dependencies managed by yarnpkg. Dependabot created the following pull requests and this issue is to close them. * https://github.com/apache/hadoop/pull/1741 * https://github.com/apache/hadoop/pull/1742 * https://github.com/apache/hadoop/pull/1743 * https://github.com/apache/hadoop/pull/1744 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7672) hadoop-sls can not simulate huge scale of YARN
[ https://issues.apache.org/jira/browse/YARN-7672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17002610#comment-17002610 ] zhoukang commented on YARN-7672: Thanks for the patch [~yufeigu]do we have a patch about the metrics?thanks > hadoop-sls can not simulate huge scale of YARN > -- > > Key: YARN-7672 > URL: https://issues.apache.org/jira/browse/YARN-7672 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: zhangshilong >Assignee: zhangshilong >Priority: Major > Attachments: YARN-7672.patch > > > Our YARN cluster scale to nearly 10 thousands nodes. We need to do scheduler > pressure test. > Using SLS,we start 2000+ threads to simulate NM and AM. But cpu.load very > high to 100+. I thought that will affect performance evaluation of > scheduler. > So I thought to separate the scheduler from the simulator. > I start a real RM. Then SLS will register nodes to RM,And submit apps to RM > using RM RPC. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10053) Placement rules do not use correct group service init
[ https://issues.apache.org/jira/browse/YARN-10053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17002581#comment-17002581 ] Hadoop QA commented on YARN-10053: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 27s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} branch-3.2 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 45s{color} | {color:green} branch-3.2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} branch-3.2 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 45s{color} | {color:green} branch-3.2 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 44s{color} | {color:green} branch-3.2 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 46s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 11s{color} | {color:green} branch-3.2 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s{color} | {color:green} branch-3.2 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 37s{color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 0 new + 447 unchanged - 5 fixed = 447 total (was 452) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 11s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 70m 47s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}123m 26s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisherForV2 | | | hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:0f25cbbb251 | | JIRA Issue | YARN-10053 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12989405/YARN-10053-branch-3.2.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 5428064fb46a 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | branch-3.2 / 85da5cb | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_232 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/25315/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | |
[jira] [Commented] (YARN-10053) Placement rules do not use correct group service init
[ https://issues.apache.org/jira/browse/YARN-10053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17002580#comment-17002580 ] Hadoop QA commented on YARN-10053: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 59s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} branch-3.1 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 57s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 37s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 46s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 13s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 10s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} branch-3.1 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s{color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 0 new + 447 unchanged - 6 fixed = 447 total (was 453) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 53s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 70m 19s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}129m 53s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:70a0ef5d4a6 | | JIRA Issue | YARN-10053 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12989404/YARN-10053-branch-3.1.003.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 06f6d9cdb095 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | branch-3.1 / 01edc65 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_232 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/25314/testReport/ | | Max. process+thread count | 772 (vs. ulimit of 5500) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output |
[jira] [Updated] (YARN-10053) Placement rules do not use correct group service init
[ https://issues.apache.org/jira/browse/YARN-10053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated YARN-10053: - Attachment: YARN-10053-branch-3.2.002.patch > Placement rules do not use correct group service init > - > > Key: YARN-10053 > URL: https://issues.apache.org/jira/browse/YARN-10053 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.1.3 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Major > Attachments: YARN-10053-branch-3.1.001.patch, > YARN-10053-branch-3.1.002.patch, YARN-10053-branch-3.1.003.patch, > YARN-10053-branch-3.2.001.patch, YARN-10053-branch-3.2.002.patch, > YARN-10053.001.patch, YARN-10053.002.patch > > > The placement rules, CS and FS, all create a new group service instead of > using the shared group mapping service. This means that the cache for the > placement rules is not same as for the ACL and other parts of the RM service. > It also could cause an issue with the configuration that is passed in to > create the cache: the scheduler config might not have the same values as the > service config and could thus cause issues. This second issue just seems to > affect the CS not the FS. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10053) Placement rules do not use correct group service init
[ https://issues.apache.org/jira/browse/YARN-10053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17002545#comment-17002545 ] Wilfred Spiegelenburg edited comment on YARN-10053 at 12/23/19 11:49 PM: - The junit test failures are not related to my change. The tests pass locally. One failures is caused by a port bind exception, they seem environment related or are already known (YARN-9740, YARN-9338). Uploading new patches to fix checkstyle issues in 3.1 and 3.2 branches. was (Author: wilfreds): The junit test failures are not related to my change. The tests pass locally. One failures is caused by a port bind exception, they seem environment related or are already known (YARN-9740, YARN-9338). in 3.1 and 3.2 branches Uploading new patches to fix checkstyle issues > Placement rules do not use correct group service init > - > > Key: YARN-10053 > URL: https://issues.apache.org/jira/browse/YARN-10053 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.1.3 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Major > Attachments: YARN-10053-branch-3.1.001.patch, > YARN-10053-branch-3.1.002.patch, YARN-10053-branch-3.1.003.patch, > YARN-10053-branch-3.2.001.patch, YARN-10053.001.patch, YARN-10053.002.patch > > > The placement rules, CS and FS, all create a new group service instead of > using the shared group mapping service. This means that the cache for the > placement rules is not same as for the ACL and other parts of the RM service. > It also could cause an issue with the configuration that is passed in to > create the cache: the scheduler config might not have the same values as the > service config and could thus cause issues. This second issue just seems to > affect the CS not the FS. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10053) Placement rules do not use correct group service init
[ https://issues.apache.org/jira/browse/YARN-10053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17002545#comment-17002545 ] Wilfred Spiegelenburg commented on YARN-10053: -- The junit test failures are not related to my change. The tests pass locally. One failures is caused by a port bind exception, they seem environment related or are already known (YARN-9740, YARN-9338). in 3.1 and 3.2 branches Uploading new patches to fix checkstyle issues > Placement rules do not use correct group service init > - > > Key: YARN-10053 > URL: https://issues.apache.org/jira/browse/YARN-10053 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.1.3 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Major > Attachments: YARN-10053-branch-3.1.001.patch, > YARN-10053-branch-3.1.002.patch, YARN-10053-branch-3.1.003.patch, > YARN-10053-branch-3.2.001.patch, YARN-10053.001.patch, YARN-10053.002.patch > > > The placement rules, CS and FS, all create a new group service instead of > using the shared group mapping service. This means that the cache for the > placement rules is not same as for the ACL and other parts of the RM service. > It also could cause an issue with the configuration that is passed in to > create the cache: the scheduler config might not have the same values as the > service config and could thus cause issues. This second issue just seems to > affect the CS not the FS. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10053) Placement rules do not use correct group service init
[ https://issues.apache.org/jira/browse/YARN-10053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated YARN-10053: - Attachment: YARN-10053-branch-3.1.003.patch > Placement rules do not use correct group service init > - > > Key: YARN-10053 > URL: https://issues.apache.org/jira/browse/YARN-10053 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.1.3 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Major > Attachments: YARN-10053-branch-3.1.001.patch, > YARN-10053-branch-3.1.002.patch, YARN-10053-branch-3.1.003.patch, > YARN-10053-branch-3.2.001.patch, YARN-10053.001.patch, YARN-10053.002.patch > > > The placement rules, CS and FS, all create a new group service instead of > using the shared group mapping service. This means that the cache for the > placement rules is not same as for the ACL and other parts of the RM service. > It also could cause an issue with the configuration that is passed in to > create the cache: the scheduler config might not have the same values as the > service config and could thus cause issues. This second issue just seems to > affect the CS not the FS. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-10000) Code cleanup in FSSchedulerConfigurationStore
[ https://issues.apache.org/jira/browse/YARN-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Ahuja reassigned YARN-1: -- Assignee: Siddharth Ahuja (was: Szilard Nemeth) > Code cleanup in FSSchedulerConfigurationStore > - > > Key: YARN-1 > URL: https://issues.apache.org/jira/browse/YARN-1 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Siddharth Ahuja >Priority: Minor > > Some things could be improved: > * In initialize: PathFilter can be replaced with lambda > * initialize is long, could be split into smaller methods > * In method 'format': for-loop can be replaced with foreach > * There's a variable with a typo: lastestConfigPath > * Add explanation of unimplemented methods > * Abstract Filesystem operations away more: > * Bad logging: Format string is combined with exception logging. > {code:java} > LOG.info("Failed to write config version at {}", configVersionFile, e); > {code} > * Interestingly phrased log messages like "write temp capacity configuration > fail" "write temp capacity configuration successfully, schedulerConfigFile=" > * Method "writeConfigurationToFileSystem" could be private > * Any other code quality improvements -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-9996) Code cleanup in QueueAdminConfigurationMutationACLPolicy
[ https://issues.apache.org/jira/browse/YARN-9996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Ahuja reassigned YARN-9996: - Assignee: Siddharth Ahuja (was: Szilard Nemeth) > Code cleanup in QueueAdminConfigurationMutationACLPolicy > > > Key: YARN-9996 > URL: https://issues.apache.org/jira/browse/YARN-9996 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Siddharth Ahuja >Priority: Major > > Method 'isMutationAllowed' contains many uses of substring and lastIndexOf. > These could be extracted and simplified. > Also, some logging could be added as well. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-9999) TestFSSchedulerConfigurationStore: Extend from ConfigurationStoreBaseTest, general code cleanup
[ https://issues.apache.org/jira/browse/YARN-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Ahuja reassigned YARN-: - Assignee: Siddharth Ahuja (was: Szilard Nemeth) > TestFSSchedulerConfigurationStore: Extend from ConfigurationStoreBaseTest, > general code cleanup > --- > > Key: YARN- > URL: https://issues.apache.org/jira/browse/YARN- > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Siddharth Ahuja >Priority: Minor > > All config store tests are extended from ConfigurationStoreBaseTest: > * TestInMemoryConfigurationStore > * TestLeveldbConfigurationStore > * TestZKConfigurationStore > TestFSSchedulerConfigurationStore should also extend from it. > Additionally, some general code cleanup can be applied as well. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-9998) Code cleanup in LeveldbConfigurationStore
[ https://issues.apache.org/jira/browse/YARN-9998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Ahuja reassigned YARN-9998: - Assignee: Siddharth Ahuja (was: Szilard Nemeth) > Code cleanup in LeveldbConfigurationStore > - > > Key: YARN-9998 > URL: https://issues.apache.org/jira/browse/YARN-9998 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Siddharth Ahuja >Priority: Minor > > Many things can be improved: > * Field compactionTimer could be a local variable > * Field versiondb should be camelcase > * initDatabase is a very long method: Initialize db / versionDb should be in > separate methods, split this method into smaller chunks > * Remove TODOs > * Remove duplicated code block in > LeveldbConfigurationStore.CompactionTimerTask > * Any other cleanup -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-10001) Add explanation of unimplemented methods in InMemoryConfigurationStore
[ https://issues.apache.org/jira/browse/YARN-10001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Ahuja reassigned YARN-10001: -- Assignee: Siddharth Ahuja (was: Szilard Nemeth) > Add explanation of unimplemented methods in InMemoryConfigurationStore > -- > > Key: YARN-10001 > URL: https://issues.apache.org/jira/browse/YARN-10001 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Siddharth Ahuja >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-10004) Javadoc of YarnConfigurationStore#initialize is not straightforward
[ https://issues.apache.org/jira/browse/YARN-10004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Ahuja reassigned YARN-10004: -- Assignee: Siddharth Ahuja (was: Szilard Nemeth) > Javadoc of YarnConfigurationStore#initialize is not straightforward > --- > > Key: YARN-10004 > URL: https://issues.apache.org/jira/browse/YARN-10004 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Siddharth Ahuja >Priority: Minor > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10047) Memory consume of process tree will consider subprocess which may make container exit unexcepted
[ https://issues.apache.org/jira/browse/YARN-10047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17002521#comment-17002521 ] Wilfred Spiegelenburg commented on YARN-10047: -- Accounting is not done by YARN, it is done by the OS. It sounds like you are saying that the OS is not accounting correctly. >From a YARN perspective we consider any process that is forked from the >container script to be part of the container. All sub processes are therefore >part of the container. You must account for that inside the container. If you are referring to some other issue or I am not understanding the problem you see correctly then please show it with some logs or screenshots. > Memory consume of process tree will consider subprocess which may make > container exit unexcepted > - > > Key: YARN-10047 > URL: https://issues.apache.org/jira/browse/YARN-10047 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: zhoukang >Assignee: zhoukang >Priority: Major > > As below, we have a case which spark driver execute some scripts.Then > sometimes the driver will be killed. > {code:java} > yarn.174410.log.2019-12-17.02:2019-12-17,06:59:14,831 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: > Container > [pid=50529,containerID=container_e917_1576303656075_174957_01_003197] is > running beyond physical memory limits. Current usage: 50.28 GB of 5.25 GB > physical memory used; xxx. Killing container. > {code} > {code:java} > boolean isProcessTreeOverLimit(String containerId, > long currentMemUsage, > long curMemUsageOfAgedProcesses, > long vmemLimit) { > boolean isOverLimit = false; > > /** > if (currentMemUsage > (2 * vmemLimit)) { > LOG.warn("Process tree for container: " + containerId > + " running over twice " + "the configured limit. Limit=" + > vmemLimit > + ", current usage = " + currentMemUsage); > isOverLimit = true; > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10053) Placement rules do not use correct group service init
[ https://issues.apache.org/jira/browse/YARN-10053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17002327#comment-17002327 ] Hadoop QA commented on YARN-10053: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 8m 43s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} branch-3.2 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 26s{color} | {color:green} branch-3.2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s{color} | {color:green} branch-3.2 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} branch-3.2 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s{color} | {color:green} branch-3.2 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 52s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 29s{color} | {color:green} branch-3.2 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s{color} | {color:green} branch-3.2 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 39s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 3 new + 447 unchanged - 5 fixed = 450 total (was 452) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 47s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 70m 59s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 29s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}145m 39s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisherForV2 | | | hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:0f25cbbb251 | | JIRA Issue | YARN-10053 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12989382/YARN-10053-branch-3.2.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux c3391c5fb9c4 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | branch-3.2 / 85da5cb | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_232 | | findbugs | v3.1.0-RC1 | | checkstyle |
[jira] [Commented] (YARN-10053) Placement rules do not use correct group service init
[ https://issues.apache.org/jira/browse/YARN-10053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17002322#comment-17002322 ] Hadoop QA commented on YARN-10053: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 56s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} branch-3.1 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 42s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 50s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 8s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 25s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s{color} | {color:green} branch-3.1 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 37s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 3 new + 447 unchanged - 6 fixed = 450 total (was 453) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 31s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 70m 26s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}133m 10s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:70a0ef5d4a6 | | JIRA Issue | YARN-10053 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12989380/YARN-10053-branch-3.1.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux fb90fad53870 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | branch-3.1 / 01edc65 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_232 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/25312/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/25312/testReport/ | | Max. process+thread count | 777 (vs. ulimit of 5500) | | modules | C:
[jira] [Commented] (YARN-10047) Memory consume of process tree will consider subprocess which may make container exit unexcepted
[ https://issues.apache.org/jira/browse/YARN-10047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17002273#comment-17002273 ] zhoukang commented on YARN-10047: - [~wilfreds]thanks for your reply, for some case the memory consume will consider subprocess which will has incorrect memory usage. > Memory consume of process tree will consider subprocess which may make > container exit unexcepted > - > > Key: YARN-10047 > URL: https://issues.apache.org/jira/browse/YARN-10047 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: zhoukang >Assignee: zhoukang >Priority: Major > > As below, we have a case which spark driver execute some scripts.Then > sometimes the driver will be killed. > {code:java} > yarn.174410.log.2019-12-17.02:2019-12-17,06:59:14,831 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: > Container > [pid=50529,containerID=container_e917_1576303656075_174957_01_003197] is > running beyond physical memory limits. Current usage: 50.28 GB of 5.25 GB > physical memory used; xxx. Killing container. > {code} > {code:java} > boolean isProcessTreeOverLimit(String containerId, > long currentMemUsage, > long curMemUsageOfAgedProcesses, > long vmemLimit) { > boolean isOverLimit = false; > > /** > if (currentMemUsage > (2 * vmemLimit)) { > LOG.warn("Process tree for container: " + containerId > + " running over twice " + "the configured limit. Limit=" + > vmemLimit > + ", current usage = " + currentMemUsage); > isOverLimit = true; > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10047) Memory consume of process tree will consider subprocess which may make container exit unexcepted
[ https://issues.apache.org/jira/browse/YARN-10047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhoukang updated YARN-10047: Summary: Memory consume of process tree will consider subprocess which may make container exit unexcepted (was: Process tree will consider memory consume of subprocess which may make container exit unexcepted) > Memory consume of process tree will consider subprocess which may make > container exit unexcepted > - > > Key: YARN-10047 > URL: https://issues.apache.org/jira/browse/YARN-10047 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: zhoukang >Assignee: zhoukang >Priority: Major > > As below, we have a case which spark driver execute some scripts.Then > sometimes the driver will be killed. > {code:java} > yarn.174410.log.2019-12-17.02:2019-12-17,06:59:14,831 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: > Container > [pid=50529,containerID=container_e917_1576303656075_174957_01_003197] is > running beyond physical memory limits. Current usage: 50.28 GB of 5.25 GB > physical memory used; xxx. Killing container. > {code} > {code:java} > boolean isProcessTreeOverLimit(String containerId, > long currentMemUsage, > long curMemUsageOfAgedProcesses, > long vmemLimit) { > boolean isOverLimit = false; > > /** > if (currentMemUsage > (2 * vmemLimit)) { > LOG.warn("Process tree for container: " + containerId > + " running over twice " + "the configured limit. Limit=" + > vmemLimit > + ", current usage = " + currentMemUsage); > isOverLimit = true; > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10047) Process tree will consider memory consume of subprocess which may make container exit unexcepted
[ https://issues.apache.org/jira/browse/YARN-10047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhoukang updated YARN-10047: Summary: Process tree will consider memory consume of subprocess which may make container exit unexcepted (was: container memory monitor may make container exit) > Process tree will consider memory consume of subprocess which may make > container exit unexcepted > > > Key: YARN-10047 > URL: https://issues.apache.org/jira/browse/YARN-10047 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: zhoukang >Assignee: zhoukang >Priority: Major > > As below, we have a case which spark driver execute some scripts.Then > sometimes the driver will be killed. > {code:java} > yarn.174410.log.2019-12-17.02:2019-12-17,06:59:14,831 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: > Container > [pid=50529,containerID=container_e917_1576303656075_174957_01_003197] is > running beyond physical memory limits. Current usage: 50.28 GB of 5.25 GB > physical memory used; xxx. Killing container. > {code} > {code:java} > boolean isProcessTreeOverLimit(String containerId, > long currentMemUsage, > long curMemUsageOfAgedProcesses, > long vmemLimit) { > boolean isOverLimit = false; > > /** > if (currentMemUsage > (2 * vmemLimit)) { > LOG.warn("Process tree for container: " + containerId > + " running over twice " + "the configured limit. Limit=" + > vmemLimit > + ", current usage = " + currentMemUsage); > isOverLimit = true; > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10056) Logservice may encuonter nm fgc since filesystem will only close when app finished
zhoukang created YARN-10056: --- Summary: Logservice may encuonter nm fgc since filesystem will only close when app finished Key: YARN-10056 URL: https://issues.apache.org/jira/browse/YARN-10056 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Reporter: zhoukang Assignee: zhoukang -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10053) Placement rules do not use correct group service init
[ https://issues.apache.org/jira/browse/YARN-10053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated YARN-10053: - Attachment: YARN-10053-branch-3.2.001.patch > Placement rules do not use correct group service init > - > > Key: YARN-10053 > URL: https://issues.apache.org/jira/browse/YARN-10053 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.1.3 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Major > Attachments: YARN-10053-branch-3.1.001.patch, > YARN-10053-branch-3.1.002.patch, YARN-10053-branch-3.2.001.patch, > YARN-10053.001.patch, YARN-10053.002.patch > > > The placement rules, CS and FS, all create a new group service instead of > using the shared group mapping service. This means that the cache for the > placement rules is not same as for the ACL and other parts of the RM service. > It also could cause an issue with the configuration that is passed in to > create the cache: the scheduler config might not have the same values as the > service config and could thus cause issues. This second issue just seems to > affect the CS not the FS. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10053) Placement rules do not use correct group service init
[ https://issues.apache.org/jira/browse/YARN-10053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated YARN-10053: - Attachment: YARN-10053-branch-3.1.002.patch > Placement rules do not use correct group service init > - > > Key: YARN-10053 > URL: https://issues.apache.org/jira/browse/YARN-10053 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.1.3 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Major > Attachments: YARN-10053-branch-3.1.001.patch, > YARN-10053-branch-3.1.002.patch, YARN-10053.001.patch, YARN-10053.002.patch > > > The placement rules, CS and FS, all create a new group service instead of > using the shared group mapping service. This means that the cache for the > placement rules is not same as for the ACL and other parts of the RM service. > It also could cause an issue with the configuration that is passed in to > create the cache: the scheduler config might not have the same values as the > service config and could thus cause issues. This second issue just seems to > affect the CS not the FS. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10053) Placement rules do not use correct group service init
[ https://issues.apache.org/jira/browse/YARN-10053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17002230#comment-17002230 ] Wilfred Spiegelenburg commented on YARN-10053: -- Similar update for the 3.1 branch, now it also needs a 3.2 branch change. > Placement rules do not use correct group service init > - > > Key: YARN-10053 > URL: https://issues.apache.org/jira/browse/YARN-10053 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.1.3 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Major > Attachments: YARN-10053-branch-3.1.001.patch, > YARN-10053-branch-3.1.002.patch, YARN-10053.001.patch, YARN-10053.002.patch > > > The placement rules, CS and FS, all create a new group service instead of > using the shared group mapping service. This means that the cache for the > placement rules is not same as for the ACL and other parts of the RM service. > It also could cause an issue with the configuration that is passed in to > create the cache: the scheduler config might not have the same values as the > service config and could thus cause issues. This second issue just seems to > affect the CS not the FS. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10055) bower install fails
[ https://issues.apache.org/jira/browse/YARN-10055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated YARN-10055: - Priority: Blocker (was: Major) > bower install fails > --- > > Key: YARN-10055 > URL: https://issues.apache.org/jira/browse/YARN-10055 > Project: Hadoop YARN > Issue Type: Bug > Components: build, yarn-ui-v2 >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Blocker > > bower install is failing. > {noformat} > bower ENOTFOUND Package abdmob/x2js=abdmob/x2js not found > {noformat} > I ran the following commands: > {noformat} > $ ./start-build.env.sh > $ cd hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp > $ bower install > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10055) bower install fails
[ https://issues.apache.org/jira/browse/YARN-10055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17002200#comment-17002200 ] Hudson commented on YARN-10055: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17788 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17788/]) YARN-10055. bower install fails. (#1778) (GitHub: rev 34ff7dbaf53cbbebcb163980f4b45221630c2882) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/ember-cli-build.js * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/bower.json > bower install fails > --- > > Key: YARN-10055 > URL: https://issues.apache.org/jira/browse/YARN-10055 > Project: Hadoop YARN > Issue Type: Bug > Components: build, yarn-ui-v2 >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Major > > bower install is failing. > {noformat} > bower ENOTFOUND Package abdmob/x2js=abdmob/x2js not found > {noformat} > I ran the following commands: > {noformat} > $ ./start-build.env.sh > $ cd hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp > $ bower install > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10055) bower install fails
[ https://issues.apache.org/jira/browse/YARN-10055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17002188#comment-17002188 ] Akira Ajisaka commented on YARN-10055: -- Merged the PR into trunk. > bower install fails > --- > > Key: YARN-10055 > URL: https://issues.apache.org/jira/browse/YARN-10055 > Project: Hadoop YARN > Issue Type: Bug > Components: build, yarn-ui-v2 >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Major > > bower install is failing. > {noformat} > bower ENOTFOUND Package abdmob/x2js=abdmob/x2js not found > {noformat} > I ran the following commands: > {noformat} > $ ./start-build.env.sh > $ cd hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp > $ bower install > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10053) Placement rules do not use correct group service init
[ https://issues.apache.org/jira/browse/YARN-10053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17002186#comment-17002186 ] Hadoop QA commented on YARN-10053: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 28s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 29s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 34s{color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 0 new + 237 unchanged - 1 fixed = 237 total (was 238) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 53s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 95m 12s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 34s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}156m 43s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:c44943d1fc3 | | JIRA Issue | YARN-10053 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12989372/YARN-10053.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux d87e0b77deba 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / c44943d | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_232 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/25311/testReport/ | | Max. process+thread count | 817 (vs. ulimit of 5500) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/25311/console | | Powered by