[ https://issues.apache.org/jira/browse/YARN-9667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900285#comment-16900285 ]
Hudson commented on YARN-9667: ------------------------------ FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #17042 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17042/]) YARN-9667. Use setbuf with line buffer to reduce fflush complexity in (eyang: rev d6697da5e854355ac3718a85006b73315d0702aa) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/test/test-container-executor.c * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/util.h * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/util.c * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/utils/docker-util.c * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/modules/devices/devices-module.c > Container-executor.c duplicates messages to stdout > -------------------------------------------------- > > Key: YARN-9667 > URL: https://issues.apache.org/jira/browse/YARN-9667 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager, yarn > Affects Versions: 3.2.0 > Reporter: Adam Antal > Assignee: Peter Bacsko > Priority: Major > Attachments: YARN-9667-001.patch > > > When a container is killed by its AM we get a similar error message like this: > {noformat} > 2019-06-30 12:09:04,412 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor: > Shell execution returned exit code: 143. Privileged Execution Operation > Stderr: > Stdout: main : command provided 1 > main : run as user is systest > main : requested yarn user is systest > Getting exit code file... > Creating script paths... > Writing pid file... > Writing to tmp file > /yarn/nm/nmPrivate/application_1561921629886_0001/container_e84_1561921629886_0001_01_000019/container_e84_1561921629886_0001_01_000019.pid.tmp > Writing to cgroup task files... > Creating local dirs... > Launching container... > Getting exit code file... > Creating script paths... > {noformat} > In container-executor.c the fork point is right after the "Creating script > paths..." part, though in the Stdout log we can clearly see it has been > written there twice. After consulting with [~pbacsko] it seems like there's a > missing flush in container-executor.c before the fork and that causes the > duplication. > I suggest to add a flush there so that it won't be duplicated: it's a bit > misleading that the child process writes out "Getting exit code file" and > "Creating script paths" even though it is clearly not doing that. > A more appealing solution could be to revisit the fprintf-fflush pairs in the > code and change them to a single call, so that the fflush calls would not be > forgotten accidentally. (It can cause problems in every place where it's > used). > Note: this issue probably affects every occasion of fork(), not just the one > from {{launch_container_as_user}} in {{main.c}}. -- This message was sent by Atlassian JIRA (v7.6.14#76016) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org