[jira] [Commented] (YARN-7037) Optimize data transfer with zero-copy approach for containerlogs REST API in NMWebServices
[ https://issues.apache.org/jira/browse/YARN-7037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16146470#comment-16146470 ] Tao Yang commented on YARN-7037: Thanks [~djp] for review and commit ! > Optimize data transfer with zero-copy approach for containerlogs REST API in > NMWebServices > -- > > Key: YARN-7037 > URL: https://issues.apache.org/jira/browse/YARN-7037 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 2.8.0 >Reporter: Tao Yang >Assignee: Tao Yang > Fix For: 2.9.0, 3.0.0-beta1, 2.8.3 > > Attachments: YARN-7037.001.patch, YARN-7037.branch-2.8.001.patch > > > Split this improvement from YARN-6259. > It's useful to read container logs more efficiently. With zero-copy approach, > data transfer pipeline (disk --> read buffer --> NM buffer --> socket buffer) > can be optimized to pipeline(disk --> read buffer --> socket buffer) . > In my local test, time cost of copying 256MB file with zero-copy can be > reduced from 12 seconds to 2.5 seconds. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7037) Optimize data transfer with zero-copy approach for containerlogs REST API in NMWebServices
[ https://issues.apache.org/jira/browse/YARN-7037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16146291#comment-16146291 ] Hudson commented on YARN-7037: -- ABORTED: Integrated in Jenkins build Hadoop-trunk-Commit #12264 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/12264/]) YARN-7037. Optimize data transfer with zero-copy approach for (junping_du: rev ad45d19998c1b0da25754d0016854046731fa623) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/LogToolUtils.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/NMWebServices.java > Optimize data transfer with zero-copy approach for containerlogs REST API in > NMWebServices > -- > > Key: YARN-7037 > URL: https://issues.apache.org/jira/browse/YARN-7037 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 2.8.0 >Reporter: Tao Yang >Assignee: Tao Yang > Fix For: 2.9.0, 3.0.0-beta1, 2.8.3 > > Attachments: YARN-7037.001.patch, YARN-7037.branch-2.8.001.patch > > > Split this improvement from YARN-6259. > It's useful to read container logs more efficiently. With zero-copy approach, > data transfer pipeline (disk --> read buffer --> NM buffer --> socket buffer) > can be optimized to pipeline(disk --> read buffer --> socket buffer) . > In my local test, time cost of copying 256MB file with zero-copy can be > reduced from 12 seconds to 2.5 seconds. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7037) Optimize data transfer with zero-copy approach for containerlogs REST API in NMWebServices
[ https://issues.apache.org/jira/browse/YARN-7037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16144398#comment-16144398 ] Junping Du commented on YARN-7037: -- bq. LogToolUtils#outputContainerLog was used for both local log which can be optimized by FileInputStream and aggregated log which can't because it's transferred by DataInputStream from remote. I see. That make sense to me. +1 on latest patch. Will commit it tomorrow if no further comments from others. > Optimize data transfer with zero-copy approach for containerlogs REST API in > NMWebServices > -- > > Key: YARN-7037 > URL: https://issues.apache.org/jira/browse/YARN-7037 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 2.8.0 >Reporter: Tao Yang >Assignee: Tao Yang > Attachments: YARN-7037.001.patch, YARN-7037.branch-2.8.001.patch > > > Split this improvement from YARN-6259. > It's useful to read container logs more efficiently. With zero-copy approach, > data transfer pipeline (disk --> read buffer --> NM buffer --> socket buffer) > can be optimized to pipeline(disk --> read buffer --> socket buffer) . > In my local test, time cost of copying 256MB file with zero-copy can be > reduced from 12 seconds to 2.5 seconds. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7037) Optimize data transfer with zero-copy approach for containerlogs REST API in NMWebServices
[ https://issues.apache.org/jira/browse/YARN-7037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16141364#comment-16141364 ] Tao Yang commented on YARN-7037: Thanks [~djp] for looking into the issue. I chose to add new method since this optimization can not cover all use cases, zero-copy is only fit for local read. LogToolUtils#outputContainerLog was used for both local log which can be optimized by FileInputStream and aggregated log which can't because it's transferred by DataInputStream from remote. > Optimize data transfer with zero-copy approach for containerlogs REST API in > NMWebServices > -- > > Key: YARN-7037 > URL: https://issues.apache.org/jira/browse/YARN-7037 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 2.8.0 >Reporter: Tao Yang >Assignee: Tao Yang > Attachments: YARN-7037.001.patch, YARN-7037.branch-2.8.001.patch > > > Split this improvement from YARN-6259. > It's useful to read container logs more efficiently. With zero-copy approach, > data transfer pipeline (disk --> read buffer --> NM buffer --> socket buffer) > can be optimized to pipeline(disk --> read buffer --> socket buffer) . > In my local test, time cost of copying 256MB file with zero-copy can be > reduced from 12 seconds to 2.5 seconds. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7037) Optimize data transfer with zero-copy approach for containerlogs REST API in NMWebServices
[ https://issues.apache.org/jira/browse/YARN-7037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16139197#comment-16139197 ] Hadoop QA commented on YARN-7037: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 7s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 48s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager in trunk has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 54s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 51s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 1 new + 8 unchanged - 0 fixed = 9 total (was 8) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 29s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 44s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 64m 59s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | YARN-7037 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12883301/YARN-7037.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 9d4e9588a6f4 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 7e6463d | | Default Java | 1.8.0_144 | | findbugs | v3.1.0-RC1 | | findbugs | https://builds.apache.org/job/PreCommit-YARN-Build/17098/artifact/patchprocess/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-warnings.html | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/17098/artifact
[jira] [Commented] (YARN-7037) Optimize data transfer with zero-copy approach for containerlogs REST API in NMWebServices
[ https://issues.apache.org/jira/browse/YARN-7037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16139117#comment-16139117 ] Junping Du commented on YARN-7037: -- Thanks [~Tao Yang] for delivering the patch. The approach (bypass unnecessary buffer) here make sense to me. For trunk branch patch, instead of adding a new method - {{outputContainerLogThroughZeroCopy}}, may be we should replace original method - {{outputContainerLog}} directly? So that other callers like outputAggregatedContainerLog() can benefit from the performance improvement here. > Optimize data transfer with zero-copy approach for containerlogs REST API in > NMWebServices > -- > > Key: YARN-7037 > URL: https://issues.apache.org/jira/browse/YARN-7037 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 2.8.0 >Reporter: Tao Yang >Assignee: Tao Yang > Attachments: YARN-7037.001.patch, YARN-7037.branch-2.8.001.patch > > > Split this improvement from YARN-6259. > It's useful to read container logs more efficiently. With zero-copy approach, > data transfer pipeline (disk --> read buffer --> NM buffer --> socket buffer) > can be optimized to pipeline(disk --> read buffer --> socket buffer) . > In my local test, time cost of copying 256MB file with zero-copy can be > reduced from 12 seconds to 2.5 seconds. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org