[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15127070#comment-15127070 ] Hudson commented on HDFS-9494: -- FAILURE: Integrated in Hadoop-trunk-Commit #9217 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9217/]) HDFS-9494. Parallel optimization of (jing9: rev e30ce01ddce1cfd1e9d49c4784eb4a6bc87e36ca) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSStripedOutputStream.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Fix For: 3.0.0 > > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch, > HDFS-9494-origin-trunk.05.patch, HDFS-9494-origin-trunk.06.patch, > HDFS-9494-origin-trunk.07.patch, HDFS-9494-origin-trunk.08.patch, > HDFS-9494-origin-trunk.08_testing.patch, HDFS-9494.08.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15125975#comment-15125975 ] Hadoop QA commented on HDFS-9494: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 23s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 57s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 33s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 4s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 37s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 8s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 34s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 22s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 38s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 20s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 56s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 37s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 37s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 56s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 56s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 33s {color} | {color:green} hadoop-hdfs-project: patch generated 0 new + 30 unchanged - 1 fixed = 30 total (was 31) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 3s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 33s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 4s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 54s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 13s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 85m 0s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 14s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.7.0_91. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 121m 17s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 3s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 262m 20s {color} | {color:black} {color} | \\
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15125764#comment-15125764 ] Jing Zhao commented on HDFS-9494: - Yeah, you're right. Thanks for pointing this out, Nicholas! > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Fix For: 3.0.0 > > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch, > HDFS-9494-origin-trunk.05.patch, HDFS-9494-origin-trunk.06.patch, > HDFS-9494-origin-trunk.07.patch, HDFS-9494-origin-trunk.08.patch, > HDFS-9494-origin-trunk.08_testing.patch, HDFS-9494.08.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15125753#comment-15125753 ] Hadoop QA commented on HDFS-9494: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 42s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 18s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 46s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 52s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 43s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s {color} | {color:green} hadoop-hdfs-project/hadoop-hdfs-client: patch generated 0 new + 30 unchanged - 1 fixed = 30 total (was 31) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 4s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 33s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 23s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 26s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 33m 40s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL |
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15125588#comment-15125588 ] GAO Rui commented on HDFS-9494: --- Thank you all very much for your patient comments and kind help, [~jingzhao],[~szetszwo],[~rakeshr]. > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Fix For: 3.0.0 > > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch, > HDFS-9494-origin-trunk.05.patch, HDFS-9494-origin-trunk.06.patch, > HDFS-9494-origin-trunk.07.patch, HDFS-9494-origin-trunk.08.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15125737#comment-15125737 ] Tsz Wo Nicholas Sze commented on HDFS-9494: --- > Looks like all the previous jenkins run only ran limited number of tests. ... The reason is that the patch only changes the files under hadoop-hdfs-client. If we want to run all the hdfs tests, we could change some code in hadoop-hdfs. > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Fix For: 3.0.0 > > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch, > HDFS-9494-origin-trunk.05.patch, HDFS-9494-origin-trunk.06.patch, > HDFS-9494-origin-trunk.07.patch, HDFS-9494-origin-trunk.08.patch, > HDFS-9494.08.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15125222#comment-15125222 ] Rakesh R commented on HDFS-9494: +1. Thanks [~demongaorui] > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Fix For: 3.0.0 > > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch, > HDFS-9494-origin-trunk.05.patch, HDFS-9494-origin-trunk.06.patch, > HDFS-9494-origin-trunk.07.patch, HDFS-9494-origin-trunk.08.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124011#comment-15124011 ] Jing Zhao commented on HDFS-9494: - The 08 patch looks good to me. +1 [~szetszwo], [~rakeshr], further comments? > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Fix For: 3.0.0 > > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch, > HDFS-9494-origin-trunk.05.patch, HDFS-9494-origin-trunk.06.patch, > HDFS-9494-origin-trunk.07.patch, HDFS-9494-origin-trunk.08.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15120515#comment-15120515 ] GAO Rui commented on HDFS-9494: --- Thanks for your comments [~jingzhao]. bq. In DFSStripedOutputStream, we do not need to move all the data fields to the beginning of the class. I move them to the beginning of the class body cause when doing the fields related coding, it had been a little trouble to find and edit(adding) fields to {{DFSStripedOutputStream}}. Would it be easier for future coding if we move the fields to the beginning? bq. flushAllExecutor.shutdownNow() can be called in the finally section bq. flushAllFuturesMap does not need to be a class level field. It can be a temporary variable defined and used in flushAllInternals. That makes a lot of sense. I will move {{flushAllExecutor.shutdownNow()}} to finally section and make {{flushAllFuturesMap}} an local variable in the next patch. > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Fix For: 3.0.0 > > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch, > HDFS-9494-origin-trunk.05.patch, HDFS-9494-origin-trunk.06.patch, > HDFS-9494-origin-trunk.07.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15120540#comment-15120540 ] GAO Rui commented on HDFS-9494: --- BTW. will it reduce the resource cost to make {{flushAllExecutorCompletionService}} as a class level field, or it's better to make it as local variable in {{flushAllInternals()}} to limit unnecessary exposure as well as {{flushAllFuturesMap}}? > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Fix For: 3.0.0 > > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch, > HDFS-9494-origin-trunk.05.patch, HDFS-9494-origin-trunk.06.patch, > HDFS-9494-origin-trunk.07.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15120338#comment-15120338 ] Jing Zhao commented on HDFS-9494: - Thanks [~demongaorui]. The 07 patch looks good to me. Some minor comments: # In {{DFSStripedOutputStream}}, we do not need to move all the data fields to the beginning of the class. # {{flushAllExecutor.shutdownNow()}} can be called in the finally section. # flushAllFuturesMap does not need to be a class level field. It can be a temporary variable defined and used in {{flushAllInternals}}. > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Fix For: 3.0.0 > > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch, > HDFS-9494-origin-trunk.05.patch, HDFS-9494-origin-trunk.06.patch, > HDFS-9494-origin-trunk.07.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15120633#comment-15120633 ] Jing Zhao commented on HDFS-9494: - I think we can still keep flushAllExecutorCompletionService as a class level field. In this way we can avoid shutting down the thread pool at the end of each flush and reuse threads. For data fields order, I remember we should place static fields/class definitions before non-static class fields. Also that change is not related to this jira. > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Fix For: 3.0.0 > > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch, > HDFS-9494-origin-trunk.05.patch, HDFS-9494-origin-trunk.06.patch, > HDFS-9494-origin-trunk.07.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15120834#comment-15120834 ] GAO Rui commented on HDFS-9494: --- Oh, I see. I am agree with you. I have upload an new 08 patch to address previous mentioned issues. Thank you again for your detailed comments! > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Fix For: 3.0.0 > > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch, > HDFS-9494-origin-trunk.05.patch, HDFS-9494-origin-trunk.06.patch, > HDFS-9494-origin-trunk.07.patch, HDFS-9494-origin-trunk.08.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15120863#comment-15120863 ] Hadoop QA commented on HDFS-9494: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 39s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 47s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 29s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 34s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s {color} | {color:green} hadoop-hdfs-project/hadoop-hdfs-client: patch generated 0 new + 30 unchanged - 1 fixed = 30 total (was 31) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 43s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 22s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 11s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 29m 17s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12784831/HDFS-9494-origin-trunk.08.patch | | JIRA Issue | HDFS-9494 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 3b591ef4c8d1 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118897#comment-15118897 ] Hadoop QA commented on HDFS-9494: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 25s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 52s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 13s {color} | {color:green} hadoop-hdfs-project/hadoop-hdfs-client: patch generated 0 new + 30 unchanged - 1 fixed = 30 total (was 31) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 3s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 54s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 53s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 21m 22s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12784589/HDFS-9494-origin-trunk.07.patch | | JIRA Issue | HDFS-9494 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 40dc27c2992b 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117849#comment-15117849 ] Jing Zhao commented on HDFS-9494: - Thanks for updating the patch, [~demongaorui]! The latest patch looks good to me. Some minors: # We do not need to create a thread pool instance for each {{flushAllInternals}} call. It will be better if we can reuse. # In {{handleCurrentStreamerFailure}}, let's still set {{currentPacket}} to null. # should be "new ConcurrentHashMap<>()" {code} +final MapstreamersExceptionMap = new +ConcurrentHashMap(); {code} # We do not need to remove the entry from the temporary map here. {code} + iterator.remove(); +} +executor.shutdownNow(); {code} # The exception thrown by {{waitForAckedSeqno}} can be caught as {{ExecutionException}} while calling {{Future#get}}. Thus I think we do not need to use a concurrent hashmap to catch the exceptions. Instead, we can create a map to track the mapping between streamers and corresponding futures. In this way we can remove both {{healthyStreamerCount}} and {{streamersExceptionMap}}. Please see {{DFSStripedInputStream#readStripe}} as an example. bq. After checking the related codes, it seems that we haven't set a timeout for waitForAckedSeqno(). Maybe we could consider to set a timeout for it in another new Jira. It will be very hard to set this timeout value. Let's still depend on the read/write timeout set on the connection instead of adding a timeout for {{waitForAckedSeqno}}. > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch, > HDFS-9494-origin-trunk.05.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118504#comment-15118504 ] GAO Rui commented on HDFS-9494: --- Thank you very much for your great comment, [~jingzhao]. I have update a new 06 patch to address the comments. Please let me know if you have further advice. Thank you. > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch, > HDFS-9494-origin-trunk.05.patch, HDFS-9494-origin-trunk.06.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118539#comment-15118539 ] Hadoop QA commented on HDFS-9494: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 0s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 32s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 3m 58s {color} | {color:red} hadoop-hdfs-project_hadoop-hdfs-client-jdk1.8.0_66 with JDK v1.8.0_66 generated 1 new + 13 unchanged - 1 fixed = 14 total (was 14) {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 4m 26s {color} | {color:red} hadoop-hdfs-project_hadoop-hdfs-client-jdk1.7.0_91 with JDK v1.7.0_91 generated 1 new + 13 unchanged - 1 fixed = 14 total (was 14) {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 14s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client: patch generated 1 new + 30 unchanged - 1 fixed = 31 total (was 31) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 52s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 56s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 23m 3s {color} | {color:black}
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118559#comment-15118559 ] GAO Rui commented on HDFS-9494: --- For the javac and checkstyle issue, I would like to fix them as: {code} // Size of each striping cell, must be a multiple of bytesPerChecksum. {code} and {code} Future future = null; {code} > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch, > HDFS-9494-origin-trunk.05.patch, HDFS-9494-origin-trunk.06.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15116748#comment-15116748 ] GAO Rui commented on HDFS-9494: --- [~szetszwo],[~rakeshr], thanks a lot for your advice. For {{executor.shutdownNow()}}, if not all the tasks had been completed, we might not arrive to the end of {{flushAllInternals()}}. But there is no harmless to ensure executor shutdown, so I add {{executor.shutdownNow()}} as well. After checking the related codes, it seems that we haven't set a timeout for {{waitForAckedSeqno()}}. Maybe we could consider to set a timeout for it in another new Jira. I have updated the 05 patch. Could you kindly review it? Thank you very much. > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch, > HDFS-9494-origin-trunk.05.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15116767#comment-15116767 ] Hadoop QA commented on HDFS-9494: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 22s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 46s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 25s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 5m 18s {color} | {color:red} hadoop-hdfs-project_hadoop-hdfs-client-jdk1.8.0_66 with JDK v1.8.0_66 generated 1 new + 13 unchanged - 1 fixed = 14 total (was 14) {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 5m 56s {color} | {color:red} hadoop-hdfs-project_hadoop-hdfs-client-jdk1.7.0_91 with JDK v1.7.0_91 generated 1 new + 13 unchanged - 1 fixed = 14 total (was 14) {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 44s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 19s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 37s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 33m 5s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15115745#comment-15115745 ] Rakesh R commented on HDFS-9494: Thanks [~demongaorui] for the patch. I've a minor comment, please consider this also when preparing next patch. For every flushAllInternals(), it is creating {{ExecutorService executor = Executors.newFixedThreadPool(numAllBlocks);}}. Please do {{executor.shutdownNow();}} at the end of flushAllInternals() function. Otw there could be a chance of unnecessary {{Thread (pool-1-thread-1) (Running)}} reference leaving, right? > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15115531#comment-15115531 ] Tsz Wo Nicholas Sze commented on HDFS-9494: --- In the last for-loop, the finally-block will be executed multiple times (healthyStreamerCount). It may not be intended. I think it is better to wait until all tasks have been completed. Then, process the exceptions if the map is non-empty. > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15114789#comment-15114789 ] GAO Rui commented on HDFS-9494: --- Hi [~szetszwo], Could you share your opinions about the last patch? Thank you very much. > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15067772#comment-15067772 ] GAO Rui commented on HDFS-9494: --- Hi [~szetszwo], I agree processing failures sequentially is more safer than trying to confirm {{checkStreamers()}}'s thread safety. I have updated a new patch. Could you review it, again? I use a {{ConcurrentHashMap}} to keep both the streamer and the related exception of this streamer inside of {{call()}}, so that we could provide more specific informations to {{handleStreamerFailure()}} later. For the handling codes: {code} Iterator> iterator = streamersExceptionMap.entrySet().iterator(); while (iterator.hasNext()) { Map.Entry entry = iterator.next(); StripedDataStreamer s = entry.getKey(); Exception e = entry.getValue(); handleStreamerFailure("flushInternal " + s, e, s); iterator.remove(); } {code} I think maybe we could move these codes to the position after this the for loop, so codes will become look like: {code} for (int i = 0; i < healthyStreamerCount; i++) { try { executorCompletionService.take().get(); } catch (InterruptedException ie) { throw DFSUtilClient.toInterruptedIOException( "Interrupted during waiting all streamer flush, ", ie); } catch (ExecutionException ee) { LOG.warn( "Caught ExecutionException while waiting all streamer flush, ", ee); } } Iterator > iterator = streamersExceptionMap.entrySet().iterator(); while (iterator.hasNext()) { Map.Entry entry = iterator.next(); StripedDataStreamer s = entry.getKey(); Exception e = entry.getValue(); handleStreamerFailure("flushInternal " + s, e, s); iterator.remove(); } {code} But for safe consideration, in the new 04 patch, I put these handling codes inside the for loop, as finally sub case. Could you share your opinions? Thank you. > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15067792#comment-15067792 ] Hadoop QA commented on HDFS-9494: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 10s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 45s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 31s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 44s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 5m 6s {color} | {color:red} hadoop-hdfs-project_hadoop-hdfs-client-jdk1.8.0_66 with JDK v1.8.0_66 generated 1 new issues (was 14, now 14). {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 43s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 5m 45s {color} | {color:red} hadoop-hdfs-project_hadoop-hdfs-client-jdk1.7.0_91 with JDK v1.7.0_91 generated 1 new issues (was 14, now 14). {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 44s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 21s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 11s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 29s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 30m 4s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL |
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15067213#comment-15067213 ] Tsz Wo Nicholas Sze commented on HDFS-9494: --- Sure, we may keep the log message. > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15067217#comment-15067217 ] Tsz Wo Nicholas Sze commented on HDFS-9494: --- Actually, checkStreamers is not thread safe so that the current would have a race condition. > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15067363#comment-15067363 ] GAO Rui commented on HDFS-9494: --- Thanks Nicholas. I have considered {{checkStreamers()}} issue before too. Cause in {{checkStreamers()}}, we just read {{streamers}} and {{failedStreamers}}, no operation to change them. So, even different thread call {{checkStreamers()}}, there may not have a conflict I think. For race condition, could you share more details? > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066214#comment-15066214 ] GAO Rui commented on HDFS-9494: --- [~szetszwo], great catch! Sorry, I ignored that handleStreamerFailure itself might throw an IOException. I will upload a new patch to address this soon. > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15067615#comment-15067615 ] Tsz Wo Nicholas Sze commented on HDFS-9494: --- > ... read streamers and failedStreamers ... Since execution tasks do not have the lock, another thread may change the value of them. I think it is a problem. I suggest that we do not catch exception in call(). Let the exception, if there is any, propagates. We catch ExecutionException in flushAllInternals(). Then, there is no race condition. I believe it is very hard to handle multiple failures in parallel. We are better to call waitForAckedSeqno(..) in parallel but process failures sequentially. > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066224#comment-15066224 ] GAO Rui commented on HDFS-9494: --- Additionally, I think we might encounter other thread related ExecutionException as well, right? So, I suggest to keep printing a warning while also re-throw the ExecutionException as an IOException. {code} } catch (ExecutionException ee) { LOG.warn( "Caught ExecutionException while waiting all streamer flush, ", ee); throw new IOException(ee); } {code} Do you think the above codes is OK? > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066196#comment-15066196 ] Tsz Wo Nicholas Sze commented on HDFS-9494: --- [~rakeshr], thanks for catching the bug. For the latest patch, we should re-throw the ExecutionException as an IOException instead of printing a warning. handleStreamerFailure may throw an exception when the failure is intolerable, i.e. when the number of remaining streamers is less than the number of data blocks. {code} + } catch (ExecutionException ee) { +LOG.warn( +"Caught ExecutionException while waiting all streamer flush, ", ee); + } {code} > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066009#comment-15066009 ] GAO Rui commented on HDFS-9494: --- [~rakeshr], thank you very much for your kind review and advice. Let's wait for [~szetszwo] and [~jingzhao]'s comment to see if it's OK to commit the latest patch :) > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15059672#comment-15059672 ] Hadoop QA commented on HDFS-9494: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 40s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 11s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 51s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 32s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 56s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 6s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 22m 43s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12777949/HDFS-9494-origin-trunk.03.patch | | JIRA Issue | HDFS-9494 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 096f88789b74 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15059789#comment-15059789 ] Rakesh R commented on HDFS-9494: Thanks [~demongaorui] for the good work. Latest patch looks nice to me. > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15058325#comment-15058325 ] Rakesh R commented on HDFS-9494: Thanks [~demongaorui]. Overall the latest patch looks good apart from one comment. I'm adding a thought about the new exception handling logic, sorry for not identifying this point in my previous review. With the proposed changes, it looks like not catching per streamer failure and handling that streamer. Instead it is throwing InterruptedIOException OR just warning. Is that intentionally implemented? {code} + try { +executorCompletionService.take().get(); + } catch (InterruptedException ie) { +throw DFSUtilClient.toInterruptedIOException( +"Interrupted during waiting all streamer flush, ", ie); + } catch (ExecutionException ee) { +LOG.warn( +"Caught ExecutionException while waiting all streamer flush, ", ee); + } {code} Existing logic: 1# Iterate each streamer. 2# If there is a failure in the current streamer then {{handleStreamerFailure("flushInternal " + s, e);}} New logic: 1# iterate each streamer 2# flush it and not waiting for the ack 3# Wait for any streamer completion. If one streamer failed then throwing IOException back to the caller OR just warn then continue. Here in 3# step, I think it would be good if we can handle each streamer exception separately. For this, one draft idea that comes in mind is to move the exception handling logic inside {{Callable#call()}} rather than throwing the IOException of this function {{s.waitForAckedSeqno(toWaitFor);}}. Also, you need to refactor {{handleStreamerFailure()}} function by passing extra parameter of {{StripedDataStreamer s}} like, {code} public Void call() throws Exception { try { s.waitForAckedSeqno(toWaitFor); catch(IOException ioe){ handleStreamerFailure(s, "flushInternal ", ioe); } return null; } {code} > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15058657#comment-15058657 ] Jing Zhao commented on HDFS-9494: - I agree with [~rakeshr]'s comment. Exceptions can happen in streamers after queueing the packet and while sending the remaining packets out. The exception can be thrown by {{waitForAckedSeqno}}. We should correctly handle these exceptions instead of only logging a warning msg. > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057763#comment-15057763 ] Hadoop QA commented on HDFS-9494: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 0s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 36s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 55s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 0s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 23m 43s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12777682/HDFS-9494-origin-trunk.02.patch | | JIRA Issue | HDFS-9494 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux da0c9ad7d29a 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15059388#comment-15059388 ] GAO Rui commented on HDFS-9494: --- [~szetszwo], [~rakeshr], and [~jingzhao], thank you very much for your comments. I agree that refactor {{handleStreamerFailure()}} and handle exception inside {{call()}}, also make {{handleStreamerFailure()}} synchronized. I think in the previous 02 patch: {code} for (int i = 0; i < healthyStreamerCount; i++) { try { executorCompletionService.take().get(); } catch (InterruptedException ie) { throw DFSUtilClient.toInterruptedIOException( "Interrupted during waiting all streamer flush, ", ie); } catch (ExecutionException ee) { LOG.warn( "Caught ExecutionException while waiting all streamer flush, ", ee); } } {code} Exception which might be thrown by {{waitForAckedSeqno}} was not caught. I will draft a new patch soon. > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15058060#comment-15058060 ] Tsz Wo Nicholas Sze commented on HDFS-9494: --- +1 the new patch looks good. (Need to board a flight now. Will commit later.) > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15059643#comment-15059643 ] GAO Rui commented on HDFS-9494: --- [~szetszwo], [~rakeshr] and [~jingzhao], 03 patch was attached. Streamer related exception is handled inside {{call()}}. ExecutorCompletionService related exceptions is handled as the same way with previous 02 patch. Please review the new patch and feel free to give further comments when you are available. Thank you very much. > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057377#comment-15057377 ] GAO Rui commented on HDFS-9494: --- [~rakeshr], thank you very much for your detailed comments! I have addressed 1,3,4 in previous comment. For the second one, I drafted the following code: {code} for (int i = 0; i < healthyStreamerCount; i++) { try { executorCompletionService.take().get(); } catch (InterruptedException ie) { throw DFSUtilClient.toInterruptedIOException( "Interrupted during waiting all streamer flush. ", ie); } catch (ExecutionException ee) { LOG.warn("Caught ExecutionException during waiting all streamer " + "flush", ee); } } {code} I think it should be enough for handling {{ExecutionException}}. What do you think? > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056345#comment-15056345 ] Rakesh R commented on HDFS-9494: Thanks [~demongaorui], nice improvement. I've few comments, please see: # To be on the safe side, please do incr {{healthyStreamerCount++;}} after successfully submitting {{executorCompletionService.submit()}}. # The new logic is masking the execution exception of {{s.waitForAckedSeqno(toWaitFor);}}. Please catch ExecutionException. Probably you can use like, {code} try { Future f = executorCompletionService.take(); f.get(); } catch (InterruptedException ie) { // } catch (ExecutionException e) { // } {code} # For javac warning, please add {{}} as follows: {code} CompletionService executorCompletionService = new ExecutorCompletionService<>(executor); {code} # Nit: Please correct formatting. Give one space in between the operators like, {code} int healthyStreamerCount = 0; final long toWaitFor = flushInternalWithoutWaitingAck(); // // for (int i = 0; i < healthyStreamerCount; i++) {code} > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055603#comment-15055603 ] Hadoop QA commented on HDFS-9494: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 37s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 55s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 17s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 25s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 29s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 18s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 11s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 33s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 9m 11s {color} | {color:red} hadoop-hdfs-project_hadoop-hdfs-client-jdk1.8.0_66 with JDK v1.8.0_66 generated 1 new issues (was 14, now 14). {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 33s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 5s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 10m 17s {color} | {color:red} hadoop-hdfs-project_hadoop-hdfs-client-jdk1.7.0_91 with JDK v1.7.0_91 generated 1 new issues (was 14, now 14). {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 5s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 25s {color} | {color:red} Patch generated 3 new checkstyle issues in hadoop-hdfs-project/hadoop-hdfs-client (total was 31, now 34). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 50s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 10s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 14s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 43s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 58s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 58m 47s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055659#comment-15055659 ] GAO Rui commented on HDFS-9494: --- I have fixed the checkstyle issues found by HADOOP QA. But haven't get a idea about the javac issue. I'll upload the next patch which will not causing checkstyle issues along with addressing further commends. > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1502#comment-1502 ] GAO Rui commented on HDFS-9494: --- Thank you very much [~szetszwo]. I have refactored the code and using {{ExecutorCompletionService.take()}} to waiting for all the streamer flushing. Because, {{ExecutorCompletionService}} could not tell us how many Future objects are still there waiting, so I use {{healthyStreamerCount}} to count how many times we should call take(). And {{Runnable.run()}} can not throw Exception, so I use {{Callable.call() throws Exception}} and return null. > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052561#comment-15052561 ] GAO Rui commented on HDFS-9494: --- Hi [~szetszwo],[~zhz],[~walter.k.su],[~xiaochen], I have attached a patch to implement this optimization. Could you please review it. Any comments are welcome! Thank you very much. > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052979#comment-15052979 ] Hadoop QA commented on HDFS-9494: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 20s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 41s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 40s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 4m 31s {color} | {color:red} hadoop-hdfs-project_hadoop-hdfs-client-jdk1.8.0_66 with JDK v1.8.0_66 generated 1 new issues (was 14, now 14). {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 5m 4s {color} | {color:red} hadoop-hdfs-project_hadoop-hdfs-client-jdk1.7.0_91 with JDK v1.7.0_91 generated 1 new issues (was 14, now 14). {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 33s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 13s {color} | {color:red} Patch generated 3 new checkstyle issues in hadoop-hdfs-project/hadoop-hdfs-client (total was 8, now 11). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 40s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 9s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 7s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 27m 6s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || |
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15054025#comment-15054025 ] Tsz Wo Nicholas Sze commented on HDFS-9494: --- > ... use an ExecutorCompletionService to poll() the tasks ... It should be take(), not poll(). > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053920#comment-15053920 ] Tsz Wo Nicholas Sze commented on HDFS-9494: --- - Let's refactor the following code to a separated method. Then, both DFSOutputStream and DFSStrippedOutputStream can use it. {code} synchronized (this) { dfsClient.checkOpen(); ... toWaitFor = getStreamer().getLastQueuedSeqno(); } {code} - It is better to use an ExecutorCompletionService to poll() the tasks, i.e. {code} for(Future f; (f = executor.poll()) != null; ); {code} > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)