[jira] [Commented] (HDFS-13688) Introduce msync API call
[ https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16744299#comment-16744299 ] Erik Krogen commented on HDFS-13688: Hey [~hexiaoqiao], that's a great question. It is definitely on our roadmap to provide a mechanism for clients to use this without requiring application-side changes. We mentioned this in our design document but didn't create a detailed JIRA. I have just done so now at HDFS-14211. Please comment there if you have further questions. > Introduce msync API call > > > Key: HDFS-13688 > URL: https://issues.apache.org/jira/browse/HDFS-13688 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Chen Liang >Assignee: Chen Liang >Priority: Major > Fix For: HDFS-12943, 3.3.0 > > Attachments: HDFS-13688-HDFS-12943.001.patch, > HDFS-13688-HDFS-12943.002.patch, HDFS-13688-HDFS-12943.002.patch, > HDFS-13688-HDFS-12943.003.patch, HDFS-13688-HDFS-12943.004.patch, > HDFS-13688-HDFS-12943.005.patch, HDFS-13688-HDFS-12943.WIP.002.patch, > HDFS-13688-HDFS-12943.WIP.patch > > > As mentioned in the design doc in HDFS-12943, to ensure consistent read, we > need to introduce an RPC call {{msync}}. Specifically, client can issue a > msync call to Observer node along with a transactionID. The msync will only > return when the Observer's transactionID has caught up to the given ID. This > JIRA is to add this API. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13688) Introduce msync API call
[ https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16742910#comment-16742910 ] He Xiaoqiao commented on HDFS-13688: Thanks all for great work here. I am confused about the new API #msync, from the design docs it introduces an RPC call msync to ensure consistent read. IIUC application upon HDFS has to adapt the new API if open 'Consistent Read' feature, this changes involve complex works since there are more and more engine run on HDFS, I believe it is a gigantic project if all compute engines to match this change. So my question is any plan to restrain data consistent checking in DFSClient only? If I missed something or understood incorrectly please correct me. Thanks again. > Introduce msync API call > > > Key: HDFS-13688 > URL: https://issues.apache.org/jira/browse/HDFS-13688 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Chen Liang >Assignee: Chen Liang >Priority: Major > Fix For: HDFS-12943, 3.3.0 > > Attachments: HDFS-13688-HDFS-12943.001.patch, > HDFS-13688-HDFS-12943.002.patch, HDFS-13688-HDFS-12943.002.patch, > HDFS-13688-HDFS-12943.003.patch, HDFS-13688-HDFS-12943.004.patch, > HDFS-13688-HDFS-12943.005.patch, HDFS-13688-HDFS-12943.WIP.002.patch, > HDFS-13688-HDFS-12943.WIP.patch > > > As mentioned in the design doc in HDFS-12943, to ensure consistent read, we > need to introduce an RPC call {{msync}}. Specifically, client can issue a > msync call to Observer node along with a transactionID. The msync will only > return when the Observer's transactionID has caught up to the given ID. This > JIRA is to add this API. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13688) Introduce msync API call
[ https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16728481#comment-16728481 ] Hudson commented on HDFS-13688: --- FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #15662 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/15662/]) HDFS-13688. [SBN read] Introduce msync API call. Contributed by Chen (shv: rev eae0a5d54a2b4f415ad12a3e1dcfde39b3b55a82) * (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java * (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java * (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientNamenodeProtocolTranslatorPB.java * (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/test/java/org/apache/hadoop/hdfs/protocol/TestReadOnly.java * (edit) hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterClientProtocol.java * (edit) hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java * (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/proto/ClientNamenodeProtocol.proto * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientNamenodeProtocolServerSideTranslatorPB.java > Introduce msync API call > > > Key: HDFS-13688 > URL: https://issues.apache.org/jira/browse/HDFS-13688 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Chen Liang >Assignee: Chen Liang >Priority: Major > Fix For: HDFS-12943, 3.3.0 > > Attachments: HDFS-13688-HDFS-12943.001.patch, > HDFS-13688-HDFS-12943.002.patch, HDFS-13688-HDFS-12943.002.patch, > HDFS-13688-HDFS-12943.003.patch, HDFS-13688-HDFS-12943.004.patch, > HDFS-13688-HDFS-12943.005.patch, HDFS-13688-HDFS-12943.WIP.002.patch, > HDFS-13688-HDFS-12943.WIP.patch > > > As mentioned in the design doc in HDFS-12943, to ensure consistent read, we > need to introduce an RPC call {{msync}}. Specifically, client can issue a > msync call to Observer node along with a transactionID. The msync will only > return when the Observer's transactionID has caught up to the given ID. This > JIRA is to add this API. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13688) Introduce msync API call
[ https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563873#comment-16563873 ] Erik Krogen commented on HDFS-13688: Cool, v005 patch LGTM. I'll commit EOD today in case [~shv] wants to have a look first. Let's move the discussion of lock vs. unlocked to HDFS-13767. > Introduce msync API call > > > Key: HDFS-13688 > URL: https://issues.apache.org/jira/browse/HDFS-13688 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Chen Liang >Assignee: Chen Liang >Priority: Major > Attachments: HDFS-13688-HDFS-12943.001.patch, > HDFS-13688-HDFS-12943.002.patch, HDFS-13688-HDFS-12943.002.patch, > HDFS-13688-HDFS-12943.003.patch, HDFS-13688-HDFS-12943.004.patch, > HDFS-13688-HDFS-12943.005.patch, HDFS-13688-HDFS-12943.WIP.002.patch, > HDFS-13688-HDFS-12943.WIP.patch > > > As mentioned in the design doc in HDFS-12943, to ensure consistent read, we > need to introduce an RPC call {{msync}}. Specifically, client can issue a > msync call to Observer node along with a transactionID. The msync will only > return when the Observer's transactionID has caught up to the given ID. This > JIRA is to add this API. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13688) Introduce msync API call
[ https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563140#comment-16563140 ] genericqa commented on HDFS-13688: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} HDFS-12943 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 29s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 21s{color} | {color:green} HDFS-12943 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 57s{color} | {color:green} HDFS-12943 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} HDFS-12943 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 14s{color} | {color:green} HDFS-12943 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 14s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 20s{color} | {color:green} HDFS-12943 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 59s{color} | {color:green} HDFS-12943 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 15m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 17s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 33s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 30s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}102m 46s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 17m 25s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 33s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}220m 18s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:9b55946 | | JIRA Issue | HDFS-13688 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12933687/HDFS-13688-HDFS-12943.005.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle cc | | uname | Linux e0f10e072c62 4.4.0-130
[jira] [Commented] (HDFS-13688) Introduce msync API call
[ https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16562713#comment-16562713 ] Chen Liang commented on HDFS-13688: --- Thanks for the review [~xkrogen]! Posted v005 patch to remove remaining server side changes. As for {{getCorrectLastAppliedOrWrittenTxId()}}, I did considered using this synchronized version. But I was more concerned with the locking overhead than returning a potentially not the most recent value. Because every single read call will be grabbing the lock. Also the current {{GlobalStateIdContext}} is using {{getLastWrittenTransactionId}}, which is also underlying using a "un-locked" version of written txid. > Introduce msync API call > > > Key: HDFS-13688 > URL: https://issues.apache.org/jira/browse/HDFS-13688 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Chen Liang >Assignee: Chen Liang >Priority: Major > Attachments: HDFS-13688-HDFS-12943.001.patch, > HDFS-13688-HDFS-12943.002.patch, HDFS-13688-HDFS-12943.002.patch, > HDFS-13688-HDFS-12943.003.patch, HDFS-13688-HDFS-12943.004.patch, > HDFS-13688-HDFS-12943.005.patch, HDFS-13688-HDFS-12943.WIP.002.patch, > HDFS-13688-HDFS-12943.WIP.patch > > > As mentioned in the design doc in HDFS-12943, to ensure consistent read, we > need to introduce an RPC call {{msync}}. Specifically, client can issue a > msync call to Observer node along with a transactionID. The msync will only > return when the Observer's transactionID has caught up to the given ID. This > JIRA is to add this API. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13688) Introduce msync API call
[ https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16562631#comment-16562631 ] Erik Krogen commented on HDFS-13688: Hey [~vagarychen], I think separating out into client- and server-side makes sense. However, this patch which claims to be only client-side is still making changes to server-side classes like {{o.a.h.ipc.Server}}, {{GlobalStateIdContext}}. Would it be better to have this as a part of the server-side change? That being said, in {{GlobalStateIdContext}}, should we be using {{getCorrectLastAppliedOrWrittenTxId}} rather than {{getLastAppliedOrWrittenTxId}}, which gets the ID without a lock? This could result in an older txn ID value being returned. I think we need to take the lock here. One minor nit, the Javadoc for {{DFSClient#msync()}} should have an empty line before the {{@throws}} > Introduce msync API call > > > Key: HDFS-13688 > URL: https://issues.apache.org/jira/browse/HDFS-13688 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Chen Liang >Assignee: Chen Liang >Priority: Major > Attachments: HDFS-13688-HDFS-12943.001.patch, > HDFS-13688-HDFS-12943.002.patch, HDFS-13688-HDFS-12943.002.patch, > HDFS-13688-HDFS-12943.003.patch, HDFS-13688-HDFS-12943.004.patch, > HDFS-13688-HDFS-12943.WIP.002.patch, HDFS-13688-HDFS-12943.WIP.patch > > > As mentioned in the design doc in HDFS-12943, to ensure consistent read, we > need to introduce an RPC call {{msync}}. Specifically, client can issue a > msync call to Observer node along with a transactionID. The msync will only > return when the Observer's transactionID has caught up to the given ID. This > JIRA is to add this API. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13688) Introduce msync API call
[ https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16556051#comment-16556051 ] Chen Liang commented on HDFS-13688: --- Two of the failed tests are actually related, post v004 patch to fix, also fix the findbugs warnings. > Introduce msync API call > > > Key: HDFS-13688 > URL: https://issues.apache.org/jira/browse/HDFS-13688 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Chen Liang >Assignee: Chen Liang >Priority: Major > Attachments: HDFS-13688-HDFS-12943.001.patch, > HDFS-13688-HDFS-12943.002.patch, HDFS-13688-HDFS-12943.002.patch, > HDFS-13688-HDFS-12943.003.patch, HDFS-13688-HDFS-12943.004.patch, > HDFS-13688-HDFS-12943.WIP.002.patch, HDFS-13688-HDFS-12943.WIP.patch > > > As mentioned in the design doc in HDFS-12943, to ensure consistent read, we > need to introduce an RPC call {{msync}}. Specifically, client can issue a > msync call to Observer node along with a transactionID. The msync will only > return when the Observer's transactionID has caught up to the given ID. This > JIRA is to add this API. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13688) Introduce msync API call
[ https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555108#comment-16555108 ] genericqa commented on HDFS-13688: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 30s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} HDFS-12943 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 36s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 32s{color} | {color:green} HDFS-12943 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 27m 26s{color} | {color:green} HDFS-12943 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s{color} | {color:green} HDFS-12943 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 37s{color} | {color:green} HDFS-12943 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 32s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 24s{color} | {color:green} HDFS-12943 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 26s{color} | {color:green} HDFS-12943 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 53s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 28m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 28m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 4s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 57s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 26s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 25s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 47s{color} | {color:red} hadoop-hdfs-client in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}102m 33s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 17m 47s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 37s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}268m 46s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs-project/hadoop-hdfs-client | | | Dead store to e in org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.msync() At ClientNamenodeProtocolTranslatorPB.java:o
[jira] [Commented] (HDFS-13688) Introduce msync API call
[ https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555017#comment-16555017 ] genericqa commented on HDFS-13688: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 34s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 1s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} HDFS-12943 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 4s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 29m 32s{color} | {color:green} HDFS-12943 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 31m 21s{color} | {color:green} HDFS-12943 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s{color} | {color:green} HDFS-12943 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 9s{color} | {color:green} HDFS-12943 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 15s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 47s{color} | {color:green} HDFS-12943 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 17s{color} | {color:green} HDFS-12943 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 56s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 29m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 29m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 10m 53s{color} | {color:red} patch has errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 59s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 17s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 12s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 17s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 47s{color} | {color:red} hadoop-hdfs-client in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 32m 10s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 6m 4s{color} | {color:red} hadoop-hdfs-rbf in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 58s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}202m 13s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs
[jira] [Commented] (HDFS-13688) Introduce msync API call
[ https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16554874#comment-16554874 ] Chen Liang commented on HDFS-13688: --- Had an offline discussion with [~shv], updated v003 patch. Some notes/updates here: 1. decided to break the msync implementation into two parts: client side and server side. This JIRA is specifically addressing only the client side by introducing msync API to {{ClientProtocol}}. The server side is completely missing. This is also the reason why I did not include unit test in v003 patch. As no meaningful test can actually be done as for now, will follow up with tests in next part. Please let me know if tests here are still preferred. 2. msync should not take {{syncTxid}} as a parameter, because it is already being passed around via {{AlignmentContext}}. Removing syncTxid significantly simplified client, as compared to v002 patch. 3. In addition to having a explicit msync API call, msync should also by default be implicitly done for every single call to Observer in order to maintain consistency. Using deferred response can still achieve this, but it would mean using deferred response for every single read API call. Which is not ideal. An alternative way would be leveraging callQueue so that if server hasn't caught up with client id, the call stays in the queue. The next JIRA will be implementing this approach. > Introduce msync API call > > > Key: HDFS-13688 > URL: https://issues.apache.org/jira/browse/HDFS-13688 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Chen Liang >Assignee: Chen Liang >Priority: Major > Attachments: HDFS-13688-HDFS-12943.001.patch, > HDFS-13688-HDFS-12943.002.patch, HDFS-13688-HDFS-12943.002.patch, > HDFS-13688-HDFS-12943.003.patch, HDFS-13688-HDFS-12943.WIP.002.patch, > HDFS-13688-HDFS-12943.WIP.patch > > > As mentioned in the design doc in HDFS-12943, to ensure consistent read, we > need to introduce an RPC call {{msync}}. Specifically, client can issue a > msync call to Observer node along with a transactionID. The msync will only > return when the Observer's transactionID has caught up to the given ID. This > JIRA is to add this API. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13688) Introduce msync API call
[ https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553716#comment-16553716 ] genericqa commented on HDFS-13688: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} HDFS-12943 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 16s{color} | {color:green} HDFS-12943 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 27m 19s{color} | {color:green} HDFS-12943 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s{color} | {color:green} HDFS-12943 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 41s{color} | {color:green} HDFS-12943 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 41s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 24s{color} | {color:green} HDFS-12943 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 21s{color} | {color:green} HDFS-12943 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 27m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 27m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 27m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 32s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 10s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 27s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 26s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 47s{color} | {color:red} hadoop-hdfs-client in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}104m 11s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 17m 50s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 48s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}267m 44s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs-project/hadoop-hdfs | | | Thread passed where Runnable expected in org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.msync(RpcController, ClientNamenodeProtocolProtos$MsyncRequestProto) At ClientNamenodeProtocolServerSideTranslatorPB.java:in org.apache.hadoop.hdfs.protocolPB.Cl
[jira] [Commented] (HDFS-13688) Introduce msync API call
[ https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553676#comment-16553676 ] Konstantin Shvachko commented on HDFS-13688: # As [previously discussed|https://issues.apache.org/jira/browse/HDFS-12977?focusedCommentId=16368689&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16368689] there is a constant {{HdfsServerConstants.INVALID_TXID}} for {{INVALID_STATEID}}. But I don't think you need it in {{AlignmentContext}} at all. If Server ever returns {{INVALID_TXID}}, you will catch it with {{((txid < clientId)}} condition but not the other one, since it never equals {{Long.MIN_VALUE}}. # On the client we should always address {{ClientGSIContext}} instead of {{AlignmentContext}}. The former already has {{getLastSeenStateId()}}. So you don't need to add it to {{AlignmentContext}}. On the server {{getLastSeenStateId()}} doesn't make sense. # We should not just create a {{ClientGSIContext}} instance in DFSClient. It lives deep inside {{ObserverReadProxyProvider}} and is generally not exposed to DFSClient. You mostly use it for testing, but there should another way. HDFS-13399 was dedicated to this. [This is the relevant comment|https://issues.apache.org/jira/browse/HDFS-13399?focusedCommentId=16459341&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16459341]. There was also discussion around it in HDFS-12976. I see you are passing it into {{ObserverReadProxyProvider}}, which we were trying to avoid it in previous steps. These are the main issues. I also see some unused imports and non-parameterized generics. > Introduce msync API call > > > Key: HDFS-13688 > URL: https://issues.apache.org/jira/browse/HDFS-13688 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Chen Liang >Assignee: Chen Liang >Priority: Major > Attachments: HDFS-13688-HDFS-12943.001.patch, > HDFS-13688-HDFS-12943.002.patch, HDFS-13688-HDFS-12943.002.patch, > HDFS-13688-HDFS-12943.WIP.002.patch, HDFS-13688-HDFS-12943.WIP.patch > > > As mentioned in the design doc in HDFS-12943, to ensure consistent read, we > need to introduce an RPC call {{msync}}. Specifically, client can issue a > msync call to Observer node along with a transactionID. The msync will only > return when the Observer's transactionID has caught up to the given ID. This > JIRA is to add this API. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13688) Introduce msync API call
[ https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553483#comment-16553483 ] genericqa commented on HDFS-13688: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} HDFS-12943 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 47s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 28m 9s{color} | {color:green} HDFS-12943 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 30m 28s{color} | {color:green} HDFS-12943 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s{color} | {color:green} HDFS-12943 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 13s{color} | {color:green} HDFS-12943 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 11s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 45s{color} | {color:green} HDFS-12943 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 51s{color} | {color:green} HDFS-12943 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 38m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 38m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 38m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 22s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 37s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 42s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 55s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 2m 20s{color} | {color:red} hadoop-hdfs-client in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 98m 1s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 16m 58s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 47s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}292m 25s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs-project/hadoop-hdfs | | | Thread passed where Runnable expected in org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.msync(RpcController, ClientNamenodeProtocolProtos$MsyncRequestProto) At ClientNamenodeProtocolServerSideTranslatorPB.java:in org.apache.hadoop.hdfs.protocolPB.Cl
[jira] [Commented] (HDFS-13688) Introduce msync API call
[ https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553457#comment-16553457 ] genericqa commented on HDFS-13688: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} HDFS-12943 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 4s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 58s{color} | {color:green} HDFS-12943 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 27m 21s{color} | {color:green} HDFS-12943 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} HDFS-12943 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 51s{color} | {color:green} HDFS-12943 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 55s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 26s{color} | {color:green} HDFS-12943 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 24s{color} | {color:green} HDFS-12943 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 26m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 26m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 26m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 32s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 11s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 26s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 24s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 41s{color} | {color:red} hadoop-hdfs-client in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}103m 12s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 17m 0s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 36s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}267m 55s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs-project/hadoop-hdfs | | | Thread passed where Runnable expected in org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.msync(RpcController, ClientNamenodeProtocolProtos$MsyncRequestProto) At ClientNamenodeProtocolServerSideTranslatorPB.java:in org.apache.hadoop.hdfs.protocolPB.Cl
[jira] [Commented] (HDFS-13688) Introduce msync API call
[ https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553152#comment-16553152 ] Chen Liang commented on HDFS-13688: --- I thought HADOOP-15610 was merge to the branch but I guess I was wrong... I have cherry-picked HADOOP-15610. So it should actually be fixed now. Triggering another build again. > Introduce msync API call > > > Key: HDFS-13688 > URL: https://issues.apache.org/jira/browse/HDFS-13688 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Chen Liang >Assignee: Chen Liang >Priority: Major > Attachments: HDFS-13688-HDFS-12943.001.patch, > HDFS-13688-HDFS-12943.002.patch, HDFS-13688-HDFS-12943.WIP.002.patch, > HDFS-13688-HDFS-12943.WIP.patch > > > As mentioned in the design doc in HDFS-12943, to ensure consistent read, we > need to introduce an RPC call {{msync}}. Specifically, client can issue a > msync call to Observer node along with a transactionID. The msync will only > return when the Observer's transactionID has caught up to the given ID. This > JIRA is to add this API. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13688) Introduce msync API call
[ https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553125#comment-16553125 ] genericqa commented on HDFS-13688: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red} 0m 9s{color} | {color:red} Docker failed to build yetus/hadoop:abb62dd. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-13688 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12932448/HDFS-13688-HDFS-12943.002.patch | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/24637/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Introduce msync API call > > > Key: HDFS-13688 > URL: https://issues.apache.org/jira/browse/HDFS-13688 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Chen Liang >Assignee: Chen Liang >Priority: Major > Attachments: HDFS-13688-HDFS-12943.001.patch, > HDFS-13688-HDFS-12943.002.patch, HDFS-13688-HDFS-12943.WIP.002.patch, > HDFS-13688-HDFS-12943.WIP.patch > > > As mentioned in the design doc in HDFS-12943, to ensure consistent read, we > need to introduce an RPC call {{msync}}. Specifically, client can issue a > msync call to Observer node along with a transactionID. The msync will only > return when the Observer's transactionID has caught up to the given ID. This > JIRA is to add this API. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13688) Introduce msync API call
[ https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16551022#comment-16551022 ] genericqa commented on HDFS-13688: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red} 0m 7s{color} | {color:red} Docker failed to build yetus/hadoop:abb62dd. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-13688 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12932448/HDFS-13688-HDFS-12943.002.patch | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/24626/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Introduce msync API call > > > Key: HDFS-13688 > URL: https://issues.apache.org/jira/browse/HDFS-13688 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Chen Liang >Assignee: Chen Liang >Priority: Major > Attachments: HDFS-13688-HDFS-12943.001.patch, > HDFS-13688-HDFS-12943.002.patch, HDFS-13688-HDFS-12943.WIP.002.patch, > HDFS-13688-HDFS-12943.WIP.patch > > > As mentioned in the design doc in HDFS-12943, to ensure consistent read, we > need to introduce an RPC call {{msync}}. Specifically, client can issue a > msync call to Observer node along with a transactionID. The msync will only > return when the Observer's transactionID has caught up to the given ID. This > JIRA is to add this API. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13688) Introduce msync API call
[ https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16551018#comment-16551018 ] Chen Liang commented on HDFS-13688: --- Seems HADOOP-15610 has resolved the jenkins issue. Resubmit v001 patch as v002 patch to trigger another build. > Introduce msync API call > > > Key: HDFS-13688 > URL: https://issues.apache.org/jira/browse/HDFS-13688 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Chen Liang >Assignee: Chen Liang >Priority: Major > Attachments: HDFS-13688-HDFS-12943.001.patch, > HDFS-13688-HDFS-12943.002.patch, HDFS-13688-HDFS-12943.WIP.002.patch, > HDFS-13688-HDFS-12943.WIP.patch > > > As mentioned in the design doc in HDFS-12943, to ensure consistent read, we > need to introduce an RPC call {{msync}}. Specifically, client can issue a > msync call to Observer node along with a transactionID. The msync will only > return when the Observer's transactionID has caught up to the given ID. This > JIRA is to add this API. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13688) Introduce msync API call
[ https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16548173#comment-16548173 ] Chen Liang commented on HDFS-13688: --- I was checking the failed builds. It seemed all the most recent builds failed due to the same {{pip2 install pylint}} command fail error on the Jenkins nodes. Will keep an eye on this. > Introduce msync API call > > > Key: HDFS-13688 > URL: https://issues.apache.org/jira/browse/HDFS-13688 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Chen Liang >Assignee: Chen Liang >Priority: Major > Attachments: HDFS-13688-HDFS-12943.001.patch, > HDFS-13688-HDFS-12943.WIP.002.patch, HDFS-13688-HDFS-12943.WIP.patch > > > As mentioned in the design doc in HDFS-12943, to ensure consistent read, we > need to introduce an RPC call {{msync}}. Specifically, client can issue a > msync call to Observer node along with a transactionID. The msync will only > return when the Observer's transactionID has caught up to the given ID. This > JIRA is to add this API. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13688) Introduce msync API call
[ https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16547357#comment-16547357 ] genericqa commented on HDFS-13688: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red} 0m 7s{color} | {color:red} Docker failed to build yetus/hadoop:abb62dd. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-13688 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12932004/HDFS-13688-HDFS-12943.001.patch | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/24607/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Introduce msync API call > > > Key: HDFS-13688 > URL: https://issues.apache.org/jira/browse/HDFS-13688 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Chen Liang >Assignee: Chen Liang >Priority: Major > Attachments: HDFS-13688-HDFS-12943.001.patch, > HDFS-13688-HDFS-12943.WIP.002.patch, HDFS-13688-HDFS-12943.WIP.patch > > > As mentioned in the design doc in HDFS-12943, to ensure consistent read, we > need to introduce an RPC call {{msync}}. Specifically, client can issue a > msync call to Observer node along with a transactionID. The msync will only > return when the Observer's transactionID has caught up to the given ID. This > JIRA is to add this API. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13688) Introduce msync API call
[ https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16547222#comment-16547222 ] genericqa commented on HDFS-13688: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red} 0m 7s{color} | {color:red} Docker failed to build yetus/hadoop:abb62dd. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-13688 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12932004/HDFS-13688-HDFS-12943.001.patch | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/24605/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Introduce msync API call > > > Key: HDFS-13688 > URL: https://issues.apache.org/jira/browse/HDFS-13688 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Chen Liang >Assignee: Chen Liang >Priority: Major > Attachments: HDFS-13688-HDFS-12943.001.patch, > HDFS-13688-HDFS-12943.WIP.002.patch, HDFS-13688-HDFS-12943.WIP.patch > > > As mentioned in the design doc in HDFS-12943, to ensure consistent read, we > need to introduce an RPC call {{msync}}. Specifically, client can issue a > msync call to Observer node along with a transactionID. The msync will only > return when the Observer's transactionID has caught up to the given ID. This > JIRA is to add this API. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13688) Introduce msync API call
[ https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16547193#comment-16547193 ] Chen Liang commented on HDFS-13688: --- Post v001 patch for a Jenkins run. The main difference from WIP.002 patch is removing the call to ANN for fresh client. Things like handling delegation token and failover logic handling still need to be added. I was trying to move {{AlignmentContext}} out of {{DFSClient}}. But it is actually tricky, because in current patch, msync API in {{ClientProtocol}} takes state id as argument. So DFSClient must present the state id when making the call, requiring it to access state id maintained in AlignmentContext. Haven't come up with a non-hacky alternative. Suggestions are welcome. > Introduce msync API call > > > Key: HDFS-13688 > URL: https://issues.apache.org/jira/browse/HDFS-13688 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Chen Liang >Assignee: Chen Liang >Priority: Major > Attachments: HDFS-13688-HDFS-12943.001.patch, > HDFS-13688-HDFS-12943.WIP.002.patch, HDFS-13688-HDFS-12943.WIP.patch > > > As mentioned in the design doc in HDFS-12943, to ensure consistent read, we > need to introduce an RPC call {{msync}}. Specifically, client can issue a > msync call to Observer node along with a transactionID. The msync will only > return when the Observer's transactionID has caught up to the given ID. This > JIRA is to add this API. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13688) Introduce msync API call
[ https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16545898#comment-16545898 ] Chen Liang commented on HDFS-13688: --- I was rebasing the patch, and I found that one of my tests failed. After looking into it, it seems that it was caused by that in the final committed version of HDFS-12976, {{isObserverState}} has changed. It gets called on initialization of {{ObserverReadProxyProvider}}. In current new {{isObserverState}} call, it makes an RPC call to every NameNode. As a side effect, this RPC call will also set up client side state id. In which case, even for a fresh client, it will have state id set after {{ObserverReadProxyProvider}} gets created, so there is no longer the need to explicitly making a call to set state id for fresh clients. I will post another patch to reflect this change. > Introduce msync API call > > > Key: HDFS-13688 > URL: https://issues.apache.org/jira/browse/HDFS-13688 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Chen Liang >Assignee: Chen Liang >Priority: Major > Attachments: HDFS-13688-HDFS-12943.WIP.002.patch, > HDFS-13688-HDFS-12943.WIP.patch > > > As mentioned in the design doc in HDFS-12943, to ensure consistent read, we > need to introduce an RPC call {{msync}}. Specifically, client can issue a > msync call to Observer node along with a transactionID. The msync will only > return when the Observer's transactionID has caught up to the given ID. This > JIRA is to add this API. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13688) Introduce msync API call
[ https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16545720#comment-16545720 ] Chen Liang commented on HDFS-13688: --- Thanks [~linyiqun], [~zero45] for the comments! Sorry for the late response, just got back from vacation. On [~linyiqun]'s comment: bq. the benefit of the second approach is that we can make client logic more simple and don't need to hold the state id. ... Thanks for sharing the thoughts [~linyiqun]! I'm not sure though, if there currently is a protocol for SbN to make RPC calls to ANN. Because it seems an overkill to add a whole new protocol just for this. And even with the second approach, I think the client still needs to hold state id, because msync call is to let SbN catch up on a state id given by client, not necessarily to catch up on the most recent ANN state id. So client still needs to present a state id for SbN to check how much it needs to catch. On [~zero45]'s comments: bq. client should instead learn the stateID when it eventually decides to do something, like a read or a write. You are right that when the fresh client does a read/write, the client will have a state id regardless. The targeted issue here was that, if the fresh client is making a write call, that would be fine because the client gets the ANN state id. But if the fresh client is making a read call, which goes to observer, there is no guarantee on what this state id will be, potentially causing issue for the "Third-part communication" part as mentioned in the design doc. So for a fresh client's read, we need a way for it catch to ANN state id, and this is where msync comes in. Namely, the fresh client can make a msync call first, then proceed to read the recent update. This is the use case where a fresh client may start with making a msync call. bq. Observer and SNN also provide stateIDs from reads. Is there a reason you need the stateID from the ANN? Same as above, msync needs to catch up the most ANN state id. Getting this from Observer does not guarantee this. bq. Have we considered having the txid to wait for be a parameter to msync? Something like msync(long txidToWaitFor)?... In the current WIP patch, in NameNode protocol, the msync does take a parameter, which is the txid to sync on/wait for. It is DFSClient that has a wrapper msync without this parameter, DFSClient gets the txid first for fresh client, then pass the id to the NameNode protocol's msync. Looks removing the wrapper logic, but expose the parameterized msync would be what you suggested? It seems a good idea to separate the issue of how to get the right state id. But it would make no sense to have DFSClient expose a msync API that requires a state id, as I don't think any layer above DFSClient would/should understand state id and be able to provide one. As for implementing blocking wait, in the WIP patch, this happens on server side (i.e. a blocking call from client perspective), it is currently using deferred response feature from HADOOP-11552 along with a dedicated thread pool. Fundamentally, this is the same as a naive blocking, only that the wait is handed over to the thread pool to release handler threads. > Introduce msync API call > > > Key: HDFS-13688 > URL: https://issues.apache.org/jira/browse/HDFS-13688 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Chen Liang >Assignee: Chen Liang >Priority: Major > Attachments: HDFS-13688-HDFS-12943.WIP.002.patch, > HDFS-13688-HDFS-12943.WIP.patch > > > As mentioned in the design doc in HDFS-12943, to ensure consistent read, we > need to introduce an RPC call {{msync}}. Specifically, client can issue a > msync call to Observer node along with a transactionID. The msync will only > return when the Observer's transactionID has caught up to the given ID. This > JIRA is to add this API. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13688) Introduce msync API call
[ https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16530285#comment-16530285 ] Plamen Jeliazkov commented on HDFS-13688: - Hi Chen, Couple questions for you here. (1) I am a little confused about this part here: "fresh client who has no initial state id, the client needs to check AlignmentContext, to see if state id is set". Why do we need to check AlignmentContext if we are a fresh client? Given that you have no knowledge of the state of the cluster the client should instead learn the stateID when it eventually decides to do something, like a read or a write. IMO, there should be no need for a fresh client to try to check the stateID. Maybe I am not understanding something. (2) "see if state id is set, if not, DFSClient makes a RPC call to ANN". Observer and SNN also provide stateIDs from reads. Is there a reason you need the stateID from the ANN? (3) Have we considered having the txid to wait for be a parameter to msync? Something like {{msync(long txidToWaitFor)}}? I can see a naive implementation being just to have the receiving server block until the expected txid value is reached, and then reply. This would allow us to focus on how to supply that txid as a separate issue and enable folks to write their own out of bound, third party, stateID transfer mechanisms. > Introduce msync API call > > > Key: HDFS-13688 > URL: https://issues.apache.org/jira/browse/HDFS-13688 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Chen Liang >Assignee: Chen Liang >Priority: Major > Attachments: HDFS-13688-HDFS-12943.WIP.002.patch, > HDFS-13688-HDFS-12943.WIP.patch > > > As mentioned in the design doc in HDFS-12943, to ensure consistent read, we > need to introduce an RPC call {{msync}}. Specifically, client can issue a > msync call to Observer node along with a transactionID. The msync will only > return when the Observer's transactionID has caught up to the given ID. This > JIRA is to add this API. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13688) Introduce msync API call
[ https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16529539#comment-16529539 ] Yiqun Lin commented on HDFS-13688: -- Thanks for the sharing, [~vagarychen]! {quote} 1. it is preferred not to have AlignmentContext in DFSClient. The current reason AlignmentContext in DFSClient as in WIP patch is that for a fresh client who has no initial state id, the client needs to check AlignmentContext, to see if state id is set, if not, DFSClient makes a RPC call to ANN. Since DFSClient has to make this check, it needs to see the AlignmentContext instance. If there is an alternative way where DFSClient does not need to explicitly make this check, there is no need to have AlignmentContext in DFSClient. Still need to investigate if there is an alternative way though. {quote} I am thinking for this, actually the key point now is that we should let SBN catch up with ANN. From my understanding of this, there are two approaches to achieve this. * Hold state id in client (we are assume this id is latest tx id requested from ANN), and then pass to SBN, and let SBN to reach to this desired id. * Client don't need to hold the state id and just make the msync call to SBN. But in the mysnc dealing in SBN, it will do one additional call for getting latest tx id from ANN, then wait until itself to catch up with this id. In the WIP patch, we use the first approach. Comparing with the first approach, the benefit of the second approach is that we can make client logic more simple and don't need to hold the state id. Instead, SBN will make one additional RPC call to get latest tx id from ANN every time. But if we often use the fresh client to do the msync, it should be almost same since the fresh client will also need to request the initial call to ANN. > Introduce msync API call > > > Key: HDFS-13688 > URL: https://issues.apache.org/jira/browse/HDFS-13688 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Chen Liang >Assignee: Chen Liang >Priority: Major > Attachments: HDFS-13688-HDFS-12943.WIP.002.patch, > HDFS-13688-HDFS-12943.WIP.patch > > > As mentioned in the design doc in HDFS-12943, to ensure consistent read, we > need to introduce an RPC call {{msync}}. Specifically, client can issue a > msync call to Observer node along with a transactionID. The msync will only > return when the Observer's transactionID has caught up to the given ID. This > JIRA is to add this API. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13688) Introduce msync API call
[ https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16529284#comment-16529284 ] Chen Liang commented on HDFS-13688: --- There have been some offline discussion on msync, sharing some notes here for reference. 1. it is preferred not to have AlignmentContext in DFSClient. The current reason AlignmentContext in DFSClient as in WIP patch is that for a fresh client who has no initial state id, the client needs to check AlignmentContext, to see if state id is set, if not, DFSClient makes a RPC call to ANN. Since DFSClient has to make this check, it needs to see the AlignmentContext instance. If there is an alternative way where DFSClient does not need to explicitly make this check, there is no need to have AlignmentContext in DFSClient. Still need to investigate if there is an alternative way though. 2. it is preferred not to check method name for msync 3. need to make sure delegation token gets propagated to Observer first before Observer node reacts to a msync call. 4. as mentioned as a TODO in the WIP patch, still missing the logic to trigger client making msync call when Observer node failover happens. Under the context current WIP patch, this can be down by reseting AlignmentContext instance stateid when switching observer in ProxyProvider. These are based on discussions with [~shv], [~csun], [~zero45] and [~jnp], thanks for the feedbacks! > Introduce msync API call > > > Key: HDFS-13688 > URL: https://issues.apache.org/jira/browse/HDFS-13688 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Chen Liang >Assignee: Chen Liang >Priority: Major > Attachments: HDFS-13688-HDFS-12943.WIP.002.patch, > HDFS-13688-HDFS-12943.WIP.patch > > > As mentioned in the design doc in HDFS-12943, to ensure consistent read, we > need to introduce an RPC call {{msync}}. Specifically, client can issue a > msync call to Observer node along with a transactionID. The msync will only > return when the Observer's transactionID has caught up to the given ID. This > JIRA is to add this API. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13688) Introduce msync API call
[ https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16522703#comment-16522703 ] Chen Liang commented on HDFS-13688: --- Thanks [~linyiqun] for the review, appreciate it! bq. LastSeenId isn't tracked for both ANN and SBN. Thanks for checking with design doc! The idea in the patch was based on some offline discussion we had so there seems a bit difference there. After client sees the ANN's state id. There were two ideas we evaluated. One is that client keeps sending msync calls to Observer, Observer returns immediately with its state id, client only returns until the observer state id catches up. A downside here is that multiple RPC calls are made, abusing RPC queue and handler CPU time on server side. The other approach is that client makes one single call, and Observer side will block the call only after the state id catches up. Since it is observer side making sure the id catches up, client side no longer needs to keep track of observer id. A downside here is that server needs more thread resources (i.e. the executor introduced), but I think this is a fair tradeoff compared to the other way. bq. syncTnxId passed in msync call large than LastAppliedOrWrittenTxId in ANN. Need to throw the exception? Fixed bq. The condition check should be HAServiceState.ACTIVE.toString().equals(namesystem.getHAState()? This led me to think of in what situation will msync be called on standby. Seems this happens only when there is some role transition happening, I will need to think of if all transition cases are properly handled here. Right now I'm inclined to believe the change you suggested should be sufficient. Fixed in the WIP.v002 patch. This is actually an interesting a point, thanks for bringing it up! bq. Why not just pass the msyncExecutor as null there? Fixed. > Introduce msync API call > > > Key: HDFS-13688 > URL: https://issues.apache.org/jira/browse/HDFS-13688 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Chen Liang >Assignee: Chen Liang >Priority: Major > Attachments: HDFS-13688-HDFS-12943.WIP.002.patch, > HDFS-13688-HDFS-12943.WIP.patch > > > As mentioned in the design doc in HDFS-12943, to ensure consistent read, we > need to introduce an RPC call {{msync}}. Specifically, client can issue a > msync call to Observer node along with a transactionID. The msync will only > return when the Observer's transactionID has caught up to the given ID. This > JIRA is to add this API. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13688) Introduce msync API call
[ https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16522041#comment-16522041 ] Yiqun Lin commented on HDFS-13688: -- Hi [~vagarychen], just comparing implementation detail of msync call with that in design doc: {noformat} msync() implementation on the client should keep track of LastSeenId for both ANN and SBN: * If c.LastSeenId.ANN <= c.LastSeenId.SBN then goto ANN and update c.LastSeenId.ANN * Wait until SBN reaches c.LastSeenId.ANN {noformat} Some differences: * LastSeenId isn't tracked for both ANN and SBN. * For the corner case, the client request to ANN, meanwhile the syncTnxId passed in msync call large than {{LastAppliedOrWrittenTxId}} in ANN. Current processing logic is different with designed way. Besides, for the following logic: {code:java} +if (!HAServiceState.OBSERVER.toString().equals(namesystem.getHAState())) { + LOG.warn("Calling msync on a non-observer node:" + + namesystem.getHAState()); + return namesystem.getFSImage().getLastAppliedOrWrittenTxId(); +} {code} The condition check should be {{HAServiceState.ACTIVE.toString().equals(namesystem.getHAState()}}? This is mean that only when we request for ANN, then return current txid. For the SBN/Observer Node, we wait until catching up. For the msync call dealing in RBF, currently we don't supported. Why not just pass the msyncExecutor as null there? Actually it isn't real used. {code:java} @@ -252,9 +257,11 @@ public RouterRpcServer(Configuration configuration, Router router, RPC.setProtocolEngine(this.conf, ClientNamenodeProtocolPB.class, ProtobufRpcEngine.class); +this.msyncExecutor = Executors.newFixedThreadPool(10); ClientNamenodeProtocolServerSideTranslatorPB clientProtocolServerTranslator = -new ClientNamenodeProtocolServerSideTranslatorPB(this); +new ClientNamenodeProtocolServerSideTranslatorPB( +this, msyncExecutor); {code} > Introduce msync API call > > > Key: HDFS-13688 > URL: https://issues.apache.org/jira/browse/HDFS-13688 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Chen Liang >Assignee: Chen Liang >Priority: Major > Attachments: HDFS-13688-HDFS-12943.WIP.patch > > > As mentioned in the design doc in HDFS-12943, to ensure consistent read, we > need to introduce an RPC call {{msync}}. Specifically, client can issue a > msync call to Observer node along with a transactionID. The msync will only > return when the Observer's transactionID has caught up to the given ID. This > JIRA is to add this API. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13688) Introduce msync API call
[ https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520803#comment-16520803 ] Chen Liang commented on HDFS-13688: --- Post a WIP patch for early review. This patch depends on HDFS-12976, and needs to be applied on top of HDFS-12976 v002 patch. Some notes on the patch for reviewers, comments are welcome!: # introduced per dfsclient AlignmentContext instance, which gets passed proxy provider. Existing code ensures that all proxies created from this provider will have this alignment context instance. # when server sets the last seen id in the rpc response, changed from lastWritten id to lastAppliedOrWrittenID # currently, using a local spin loop with 1ms interval, at most 1000 loops to wait for observer to catch up. # leverage deferred response and a dedicated new thread pool of 10 thread to handle all msync, such that handler threads will not be handling (and potentially blocking) on msync call. 10 is hard coded, can be made configurable if more preferred. # currently, this is a call exposed through DFSClient and DistributedFilesystem, still needs to be called explicitly. Will need to make it that every single call to Observer is somehow piggybacked with msync. # for a client that already has a state id set in its alignmentContext, the msync call will directly calls into observer node to sync on this state id. But if there is no state id set in alignmentContext (e.g. a freshly started client). The client needs to first get the current state id from active NN, by making a "setup" call. Based on offline discussion with Konstantin, we may not have to introduce a new "setup" call. This can be done by making any call, as long as it is to active. Currently in ClientProtocol, there is getQuotaUsage which is annotated with activeOnly = true. So the current patch makes a getQuotaUsage call on root directory as a "setup" call. > Introduce msync API call > > > Key: HDFS-13688 > URL: https://issues.apache.org/jira/browse/HDFS-13688 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Chen Liang >Assignee: Chen Liang >Priority: Major > Attachments: HDFS-13688-HDFS-12943.WIP.patch > > > As mentioned in the design doc in HDFS-12943, to ensure consistent read, we > need to introduce an RPC call {{msync}}. Specifically, client can issue a > msync call to Observer node along with a transactionID. The msync will only > return when the Observer's transactionID has caught up to the given ID. This > JIRA is to add this API. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org