[jira] [Commented] (HDFS-15915) Race condition with async edits logging due to updating txId outside of the namesystem log
[ https://issues.apache.org/jira/browse/HDFS-15915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17352653#comment-17352653 ] Konstantin Shvachko commented on HDFS-15915: [~daryn] would appreciate your review. > Race condition with async edits logging due to updating txId outside of the > namesystem log > -- > > Key: HDFS-15915 > URL: https://issues.apache.org/jira/browse/HDFS-15915 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, namenode >Reporter: Konstantin Shvachko >Assignee: Konstantin Shvachko >Priority: Major > Fix For: 3.4.0, 3.1.5, 2.10.2, 3.2.3, 3.3.2 > > Attachments: HDFS-15915-01.patch, HDFS-15915-02.patch, > HDFS-15915-03.patch, HDFS-15915-04.patch, HDFS-15915-05.patch, > testMkdirsRace.patch > > > {{FSEditLogAsync}} creates an {{FSEditLogOp}} and populates its fields inside > {{FSNamesystem.writeLock}}. But one essential field the transaction id of the > edits op remains unset until the time when the operation is scheduled for > synching. At that time {{beginTransaction()}} will set the the > {{FSEditLogOp.txid}} and increment the global transaction count. On busy > NameNode this event can fall outside the write lock. > This causes problems for Observer reads. It also can potentially reshuffle > transactions and Standby will apply them in a wrong order. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15915) Race condition with async edits logging due to updating txId outside of the namesystem log
[ https://issues.apache.org/jira/browse/HDFS-15915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17352565#comment-17352565 ] Daryn Sharp commented on HDFS-15915: I'm very nervous about this patch and need to thoroughly reacquaint myself with the code. Skimming the patch, I'm initially very worried about the added synchronization and the potential for deadlock particularly during an edit log roll. We're in the midst of a upgrade cycle so I likely won't have time to review till early next but in the meantime we will internally revert due to risk... > Race condition with async edits logging due to updating txId outside of the > namesystem log > -- > > Key: HDFS-15915 > URL: https://issues.apache.org/jira/browse/HDFS-15915 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, namenode >Reporter: Konstantin Shvachko >Assignee: Konstantin Shvachko >Priority: Major > Fix For: 3.4.0, 3.1.5, 2.10.2, 3.2.3, 3.3.2 > > Attachments: HDFS-15915-01.patch, HDFS-15915-02.patch, > HDFS-15915-03.patch, HDFS-15915-04.patch, HDFS-15915-05.patch, > testMkdirsRace.patch > > > {{FSEditLogAsync}} creates an {{FSEditLogOp}} and populates its fields inside > {{FSNamesystem.writeLock}}. But one essential field the transaction id of the > edits op remains unset until the time when the operation is scheduled for > synching. At that time {{beginTransaction()}} will set the the > {{FSEditLogOp.txid}} and increment the global transaction count. On busy > NameNode this event can fall outside the write lock. > This causes problems for Observer reads. It also can potentially reshuffle > transactions and Standby will apply them in a wrong order. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15915) Race condition with async edits logging due to updating txId outside of the namesystem log
[ https://issues.apache.org/jira/browse/HDFS-15915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17352023#comment-17352023 ] Konstantin Shvachko commented on HDFS-15915: Ran unit tests that failed on Jenkins. All passing locally. Will be committing this shortly. > Race condition with async edits logging due to updating txId outside of the > namesystem log > -- > > Key: HDFS-15915 > URL: https://issues.apache.org/jira/browse/HDFS-15915 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, namenode >Reporter: Konstantin Shvachko >Assignee: Konstantin Shvachko >Priority: Major > Attachments: HDFS-15915-01.patch, HDFS-15915-02.patch, > HDFS-15915-03.patch, HDFS-15915-04.patch, HDFS-15915-05.patch, > testMkdirsRace.patch > > > {{FSEditLogAsync}} creates an {{FSEditLogOp}} and populates its fields inside > {{FSNamesystem.writeLock}}. But one essential field the transaction id of the > edits op remains unset until the time when the operation is scheduled for > synching. At that time {{beginTransaction()}} will set the the > {{FSEditLogOp.txid}} and increment the global transaction count. On busy > NameNode this event can fall outside the write lock. > This causes problems for Observer reads. It also can potentially reshuffle > transactions and Standby will apply them in a wrong order. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15915) Race condition with async edits logging due to updating txId outside of the namesystem log
[ https://issues.apache.org/jira/browse/HDFS-15915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17351431#comment-17351431 ] Hadoop QA commented on HDFS-15915: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 2m 51s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 42s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 23s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 15s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 3s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 18s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 42s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 23m 13s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 3m 14s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 12s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 18s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 18s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/612/artifact/out/diff-compile-javac-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt{color} | {color:red} hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 generated 1 new + 501 unchanged - 1 fixed = 502 total (was 502) {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 8s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 8s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/612/artifact/out/diff-compile-javac-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt{color} | {color:red} hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 generated 1 new + 485 unchanged - 1 fixed = 486 total (was 486) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 57s{color} | {color:green}{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 204 unchanged - 1 fixed = 204 total (was 205) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 15s{color}
[jira] [Commented] (HDFS-15915) Race condition with async edits logging due to updating txId outside of the namesystem log
[ https://issues.apache.org/jira/browse/HDFS-15915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17351284#comment-17351284 ] Konstantin Shvachko commented on HDFS-15915: Thanks for thorough review [~virajith]. BTW, this {{logEdit()}} method is only used in BackupNode, so it doesn't matter much. But I swapped the two lines in v 05 patch. > Race condition with async edits logging due to updating txId outside of the > namesystem log > -- > > Key: HDFS-15915 > URL: https://issues.apache.org/jira/browse/HDFS-15915 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, namenode >Reporter: Konstantin Shvachko >Assignee: Konstantin Shvachko >Priority: Major > Attachments: HDFS-15915-01.patch, HDFS-15915-02.patch, > HDFS-15915-03.patch, HDFS-15915-04.patch, HDFS-15915-05.patch, > testMkdirsRace.patch > > > {{FSEditLogAsync}} creates an {{FSEditLogOp}} and populates its fields inside > {{FSNamesystem.writeLock}}. But one essential field the transaction id of the > edits op remains unset until the time when the operation is scheduled for > synching. At that time {{beginTransaction()}} will set the the > {{FSEditLogOp.txid}} and increment the global transaction count. On busy > NameNode this event can fall outside the write lock. > This causes problems for Observer reads. It also can potentially reshuffle > transactions and Standby will apply them in a wrong order. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15915) Race condition with async edits logging due to updating txId outside of the namesystem log
[ https://issues.apache.org/jira/browse/HDFS-15915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17351140#comment-17351140 ] Virajith Jalaparti commented on HDFS-15915: --- Thanks for making the changes [~shv]. Minor comment: The {{long start = monotonicNow();}} should be done after the call to {{beginTransaction(null);}} in lines 1637-1638 in {{logEdit}} method to keep it consistent with others places (and previous implementation) where {{endTransaction}} is called with the start time. Verified that the added test fails if the changes in the patch are removed. +1 on committing the patch once this minor comment is addressed. > Race condition with async edits logging due to updating txId outside of the > namesystem log > -- > > Key: HDFS-15915 > URL: https://issues.apache.org/jira/browse/HDFS-15915 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, namenode >Reporter: Konstantin Shvachko >Assignee: Konstantin Shvachko >Priority: Major > Attachments: HDFS-15915-01.patch, HDFS-15915-02.patch, > HDFS-15915-03.patch, HDFS-15915-04.patch, testMkdirsRace.patch > > > {{FSEditLogAsync}} creates an {{FSEditLogOp}} and populates its fields inside > {{FSNamesystem.writeLock}}. But one essential field the transaction id of the > edits op remains unset until the time when the operation is scheduled for > synching. At that time {{beginTransaction()}} will set the the > {{FSEditLogOp.txid}} and increment the global transaction count. On busy > NameNode this event can fall outside the write lock. > This causes problems for Observer reads. It also can potentially reshuffle > transactions and Standby will apply them in a wrong order. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15915) Race condition with async edits logging due to updating txId outside of the namesystem log
[ https://issues.apache.org/jira/browse/HDFS-15915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17347691#comment-17347691 ] Hadoop QA commented on HDFS-15915: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 41s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 1s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 52s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 35s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 23s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 8s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 24s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 19m 20s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 30s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 25m 14s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 3m 23s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 28s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 28s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 28s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/607/artifact/out/diff-compile-javac-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt{color} | {color:red} hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 generated 1 new + 502 unchanged - 1 fixed = 503 total (was 503) {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 19s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 19s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/607/artifact/out/diff-compile-javac-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt{color} | {color:red} hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 generated 1 new + 486 unchanged - 1 fixed = 487 total (was 487) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 57s{color} | {color:green}{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 204 unchanged - 1 fixed = 204 total (was 205) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 23s{color}
[jira] [Commented] (HDFS-15915) Race condition with async edits logging due to updating txId outside of the namesystem log
[ https://issues.apache.org/jira/browse/HDFS-15915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17347336#comment-17347336 ] Konstantin Shvachko commented on HDFS-15915: Updated the patch per [~virajith]'s suggestions. Thanks. # The default implementation of {{EditLogOutputStream.getLastJournalledTxId()}} returns {{INVALID_TXID}} rather than {{0}}. # Changed {{beginTransaction()}} type to void. ??This change forces the txid to be assigned when the operation takes place under the FSN lock.?? Exactly right. The advantage of this in non-Observer case is verifiability and proper enforcement. When you merely rely on placing operations into the queue in the right order you cannot verify that, such as write unit tests or set asserts. And it is hard to detect a bug if there is one in this very multi-threaded code. With the patch the txId is generated when the operation is queued, so I could add asserts to ensure operations are queued and synced in the order they were applied on the active NN. > Race condition with async edits logging due to updating txId outside of the > namesystem log > -- > > Key: HDFS-15915 > URL: https://issues.apache.org/jira/browse/HDFS-15915 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, namenode >Reporter: Konstantin Shvachko >Assignee: Konstantin Shvachko >Priority: Major > Attachments: HDFS-15915-01.patch, HDFS-15915-02.patch, > HDFS-15915-03.patch, HDFS-15915-04.patch, testMkdirsRace.patch > > > {{FSEditLogAsync}} creates an {{FSEditLogOp}} and populates its fields inside > {{FSNamesystem.writeLock}}. But one essential field the transaction id of the > edits op remains unset until the time when the operation is scheduled for > synching. At that time {{beginTransaction()}} will set the the > {{FSEditLogOp.txid}} and increment the global transaction count. On busy > NameNode this event can fall outside the write lock. > This causes problems for Observer reads. It also can potentially reshuffle > transactions and Standby will apply them in a wrong order. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15915) Race condition with async edits logging due to updating txId outside of the namesystem log
[ https://issues.apache.org/jira/browse/HDFS-15915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17346606#comment-17346606 ] Virajith Jalaparti commented on HDFS-15915: --- Looking at this again and answering my question (3) - when {{FSEditLog#getLastWrittenTxIdWithoutLock()}} is called with the current implementation (without the patch here), it may or may not return the txid of the latest operation that was actually applied as the txid may not be assigned to the operation at all. It is assigned only when {{beginTransaction}} is called and {{txid}} is incremented in {{FSEditLog}}. Hence, even though the operations are assigned the correct txids in the current implementation, they might be assigned at a later time than when it's applied to the namesystem. This change forces the txid to be assigned when the operation takes place under the FSN lock. > Race condition with async edits logging due to updating txId outside of the > namesystem log > -- > > Key: HDFS-15915 > URL: https://issues.apache.org/jira/browse/HDFS-15915 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, namenode >Reporter: Konstantin Shvachko >Assignee: Konstantin Shvachko >Priority: Major > Attachments: HDFS-15915-01.patch, HDFS-15915-02.patch, > HDFS-15915-03.patch, testMkdirsRace.patch > > > {{FSEditLogAsync}} creates an {{FSEditLogOp}} and populates its fields inside > {{FSNamesystem.writeLock}}. But one essential field the transaction id of the > edits op remains unset until the time when the operation is scheduled for > synching. At that time {{beginTransaction()}} will set the the > {{FSEditLogOp.txid}} and increment the global transaction count. On busy > NameNode this event can fall outside the write lock. > This causes problems for Observer reads. It also can potentially reshuffle > transactions and Standby will apply them in a wrong order. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15915) Race condition with async edits logging due to updating txId outside of the namesystem log
[ https://issues.apache.org/jira/browse/HDFS-15915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17346591#comment-17346591 ] Virajith Jalaparti commented on HDFS-15915: --- Thanks for finding this and providing a fix [~shv]. A few questions: # Nit: Should the default implementation of {{EditLogOutputStream#getLastJournalledTxId}} return a value of -1 instead of 0 as 0 can be a valid txid? # Nit: In the current implementation, the return value of {{beginTransaction}} is used to get the start time in one place but ignored in other places. Should we just make it return void and force the caller to track the start time? # Without this change, the previous implementation seems to have relied on the ordering within the queue (elements added under the FSN lock) ({{FSEditLogAsync#editPendingQ}}) to ensure that the order in which edits are assigned txids is the same in which they are processed. Why is that not sufficient? > Race condition with async edits logging due to updating txId outside of the > namesystem log > -- > > Key: HDFS-15915 > URL: https://issues.apache.org/jira/browse/HDFS-15915 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, namenode >Reporter: Konstantin Shvachko >Assignee: Konstantin Shvachko >Priority: Major > Attachments: HDFS-15915-01.patch, HDFS-15915-02.patch, > HDFS-15915-03.patch, testMkdirsRace.patch > > > {{FSEditLogAsync}} creates an {{FSEditLogOp}} and populates its fields inside > {{FSNamesystem.writeLock}}. But one essential field the transaction id of the > edits op remains unset until the time when the operation is scheduled for > synching. At that time {{beginTransaction()}} will set the the > {{FSEditLogOp.txid}} and increment the global transaction count. On busy > NameNode this event can fall outside the write lock. > This causes problems for Observer reads. It also can potentially reshuffle > transactions and Standby will apply them in a wrong order. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15915) Race condition with async edits logging due to updating txId outside of the namesystem log
[ https://issues.apache.org/jira/browse/HDFS-15915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17335498#comment-17335498 ] Hadoop QA commented on HDFS-15915: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 45s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 1s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 6m 34s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/595/artifact/out/branch-mvninstall-root.txt{color} | {color:red} root in trunk failed. {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 17s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 13s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 0s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 19s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 30s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 22s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 20m 44s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 3m 3s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 10s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 19s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 19s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/595/artifact/out/diff-compile-javac-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt{color} | {color:red} hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 generated 1 new + 471 unchanged - 1 fixed = 472 total (was 472) {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 9s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 9s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/595/artifact/out/diff-compile-javac-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt{color} | {color:red} hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 generated 1 new + 455 unchanged - 1 fixed = 456 total (was 456) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 58s{color} | {color:green}{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 204 unchanged - 1 fixed = 204 total (was 205) {
[jira] [Commented] (HDFS-15915) Race condition with async edits logging due to updating txId outside of the namesystem log
[ https://issues.apache.org/jira/browse/HDFS-15915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17335215#comment-17335215 ] Hadoop QA commented on HDFS-15915: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 2m 32s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 1s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 29m 4s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 6s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 41s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 24s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 59s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 24m 22s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 36s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 20s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 32m 57s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 4m 40s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 33s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 37s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 37s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/593/artifact/out/diff-compile-javac-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04.txt{color} | {color:red} hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 generated 1 new + 471 unchanged - 1 fixed = 472 total (was 472) {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 21s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 21s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/593/artifact/out/diff-compile-javac-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08.txt{color} | {color:red} hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 generated 1 new + 455 unchanged - 1 fixed = 456 total (was 456) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 0s{color} | {color:orange}https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/593/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 204 unchanged - 1
[jira] [Commented] (HDFS-15915) Race condition with async edits logging due to updating txId outside of the namesystem log
[ https://issues.apache.org/jira/browse/HDFS-15915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17335052#comment-17335052 ] Konstantin Shvachko commented on HDFS-15915: Patch v.02 fixes findbugs and white space warnings. Checked test failures * {{TestOfflineEditsViewer}} fails on trunk the same way as with the patch. Filed HDFS-16001 for it. * {{TestDirectoryScanner}} intermittently fails because of HDFS-11045. * All other tests passed locally. > Race condition with async edits logging due to updating txId outside of the > namesystem log > -- > > Key: HDFS-15915 > URL: https://issues.apache.org/jira/browse/HDFS-15915 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, namenode >Reporter: Konstantin Shvachko >Assignee: Konstantin Shvachko >Priority: Major > Attachments: HDFS-15915-01.patch, HDFS-15915-02.patch, > testMkdirsRace.patch > > > {{FSEditLogAsync}} creates an {{FSEditLogOp}} and populates its fields inside > {{FSNamesystem.writeLock}}. But one essential field the transaction id of the > edits op remains unset until the time when the operation is scheduled for > synching. At that time {{beginTransaction()}} will set the the > {{FSEditLogOp.txid}} and increment the global transaction count. On busy > NameNode this event can fall outside the write lock. > This causes problems for Observer reads. It also can potentially reshuffle > transactions and Standby will apply them in a wrong order. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15915) Race condition with async edits logging due to updating txId outside of the namesystem log
[ https://issues.apache.org/jira/browse/HDFS-15915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17334679#comment-17334679 ] Hadoop QA commented on HDFS-15915: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 14m 0s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 31s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 19s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 17s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 5s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 21s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 34s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 24s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 20m 55s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 3m 3s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 13s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 12s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 12s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/590/artifact/out/diff-compile-javac-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt{color} | {color:red} hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 generated 1 new + 471 unchanged - 1 fixed = 472 total (was 472) {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 6s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 6s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/590/artifact/out/diff-compile-javac-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt{color} | {color:red} hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 generated 1 new + 455 unchanged - 1 fixed = 456 total (was 456) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 56s{color} | {color:orange}https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/590/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 10 new + 204 unchanged -
[jira] [Commented] (HDFS-15915) Race condition with async edits logging due to updating txId outside of the namesystem log
[ https://issues.apache.org/jira/browse/HDFS-15915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17334483#comment-17334483 ] Konstantin Shvachko commented on HDFS-15915: Attaching a patch to fix the problem. The is a lot of moving parts in asynchronous journal logging, took me a while to get it working, although the actual fix doesn't look complex. # The main idea is that a new txId is assigned to the journal transaction when it is logged by {{logEdit(op)}} when the call is still under {{fsn.writeLock}}, rather than later while in {{logSync()}} as it is now. I think this is the right way to _*guarantee that all transactions are journalled in the same order as they were applied on Active NameNode*_. # Currently we do not have checks or tests against mismatch of the transactions order. This would have been a problem for regular HA with or without Observer. I could not build a test, which would show the order of transactions can be tampered with, but couldn't convince myself it is impossible either. The patch adds asserts to guarantee the journal txIds order is the same as they were applied to ANN. # I had to rework {{TestEditLogRace.testDeadlock()}}. Changed it to mock on {{doEditTransaction()}} instead of on {{setTransactionId()}} for the "blocker thread". Also with FSEditLogAsync we cannot really reuse the same operation instance for different transactions any more as they now have txid set in it before syncing. This is [~daryn]'s creation. woud appreciate if you could take a look. > Race condition with async edits logging due to updating txId outside of the > namesystem log > -- > > Key: HDFS-15915 > URL: https://issues.apache.org/jira/browse/HDFS-15915 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, namenode >Reporter: Konstantin Shvachko >Priority: Major > Attachments: testMkdirsRace.patch > > > {{FSEditLogAsync}} creates an {{FSEditLogOp}} and populates its fields inside > {{FSNamesystem.writeLock}}. But one essential field the transaction id of the > edits op remains unset until the time when the operation is scheduled for > synching. At that time {{beginTransaction()}} will set the the > {{FSEditLogOp.txid}} and increment the global transaction count. On busy > NameNode this event can fall outside the write lock. > This causes problems for Observer reads. It also can potentially reshuffle > transactions and Standby will apply them in a wrong order. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15915) Race condition with async edits logging due to updating txId outside of the namesystem log
[ https://issues.apache.org/jira/browse/HDFS-15915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17307463#comment-17307463 ] Konstantin Shvachko commented on HDFS-15915: Attached the test reproducing the bug. Looks like [~zero45] warned about this problem in [his comment|https://issues.apache.org/jira/browse/HDFS-13399?focusedCommentId=16454623&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16454623]. I don't remember though what was the resolution back then. > Race condition with async edits logging due to updating txId outside of the > namesystem log > -- > > Key: HDFS-15915 > URL: https://issues.apache.org/jira/browse/HDFS-15915 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, namenode >Reporter: Konstantin Shvachko >Priority: Major > Attachments: testMkdirsRace.patch > > > {{FSEditLogAsync}} creates an {{FSEditLogOp}} and populates its fields inside > {{FSNamesystem.writeLock}}. But one essential field the transaction id of the > edits op remains unset until the time when the operation is scheduled for > synching. At that time {{beginTransaction()}} will set the the > {{FSEditLogOp.txid}} and increment the global transaction count. On busy > NameNode this event can fall outside the write lock. > This causes problems for Observer reads. It also can potentially reshuffle > transactions and Standby will apply them in a wrong order. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15915) Race condition with async edits logging due to updating txId outside of the namesystem log
[ https://issues.apache.org/jira/browse/HDFS-15915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17307458#comment-17307458 ] Konstantin Shvachko commented on HDFS-15915: # Suppose two {{mkdirs}} for the same path are running on the Active NameNode at the same time. Assume that the path does not exist yet and that the two RPCs are coming from two clients c1 and c2. # Then one of them, e.g. c1, will create the directory in memory and generate the respective transaction {{MkdirOp}}, which has all the fields except for {{txid}}. Then it will enqueue the transaction in {{FSEditLogAsync.logEdit(op)}} for further asynchronous processing. The handler thread processing this rpc from c1 is now free to release the write lock and give control to other threads. # {{FSEditLogAsync.run()}} will asynchronously process the transaction when it dequeues it. At that time it will assign the {{txid}} for the transaction, see {{logEdit() -> doEditTransaction() -> beginTransaction()}}, and increment the global transaction count {{FSEditLog.txid}}. This can happen either inside or outside of the namesystem lock. Under heavy load (rare event) the call to {{logEdit()}} can happen outside the lock. And that causes the problem. # Now suppose that {{MkdirOp}} has not been processed yet, but the second {{mkdirs()}} from client c2 started executing. It can proceed because the write lock has been released. The c2 call will find that the directory already exists and will return to the client without generating any transactions. In the reply it will populate {{lastSeenStateId}}. But the stateId will be less than the txId of the {{MkdirOp}} client c2 just have seen, because this transaction has not been processed yet and the global tx count {{FSEditLog.txid}} did not advance. # Then of course going to ObserverNode with that transaction id can cause stale read if the client reaches the Observer before it tails the {{MkdirOp}} edit from the journal. I managed to reproduce this in a unit test. Attaching. The test spawns a bunch of {{mkdirs()}} on the same path. Then it mocks {{doEditTransaction()}} to delay async processing of the mkdir transaction on Active NN. The delay is sufficient for another {{mkdirs()}} call to pass through and obtain the wrong {{lastSeenStateId}}. Then one can see {{FileNotFoundException}}, which indicates stale read from Observer. _Seems like a straightforward solution is to assign the transaction id at the time of its creation before it is enqueued. The queue order should guarantee the same result of the assignment as now, but will avoid the race._ > Race condition with async edits logging due to updating txId outside of the > namesystem log > -- > > Key: HDFS-15915 > URL: https://issues.apache.org/jira/browse/HDFS-15915 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, namenode >Reporter: Konstantin Shvachko >Priority: Major > > {{FSEditLogAsync}} creates an {{FSEditLogOp}} and populates its fields inside > {{FSNamesystem.writeLock}}. But one essential field the transaction id of the > edits op remains unset until the time when the operation is scheduled for > synching. At that time {{beginTransaction()}} will set the the > {{FSEditLogOp.txid}} and increment the global transaction count. On busy > NameNode this event can fall outside the write lock. > This causes problems for Observer reads. It also can potentially reshuffle > transactions and Standby will apply them in a wrong order. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org