[jira] [Commented] (HDFS-12136) BlockSender performance regression due to volume scanner edge case
[ https://issues.apache.org/jira/browse/HDFS-12136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16607968#comment-16607968 ] Hadoop QA commented on HDFS-12136: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s{color} | {color:red} HDFS-12136 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-12136 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12877142/HDFS-12136.trunk.patch | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/25007/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > BlockSender performance regression due to volume scanner edge case > -- > > Key: HDFS-12136 > URL: https://issues.apache.org/jira/browse/HDFS-12136 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.8.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp >Priority: Critical > Attachments: HDFS-12136.branch-2.patch, HDFS-12136.trunk.patch > > > HDFS-11160 attempted to fix a volume scan race for a file appended mid-scan > by reading the last checksum of finalized blocks within the {{BlockSender}} > ctor. Unfortunately it's holding the exclusive dataset lock to open and read > the metafile multiple times Block sender instantiation becomes serialized. > Performance completely collapses under heavy disk i/o utilization or high > xceiver activity. Ex. lost node replication, balancing, or decommissioning. > The xceiver threads congest creating block senders and impair the heartbeat > processing that is contending for the same lock. Combined with other lock > contention issues, pipelines break and nodes sporadically go dead. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12136) BlockSender performance regression due to volume scanner edge case
[ https://issues.apache.org/jira/browse/HDFS-12136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466776#comment-16466776 ] Junping Du commented on HDFS-12136: --- looks like HDFS-11187 may not cover all fix here. [~jojochuang], do you have further comments here? Move it to 2.8.5 as we need more discussion and 2.8.4 is in RC stage. > BlockSender performance regression due to volume scanner edge case > -- > > Key: HDFS-12136 > URL: https://issues.apache.org/jira/browse/HDFS-12136 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.8.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp >Priority: Critical > Attachments: HDFS-12136.branch-2.patch, HDFS-12136.trunk.patch > > > HDFS-11160 attempted to fix a volume scan race for a file appended mid-scan > by reading the last checksum of finalized blocks within the {{BlockSender}} > ctor. Unfortunately it's holding the exclusive dataset lock to open and read > the metafile multiple times Block sender instantiation becomes serialized. > Performance completely collapses under heavy disk i/o utilization or high > xceiver activity. Ex. lost node replication, balancing, or decommissioning. > The xceiver threads congest creating block senders and impair the heartbeat > processing that is contending for the same lock. Combined with other lock > contention issues, pipelines break and nodes sporadically go dead. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12136) BlockSender performance regression due to volume scanner edge case
[ https://issues.apache.org/jira/browse/HDFS-12136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16459058#comment-16459058 ] Daryn Sharp commented on HDFS-12136: Did a cursory scan of HDFS-11187 and I'm not sure if it supersedes this one. We're still running with my patch internally. In addition to eliminating i/o in the lock, this patch tweaked the volume scanner to intelligently handle checksum errors due to genstamp updates. I didn't look that hard into the other patch. Is that scenario an impossible condition now? > BlockSender performance regression due to volume scanner edge case > -- > > Key: HDFS-12136 > URL: https://issues.apache.org/jira/browse/HDFS-12136 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.8.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp >Priority: Critical > Attachments: HDFS-12136.branch-2.patch, HDFS-12136.trunk.patch > > > HDFS-11160 attempted to fix a volume scan race for a file appended mid-scan > by reading the last checksum of finalized blocks within the {{BlockSender}} > ctor. Unfortunately it's holding the exclusive dataset lock to open and read > the metafile multiple times Block sender instantiation becomes serialized. > Performance completely collapses under heavy disk i/o utilization or high > xceiver activity. Ex. lost node replication, balancing, or decommissioning. > The xceiver threads congest creating block senders and impair the heartbeat > processing that is contending for the same lock. Combined with other lock > contention issues, pipelines break and nodes sporadically go dead. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12136) BlockSender performance regression due to volume scanner edge case
[ https://issues.apache.org/jira/browse/HDFS-12136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16459023#comment-16459023 ] Daryn Sharp commented on HDFS-12136: I'll study other Jira to see if it's a dup. > BlockSender performance regression due to volume scanner edge case > -- > > Key: HDFS-12136 > URL: https://issues.apache.org/jira/browse/HDFS-12136 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.8.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp >Priority: Critical > Attachments: HDFS-12136.branch-2.patch, HDFS-12136.trunk.patch > > > HDFS-11160 attempted to fix a volume scan race for a file appended mid-scan > by reading the last checksum of finalized blocks within the {{BlockSender}} > ctor. Unfortunately it's holding the exclusive dataset lock to open and read > the metafile multiple times Block sender instantiation becomes serialized. > Performance completely collapses under heavy disk i/o utilization or high > xceiver activity. Ex. lost node replication, balancing, or decommissioning. > The xceiver threads congest creating block senders and impair the heartbeat > processing that is contending for the same lock. Combined with other lock > contention issues, pipelines break and nodes sporadically go dead. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12136) BlockSender performance regression due to volume scanner edge case
[ https://issues.apache.org/jira/browse/HDFS-12136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457443#comment-16457443 ] genericqa commented on HDFS-12136: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 6s{color} | {color:red} HDFS-12136 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-12136 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12877142/HDFS-12136.trunk.patch | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/24104/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > BlockSender performance regression due to volume scanner edge case > -- > > Key: HDFS-12136 > URL: https://issues.apache.org/jira/browse/HDFS-12136 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.8.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp >Priority: Critical > Attachments: HDFS-12136.branch-2.patch, HDFS-12136.trunk.patch > > > HDFS-11160 attempted to fix a volume scan race for a file appended mid-scan > by reading the last checksum of finalized blocks within the {{BlockSender}} > ctor. Unfortunately it's holding the exclusive dataset lock to open and read > the metafile multiple times Block sender instantiation becomes serialized. > Performance completely collapses under heavy disk i/o utilization or high > xceiver activity. Ex. lost node replication, balancing, or decommissioning. > The xceiver threads congest creating block senders and impair the heartbeat > processing that is contending for the same lock. Combined with other lock > contention issues, pipelines break and nodes sporadically go dead. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12136) BlockSender performance regression due to volume scanner edge case
[ https://issues.apache.org/jira/browse/HDFS-12136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457439#comment-16457439 ] Junping Du commented on HDFS-12136: --- Thanks [~jojochuang] for comments. I agree with you that previous lock issue get resolved in HDFS-11187. [~daryn], I will go ahead to resolve this as duplicated of HDFS-11187 if you also agree. > BlockSender performance regression due to volume scanner edge case > -- > > Key: HDFS-12136 > URL: https://issues.apache.org/jira/browse/HDFS-12136 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.8.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp >Priority: Critical > Attachments: HDFS-12136.branch-2.patch, HDFS-12136.trunk.patch > > > HDFS-11160 attempted to fix a volume scan race for a file appended mid-scan > by reading the last checksum of finalized blocks within the {{BlockSender}} > ctor. Unfortunately it's holding the exclusive dataset lock to open and read > the metafile multiple times Block sender instantiation becomes serialized. > Performance completely collapses under heavy disk i/o utilization or high > xceiver activity. Ex. lost node replication, balancing, or decommissioning. > The xceiver threads congest creating block senders and impair the heartbeat > processing that is contending for the same lock. Combined with other lock > contention issues, pipelines break and nodes sporadically go dead. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12136) BlockSender performance regression due to volume scanner edge case
[ https://issues.apache.org/jira/browse/HDFS-12136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16437193#comment-16437193 ] Wei-Chiu Chuang commented on HDFS-12136: I believe this issue is no more after HDFS-11187 (fixed in 2.8.4) > BlockSender performance regression due to volume scanner edge case > -- > > Key: HDFS-12136 > URL: https://issues.apache.org/jira/browse/HDFS-12136 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.8.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp >Priority: Critical > Attachments: HDFS-12136.branch-2.patch, HDFS-12136.trunk.patch > > > HDFS-11160 attempted to fix a volume scan race for a file appended mid-scan > by reading the last checksum of finalized blocks within the {{BlockSender}} > ctor. Unfortunately it's holding the exclusive dataset lock to open and read > the metafile multiple times Block sender instantiation becomes serialized. > Performance completely collapses under heavy disk i/o utilization or high > xceiver activity. Ex. lost node replication, balancing, or decommissioning. > The xceiver threads congest creating block senders and impair the heartbeat > processing that is contending for the same lock. Combined with other lock > contention issues, pipelines break and nodes sporadically go dead. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12136) BlockSender performance regression due to volume scanner edge case
[ https://issues.apache.org/jira/browse/HDFS-12136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16437109#comment-16437109 ] Junping Du commented on HDFS-12136: --- Hi [~daryn], I saw this issue has been quiet for a while. Can we move to next release given 2.8.4 is on releasing track? > BlockSender performance regression due to volume scanner edge case > -- > > Key: HDFS-12136 > URL: https://issues.apache.org/jira/browse/HDFS-12136 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.8.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp >Priority: Critical > Attachments: HDFS-12136.branch-2.patch, HDFS-12136.trunk.patch > > > HDFS-11160 attempted to fix a volume scan race for a file appended mid-scan > by reading the last checksum of finalized blocks within the {{BlockSender}} > ctor. Unfortunately it's holding the exclusive dataset lock to open and read > the metafile multiple times Block sender instantiation becomes serialized. > Performance completely collapses under heavy disk i/o utilization or high > xceiver activity. Ex. lost node replication, balancing, or decommissioning. > The xceiver threads congest creating block senders and impair the heartbeat > processing that is contending for the same lock. Combined with other lock > contention issues, pipelines break and nodes sporadically go dead. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12136) BlockSender performance regression due to volume scanner edge case
[ https://issues.apache.org/jira/browse/HDFS-12136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16354973#comment-16354973 ] genericqa commented on HDFS-12136: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s{color} | {color:red} HDFS-12136 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-12136 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12877142/HDFS-12136.trunk.patch | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/22971/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > BlockSender performance regression due to volume scanner edge case > -- > > Key: HDFS-12136 > URL: https://issues.apache.org/jira/browse/HDFS-12136 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.8.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp >Priority: Critical > Attachments: HDFS-12136.branch-2.patch, HDFS-12136.trunk.patch > > > HDFS-11160 attempted to fix a volume scan race for a file appended mid-scan > by reading the last checksum of finalized blocks within the {{BlockSender}} > ctor. Unfortunately it's holding the exclusive dataset lock to open and read > the metafile multiple times Block sender instantiation becomes serialized. > Performance completely collapses under heavy disk i/o utilization or high > xceiver activity. Ex. lost node replication, balancing, or decommissioning. > The xceiver threads congest creating block senders and impair the heartbeat > processing that is contending for the same lock. Combined with other lock > contention issues, pipelines break and nodes sporadically go dead. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12136) BlockSender performance regression due to volume scanner edge case
[ https://issues.apache.org/jira/browse/HDFS-12136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16145433#comment-16145433 ] Brahma Reddy Battula commented on HDFS-12136: - bq.so we could target 2.8.3 for the fix. updated as target version. > BlockSender performance regression due to volume scanner edge case > -- > > Key: HDFS-12136 > URL: https://issues.apache.org/jira/browse/HDFS-12136 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.8.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp >Priority: Critical > Attachments: HDFS-12136.branch-2.patch, HDFS-12136.trunk.patch > > > HDFS-11160 attempted to fix a volume scan race for a file appended mid-scan > by reading the last checksum of finalized blocks within the {{BlockSender}} > ctor. Unfortunately it's holding the exclusive dataset lock to open and read > the metafile multiple times Block sender instantiation becomes serialized. > Performance completely collapses under heavy disk i/o utilization or high > xceiver activity. Ex. lost node replication, balancing, or decommissioning. > The xceiver threads congest creating block senders and impair the heartbeat > processing that is contending for the same lock. Combined with other lock > contention issues, pipelines break and nodes sporadically go dead. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12136) BlockSender performance regression due to volume scanner edge case
[ https://issues.apache.org/jira/browse/HDFS-12136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142118#comment-16142118 ] Kihwal Lee commented on HDFS-12136: --- I think the performance impact is less severe after HDFS-12157, so we could target 2.8.3 for the fix. > BlockSender performance regression due to volume scanner edge case > -- > > Key: HDFS-12136 > URL: https://issues.apache.org/jira/browse/HDFS-12136 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.8.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp >Priority: Critical > Attachments: HDFS-12136.branch-2.patch, HDFS-12136.trunk.patch > > > HDFS-11160 attempted to fix a volume scan race for a file appended mid-scan > by reading the last checksum of finalized blocks within the {{BlockSender}} > ctor. Unfortunately it's holding the exclusive dataset lock to open and read > the metafile multiple times Block sender instantiation becomes serialized. > Performance completely collapses under heavy disk i/o utilization or high > xceiver activity. Ex. lost node replication, balancing, or decommissioning. > The xceiver threads congest creating block senders and impair the heartbeat > processing that is contending for the same lock. Combined with other lock > contention issues, pipelines break and nodes sporadically go dead. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12136) BlockSender performance regression due to volume scanner edge case
[ https://issues.apache.org/jira/browse/HDFS-12136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16097190#comment-16097190 ] Wei-Chiu Chuang commented on HDFS-12136: HDFS-6804 is another facet of the same race condition. [~brahmareddy] made a unit test that reliably reproduces the bug if HDFS-11160 is reverted. > BlockSender performance regression due to volume scanner edge case > -- > > Key: HDFS-12136 > URL: https://issues.apache.org/jira/browse/HDFS-12136 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.8.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp >Priority: Critical > Attachments: HDFS-12136.branch-2.patch, HDFS-12136.trunk.patch > > > HDFS-11160 attempted to fix a volume scan race for a file appended mid-scan > by reading the last checksum of finalized blocks within the {{BlockSender}} > ctor. Unfortunately it's holding the exclusive dataset lock to open and read > the metafile multiple times Block sender instantiation becomes serialized. > Performance completely collapses under heavy disk i/o utilization or high > xceiver activity. Ex. lost node replication, balancing, or decommissioning. > The xceiver threads congest creating block senders and impair the heartbeat > processing that is contending for the same lock. Combined with other lock > contention issues, pipelines break and nodes sporadically go dead. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12136) BlockSender performance regression due to volume scanner edge case
[ https://issues.apache.org/jira/browse/HDFS-12136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096405#comment-16096405 ] Daryn Sharp commented on HDFS-12136: [~jojochuang], do you have a unit test that can reliably expose the issue? We run lots of Spark apps and didn't see the problem in 2.7, or maybe it happens so infrequently that nobody notices. The basic problem is that doing IO in the lock is completely unacceptable. Basic things like a failing drive, a heavily utilized drive, high replication from decommissioning or a lost node, will jam up the DN. Just one slow drive will ruin the DN. Here's how that becomes catastrophic: Under sufficiently high load, DNs congest with BlockSenders serially computing checksums. Start decomming, heartbeat thread receives replication commands. Instantiating BlockSenders contend with the backlogged xceiver threads. Heartbeats are now delayed. Meanwhile, clients timeout and pipelines collapse after 45s. Clients reconstruct pipeline but the prior xceivers may are blocked creating the BlockSender – not knowing the client disconnected. Blocked threads go up, eventually hitting the xceiver thread limit (4k for us). Surprisingly the heartbeat thread can eventually receive such little share of the lock that 10mins elapse and the node goes "dead". Now the replication monitor issues even more replications, causing even more congestion in other nodes. In a real incident, we had ~6 random nodes flapping for hours causing sporadic missing blocks and jobs were taking forever. Kihwal and I think we might know a few approaches to further fix the original issue. However at this time I'd argue that undoing the cluster damaging performance is more important than completely fixing the reader/append race. > BlockSender performance regression due to volume scanner edge case > -- > > Key: HDFS-12136 > URL: https://issues.apache.org/jira/browse/HDFS-12136 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.8.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp >Priority: Critical > Attachments: HDFS-12136.branch-2.patch, HDFS-12136.trunk.patch > > > HDFS-11160 attempted to fix a volume scan race for a file appended mid-scan > by reading the last checksum of finalized blocks within the {{BlockSender}} > ctor. Unfortunately it's holding the exclusive dataset lock to open and read > the metafile multiple times Block sender instantiation becomes serialized. > Performance completely collapses under heavy disk i/o utilization or high > xceiver activity. Ex. lost node replication, balancing, or decommissioning. > The xceiver threads congest creating block senders and impair the heartbeat > processing that is contending for the same lock. Combined with other lock > contention issues, pipelines break and nodes sporadically go dead. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12136) BlockSender performance regression due to volume scanner edge case
[ https://issues.apache.org/jira/browse/HDFS-12136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091766#comment-16091766 ] Wei-Chiu Chuang commented on HDFS-12136: Hi [~daryn] sorry come to this late. Thanks for the patch and thanks [~kihwal] for the stack trace. No doubt this is a serious performance regression. I want to emphasis that the initial (false positive) corruption is due to a race condition between concurrent readers and writers. While HDFS-11160 made it seems like it only happen to VolumeScanner, the thing is that this can happen to any readers. When reader thinks it gets a checksum corruption, it reports to NN, which removes the block replica. This happens very frequent for a Spark Streaming application, and data is being read in real-time while data is being ingested. If you want to go this route, please add the same check at dfsclient reader side. For example, when it receives a checksum error, read again to weed out false false positive caused by race condition. > BlockSender performance regression due to volume scanner edge case > -- > > Key: HDFS-12136 > URL: https://issues.apache.org/jira/browse/HDFS-12136 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.8.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp >Priority: Critical > Attachments: HDFS-12136.branch-2.patch, HDFS-12136.trunk.patch > > > HDFS-11160 attempted to fix a volume scan race for a file appended mid-scan > by reading the last checksum of finalized blocks within the {{BlockSender}} > ctor. Unfortunately it's holding the exclusive dataset lock to open and read > the metafile multiple times Block sender instantiation becomes serialized. > Performance completely collapses under heavy disk i/o utilization or high > xceiver activity. Ex. lost node replication, balancing, or decommissioning. > The xceiver threads congest creating block senders and impair the heartbeat > processing that is contending for the same lock. Combined with other lock > contention issues, pipelines break and nodes sporadically go dead. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12136) BlockSender performance regression due to volume scanner edge case
[ https://issues.apache.org/jira/browse/HDFS-12136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16088124#comment-16088124 ] Kihwal Lee commented on HDFS-12136: --- [~jojochuang], we started seeing significant performance regression after increased I/O activities. Jstacking has revealed that DataXceiver threads are all waiting for the dataset impl lock. When the I/O load is reasonable, this might not be visible. {noformat} "org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer@61a9d939" #351184 daemon prio=5 os_prio=0 tid=0x7f94ddf0a000 nid=0xafef waiting on condition [0x7f94c1d4f000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xd55efd28> (a java.util.concurrent.locks.ReentrantLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285) at org.apache.hadoop.hdfs.InstrumentedLock.lock(InstrumentedLock.java:102) at org.apache.hadoop.util.AutoCloseableLock.acquire(AutoCloseableLock.java:67) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.acquireDatasetLock(FsDatasetImpl.java:3274) at org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:252) at org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:2348) at java.lang.Thread.run(Thread.java:745) Locked ownable synchronizers: - None {noformat} {noformat} "DataXceiver for client DFSClient_xxx [Sending block xxx]" #351183 daemon prio=5 os_prio=0 tid=0x0409b000 nid=0xafee waiting on condition [0x7f94c9f49000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xd55efd28> (a java.util.concurrent.locks.ReentrantLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285) at org.apache.hadoop.hdfs.InstrumentedLock.lock(InstrumentedLock.java:102) at org.apache.hadoop.util.AutoCloseableLock.acquire(AutoCloseableLock.java:67) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.acquireDatasetLock(FsDatasetImpl.java:3274) at org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:252) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:580) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:145) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:100) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:288) at java.lang.Thread.run(Thread.java:745) Locked ownable synchronizers: - None {noformat} > BlockSender performance regression due to volume scanner edge case > -- > > Key: HDFS-12136 > URL: https://issues.apache.org/jira/browse/HDFS-12136 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.8.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp >Priority: Critical > Attachments: HDFS-12136.branch-2.patch, HDFS-12136.trunk.patch > > > HDFS-11160 attempted to fix a volume scan race for a file appended mid-scan > by reading the last checksum of finalized blocks within the {{BlockSender}} > ctor. Unfortunately it's holding the exclusive dataset lock to open and read > the metafile multiple times Block sender instantiation becomes serialized. > Performance completely collapses under heavy disk i/o utilization or high > xceiver activity. Ex. lost node replication, balancing, or decommissioning. > The xceiver threads congest
[jira] [Commented] (HDFS-12136) BlockSender performance regression due to volume scanner edge case
[ https://issues.apache.org/jira/browse/HDFS-12136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16087778#comment-16087778 ] Hadoop QA commented on HDFS-12136: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 39s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 65m 46s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 93m 15s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure010 | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | HDFS-12136 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12877142/HDFS-12136.trunk.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 2376c0df5944 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 75c0220 | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC1 | | findbugs | https://builds.apache.org/job/PreCommit-HDFS-Build/20277/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/20277/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/20277/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/20277/console | | Powered by | Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > BlockSender performance regression due to volume scanner
[jira] [Commented] (HDFS-12136) BlockSender performance regression due to volume scanner edge case
[ https://issues.apache.org/jira/browse/HDFS-12136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16087608#comment-16087608 ] Daryn Sharp commented on HDFS-12136: The failed tests either aren't failing for me, or are failing with or w/o this patch. The volume failure tests are very flaky. Apparently races let volume references leak so the cluster can't shutdown and timeouts occur. Will kick the build again to cross-compare test failures. > BlockSender performance regression due to volume scanner edge case > -- > > Key: HDFS-12136 > URL: https://issues.apache.org/jira/browse/HDFS-12136 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.8.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp >Priority: Critical > Attachments: HDFS-12136.branch-2.patch, HDFS-12136.trunk.patch > > > HDFS-11160 attempted to fix a volume scan race for a file appended mid-scan > by reading the last checksum of finalized blocks within the {{BlockSender}} > ctor. Unfortunately it's holding the exclusive dataset lock to open and read > the metafile multiple times Block sender instantiation becomes serialized. > Performance completely collapses under heavy disk i/o utilization or high > xceiver activity. Ex. lost node replication, balancing, or decommissioning. > The xceiver threads congest creating block senders and impair the heartbeat > processing that is contending for the same lock. Combined with other lock > contention issues, pipelines break and nodes sporadically go dead. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12136) BlockSender performance regression due to volume scanner edge case
[ https://issues.apache.org/jira/browse/HDFS-12136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086435#comment-16086435 ] Hadoop QA commented on HDFS-12136: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 39s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 82m 31s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}108m 39s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency | | | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks | | | hadoop.hdfs.TestDFSStripedInputStreamWithRandomECPolicy | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure | | Timed out junit tests | org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | HDFS-12136 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12877142/HDFS-12136.trunk.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux f92003b5745c 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 945c095 | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC1 | | findbugs | https://builds.apache.org/job/PreCommit-HDFS-Build/20263/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/20263/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results |
[jira] [Commented] (HDFS-12136) BlockSender performance regression due to volume scanner edge case
[ https://issues.apache.org/jira/browse/HDFS-12136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086236#comment-16086236 ] Wei-Chiu Chuang commented on HDFS-12136: For more context, there is a Spark Streaming use case that would see very frequent HDFS file corruption without HDFS-11160 (several corrupt files in a day). So I would hope you are not considering reverting HDFS-11160. > BlockSender performance regression due to volume scanner edge case > -- > > Key: HDFS-12136 > URL: https://issues.apache.org/jira/browse/HDFS-12136 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.8.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp >Priority: Critical > > HDFS-11160 attempted to fix a volume scan race for a file appended mid-scan > by reading the last checksum of finalized blocks within the {{BlockSender}} > ctor. Unfortunately it's holding the exclusive dataset lock to open and read > the metafile multiple times Block sender instantiation becomes serialized. > Performance completely collapses under heavy disk i/o utilization or high > xceiver activity. Ex. lost node replication, balancing, or decommissioning. > The xceiver threads congest creating block senders and impair the heartbeat > processing that is contending for the same lock. Combined with other lock > contention issues, pipelines break and nodes sporadically go dead. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12136) BlockSender performance regression due to volume scanner edge case
[ https://issues.apache.org/jira/browse/HDFS-12136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086233#comment-16086233 ] Wei-Chiu Chuang commented on HDFS-12136: Hi [~daryn] thanks for reporting this. I am aware that the fix HDFS-11160 could potential have the problem that you described. But choosing between correctness and performance, I would choose the former. That said, HDFS-11187 is a possible optimization, perhaps you can consider it? I did not push HDFS-11187 forward, because HDFS-11160 has been running at our end internally in the past few months, has been distributed in a number of CDH releases, and we've not yet received reports regarding performance anomaly. > BlockSender performance regression due to volume scanner edge case > -- > > Key: HDFS-12136 > URL: https://issues.apache.org/jira/browse/HDFS-12136 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.8.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp >Priority: Critical > > HDFS-11160 attempted to fix a volume scan race for a file appended mid-scan > by reading the last checksum of finalized blocks within the {{BlockSender}} > ctor. Unfortunately it's holding the exclusive dataset lock to open and read > the metafile multiple times Block sender instantiation becomes serialized. > Performance completely collapses under heavy disk i/o utilization or high > xceiver activity. Ex. lost node replication, balancing, or decommissioning. > The xceiver threads congest creating block senders and impair the heartbeat > processing that is contending for the same lock. Combined with other lock > contention issues, pipelines break and nodes sporadically go dead. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org