[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-11-30 Thread genericqa (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273631#comment-16273631 ] genericqa commented on HDFS-12638: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-11-30 Thread Zhe Zhang (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273621#comment-16273621 ] Zhe Zhang commented on HDFS-12638: -- Thanks [~shv]. +1 on the v3 patch (pretty clear fix t

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-11-30 Thread Konstantin Shvachko (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273614#comment-16273614 ] Konstantin Shvachko commented on HDFS-12638: Filed HDFS-12880 to restore the B

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-11-29 Thread Erik Krogen (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272083#comment-16272083 ] Erik Krogen commented on HDFS-12638: I investigated this further and think that Konsta

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-11-28 Thread Junping Du (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16269808#comment-16269808 ] Junping Du commented on HDFS-12638: --- Ping... [~szetszwo] and [~jingzhao], do you agree w

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-11-22 Thread Junping Du (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16263649#comment-16263649 ] Junping Du commented on HDFS-12638: --- Should be a blocker for 2.9.1 and 3.0.0 as well. Ad

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-11-22 Thread Konstantin Shvachko (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16263561#comment-16263561 ] Konstantin Shvachko commented on HDFS-12638: I looked more closely through the

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-11-22 Thread Tsz Wo Nicholas Sze (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16263553#comment-16263553 ] Tsz Wo Nicholas Sze commented on HDFS-12638: Since this is about NameNode fail

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-11-22 Thread Junping Du (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16263522#comment-16263522 ] Junping Du commented on HDFS-12638: --- [~jingzhao] and [~szetszwo], any comments on Konsta

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-11-16 Thread Konstantin Shvachko (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16256293#comment-16256293 ] Konstantin Shvachko commented on HDFS-12638: I think it's a blocker for all br

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-11-08 Thread Weiwei Yang (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16245161#comment-16245161 ] Weiwei Yang commented on HDFS-12638: Hi [~shv] Thanks, I see your point. I have incre

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-11-08 Thread Konstantin Shvachko (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16245134#comment-16245134 ] Konstantin Shvachko commented on HDFS-12638: [~wweic] {{addDeleteBlock()}} is

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-10-30 Thread Weiwei Yang (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226111#comment-16226111 ] Weiwei Yang commented on HDFS-12638: Hi [~shv] Thanks for the new patch. One question

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-10-24 Thread Daryn Sharp (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16217402#comment-16217402 ] Daryn Sharp commented on HDFS-12638: Might want to investigate the lifecycle handling

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-10-23 Thread Weiwei Yang (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16215008#comment-16215008 ] Weiwei Yang commented on HDFS-12638: Hi [~shv] I don't think what you suggested could

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-10-21 Thread Konstantin Shvachko (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214120#comment-16214120 ] Konstantin Shvachko commented on HDFS-12638: It looks to me that the main prob

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-10-19 Thread Daryn Sharp (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16211413#comment-16211413 ] Daryn Sharp commented on HDFS-12638: bq. Yes, I think our code should bear with such o

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-10-18 Thread Weiwei Yang (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210498#comment-16210498 ] Weiwei Yang commented on HDFS-12638: Hi [~daryn] bq. it sounds like you want to avoid

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-10-18 Thread Daryn Sharp (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16209604#comment-16209604 ] Daryn Sharp commented on HDFS-12638: [~cheersyang], maybe I'm misinterpreting the prop

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-10-17 Thread Konstantin Shvachko (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208104#comment-16208104 ] Konstantin Shvachko commented on HDFS-12638: Hey [~cheersyang], blocks deletio

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-10-16 Thread Weiwei Yang (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16206921#comment-16206921 ] Weiwei Yang commented on HDFS-12638: Hi [~shv], the actual deletion of those blocks ar

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-10-16 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16206786#comment-16206786 ] Hadoop QA commented on HDFS-12638: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-10-16 Thread Weiwei Yang (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16205861#comment-16205861 ] Weiwei Yang commented on HDFS-12638: Hi [~yangjiandan] Thanks for narrowing down the

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-10-16 Thread Jiandan Yang (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16205739#comment-16205739 ] Jiandan Yang commented on HDFS-12638: -- Doing truncate & delete files when cluster is

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-10-13 Thread Daryn Sharp (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203585#comment-16203585 ] Daryn Sharp commented on HDFS-12638: [~shv], you worked on truncate, any further insig

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-10-13 Thread Daryn Sharp (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203577#comment-16203577 ] Daryn Sharp commented on HDFS-12638: There's 2 likely scenarios: * The block is added

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-10-13 Thread Jiandan Yang (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203501#comment-16203501 ] Jiandan Yang commented on HDFS-12638: -- I find another block with the same problem,

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-10-12 Thread Jiandan Yang (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203022#comment-16203022 ] Jiandan Yang commented on HDFS-12638: -- [~daryn] NN audit log lost some, and we did n

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-10-12 Thread Daryn Sharp (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202357#comment-16202357 ] Daryn Sharp commented on HDFS-12638: The issues actually appear unrelated. In our cas

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-10-12 Thread Daryn Sharp (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202092#comment-16202092 ] Daryn Sharp commented on HDFS-12638: It does differ from our case but likely has the s

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-10-12 Thread Daryn Sharp (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202079#comment-16202079 ] Daryn Sharp commented on HDFS-12638: This is bad. The fsck NPE means the block _is_ i

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-10-12 Thread Jiandan Yang (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16201540#comment-16201540 ] Jiandan Yang commented on HDFS-12638: -- datanode revover failed because new blocksize

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-10-11 Thread Jiandan Yang (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16201413#comment-16201413 ] Jiandan Yang commented on HDFS-12638: -- We found missing blockId by metasave, and did

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-10-11 Thread Jiandan Yang (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16201343#comment-16201343 ] Jiandan Yang commented on HDFS-12638: -- [~daryn] There is no snapshot directory in cl

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-10-11 Thread Daryn Sharp (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16200469#comment-16200469 ] Daryn Sharp commented on HDFS-12638: [~yangjiandan] do you have any further info you c

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-10-11 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16200361#comment-16200361 ] Kihwal Lee commented on HDFS-12638: --- We have also seen blocks with "null" bc staying in