[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2020-12-10 Thread Hemanth Boyina (Jira)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17247202#comment-17247202 ] Hemanth Boyina commented on HDFS-13314: --- thanks for the discussions here {quote}*bq. WithCount

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-30 Thread Yongjun Zhang (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16420978#comment-16420978 ] Yongjun Zhang commented on HDFS-13314: -- {quote} Hi Yongjun, thanks for looking at the Jira! Please

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-30 Thread Yongjun Zhang (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16420973#comment-16420973 ] Yongjun Zhang commented on HDFS-13314: -- I had couple of email exchange with [~arpitagarwal] {quote}

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-29 Thread Tsz Wo Nicholas Sze (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16419417#comment-16419417 ] Tsz Wo Nicholas Sze commented on HDFS-13314: {quote} The test case would protect this feature

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-29 Thread Arpit Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16419307#comment-16419307 ] Arpit Agarwal commented on HDFS-13314: -- bq. In most of the patches that I have submitted, writing

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-29 Thread Rushabh S Shah (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16419273#comment-16419273 ] Rushabh S Shah commented on HDFS-13314: --- bq. We test this code path since we have many unit tests

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-28 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418042#comment-16418042 ] Hudson commented on HDFS-13314: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13896 (See

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-27 Thread Arpit Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416260#comment-16416260 ] Arpit Agarwal commented on HDFS-13314: -- bq. if numErrors == 0 then namenode should not exit. We test

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-27 Thread Tsz Wo Nicholas Sze (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416209#comment-16416209 ] Tsz Wo Nicholas Sze commented on HDFS-13314: More details: The suggested unit test sounds like

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-27 Thread Tsz Wo Nicholas Sze (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416199#comment-16416199 ] Tsz Wo Nicholas Sze commented on HDFS-13314: [~shahrs87], imho, the unit test you suggested

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-27 Thread Rushabh S Shah (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416046#comment-16416046 ] Rushabh S Shah commented on HDFS-13314: --- bq. perhaps we could do some ugly fault injection to create

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-27 Thread Arpit Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16415919#comment-16415919 ] Arpit Agarwal commented on HDFS-13314: -- Thanks for reviewing the patch. I don't see an easy way to

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-27 Thread Rushabh S Shah (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16415743#comment-16415743 ] Rushabh S Shah commented on HDFS-13314: --- Overall the changes looks good. I would like to see a test

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-27 Thread Rushabh S Shah (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16415676#comment-16415676 ] Rushabh S Shah commented on HDFS-13314: --- I will review it today. > NameNode should optionally exit

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-26 Thread Tsz Wo Nicholas Sze (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16414882#comment-16414882 ] Tsz Wo Nicholas Sze commented on HDFS-13314: +1 the 05 patch looks good. > NameNode should

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-22 Thread genericqa (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16410615#comment-16410615 ] genericqa commented on HDFS-13314: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem ||

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-22 Thread Arpit Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16410236#comment-16410236 ] Arpit Agarwal commented on HDFS-13314: -- v05 patch: Remove the config key which made this behavior

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-22 Thread Arpit Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16409920#comment-16409920 ] Arpit Agarwal commented on HDFS-13314: -- bq. not halting the NN risks removing the only good image.

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-22 Thread Arpit Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16409895#comment-16409895 ] Arpit Agarwal commented on HDFS-13314: -- bq. Yes, no config option. Detected corruption =

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-22 Thread Daryn Sharp (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16409860#comment-16409860 ] Daryn Sharp commented on HDFS-13314: bq. Is there anything you suggest doing differently? Yes, no

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-22 Thread Arpit Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16409657#comment-16409657 ] Arpit Agarwal commented on HDFS-13314: -- Thanks [~szetszwo]. I'll hold off committing in case [~daryn]

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-22 Thread genericqa (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16409463#comment-16409463 ] genericqa commented on HDFS-13314: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem ||

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-21 Thread Arpit Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408443#comment-16408443 ] Arpit Agarwal commented on HDFS-13314: -- Thanks [~szetszwo]. The v4 patch removes savedImage and

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-20 Thread Tsz Wo Nicholas Sze (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407400#comment-16407400 ] Tsz Wo Nicholas Sze commented on HDFS-13314: [~arpitagarwal], thanks for the update. I have

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-20 Thread genericqa (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407364#comment-16407364 ] genericqa commented on HDFS-13314: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem ||

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-20 Thread Arpit Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407196#comment-16407196 ] Arpit Agarwal commented on HDFS-13314: -- bq. How is the "safe" choice to knowingly write a corrupt

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-20 Thread Daryn Sharp (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407124#comment-16407124 ] Daryn Sharp commented on HDFS-13314: I think Rushabh thought the "don't exit" option didn't delete

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-20 Thread Arpit Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406982#comment-16406982 ] Arpit Agarwal commented on HDFS-13314: -- v03 patch addresses feedback from [~szetszwo]. > NameNode

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-20 Thread Arpit Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406819#comment-16406819 ] Arpit Agarwal commented on HDFS-13314: -- [~shahrs87] I am unsure how your question relates to this

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-20 Thread Rushabh S Shah (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406798#comment-16406798 ] Rushabh S Shah commented on HDFS-13314: --- bq. Checkpointing is done by the standby. But I don't need

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-20 Thread Arpit Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406753#comment-16406753 ] Arpit Agarwal commented on HDFS-13314: -- Thanks for the look [~xiaochen]. bq. I'm inclined to agree

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-20 Thread Arpit Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406741#comment-16406741 ] Arpit Agarwal commented on HDFS-13314: -- bq. I don't understand why it is impossible. Why do I need to

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-20 Thread Rushabh S Shah (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406429#comment-16406429 ] Rushabh S Shah commented on HDFS-13314: --- bq. Impossible, as you will need to restart the standby to

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-20 Thread Xiao Chen (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16405890#comment-16405890 ] Xiao Chen commented on HDFS-13314: -- Thanks [~arpitagarwal] and all for the effort here. Also ping

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-19 Thread genericqa (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16405647#comment-16405647 ] genericqa commented on HDFS-13314: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem ||

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-19 Thread Arpit Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16405556#comment-16405556 ] Arpit Agarwal commented on HDFS-13314: -- Hi Rushabh, bq. You need to change the namenode code and

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-19 Thread Rushabh S Shah (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16405506#comment-16405506 ] Rushabh S Shah commented on HDFS-13314: --- bq. In the cases we ran into, the corrupted image was

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-19 Thread Tsz Wo Nicholas Sze (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16405475#comment-16405475 ] Tsz Wo Nicholas Sze commented on HDFS-13314: Thanks [~arpitagarwal], some comments on the

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-19 Thread Arpit Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16405451#comment-16405451 ] Arpit Agarwal commented on HDFS-13314: -- Thanks for the look Rushabh. In the cases we ran into, the

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-19 Thread Rushabh S Shah (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16405429#comment-16405429 ] Rushabh S Shah commented on HDFS-13314: --- Just curious why we want to go ahead and still write the

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

2018-03-19 Thread Arpit Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16405373#comment-16405373 ] Arpit Agarwal commented on HDFS-13314: -- We've seen two FsImage corruption symptoms correlated with