[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-05-15 Thread Lavkesh Lahngir (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14546568#comment-14546568 ] Lavkesh Lahngir commented on YARN-3591: --- [~vinodkv]: The concern here is, If a resour

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-05-15 Thread zhihai xu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14546590#comment-14546590 ] zhihai xu commented on YARN-3591: - [~lavkesh], Currently DirectoryCollection supports {{ful

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-05-19 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550566#comment-14550566 ] Hadoop QA commented on YARN-3591: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vo

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-05-20 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14551976#comment-14551976 ] Hadoop QA commented on YARN-3591: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ ||

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-05-21 Thread zhihai xu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14553777#comment-14553777 ] zhihai xu commented on YARN-3591: - [~lavkesh], thanks for the new patch. It looks like your

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-05-21 Thread Lavkesh Lahngir (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14553981#comment-14553981 ] Lavkesh Lahngir commented on YARN-3591: --- The code shows that full dirs are both reada

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-05-21 Thread zhihai xu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14555187#comment-14555187 ] zhihai xu commented on YARN-3591: - Calling checkLocalizedResources() on both goodirs and fu

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-05-22 Thread Lavkesh Lahngir (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14555819#comment-14555819 ] Lavkesh Lahngir commented on YARN-3591: --- Hm.. Got you point. Is DirectoryCollection c

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-05-22 Thread Lavkesh Lahngir (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14555961#comment-14555961 ] Lavkesh Lahngir commented on YARN-3591: --- typo: cleanUpLocalDir(lfs, del, newRepaired

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-05-22 Thread Lavkesh Lahngir (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14555996#comment-14555996 ] Lavkesh Lahngir commented on YARN-3591: --- For adding newErrorDirs do we have to create

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-05-25 Thread zhihai xu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14558727#comment-14558727 ] zhihai xu commented on YARN-3591: - Yes, I think we can get newErrorDirs and newRepairedDirs

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-06-02 Thread Lavkesh Lahngir (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14568834#comment-14568834 ] Lavkesh Lahngir commented on YARN-3591: --- [~zxu] :Can we get away without storing into

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-06-02 Thread Sunil G (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14569480#comment-14569480 ] Sunil G commented on YARN-3591: --- If we have a new api which returns the present set of error

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-06-02 Thread zhihai xu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14569727#comment-14569727 ] zhihai xu commented on YARN-3591: - Hi [~lavkesh], I think we can create a separate JIRA for

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-06-03 Thread Lavkesh Lahngir (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570652#comment-14570652 ] Lavkesh Lahngir commented on YARN-3591: --- Thanks [~sunilg] and [~zxu] for comments and

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-06-08 Thread zhihai xu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14576761#comment-14576761 ] zhihai xu commented on YARN-3591: - Hi [~lavkesh], thanks for the update. IMHO, although sto

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-06-08 Thread Lavkesh Lahngir (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14577124#comment-14577124 ] Lavkesh Lahngir commented on YARN-3591: --- [~zxu]: Thanks for the review and comments.

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-06-12 Thread Varun Vasudev (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14583524#comment-14583524 ] Varun Vasudev commented on YARN-3591: - Sorry for the late response. In my opinion, ther

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-06-12 Thread zhihai xu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14583925#comment-14583925 ] zhihai xu commented on YARN-3591: - Hi [~vvasudev], thanks for the suggestion. It looks like

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-06-16 Thread Varun Vasudev (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14587682#comment-14587682 ] Varun Vasudev commented on YARN-3591: - Lavkesh's original patch did the test regardless

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-06-17 Thread zhihai xu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14590484#comment-14590484 ] zhihai xu commented on YARN-3591: - Hi [~vvasudev], thanks for the explanation. IMHO, If we

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-06-17 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14590576#comment-14590576 ] Hadoop QA commented on YARN-3591: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vo

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-06-18 Thread Varun Vasudev (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14591612#comment-14591612 ] Varun Vasudev commented on YARN-3591: - [~zxu] can you explain how using onChange will h

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-06-18 Thread zhihai xu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14592560#comment-14592560 ] zhihai xu commented on YARN-3591: - Hi [~vvasudev], bq. can you explain how using onChange w

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-06-19 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593472#comment-14593472 ] Jason Lowe commented on YARN-3591: -- One potential issue with that approach is long-running

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-06-21 Thread zhihai xu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14595250#comment-14595250 ] zhihai xu commented on YARN-3591: - Hi [~jlowe], thanks for the thorough analysis. My assump

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-06-30 Thread Lavkesh Lahngir (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14608089#comment-14608089 ] Lavkesh Lahngir commented on YARN-3591: --- Thanks [~jlowe] and [~zxu] for detailed anal

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-07-21 Thread Lavkesh Lahngir (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14634635#comment-14634635 ] Lavkesh Lahngir commented on YARN-3591: --- Hi [~jlowe], Can we get some input on the pr

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-07-21 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14635292#comment-14635292 ] Jason Lowe commented on YARN-3591: -- Sorry for the delay, as I was on vacation and am still

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-07-21 Thread zhihai xu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14635499#comment-14635499 ] zhihai xu commented on YARN-3591: - +1 for [~jlowe]'s comment. Yes, It fixes some problems w

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-05-07 Thread Lavkesh Lahngir (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14532567#comment-14532567 ] Lavkesh Lahngir commented on YARN-3591: --- example: >>stat /data/d3/yarn/local File: `

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-05-07 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14532675#comment-14532675 ] Hadoop QA commented on YARN-3591: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vo

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-05-12 Thread zhihai xu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14540715#comment-14540715 ] zhihai xu commented on YARN-3591: - Hi [~lavkesh], thanks for working on this issue. It look

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-05-12 Thread Lavkesh Lahngir (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14541464#comment-14541464 ] Lavkesh Lahngir commented on YARN-3591: --- Thanks [~zxu] for comments. added a null ch

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-05-13 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14541525#comment-14541525 ] Hadoop QA commented on YARN-3591: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vo

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-05-13 Thread Varun Vasudev (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14541705#comment-14541705 ] Varun Vasudev commented on YARN-3591: - [~zxu], [~lavkesh] - instead of checking listing

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-05-13 Thread Lavkesh Lahngir (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542291#comment-14542291 ] Lavkesh Lahngir commented on YARN-3591: --- [~vvasudev] Thanks for the review: This is a

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-05-13 Thread zhihai xu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542347#comment-14542347 ] zhihai xu commented on YARN-3591: - [~vvasudev], that is a good suggestion, which will give

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-05-14 Thread Lavkesh Lahngir (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14543304#comment-14543304 ] Lavkesh Lahngir commented on YARN-3591: --- Thanks for the comment [~zxu] and [~vvasudev

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-05-14 Thread Lavkesh Lahngir (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14543886#comment-14543886 ] Lavkesh Lahngir commented on YARN-3591: --- LocalResourcesTrackerImpl keeps a ref count

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-05-14 Thread zhihai xu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14544708#comment-14544708 ] zhihai xu commented on YARN-3591: - I think the current code call {{removeResource}} instead

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-05-15 Thread Lavkesh Lahngir (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14545250#comment-14545250 ] Lavkesh Lahngir commented on YARN-3591: --- What about zombie files lying in the various

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-05-15 Thread Vinod Kumar Vavilapalli (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14546213#comment-14546213 ] Vinod Kumar Vavilapalli commented on YARN-3591: --- Essentially keeping the owne

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-05-15 Thread zhihai xu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14546461#comment-14546461 ] zhihai xu commented on YARN-3591: - [~vinodkv], yes, keeping the ownership of turning disks

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-08-07 Thread Lavkesh Lahngir (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14661606#comment-14661606 ] Lavkesh Lahngir commented on YARN-3591: --- Marking sub-tasks to be invalid. > Resource

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-08-07 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14661674#comment-14661674 ] Hadoop QA commented on YARN-3591: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vo

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-08-07 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14661836#comment-14661836 ] Hadoop QA commented on YARN-3591: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vo

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-09-02 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14727155#comment-14727155 ] Hadoop QA commented on YARN-3591: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vo

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-09-02 Thread Varun Vasudev (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14727698#comment-14727698 ] Varun Vasudev commented on YARN-3591: - Thanks for the latest patch Lavkesh! Couple of c

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-09-03 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14729674#comment-14729674 ] Hadoop QA commented on YARN-3591: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vo

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-09-03 Thread Varun Vasudev (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14730374#comment-14730374 ] Varun Vasudev commented on YARN-3591: - +1 for the latest patch. I'll commit this tomorr