[jira] [Updated] (YARN-4837) User facing aspects of 'AM blacklisting' feature need fixing

2016-08-19 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-4837:
--
Attachment: YARN-4837-branch-2.8.1.txt

Updated patch fixing the javadoc issue.

And checkstyles issues that can be addressed.

The unit test failures are unrelated, pass on my local box and are tracked at 
YARN-5208 / HADOOP-12687.

> User facing aspects of 'AM blacklisting' feature need fixing
> 
>
> Key: YARN-4837
> URL: https://issues.apache.org/jira/browse/YARN-4837
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Critical
> Attachments: YARN-4837-20160515.txt, YARN-4837-20160520.1.txt, 
> YARN-4837-20160520.txt, YARN-4837-20160527.txt, YARN-4837-20160604.txt, 
> YARN-4837-branch-2.005.patch, YARN-4837-branch-2.8.1.txt, 
> YARN-4837-branch-2.8.txt
>
>
> Was reviewing the user-facing aspects that we are releasing as part of 2.8.0.
> Looking at the 'AM blacklisting feature', I see several things to be fixed 
> before we release it in 2.8.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4837) User facing aspects of 'AM blacklisting' feature need fixing

2016-08-18 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-4837:
--
Attachment: YARN-4837-branch-2.8.txt

Uploading a 2.8 patch - fixing conflicts, test-issues etc.

> User facing aspects of 'AM blacklisting' feature need fixing
> 
>
> Key: YARN-4837
> URL: https://issues.apache.org/jira/browse/YARN-4837
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Critical
> Attachments: YARN-4837-20160515.txt, YARN-4837-20160520.1.txt, 
> YARN-4837-20160520.txt, YARN-4837-20160527.txt, YARN-4837-20160604.txt, 
> YARN-4837-branch-2.005.patch, YARN-4837-branch-2.8.txt
>
>
> Was reviewing the user-facing aspects that we are releasing as part of 2.8.0.
> Looking at the 'AM blacklisting feature', I see several things to be fixed 
> before we release it in 2.8.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4837) User facing aspects of 'AM blacklisting' feature need fixing

2016-06-07 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-4837:
-
Attachment: YARN-4837-branch-2.005.patch

Attached patch to branch-2 to trigger Jenkins build.

> User facing aspects of 'AM blacklisting' feature need fixing
> 
>
> Key: YARN-4837
> URL: https://issues.apache.org/jira/browse/YARN-4837
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Critical
> Attachments: YARN-4837-20160515.txt, YARN-4837-20160520.1.txt, 
> YARN-4837-20160520.txt, YARN-4837-20160527.txt, YARN-4837-20160604.txt, 
> YARN-4837-branch-2.005.patch
>
>
> Was reviewing the user-facing aspects that we are releasing as part of 2.8.0.
> Looking at the 'AM blacklisting feature', I see several things to be fixed 
> before we release it in 2.8.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4837) User facing aspects of 'AM blacklisting' feature need fixing

2016-06-04 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-4837:
--
Attachment: YARN-4837-20160604.txt

Updated patch against the latest trunk.

[~kasha]
bq. RMAppAttemptImpl#shouldCountTowardsNodeBlacklisting: For 
ContainerExitStatus.DISKS_FAILED, doesn't that mean at least one disk failed? 
And, the NM can continue running with remaining disks right? Is the idea that 
even if we schedule it to the same node, the NM wouldn't give the same local 
directory? If yes, should we clarify the comment accordingly?
No,  a container is marked with tContainerExitStatus.DISKS_FAILED means that 
the node is already be marked unhealthy given that most of the disks failed. 
So, no more containers will be scheduled on that node.

Edited the comment for more clarity to reflect the same.

> User facing aspects of 'AM blacklisting' feature need fixing
> 
>
> Key: YARN-4837
> URL: https://issues.apache.org/jira/browse/YARN-4837
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Critical
> Attachments: YARN-4837-20160515.txt, YARN-4837-20160520.1.txt, 
> YARN-4837-20160520.txt, YARN-4837-20160527.txt, YARN-4837-20160604.txt
>
>
> Was reviewing the user-facing aspects that we are releasing as part of 2.8.0.
> Looking at the 'AM blacklisting feature', I see several things to be fixed 
> before we release it in 2.8.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4837) User facing aspects of 'AM blacklisting' feature need fixing

2016-05-27 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-4837:
--
Attachment: YARN-4837-20160527.txt

Updated patch fixing the conflicts.

Can't fix the checkstyle warning - Method length is 174 lines, and the 
test-failures are unrelated.

> User facing aspects of 'AM blacklisting' feature need fixing
> 
>
> Key: YARN-4837
> URL: https://issues.apache.org/jira/browse/YARN-4837
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Critical
> Attachments: YARN-4837-20160515.txt, YARN-4837-20160520.1.txt, 
> YARN-4837-20160520.txt, YARN-4837-20160527.txt
>
>
> Was reviewing the user-facing aspects that we are releasing as part of 2.8.0.
> Looking at the 'AM blacklisting feature', I see several things to be fixed 
> before we release it in 2.8.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4837) User facing aspects of 'AM blacklisting' feature need fixing

2016-05-20 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-4837:
--
Attachment: YARN-4837-20160520.1.txt

Updated patch fixing the checkstyle issues and the two tests  broken by the 
patch - TestAppSchedulingInfo & TestFSAppAttempt.

> User facing aspects of 'AM blacklisting' feature need fixing
> 
>
> Key: YARN-4837
> URL: https://issues.apache.org/jira/browse/YARN-4837
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Critical
> Attachments: YARN-4837-20160515.txt, YARN-4837-20160520.1.txt, 
> YARN-4837-20160520.txt
>
>
> Was reviewing the user-facing aspects that we are releasing as part of 2.8.0.
> Looking at the 'AM blacklisting feature', I see several things to be fixed 
> before we release it in 2.8.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4837) User facing aspects of 'AM blacklisting' feature need fixing

2016-05-20 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-4837:
--
Attachment: YARN-4837-20160520.txt

Here's an updated patch with fixes and new test.

[~rohithsharma]
bq. One suggestion is can default value for threshold reduce to less than 50%?
bq. 1) Should we disable am-blacklisting by default?
I would like to tackle both of these as part of YARN-4685 so that others also 
can see.

bq. 2. I am little bit confused with naming convention for blacklist with 
placesBlacklist. And is there any plan to support blacklist racks in the future?
Yes, that's the idea. In other parts of the code, this list gets passed along 
to filter both nodes and racks.

[~leftnoteasy]
Regarding renames, I've included the ones you pointed out. There are lot more 
to be done, but I deliberately avoided them given the current size of the patch.

Addressed other comments.

bq. 6) ResourceBlacklistRequest -> (Resource)Place(ment)BlacklistRequest?
This is public API, cannot rename it now.

Created a new TestNodeBlacklistingOnAMFailures, moved existing tests from 
TestAMRestart to this new class file. 
testAMBlacklistPreventsRestartOnSameNodeForMinicluster() is a bogus test, 
removed it.

[~sunilg]
bq. 1. yarn.resourcemanager.am-scheduling.node-blacklisting-enabled and 
yarn.resourcemanager.am-scheduling.node-blacklisting-disable-threshold to be 
added in yarn-default.xml.
Again, I deliberately deleted them for now. I'd like to discuss their 
re-addition as part of the outcome for YARN-4685.

> User facing aspects of 'AM blacklisting' feature need fixing
> 
>
> Key: YARN-4837
> URL: https://issues.apache.org/jira/browse/YARN-4837
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Critical
> Attachments: YARN-4837-20160515.txt, YARN-4837-20160520.txt
>
>
> Was reviewing the user-facing aspects that we are releasing as part of 2.8.0.
> Looking at the 'AM blacklisting feature', I see several things to be fixed 
> before we release it in 2.8.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4837) User facing aspects of 'AM blacklisting' feature need fixing

2016-05-15 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-4837:
--
Attachment: YARN-4837-20160515.txt

Here's a (longish) patch that addresses some of my proposal above
 - This still doesn't solve the other bugs like YARN-4685 that can happen with 
the blacklisting logic. There are a bunch of UI bugs too (callers of 
BlacklistManager.getBlacklistUpdates() only look at the additions), sigh.
 - Renamed the configuration names, but marked all the new configurations 
private till we figure the rest of the story with the bugs.
 - Removed the user facing APIs and related classes: AMBlackListingRequest, 
AMBlackListingRequestPBImpl.java, BlacklistUpdates.java (instead using 
ResourceBlacklistRequest directly), AMBlackListingRequestInfo.java
 - Removed the references to above from ApplicationSubmissionContext, 
yarn_protos.proto, ApplicationSubmissionContextPBImpl.java, RMAppImpl.java, 
ApplicationSubmissionContextInfo.java
 - The above two essentially revert YARN-4389.
 - Changed RMAppAttemptImpl to explicitly encode container exit-statuses in 
deciding when to blacklist nodes
 - Made a bunch of internal renames to better reflect what they are doing.

> User facing aspects of 'AM blacklisting' feature need fixing
> 
>
> Key: YARN-4837
> URL: https://issues.apache.org/jira/browse/YARN-4837
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Critical
> Attachments: YARN-4837-20160515.txt
>
>
> Was reviewing the user-facing aspects that we are releasing as part of 2.8.0.
> Looking at the 'AM blacklisting feature', I see several things to be fixed 
> before we release it in 2.8.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4837) User facing aspects of 'AM blacklisting' feature need fixing

2016-05-15 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-4837:
--
Target Version/s: 2.8.0
Priority: Critical  (was: Major)

This must go into 2.8.0, marking so..

> User facing aspects of 'AM blacklisting' feature need fixing
> 
>
> Key: YARN-4837
> URL: https://issues.apache.org/jira/browse/YARN-4837
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Critical
>
> Was reviewing the user-facing aspects that we are releasing as part of 2.8.0.
> Looking at the 'AM blacklisting feature', I see several things to be fixed 
> before we release it in 2.8.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org