Akshay Agarwal created YARN-8671: ------------------------------------ Summary: Container Launch failed stating "TaskAttempt killed because it ran on unusable node , Container released on a *lost* node" Key: YARN-8671 URL: https://issues.apache.org/jira/browse/YARN-8671 Project: Hadoop YARN Issue Type: Bug Components: yarn Affects Versions: 3.1.1 Reporter: Akshay Agarwal
Pre-requisites: {code:java} 1. Install HA cluster. 2.Set yarn.nodemanager.opportunistic-containers-max-queue-length=(positive integer value)[NodeManager->yarnsite.xml] 3. Set yarn.resourcemanager.opportunistic-container-allocation.enabled= true[ResourceManager->yarnsite.xml] {code} Steps to reproduce: {code:java} 1.Keep All NodeManagers Up 2.Stop 2 Nodemanagers and immediately follow step 3. 3.Submit a job with -Dmapreduce.job.num-opportunistic-maps-percent="100" and run with 50 mappers {code} Expected Result: {code:java} Job should be successfull {code} Actual Result: {code:java} Job is getting successfull but some containers are failing stating TaskAttempt killed because it ran on unusable node , Container released on a *lost* node" {code} Log Details: {code:java} TaskAttempt killed because it ran on unusable node Container released on a *lost* node Container launch failed for container_1534149133116_0019_01_000006 : java.net.ConnectException: Call From hostname/IP to hostname:portNumber failed on connection exception: java.net.ConnectException: {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org