[jira] [Created] (YARN-1720) QueuePlacementRule.SecondaryGroupExistingQueue should not be terminal
Aditya Acharya created YARN-1720: Summary: QueuePlacementRule.SecondaryGroupExistingQueue should not be terminal Key: YARN-1720 URL: https://issues.apache.org/jira/browse/YARN-1720 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.2.0 Reporter: Aditya Acharya Assignee: Sandy Ryza The SecondaryGroupExistingQueue QueuePlacementRule's isTerminal() method should always return false, not create, because it does not, in fact, create a queue ever. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1630) Unbounded waiting for response in YarnClientImpl.java causes thread to hang forever
[ https://issues.apache.org/jira/browse/YARN-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aditya Acharya updated YARN-1630: - Attachment: diff-1.txt Unbounded waiting for response in YarnClientImpl.java causes thread to hang forever --- Key: YARN-1630 URL: https://issues.apache.org/jira/browse/YARN-1630 Project: Hadoop YARN Issue Type: Bug Components: client Affects Versions: 2.2.0 Reporter: Aditya Acharya Assignee: Aditya Acharya Attachments: diff-1.txt, diff.txt I ran an MR2 application that would have been long running, and killed it programmatically using a YarnClient. The app was killed, but the client hung forever. The message that I saw, which spammed the logs, was Watiting for application application_1389036507624_0018 to be killed. The RM log indicated that the app had indeed transitioned from RUNNING to KILLED, but for some reason future responses to the RPC to kill the application did not indicate that the app had been terminated. I tracked this down to YarnClientImpl.java, and though I was unable to reproduce the bug, I wrote a patch to introduce a bound on the number of times that YarnClientImpl retries the RPC before giving up. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1630) Unbounded waiting for response in YarnClientImpl.java causes thread to hang forever
[ https://issues.apache.org/jira/browse/YARN-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883242#comment-13883242 ] Aditya Acharya commented on YARN-1630: -- Added updated diff with requested changes. Unbounded waiting for response in YarnClientImpl.java causes thread to hang forever --- Key: YARN-1630 URL: https://issues.apache.org/jira/browse/YARN-1630 Project: Hadoop YARN Issue Type: Bug Components: client Affects Versions: 2.2.0 Reporter: Aditya Acharya Assignee: Aditya Acharya Attachments: diff-1.txt, diff.txt I ran an MR2 application that would have been long running, and killed it programmatically using a YarnClient. The app was killed, but the client hung forever. The message that I saw, which spammed the logs, was Watiting for application application_1389036507624_0018 to be killed. The RM log indicated that the app had indeed transitioned from RUNNING to KILLED, but for some reason future responses to the RPC to kill the application did not indicate that the app had been terminated. I tracked this down to YarnClientImpl.java, and though I was unable to reproduce the bug, I wrote a patch to introduce a bound on the number of times that YarnClientImpl retries the RPC before giving up. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1630) Unbounded waiting for response in YarnClientImpl.java causes thread to hang forever
[ https://issues.apache.org/jira/browse/YARN-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aditya Acharya updated YARN-1630: - Attachment: (was: diff-1.txt) Unbounded waiting for response in YarnClientImpl.java causes thread to hang forever --- Key: YARN-1630 URL: https://issues.apache.org/jira/browse/YARN-1630 Project: Hadoop YARN Issue Type: Bug Components: client Affects Versions: 2.2.0 Reporter: Aditya Acharya Assignee: Aditya Acharya Attachments: diff-1.txt, diff.txt I ran an MR2 application that would have been long running, and killed it programmatically using a YarnClient. The app was killed, but the client hung forever. The message that I saw, which spammed the logs, was Watiting for application application_1389036507624_0018 to be killed. The RM log indicated that the app had indeed transitioned from RUNNING to KILLED, but for some reason future responses to the RPC to kill the application did not indicate that the app had been terminated. I tracked this down to YarnClientImpl.java, and though I was unable to reproduce the bug, I wrote a patch to introduce a bound on the number of times that YarnClientImpl retries the RPC before giving up. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1630) Unbounded waiting for response in YarnClientImpl.java causes thread to hang forever
[ https://issues.apache.org/jira/browse/YARN-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aditya Acharya updated YARN-1630: - Attachment: diff-1.txt Unbounded waiting for response in YarnClientImpl.java causes thread to hang forever --- Key: YARN-1630 URL: https://issues.apache.org/jira/browse/YARN-1630 Project: Hadoop YARN Issue Type: Bug Components: client Affects Versions: 2.2.0 Reporter: Aditya Acharya Assignee: Aditya Acharya Attachments: diff-1.txt, diff.txt I ran an MR2 application that would have been long running, and killed it programmatically using a YarnClient. The app was killed, but the client hung forever. The message that I saw, which spammed the logs, was Watiting for application application_1389036507624_0018 to be killed. The RM log indicated that the app had indeed transitioned from RUNNING to KILLED, but for some reason future responses to the RPC to kill the application did not indicate that the app had been terminated. I tracked this down to YarnClientImpl.java, and though I was unable to reproduce the bug, I wrote a patch to introduce a bound on the number of times that YarnClientImpl retries the RPC before giving up. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1630) Unbounded waiting for response in YarnClientImpl.java causes thread to hang forever
[ https://issues.apache.org/jira/browse/YARN-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883585#comment-13883585 ] Aditya Acharya commented on YARN-1630: -- Updated patch, including a unit test this time. Unbounded waiting for response in YarnClientImpl.java causes thread to hang forever --- Key: YARN-1630 URL: https://issues.apache.org/jira/browse/YARN-1630 Project: Hadoop YARN Issue Type: Bug Components: client Affects Versions: 2.2.0 Reporter: Aditya Acharya Assignee: Aditya Acharya Attachments: diff-1.txt, diff.txt I ran an MR2 application that would have been long running, and killed it programmatically using a YarnClient. The app was killed, but the client hung forever. The message that I saw, which spammed the logs, was Watiting for application application_1389036507624_0018 to be killed. The RM log indicated that the app had indeed transitioned from RUNNING to KILLED, but for some reason future responses to the RPC to kill the application did not indicate that the app had been terminated. I tracked this down to YarnClientImpl.java, and though I was unable to reproduce the bug, I wrote a patch to introduce a bound on the number of times that YarnClientImpl retries the RPC before giving up. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1630) Unbounded waiting for response in YarnClientImpl.java causes thread to hang forever
[ https://issues.apache.org/jira/browse/YARN-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aditya Acharya updated YARN-1630: - Attachment: (was: diff.txt) Unbounded waiting for response in YarnClientImpl.java causes thread to hang forever --- Key: YARN-1630 URL: https://issues.apache.org/jira/browse/YARN-1630 Project: Hadoop YARN Issue Type: Bug Components: client Affects Versions: 2.2.0 Reporter: Aditya Acharya Assignee: Aditya Acharya Attachments: diff.txt I ran an MR2 application that would have been long running, and killed it programmatically using a YarnClient. The app was killed, but the client hung forever. The message that I saw, which spammed the logs, was Watiting for application application_1389036507624_0018 to be killed. The RM log indicated that the app had indeed transitioned from RUNNING to KILLED, but for some reason future responses to the RPC to kill the application did not indicate that the app had been terminated. I tracked this down to YarnClientImpl.java, and though I was unable to reproduce the bug, I wrote a patch to introduce a bound on the number of times that YarnClientImpl retries the RPC before giving up. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1630) Unbounded waiting for response in YarnClientImpl.java causes thread to hang forever
[ https://issues.apache.org/jira/browse/YARN-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aditya Acharya updated YARN-1630: - Attachment: diff.txt Unbounded waiting for response in YarnClientImpl.java causes thread to hang forever --- Key: YARN-1630 URL: https://issues.apache.org/jira/browse/YARN-1630 Project: Hadoop YARN Issue Type: Bug Components: client Affects Versions: 2.2.0 Reporter: Aditya Acharya Assignee: Aditya Acharya Attachments: diff.txt I ran an MR2 application that would have been long running, and killed it programmatically using a YarnClient. The app was killed, but the client hung forever. The message that I saw, which spammed the logs, was Watiting for application application_1389036507624_0018 to be killed. The RM log indicated that the app had indeed transitioned from RUNNING to KILLED, but for some reason future responses to the RPC to kill the application did not indicate that the app had been terminated. I tracked this down to YarnClientImpl.java, and though I was unable to reproduce the bug, I wrote a patch to introduce a bound on the number of times that YarnClientImpl retries the RPC before giving up. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1630) Unbounded waiting for response in YarnClientImpl.java causes thread to hang forever
[ https://issues.apache.org/jira/browse/YARN-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aditya Acharya updated YARN-1630: - Attachment: diff.txt Unbounded waiting for response in YarnClientImpl.java causes thread to hang forever --- Key: YARN-1630 URL: https://issues.apache.org/jira/browse/YARN-1630 Project: Hadoop YARN Issue Type: Bug Components: client Affects Versions: 2.2.0 Reporter: Aditya Acharya Assignee: Aditya Acharya Attachments: diff.txt I ran an MR2 application that would have been long running, and killed it programmatically using a YarnClient. The app was killed, but the client hung forever. The message that I saw, which spammed the logs, was Watiting for application application_1389036507624_0018 to be killed. The RM log indicated that the app had indeed transitioned from RUNNING to KILLED, but for some reason future responses to the RPC to kill the application did not indicate that the app had been terminated. I tracked this down to YarnClientImpl.java, and though I was unable to reproduce the bug, I wrote a patch to introduce a bound on the number of times that YarnClientImpl retries the RPC before giving up. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (YARN-1630) Unbounded waiting for response in YarnClientImpl.java causes thread to hang forever
Aditya Acharya created YARN-1630: Summary: Unbounded waiting for response in YarnClientImpl.java causes thread to hang forever Key: YARN-1630 URL: https://issues.apache.org/jira/browse/YARN-1630 Project: Hadoop YARN Issue Type: Bug Components: client Affects Versions: 2.2.0 Reporter: Aditya Acharya Assignee: Aditya Acharya Attachments: diff.txt I ran an MR2 application that would have been long running, and killed it programmatically using a YarnClient. The app was killed, but the client hung forever. The message that I saw, which spammed the logs, was Watiting for application application_1389036507624_0018 to be killed. The RM log indicated that the app had indeed transitioned from RUNNING to KILLED, but for some reason future responses to the RPC to kill the application did not indicate that the app had been terminated. I tracked this down to YarnClientImpl.java, and though I was unable to reproduce the bug, I wrote a patch to introduce a bound on the number of times that YarnClientImpl retries the RPC before giving up. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (YARN-1624) QueuePlacementPolicy format is not easily readable via a JAXB parser
Aditya Acharya created YARN-1624: Summary: QueuePlacementPolicy format is not easily readable via a JAXB parser Key: YARN-1624 URL: https://issues.apache.org/jira/browse/YARN-1624 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.2.0 Reporter: Aditya Acharya The current format for specifying queue placement rules in the fair scheduler allocations file does not lend itself to easy parsing via a JAXB parser. In particular, relying on the tag name to encode information about which rule to use makes it very difficult for an xsd-based JAXB parser to preserve the order of the rules, which is essential. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1624) QueuePlacementPolicy format is not easily readable via a JAXB parser
[ https://issues.apache.org/jira/browse/YARN-1624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aditya Acharya updated YARN-1624: - Attachment: diff.txt Patch that solves the problem. QueuePlacementPolicy format is not easily readable via a JAXB parser Key: YARN-1624 URL: https://issues.apache.org/jira/browse/YARN-1624 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.2.0 Reporter: Aditya Acharya Attachments: diff.txt The current format for specifying queue placement rules in the fair scheduler allocations file does not lend itself to easy parsing via a JAXB parser. In particular, relying on the tag name to encode information about which rule to use makes it very difficult for an xsd-based JAXB parser to preserve the order of the rules, which is essential. -- This message was sent by Atlassian JIRA (v6.1.5#6160)