[ https://issues.apache.org/jira/browse/YARN-1515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15143523#comment-15143523 ]
Eric Payne commented on YARN-1515: ---------------------------------- Hi [~jira.shegalov]. I would like to see this functionality implemented. We occasionally see containers time out, and it would be good if users could have direct feedback in the form of a jstack to help them debug their applications. I have been coming up to speed on the work that's already been committed in this area under YARN-445 and its children. IIUC, YARN-445 and its children put in place the infrastructure for a {{Client -> RM -> NM -> Container}} signal path. On the other hand, this JIRA (along with YARN-1515) implements an {{AM -> NM -> Container}} signal path and the ability to send multiple signals per call. It seems that these pieces could possibly be split into separate JIRAs. Either way, I think that a lot of what has been done in this JIRA could be used to add the interface to {{ContainerManagementProtocol}} that would allow the AM to prompt the NM to signal the container to dump its stack prior to killing the container on a timeout. Is there a possibility that this JIRA will move forward? Ideally, we would like it all ported back to 2.7. > Provide ContainerManagementProtocol#signalContainer processing a batch of > signals > ---------------------------------------------------------------------------------- > > Key: YARN-1515 > URL: https://issues.apache.org/jira/browse/YARN-1515 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, nodemanager > Reporter: Gera Shegalov > Assignee: Gera Shegalov > Attachments: YARN-1515.v01.patch, YARN-1515.v02.patch, > YARN-1515.v03.patch, YARN-1515.v04.patch, YARN-1515.v05.patch, > YARN-1515.v06.patch, YARN-1515.v07.patch, YARN-1515.v08.patch > > > This is needed to implement MAPREDUCE-5044 to enable thread diagnostics for > timed-out task attempts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)