[
https://issues.apache.org/jira/browse/UIMA-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jim Challenger closed UIMA-2593.
--------------------------------
> RM: Resource Manager mishandling dead node with Work Items in Limbo
> -------------------------------------------------------------------
>
> Key: UIMA-2593
> URL: https://issues.apache.org/jira/browse/UIMA-2593
> Project: UIMA
> Issue Type: Bug
> Components: DUCC
> Reporter: Jim Challenger
> Assignee: Jim Challenger
> Fix For: 1.0-Ducc
>
>
> If a node dies with a work-item that is starting but not confirmed so it goes
> into Limbo, RM continuously allocates a new node until the pool is exhausted.
> Correct behavior is for RM to allocate only sufficient nodes to make up for
> the dead one, based on remaining work.
> To reproduce, start a small cluster and fire off a job with a couple hundred
> short (5-10 second) work items. Once all nodes are full issue SIGSTOP to one
> agent and JP. This should cause at least one WI to go into limbo. When the
> heartbeat counter says the node is dead we expect to see the errant behavior
> start.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira