[ 
https://issues.apache.org/jira/browse/TEZ-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleg Zhurakousky updated TEZ-1206:
----------------------------------

    Attachment: TEZ-1206.patch

- Refactored DAGAppMaster to simplify main(..) and improve lifecycle testing
- Eliminated dependency on _System.exit_ while keeping it there just in case. 
In time it could/should be removed
- Removed manual Thread creation in favor of ScheduledExecutorService. This 
executor should shared with other services of DAGAppMaster
- Added Semaphore to control the lifecycle of DAGAppMaster
- Refactored services startup to aggregate all of the startup exceptions being 
thrown instead of only the first one. Added ExceptionAggregator to aggregate 
multiple exceptions.
- Fixed TaskScheduler to ensure its properly propagates InterruptedException to 
avoid deadlocks during shutdown 
        (e.g., appClientDelegate.getFinalAppStatus() could result in 
InterruptedException, so stop is never called). The deadlock issue was reported 
on mailing list.
- Added DAGAppMaster life-cycle test facilitating all possible stop conditions 
ensuring it always stops naturally and not via System.exit.

> Lifecycle issues with DAGAppMaster
> ----------------------------------
>
>                 Key: TEZ-1206
>                 URL: https://issues.apache.org/jira/browse/TEZ-1206
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.4.0
>            Reporter: Oleg Zhurakousky
>            Assignee: Oleg Zhurakousky
>         Attachments: TEZ-1206.patch
>
>
> This is an umbrella issue to document and address issues with DAGAppMaster 
> lifecycle



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to