[ 
https://issues.apache.org/jira/browse/YARN-1139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13797804#comment-13797804
 ] 

Steve Loughran commented on YARN-1139:
--------------------------------------

# you don't need to convert any exceptions now, because the inner 
{{serviceStart()/serviceStop()}} methods throw exceptions. Just pass them up. 
The only reason the existing services didn't have their exception catch/wrap 
logic changed as part of YARN-117 is that I didn't want to add extra changes

# AbstractService catches a failure and relays to noteFailure(), which, for the 
first exception caught, gets saved away; {{getFailureCause()}} and 
{{getFailureState()}} returns that exception and the state when it happened.
# when an exception is caught during state changes, it triggers a 
{{Service.stop()}} action -which is why it is required to be a best-effort 
operation & do its best even when trying to stop a partially inited or started 
service
# it then calls {{ServiceStateException.convert(e);}} to convert the exception 
into a RuntimeException; if it is one it is left alone, otherwise it is 
surrounded by a ServiceStateException.

# The composite service runs through its children starting each one in turn. 
The first one that fails by throwing a runtime exception will trigger the 
noteFailure operation on the parent, then the composite service's stop() 
operation -which then walks back through all inited services (but not the 
UNINITED ones -things failed when we tried that), stopping them in turn.

What that means is that if a child service fails, the composite should pick 
that up and save it as its own failure cause. 

I've actually done a couple more child-holding services for my own work, which 
I'd happily push back into trunk/2.3 

[https://github.com/hortonworks/hoya/tree/develop/hoya-core/src/main/java/org/apache/hadoop/hoya/yarn/service]

* The 
[SequenceService|https://github.com/hortonworks/hoya/blob/develop/hoya-core/src/main/java/org/apache/hadoop/hoya/yarn/service/SequenceService.java]
 runs its children in sequence, failing when one fails
* The 
[CompoundService|https://github.com/hortonworks/hoya/blob/develop/hoya-core/src/main/java/org/apache/hadoop/hoya/yarn/service/CompoundService.java]
 stops as soon as any one of its children fail, again propagating any faults up
These both implement a [Parent interface| Parent.java] so that they can be 
treated uniformally -and allow other bits of the code to add children

Alongside that:
* [EventNotifyingService| 
https://github.com/hortonworks/hoya/blob/develop/hoya-core/src/main/java/org/apache/hadoop/hoya/yarn/service/EventNotifyingService.java]
 : sleeps, notifies a callback, stops
* 
[ForkedProcessService|https://github.com/hortonworks/hoya/blob/develop/hoya-core/src/main/java/org/apache/hadoop/hoya/yarn/service/ForkedProcessService.java]:
 forks off a native process, stops when the process stops, kills the process 
when it itself is stopped, and forwards up exceptions on a process failure

These let me build up more complex workflows like this one [to start 
accumulo|https://github.com/hortonworks/hoya/blob/develop/hoya-core/src/main/java/org/apache/hadoop/hoya/providers/accumulo/AccumuloProviderService.java#L331]
 -runs a sequence of "accumulo init" (if needed), followed by, in parallel, 
"accumulo start" and a delayed event callback. That callback will, if accumulo 
start hasn't failed in the meantime, trigger the request for containers for 
whatever other accumulo roles have been added.

Anyway, the services will catch, record, wrap and relay exceptions, the parents 
just need to be able to handle the fact that it will be a RuntimeException that 
comes back -and there is no need to catch and wrap it again if you want to pass 
it upstream.










> [Umbrella] Convert all RM components to Services
> ------------------------------------------------
>
>                 Key: YARN-1139
>                 URL: https://issues.apache.org/jira/browse/YARN-1139
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>    Affects Versions: 2.1.0-beta
>            Reporter: Karthik Kambatla
>            Assignee: Tsuyoshi OZAWA
>
> Some of the RM components - state store, scheduler etc. are not services. 
> Converting them to services goes well with the "Always On" and "Active" 
> service separation proposed on YARN-1098.
> Given that some of them already have start(), stop() methods, it should not 
> be too hard to convert them to services.
> That would also be a cleaner way of addressing YARN-1125.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to