[ https://issues.apache.org/jira/browse/TEZ-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14649667#comment-14649667 ]
Siddharth Seth commented on TEZ-2003: ------------------------------------- Thanks for taking a look. Response inline. One note: The structure of the APIs is now in place. However the APIs could use several enhancements, which I think can go in after the merge (more in the mail I had sent out to the dev list). There's jiras open for some of these enhancements. bq. there is not really a clear ownership doc on what the framework owns/controls as compared to what is handed off to the plugins. Improved docs on this front would help. The framework owns the co-ordination of the DAG, fault tolerance, recovery etc. Plugins come in to allow for execution / sourcing from different external services. There's hooks for plugins to report failures of tasks / containers - which then translate into the framework controlling the re-execution/completion. Are there specific details that you're looking for ? bq. Plugins seem to be hardwired at the vertex level and cannot be changed for different tasks. What happens if an external service is unavailable? Does everything fail? If an external service is unavailable, yes - the job should ideally fail. This also depends upon how the plugins are written to detect these failures. Eventually, as has been discussed, a policy based approach could be taken where different attempts of the same task could be run via different services. e.g. 2nd attempt always falls back to containers if the first one failed or 2nd attempt always runs in containers if the first took too long. This is a future enhancement though. bq. All classes in serviceplugins/api need lots of docs They absolutely do. Created a jira and will add docs over the next few days. bq. All TODOs need a jira especially ones that will block a final release Already done. bq. Code in some files exceeds 100 chars Will fix this, along with other jenkins errors. Likely before the merge to avoid merge/rebase complications. bq. Unused imports, order of imports at times is not alphabetical I don't think we use alphabetical imports. Order of imports is whatever intellij sets up. Will remove unused imports as I see them. bq. Spell checker needed on various API typos - Curretn, localicty Will Fix bq. TezClient The builder is much more flexible than the current set of APIs. There's 5 different create methods, all of which can be replaced by the builder. I'd actually deprecate the creates. Will add a null check for the descriptor setup. Uber/Container can be setup via a ServiceDescriptor. In it's absence, containers is the default. bq. ContainerEndReason Termination has a negative connotation. Could be ContainerEndReason or ContainerCompletionReason. A diagnostic message is required along with this wherever it is used - if it isn't already there. FRAMEWORK_ERROR really boils down to the Container state machine - which just tracks errors instead of failing the entire DAG. This can be removed if that behaviour is changed. Alternately, it can be reported as OTHER - which provides even less info. bq. ContainerLauncherContext NumNodes - is available for a specific scheduling source, and can be used if required - e.g. Tez using this to setup number of threads. LaunchFailure vs StopFailure - they're different operations, and are treated differently by the framework. TaskEndReason in ContainerCompleted - is a result of an existing leak in the Container and Task state machine. Created TEZ-2676 ApplicationAttemptId - and the underlying appId / attemptId could potentially be useful. Used by the Tez launcher if I'm not mistaken. The plugin does not tell the app that a container stop is required. It's the other way round. What it does indicate is an attempt has been made to stop the container, and subsequently whether this was successful or not. bq. ContainerLauncherOperationBase Token is probably not always needed. It is, however, needed for YARN allocated containers. This could be abstracted out. Future API changes. bq. ContainerLaunchRequest/ContainerStopRequest Node/Container mappings tracked by the framework. May need to expose APIs with this information if not already available. Scheduler/TaskCommunicator plugin required for some co-ordination. e.g. numNodes or getting port information from the communicator. Mostly a 1:1:1 mapping, but it is possible for these to be re-used. e.g. TaskCommunicator can be shared by uber / local / YARN. bq. ServicePluginsDescriptor enableContainer / uber mode is for completeness and to work with dag level defaults. uber doesn't always need to be enabled. It's possible to specify DAG level defaults for the execution context to be used, and then override at a vertex level to run in containers. Defaults are intentionally set to containers enabled, uber disabled enableUber here means setting up an AM to run with uberized containers. That could be controlled via config - but programmatic seems like a better approach. In terms of the AM deciding what gets uberized - I think that's a follow up enhancement. That needs to go along with "this vertex cannot be uberized". For now, usage of uber is explicit, and vertices which should run in the AM have to be explicitly marked as such. bq. TaskAttemptEndReason The enums are primarily separation between the API and internal usage. Will add the bits which are lossy compared to YARN. They should be extendable. Any suggestions on how to do this, other than going to using Strings instead of a pre-defined enum ? bq. ServicePluginLifecycle One extra place to do work which may not be required in the constructor. Also follows the lifecycle used by YARN AbstractService and in turn the AM components. bq. TaskScheduler availableResources and totalResources used to provide information to VertexPlugins and InputInitializers on what's available. dagComplete - signal in to components to indicate that a dag is complete. Useful for book-keeping within plugins. Plan to introduce a dagStarted(UserPayload payload) in a follow up jira. getClusterNodeCount - number of nodes available to this source deallocateContainer - is a signal to the scheduler that the container has been released for some reason. bq. TaskSchedulerContext AppFinalStatus and a couple of methods are YARN specific - register / deregister. This is captured in the TaskScheduler API enhancements jira. However, some specific APIs may just be needed - it may be possible to generalize them though. Didn't understand the question about taskAllocated ? That's the mechanism for the scheduler to tell the system that a previous request has been allocated to a container. bq. TaskCommunicatorContext / TaskHeartbeatRequest firstAttemptStartTime, dagStartTime - for scheduling decisions within the daemon, and aging of tasks. This could be replaced by a getAttemptStartTime(index) dagStarted(new payload) - addition of this method will require fixes for recovery. Framework tracks liveness - which is why the *Alive APIs exist. This needs to be consolidated with TaskHeartbeatRequest preRoutedStartIndex - similar to the variable in TezHeartbeatRequest - tracks events. bq. TaskAttemptEventAttemptFailed legacy bq. DAGImpl Will cache the context. bq. TaskAttemptImpl Would make sense to move to KILLED directly. I'll look into why it wasn't done this way. Will look into adding scheduleTime to a history event - possibly in a follow up for history enhancements for this change. bq. TaskImpl scheduleTime is returned as 0 in such cases, which would indicate no attempts scheduled yet. bq. VertexImpl There's a jira open to propagate such errors to the user. Will make a note there. bq. ContainerLauncherRouter isPureLocalMode indicates loca mode only. The components are shared between local mode and uber mode - which is where the variable re-use may get confusing. All configured launchers do need to initialize correctly. Is that not happening ? I believe the exceptions vary in where they're created. ctor vs init. Will look into whether they need to be checked exceptions. bq. AMContainerEventLaunchRequest Not at the moment. Don't think this is a bug though since the launcher id is made available when the container is created. bq. SchedulerId/TaskLauncherId/TaskCommId ordering will be maintained since the ordering is maintained in the initial setup, as is the setup in code. more details on specific docs please ? bq. ContainerMatcher Containers managed by different schedulers are not re-used. Not yet anyway - and that's likely some way away. I don't think id matching is explicitly required since containers are localized to schedulers at the moment. bq. AMNodeEvent It can be schedulerId. This will get more complicated in the future though if a single scheduler can use different launchers. In the most part though, this is linked to a scheduler as a source - since that's where a Node can fail. bq. AMNodeTracker blacklisting needs to be at a scheduler level. There may be a use case to blacklist entire nodes across sources. Will have to figure out how to get this working, and the specific use case. getAndCreateIfNeededPerSourceTracker - doesn't need to be. Expected to be invoked via a single thread. Another bit to consider is how uber mode should be configured in terms of the task sizes, number of threads etc. > [Umbrella] Allow Tez to co-ordinate execution to external services > ------------------------------------------------------------------ > > Key: TEZ-2003 > URL: https://issues.apache.org/jira/browse/TEZ-2003 > Project: Apache Tez > Issue Type: Improvement > Reporter: Siddharth Seth > Attachments: 2003_20150728.1.txt, Tez With External Services.pdf > > > The Tez engine itself takes care of co-ordinating execution - controlling how > data gets routed (different connection patterns), fault tolerance, scheduling > of work, etc. > This is currently tied to TaskSpecs defined within Tez and on containers > launched by Tez itself (TezChild). > The proposal is to allow Tez to work with external services instead of just > containers launched by Tez. This involves several more pluggable layers to > work with alternate Task Specifications, custom launch and task allocation > mechanics, as well as custom scheduling sources. > A simple example would be a simple a process with the capability to execute > multiple Tez TaskSpecs as threads. In such a case, a container launch isn't > really need and can be mocked. Sourcing / scheduling containers would need to > be pluggable. > A more advanced example would be LLAP (HIVE-7926; > https://issues.apache.org/jira/secure/attachment/12665704/LLAPdesigndocument.pdf). > This works with custom interfaces - which would need to be supported by Tez, > along with a custom event model which would need translation hooks. > Tez should be able to work with a combination of certain vertices running in > external services and others running in regular Tez containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)