[
https://issues.apache.org/jira/browse/TEZ-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14041229#comment-14041229
]
Siddharth Seth commented on TEZ-692:
------------------------------------
The patch mostly looks good to me. Couple of comments, some of which aren't
directly related to this jira - and are likely future jiras.
bq. The name TezSession and config name tez.am.shared have room for improvement.
That was probably the main comment on the patch :). Should we just call
TezSession TezClient ? - that's more intuitive when trying to figure out how to
get started, and rename the tez.am.shared to tez.am.session.mode
{code}
if (!isSession) {
throw new SessionNotRunning("Invalid without session mode");
}
{code}
IllegalStateException instead of SessionNotRunning. SessionNotRunning is a
misleading exception name for this case.
- getSessionStatus working in non session mode and returning instances of
TezSessionStatus is also a little misleading. Don't think throwing an Exception
in non-session mode is a better alternative though. Maybe rename to
TezClientStatus ?
- Unrelated to this jira, but the method public DAGClient
submitDAGApplication(ApplicationId appId, DAG dag) - which is private to Tez,
should ideally not be exposed at all. Keeping the client APIs as clean as
possible, without depending on annotations, would be useful. Will create a
follow up for this.
{code}
public synchronized DAGClient submitDAG(DAG dag, Map<String, LocalResource>
additionalAmResources)
throws TezException, IOException, InterruptedException {
Preconditions.checkState(isSession == true,
"submitDAG with additional resources applies to only session mode. " +
"In non-session mode please specify all resources in the initial
configuration");
{code}
Since the intent is to allow user code to work whether using session mode or
not, this method can continue to work in non session mode, and just add the
resources specified in the list to the ones in the DAG ?
- In non session mode (public DAGClient submitDAGApplication(ApplicationId
appId, DAG dag) - the yarn client can be shut down after the DAG is submitted,
since DAGClient creates it's own YARNClient. In a subsequent jira, we could
look at using the yarnClient instance created by TezSession for the DAGClient.
- AMConfiguration.tezConf - we end up using this for all task related
configuration as well. e.g. the RPC timeouts, whether to scale memory, etc.
These are really framework level confs for the AM or TezChild. Could be moved
out of AMConfiguration, or rename AMConfiguration to something like
DAGConfiguration. This is also a follow up jira item.
> Unify job submission in either TezClient or TezSession
> ------------------------------------------------------
>
> Key: TEZ-692
> URL: https://issues.apache.org/jira/browse/TEZ-692
> Project: Apache Tez
> Issue Type: Sub-task
> Reporter: Bikas Saha
> Assignee: Bikas Saha
> Attachments: TEZ-692.1.patch
>
>
> Its confusing to have 2 ways to create and submit a tez job. The developer
> has to spend time thinking about and deciding which method to use.
--
This message was sent by Atlassian JIRA
(v6.2#6252)