[jira] [Updated] (TEZ-2692) bugfixes enhancements related to job parser and analyzer

2015-08-10 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-2692:
--
Attachment: TEZ-2692.3.patch

- Fixed getTaskRuntime() in SlowTaskAnalyzer.  It should be firstTaskToStart.
- Fixed concurrency calculator logic (using sorted multi-set of timestamps as 
different tasks can start at same time as well. Walking through the set to 
determine concurrency as suggested)
- Merged test with TestATSFileParser and renamed TestATSFileParser to 
TestHistoryParser

will commit once the pre-commit passes.

 bugfixes  enhancements related to job parser and analyzer
 --

 Key: TEZ-2692
 URL: https://issues.apache.org/jira/browse/TEZ-2692
 Project: Apache Tez
  Issue Type: Bug
Reporter: Rajesh Balamohan
Assignee: Rajesh Balamohan
 Attachments: TEZ-2692.1.patch, TEZ-2692.2.patch, TEZ-2692.3.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2692) bugfixes enhancements related to job parser and analyzer

2015-08-10 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-2692:
--
Attachment: TEZ-2692.2.patch

Attaching revised patch to address review comments.

 bugfixes  enhancements related to job parser and analyzer
 --

 Key: TEZ-2692
 URL: https://issues.apache.org/jira/browse/TEZ-2692
 Project: Apache Tez
  Issue Type: Bug
Reporter: Rajesh Balamohan
Assignee: Rajesh Balamohan
 Attachments: TEZ-2692.1.patch, TEZ-2692.2.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2704) Fix version on tez job analyzer

2015-08-10 Thread Siddharth Seth (JIRA)
Siddharth Seth created TEZ-2704:
---

 Summary: Fix version on tez job analyzer
 Key: TEZ-2704
 URL: https://issues.apache.org/jira/browse/TEZ-2704
 Project: Apache Tez
  Issue Type: Sub-task
Affects Versions: TEZ-2003
Reporter: Siddharth Seth






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2658) Create a CLI utility tool to track Tez DAG/Application Stats

2015-08-10 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14680412#comment-14680412
 ] 

TezQA commented on TEZ-2658:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12749617/TEZ-2658.2.patch
  against master revision eadbfec.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/972//console

This message is automatically generated.

 Create a CLI utility tool to track Tez DAG/Application Stats
 

 Key: TEZ-2658
 URL: https://issues.apache.org/jira/browse/TEZ-2658
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Saikat
Assignee: Saikat
 Attachments: TEZ-2658.1.patch, TEZ-2658.2.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2618) In Ordered Fetcher, if Local Fetch fails, fallback and try http Fetch before returning a failure

2015-08-10 Thread Saikat (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saikat updated TEZ-2618:

Attachment: TEZ-2618.1.patch

rebased patch on top of TEZ-2172

 In Ordered Fetcher, if Local Fetch fails, fallback and try http Fetch before 
 returning a failure
 

 Key: TEZ-2618
 URL: https://issues.apache.org/jira/browse/TEZ-2618
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Saikat
Assignee: Saikat
 Attachments: TEZ-2618.1.patch, TEZ-2618.patch


 In setupLocalDiskFetch() method[this is invoked when the fetcher is in the 
 same host as the target map host], first try to check if we can open the 
 target spill file using the localDirAllocator.getLocalPathToRead(). The 
 localDirAllocator searches through the list of configured dirs for the file. 
 In disk full scenarios, if the path is not found, fetcher should to try an 
 http fetch.
 proposed solution:
 in local fetch mode, the fetcher should first try getLocalPathToRead() for 
 all the pending maps. and  So local fetch gets divided into 2 stages: first 
 the maps for which path was found via LocalDirAllocator and second construct 
 a http fallback fetch list for the maps which couldnt be found via 
 LocalDirAllocator.getLocalPathToRead() and do an http fetch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2658) Create a CLI utility tool to track Tez DAG/Application Stats

2015-08-10 Thread Saikat (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saikat updated TEZ-2658:

Attachment: TEZ-2658.3.patch

 Create a CLI utility tool to track Tez DAG/Application Stats
 

 Key: TEZ-2658
 URL: https://issues.apache.org/jira/browse/TEZ-2658
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Saikat
Assignee: Saikat
 Attachments: TEZ-2658.1.patch, TEZ-2658.2.patch, TEZ-2658.3.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (TEZ-2704) Fix version on tez job analyzer

2015-08-10 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth resolved TEZ-2704.
-
Resolution: Done

Looks like this was already fixed by [~zjffdu]

 Fix version on tez job analyzer
 ---

 Key: TEZ-2704
 URL: https://issues.apache.org/jira/browse/TEZ-2704
 Project: Apache Tez
  Issue Type: Sub-task
Affects Versions: TEZ-2003
Reporter: Siddharth Seth





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2300) TezClient.stop() takes a lot of time or does not work sometimes

2015-08-10 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14680555#comment-14680555
 ] 

TezQA commented on TEZ-2300:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12749649/TEZ-2300.2.patch
  against master revision eadbfec.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.client.TestTezClient

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/974//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/974//artifact/patchprocess/newPatchFindbugsWarningstez-api.html
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/974//console

This message is automatically generated.

 TezClient.stop() takes a lot of time or does not work sometimes
 ---

 Key: TEZ-2300
 URL: https://issues.apache.org/jira/browse/TEZ-2300
 Project: Apache Tez
  Issue Type: Bug
Reporter: Rohini Palaniswamy
Assignee: Jonathan Eagles
 Attachments: TEZ-2300.1.patch, TEZ-2300.2.patch, 
 syslog_dag_1428329756093_325099_1_post 


   Noticed this with a couple of pig scripts which were not behaving well (AM 
 close to OOM, etc) and even with some that were running fine. Pig calls 
 Tezclient.stop() in shutdown hook. Ctrl+C to the pig script either exits 
 immediately or is hung. In both cases it either takes a long time for the 
 yarn application to go to KILLED state. Many times I just end up calling yarn 
 application -kill separately after waiting for 5 mins or more for it to get 
 killed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2300) TezClient.stop() takes a lot of time or does not work sometimes

2015-08-10 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated TEZ-2300:
-
Attachment: TEZ-2300.2.patch

 TezClient.stop() takes a lot of time or does not work sometimes
 ---

 Key: TEZ-2300
 URL: https://issues.apache.org/jira/browse/TEZ-2300
 Project: Apache Tez
  Issue Type: Bug
Reporter: Rohini Palaniswamy
Assignee: Jonathan Eagles
 Attachments: TEZ-2300.1.patch, TEZ-2300.2.patch, 
 syslog_dag_1428329756093_325099_1_post 


   Noticed this with a couple of pig scripts which were not behaving well (AM 
 close to OOM, etc) and even with some that were running fine. Pig calls 
 Tezclient.stop() in shutdown hook. Ctrl+C to the pig script either exits 
 immediately or is hung. In both cases it either takes a long time for the 
 yarn application to go to KILLED state. Many times I just end up calling yarn 
 application -kill separately after waiting for 5 mins or more for it to get 
 killed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2658) Create a CLI utility tool to track Tez DAG/Application Stats

2015-08-10 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14680676#comment-14680676
 ] 

TezQA commented on TEZ-2658:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12749655/TEZ-2658.3.patch
  against master revision eadbfec.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/975//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/975//artifact/patchprocess/newPatchFindbugsWarningstez-cli-tools.html
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/975//console

This message is automatically generated.

 Create a CLI utility tool to track Tez DAG/Application Stats
 

 Key: TEZ-2658
 URL: https://issues.apache.org/jira/browse/TEZ-2658
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Saikat
Assignee: Saikat
 Attachments: TEZ-2658.1.patch, TEZ-2658.2.patch, TEZ-2658.3.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2658) Create a CLI utility tool to track Tez DAG/Application Stats

2015-08-10 Thread Saikat (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saikat updated TEZ-2658:

Attachment: TEZ-2658.4.patch

fixed  findbug warning for VA_FORMAT_STRING_USES_NEWLINE.

 Create a CLI utility tool to track Tez DAG/Application Stats
 

 Key: TEZ-2658
 URL: https://issues.apache.org/jira/browse/TEZ-2658
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Saikat
Assignee: Saikat
 Attachments: TEZ-2658.1.patch, TEZ-2658.2.patch, TEZ-2658.3.patch, 
 TEZ-2658.4.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TEZ-2003) [Umbrella] Allow Tez to co-ordinate execution to external services

2015-08-10 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14680721#comment-14680721
 ] 

Siddharth Seth edited comment on TEZ-2003 at 8/10/15 9:09 PM:
--

bq. logErrorIngored, hearbeats, getCurretnDagName
bq. - remove “*” e,g, import org.apache.tez.common.asterisk;

Captured in TEZ-2678

bq. abortTask vs close/cleanup
Will check the code. abortTask should try cleaning up in both of them.

bq. TezTaskRunner2
killTask isn't used yet within Tez, which is why it's not informing the AM. 
When task preemption comes in - the flow is likely to be a killTask invoked as 
a result of an RPC, at which point the AM already knows that the task is killed 
since it took the decision.

On the various atomic gets - there's separate variables to track what states 
have been set, and is used in the return result. Atomicity of the entire 
operation is handled via synchronization blocks.

TaskRunner handling containerStop is a result of containerStop coming over a 
shared Task/Container protocol - which is linked to the running task. It could 
be separated, but I think that'll need the protocols to be separated as well.

canCommit during a shutdown - will change this. I'll also verify what the 
TaskRunner behaviour was. TEZ-2678

bq. TaskReporter
I don't think shutdown needs synchronization. It modifies a final variable. 
Whether it's implemented correctly needs more investigation. It's the same as 
what exists on master.

bq. ShuffleHandler
This is essentially the shuffle handler that is used in regular clusters. It's 
not meant as a benchmark tool. Using he current shuffle mechanics seems like 
the simplest mechanism to have jobs work with the standard set of 
Inputs/Outputs which write to disk.

bq. ext-service-tests
Agree with making this a reference for ext services. It would need to implement 
the APIs better, and be documented a lot bette to serve this purpose. Creating 
a new jira to track this - TEZ-2705. Post merge ?

bq. JoinValidate
The changes are for private use, to be able to re-use the example in testing. 
Will add docs to mention this.

bq. TezTaskCommunicatorImpl
Using payloads wherever possible - including internal plugins. Avoided in 
LocalContainerLauncher only at the moment, where a lot of runtime AM 
information is used.

Will fix isKnownContainer and containerAlive t be based on specific 
communicator.

Renaming methods in TaskComm - tracked in the TaskComm enhancements jira

getDagName null - will try improving this.
getVertexName - I'm not sure there's a lot that can be done. TezException 
instead of NPE ? Eventually this will lead to an error in the plugin, which 
needs to be handled better. There's a jira to track such error handling.

onStateUpdated - is the AM telling the TaskCommunicator plugin that a vertex 
has changed state. Similar to what is done elsewhere - like the 
InputInitializers.

dagCompleteStart - couldn't find this. Maybe I removed it at some point for the 
same reason - is a very confusing name.

bq. Is there a need for the framework to make updates into the Context object? 
If yes, should the Context implement 2 interfaces? Should the internal objects 
just bind to the internal Impl objects or are they bound to the public plugin 
interfaces to catch compat errors? Binding to Impls directly may mean a smaller 
public API interface.
Need more clarification on this comment.

bq. ctor.setAccessible(true);
Will do. 


was (Author: sseth):
bq. logErrorIngored, hearbeats, getCurretnDagName
bq. - remove “*” e,g, import org.apache.tez.common.asterisk;

Captured in TEZ-2678

bq. abortTask vs close/cleanup
Will check the code. abortTask should try cleaning up in both of them.

bq. TezTaskRunner2
killTask isn't used yet within Tez, which is why it's not informing the AM. 
When task preemption comes in - the flow is likely to be a killTask invoked as 
a result of an RPC, at which point the AM already knows that the task is killed 
since it took the decision.

On the various atomic gets - there's separate variables to track what states 
have been set, and is used in the return result. Atomicity of the entire 
operation is handled via synchronization blocks.

TaskRunner handling containerStop is a result of containerStop coming over a 
shared Task/Container protocol - which is linked to the running task. It could 
be separated, but I think that'll need the protocols to be separated as well.

canCommit during a shutdown - will change this. I'll also verify what the 
TaskRunner behaviour was. TEZ-2678

bq. TaskReporter
I don't think shutdown needs synchronization. It modifies a final variable. 
Whether it's implemented correctly needs more investigation. It's the same as 
what exists on master.

bq. ShuffleHandler
This is essentially the shuffle handler that is used in regular clusters. It's 
not meant as a benchmark tool. Using he current shuffle 

[jira] [Commented] (TEZ-2003) [Umbrella] Allow Tez to co-ordinate execution to external services

2015-08-10 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14680784#comment-14680784
 ] 

Bikas Saha commented on TEZ-2003:
-

Some initial comments on the modified existing code. Not yet seen the newly 
added code. 
One repeated item in the comments is the special casing of uber/yarn mode in 
different places. I would expect plugins to come in from the user when the AM 
is created and either the client/AppMaster would create default plugins for 
uber/yarn. Thereafter dag/vertex/routers should not need to have special casing 
for any plugin (like the uber/yarn special casing that exists in all these 
places in the branch). e.g. the vertexmanager only uses 
VertexManagerPluginDescriptor - even for the built-in plugins. Similar, I would 
expect the communicator/scheduler/launcher wrappers to work only with plugin 
descriptors.
Also creating a ServicePlugin class will help in reducing code duplication and 
make maintenance easier instead of having scheduler id, launcherId and commId 
everywhere.
ContainerSignatureMatcher - ExecutorSignatureMatcher ?
{code}
+public interface ContainerSignatureMatcher {
{code}
Why is this here?

ServicePluginLifecyle etc. in tez-runtime-api like 
Inputs/Output/InputInitializer etc. ?
Typically we say start() - stop() instead of shutdown
{code}+public interface ServicePluginLifecycle {
+  void start() throws Exception;
+  void shutdown() throws Exception;
{code}

Why are executedInAm and executeInContainers there?
{code}
+  public static class VertexExecutionContext {
+final boolean executeInAm;
+final boolean executeInContainers;
+final String taskSchedulerName;{code}

Rename to ExecutorEndReason ? Also, how can An error in the AM be caused by a 
container running a task?
{code}
+public enum ContainerEndReason {
+  NODE_FAILED, // Completed because the node running the container was marked 
as dead
+  APPLICATION_ERROR, // An error in the AM caused by user code
+  FRAMEWORK_ERROR, // An error in the AM - likely a bug.
+  LAUNCH_FAILED, // Failure to launch the container
+}{code}

Why does this have schedulerName and taskCommName ?
{code}
+public class ContainerLaunchRequest extends ContainerLauncherOperationBase {
+
+  private final ContainerLaunchContext clc;
+  private final Container container;
+} {code}

Why enableContainers and enableUber?
{code}
+public class ServicePluginsDescriptor {
+
+  private final boolean enableContainers;
+  private final boolean enableUber;
+ {code}

Has the internal one been replaced by this?
{code}
+public enum TaskAttemptEndReason {
+  NODE_FAILED, // Completed because the node running the container was marked 
as dead
+}{code}

Rename to ExecutorBusy ?
{code}
+  COMMUNICATION_ERROR, // Equivalent to a launch failure
+  SERVICE_BUSY, // Service rejected the task
+  INTERRUPTED_BY_SYSTEM, // Interrupted by the system. e.g. Pre-emption
+  INTERRUPTED_BY_USER, // Interrupted by the user
+
 }{code}

Why isLocal flag needs to be passed to Scheduler/Launcher/Communicator routers? 
Instead of a service plugin for local

Is is ensured that the integer for a service plugin will turn out to be the 
same after AM restart?

Why is yarn scheduler special cased? Launcher/Communicator dont have the 
special casing ?
{code}
+  static void processSchedulerDescriptors(ListNamedEntityDescriptor 
descriptors, boolean isLocal,
+  UserPayload defaultPayload,
+  BiMapString, Integer 
schedulerPluginMap) {
.
+  if (!foundYarn) {
+NamedEntityDescriptor yarnDescriptor =
+new 
NamedEntityDescriptor(TezConstants.getTezYarnServicePluginName(), null)
+.setUserPayload(defaultPayload);
+addDescriptor(descriptors, schedulerPluginMap, yarnDescriptor);
+  }{code}

Why use different code path for uber/default. They should just work when 
instantiated the same way as a custom plugin.
{code}
+  TaskCommunicator createTaskCommunicator(NamedEntityDescriptor 
taskCommDescriptor,
+  int taskCommIndex) {
+if 
(taskCommDescriptor.getEntityName().equals(TezConstants.getTezYarnServicePluginName()))
 {
+  return 
createDefaultTaskCommunicator(taskCommunicatorContexts[taskCommIndex]);
+} else if (taskCommDescriptor.getEntityName()
+.equals(TezConstants.getTezUberServicePluginName())) {
+  return 
createUberTaskCommunicator(taskCommunicatorContexts[taskCommIndex]);
+} else {
+  return 
createCustomTaskCommunicator(taskCommunicatorContexts[taskCommIndex],
+  taskCommDescriptor);
 }{code}

Are this and other methods threadsafe wrt callback from multiple plugins?
{code}
+  public TaskHeartbeatResponse heartbeat(TaskHeartbeatRequest request)
+  throws IOException, TezException {
{code}
Also in heartbeat(), the following code has been lost during 

Success: TEZ-2658 PreCommit Build #976

2015-08-10 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2658
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/976/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 3340 lines...]
[INFO] Final Memory: 87M/1386M
[INFO] 




{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12749680/TEZ-2658.4.patch
  against master revision eadbfec.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/976//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/976//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
11df6f4d94223ca30700d446da3bf500189ebab2 logged out


==
==
Finished build.
==
==


Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #973
Archived 53 artifacts
Archive block size is 32768
Received 4 blocks and 2983826 bytes
Compression is 4.2%
Took 0.68 sec
Description set: TEZ-2658
Recording test results
Email was triggered for: Success
Sending email for trigger: Success



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (TEZ-2658) Create a CLI utility tool to track Tez DAG/Application Stats

2015-08-10 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14680858#comment-14680858
 ] 

TezQA commented on TEZ-2658:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12749680/TEZ-2658.4.patch
  against master revision eadbfec.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/976//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/976//console

This message is automatically generated.

 Create a CLI utility tool to track Tez DAG/Application Stats
 

 Key: TEZ-2658
 URL: https://issues.apache.org/jira/browse/TEZ-2658
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Saikat
Assignee: Saikat
 Attachments: TEZ-2658.1.patch, TEZ-2658.2.patch, TEZ-2658.3.patch, 
 TEZ-2658.4.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2003) [Umbrella] Allow Tez to co-ordinate execution to external services

2015-08-10 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14680781#comment-14680781
 ] 

Siddharth Seth commented on TEZ-2003:
-

Haven't thought much about work preserving restart. That'll need to be 
considered at some point when we start supporting this. It'll depend on the 
mechanism that is used for running tasks to reconnect to the AM. That's where 
the TaskCommunicator comes in - and may need to provide additional information 
for recovery. A push based mechanism to communicate with executors will make 
work preserving recovery a lot simpler. The communicator protocol - which is 
now plugin code - would need to handle recovery - with appropriate timeouts and 
retry policies in place from the task side, as well as some re-discovery and 
reconnection mechanics. The framework can help by providing this sub-system 
with relevant information after a restart occurs.

 [Umbrella] Allow Tez to co-ordinate execution to external services
 --

 Key: TEZ-2003
 URL: https://issues.apache.org/jira/browse/TEZ-2003
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Siddharth Seth
 Attachments: 2003_20150728.1.txt, 2003_20150807.1.txt, 
 2003_20150807.2.txt, Tez With External Services.pdf


 The Tez engine itself takes care of co-ordinating execution - controlling how 
 data gets routed (different connection patterns), fault tolerance, scheduling 
 of work, etc.
 This is currently tied to TaskSpecs defined within Tez and on containers 
 launched by Tez itself (TezChild).
 The proposal is to allow Tez to work with external services instead of just 
 containers launched by Tez. This involves several more pluggable layers to 
 work with alternate Task Specifications, custom launch and task allocation 
 mechanics, as well as custom scheduling sources.
 A simple example would be a simple a process with the capability to execute 
 multiple Tez TaskSpecs as threads. In such a case, a container launch isn't 
 really need and can be mocked. Sourcing / scheduling containers would need to 
 be pluggable.
 A more advanced example would be LLAP (HIVE-7926; 
 https://issues.apache.org/jira/secure/attachment/12665704/LLAPdesigndocument.pdf).
 This works with custom interfaces - which would need to be supported by Tez, 
 along with a custom event model which would need translation hooks.
 Tez should be able to work with a combination of certain vertices running in 
 external services and others running in regular Tez containers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2692) bugfixes enhancements related to job parser and analyzer

2015-08-10 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14680817#comment-14680817
 ] 

Bikas Saha commented on TEZ-2692:
-

Do we need firstTasktoFinish or firstTaskToStart? If the latter, then should we 
be using dag.getStartTime() or vertex.getStartTime() ?
{code}+  private long getTaskRuntime(VertexInfo vertexInfo) {
+TaskInfo firstTaskToFinish = vertexInfo.getFirstTaskToStart();
+TaskInfo lastTaskToFinish = vertexInfo.getLastTaskToFinish();
+
+DagInfo dagInfo = vertexInfo.getDagInfo();
+long totalTime = ((lastTaskToFinish == null) ?
+dagInfo.getFinishTime() : lastTaskToFinish.getFinishTime()) -
+((firstTaskToFinish == null) ? dagInfo.getFinishTime() : 
firstTaskToFinish.getFinishTime());
+return totalTime;
   }{code}

The concurrency calculator logic could be improved a bit. E.g. if we arrange 
all start and stop timestampts in a sorted order as - St1, St2, Et3, Et4. Then 
we can walk this list to produce concurrency as - (t1, 1), (t2, 2), (t3, 1), 
(t4, 0). If this logic is correct, we could do it here or in a follow up jira.

If possible, can the new test be merged into the existing ATS parser test. This 
would reuse code and also reduce test run time by reusing the same mini cluster.

Rest looks good!

 bugfixes  enhancements related to job parser and analyzer
 --

 Key: TEZ-2692
 URL: https://issues.apache.org/jira/browse/TEZ-2692
 Project: Apache Tez
  Issue Type: Bug
Reporter: Rajesh Balamohan
Assignee: Rajesh Balamohan
 Attachments: TEZ-2692.1.patch, TEZ-2692.2.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2687) ATS History shutdown happens before the min-held containers are released

2015-08-10 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated TEZ-2687:
-
Assignee: (was: Gopal V)

 ATS History shutdown happens before the min-held containers are released
 

 Key: TEZ-2687
 URL: https://issues.apache.org/jira/browse/TEZ-2687
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.6.2, 0.8.0, 0.7.1
Reporter: Gopal V
 Attachments: TEZ-2687.1.patch


 When ATS goes into a GC pause under heavy loads and while it recovers, each 
 Tez AM holds onto a few containers even though it is shutting down and will 
 never accept any more DAGs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2687) ATS History shutdown happens before the min-held containers are released

2015-08-10 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14680881#comment-14680881
 ] 

Gopal V commented on TEZ-2687:
--

Deleting the bad patch attached to the JIRA and leaving the issue as unresolved.

 ATS History shutdown happens before the min-held containers are released
 

 Key: TEZ-2687
 URL: https://issues.apache.org/jira/browse/TEZ-2687
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.6.2, 0.8.0, 0.7.1
Reporter: Gopal V

 When ATS goes into a GC pause under heavy loads and while it recovers, each 
 Tez AM holds onto a few containers even though it is shutting down and will 
 never accept any more DAGs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2687) ATS History shutdown happens before the min-held containers are released

2015-08-10 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated TEZ-2687:
-
Attachment: (was: TEZ-2687.1.patch)

 ATS History shutdown happens before the min-held containers are released
 

 Key: TEZ-2687
 URL: https://issues.apache.org/jira/browse/TEZ-2687
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.6.2, 0.8.0, 0.7.1
Reporter: Gopal V

 When ATS goes into a GC pause under heavy loads and while it recovers, each 
 Tez AM holds onto a few containers even though it is shutting down and will 
 never accept any more DAGs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2692) bugfixes enhancements related to job parser and analyzer

2015-08-10 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14680574#comment-14680574
 ] 

Bikas Saha commented on TEZ-2692:
-

bq.  fixing it here might not be helpful for older releases
I did not mean fixing it here. I mean fixing them separately so that downstream 
clients (whether ATS parser or something else) can read identical information 
from both. Since you have identified what the differences are, could you please 
open jiras to track them. It may be that we end up creating a Tee that always 
does simply history logging. So having the correct information in both of them 
may be essential. For now, we are working around in the ATS parser, which is ok 
for now.

 bugfixes  enhancements related to job parser and analyzer
 --

 Key: TEZ-2692
 URL: https://issues.apache.org/jira/browse/TEZ-2692
 Project: Apache Tez
  Issue Type: Bug
Reporter: Rajesh Balamohan
Assignee: Rajesh Balamohan
 Attachments: TEZ-2692.1.patch, TEZ-2692.2.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: TEZ-2300 PreCommit Build #974

2015-08-10 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2300
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/974/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2125 lines...]
{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12749649/TEZ-2300.2.patch
  against master revision eadbfec.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.client.TestTezClient

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/974//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/974//artifact/patchprocess/newPatchFindbugsWarningstez-api.html
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/974//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
b25e046d6e739ede0628b2529fe75016ec99a1bc logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #965
Archived 50 artifacts
Archive block size is 32768
Received 6 blocks and 2806437 bytes
Compression is 6.5%
Took 2.7 sec
[description-setter] Could not determine description.
Recording test results
Publish JUnit test result report is waiting for a checkpoint on 
PreCommit-TEZ-Build #973
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
4 tests failed.
REGRESSION:  org.apache.tez.client.TestTezClient.testTezclientSession

Error Message:
test timed out after 5000 milliseconds

Stack Trace:
java.lang.Exception: test timed out after 5000 milliseconds
at java.lang.Thread.sleep(Native Method)
at org.apache.tez.client.TezClient.stop(TezClient.java:518)
at 
org.apache.tez.client.TestTezClient.testTezClient(TestTezClient.java:240)
at 
org.apache.tez.client.TestTezClient.testTezclientSession(TestTezClient.java:135)


REGRESSION:  org.apache.tez.client.TestTezClient.testWaitTillReady_Interrupt

Error Message:
test timed out after 5000 milliseconds

Stack Trace:
java.lang.Exception: test timed out after 5000 milliseconds
at java.lang.Thread.sleep(Native Method)
at org.apache.tez.client.TezClient.stop(TezClient.java:518)
at 
org.apache.tez.client.TestTezClient.testWaitTillReady_Interrupt(TestTezClient.java:334)


REGRESSION:  org.apache.tez.client.TestTezClient.testPreWarm

Error Message:
test timed out after 5000 milliseconds

Stack Trace:
java.lang.Exception: test timed out after 5000 milliseconds
at java.lang.Thread.sleep(Native Method)
at org.apache.tez.client.TezClient.stop(TezClient.java:518)
at 
org.apache.tez.client.TestTezClient.testPreWarm(TestTezClient.java:268)


REGRESSION:  org.apache.tez.client.TestTezClient.testMultipleSubmissions

Error Message:
test timed out after 1 milliseconds

Stack Trace:
java.lang.Exception: test timed out after 1 milliseconds
at java.lang.Thread.sleep(Native Method)
at org.apache.tez.client.TezClient.stop(TezClient.java:518)
at 
org.apache.tez.client.TestTezClient.testMultipleSubmissionsJob(TestTezClient.java:308)
at 
org.apache.tez.client.TestTezClient.testMultipleSubmissions(TestTezClient.java:273)




Success: TEZ-2618 PreCommit Build #973

2015-08-10 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2618
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/973/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 3198 lines...]
[INFO] Final Memory: 86M/1465M
[INFO] 




{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12749643/TEZ-2618.1.patch
  against master revision eadbfec.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/973//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/973//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
de8404d740241e1a641049c7a51412a893e2b675 logged out


==
==
Finished build.
==
==


Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #965
Archived 50 artifacts
Archive block size is 32768
Received 4 blocks and 2946162 bytes
Compression is 4.3%
Took 0.67 sec
Description set: TEZ-2618
Recording test results
Email was triggered for: Success
Sending email for trigger: Success



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (TEZ-2618) In Ordered Fetcher, if Local Fetch fails, fallback and try http Fetch before returning a failure

2015-08-10 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14680621#comment-14680621
 ] 

TezQA commented on TEZ-2618:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12749643/TEZ-2618.1.patch
  against master revision eadbfec.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/973//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/973//console

This message is automatically generated.

 In Ordered Fetcher, if Local Fetch fails, fallback and try http Fetch before 
 returning a failure
 

 Key: TEZ-2618
 URL: https://issues.apache.org/jira/browse/TEZ-2618
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Saikat
Assignee: Saikat
 Attachments: TEZ-2618.1.patch, TEZ-2618.patch


 In setupLocalDiskFetch() method[this is invoked when the fetcher is in the 
 same host as the target map host], first try to check if we can open the 
 target spill file using the localDirAllocator.getLocalPathToRead(). The 
 localDirAllocator searches through the list of configured dirs for the file. 
 In disk full scenarios, if the path is not found, fetcher should to try an 
 http fetch.
 proposed solution:
 in local fetch mode, the fetcher should first try getLocalPathToRead() for 
 all the pending maps. and  So local fetch gets divided into 2 stages: first 
 the maps for which path was found via LocalDirAllocator and second construct 
 a http fallback fetch list for the maps which couldnt be found via 
 LocalDirAllocator.getLocalPathToRead() and do an http fetch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2705) Add a reference implementation for ext services

2015-08-10 Thread Siddharth Seth (JIRA)
Siddharth Seth created TEZ-2705:
---

 Summary: Add a reference implementation for ext services
 Key: TEZ-2705
 URL: https://issues.apache.org/jira/browse/TEZ-2705
 Project: Apache Tez
  Issue Type: Sub-task
Affects Versions: TEZ-2003
Reporter: Siddharth Seth
Assignee: Siddharth Seth


Potentially convert tez-ext-service-tests into this reference.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2692) bugfixes enhancements related to job parser and analyzer

2015-08-10 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681179#comment-14681179
 ] 

Bikas Saha commented on TEZ-2692:
-

Please commit the next patch with fix, if needed, for private long 
getTaskRuntime(VertexInfo vertexInfo).

 bugfixes  enhancements related to job parser and analyzer
 --

 Key: TEZ-2692
 URL: https://issues.apache.org/jira/browse/TEZ-2692
 Project: Apache Tez
  Issue Type: Bug
Reporter: Rajesh Balamohan
Assignee: Rajesh Balamohan
 Attachments: TEZ-2692.1.patch, TEZ-2692.2.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: TEZ-2658 PreCommit Build #975

2015-08-10 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2658
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/975/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 3342 lines...]




{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12749655/TEZ-2658.3.patch
  against master revision eadbfec.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/975//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/975//artifact/patchprocess/newPatchFindbugsWarningstez-cli-tools.html
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/975//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
c8c6a894d833ded5bf1a55689b69e09833fffe7b logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #973
Archived 53 artifacts
Archive block size is 32768
Received 4 blocks and 2998631 bytes
Compression is 4.2%
Took 0.67 sec
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (TEZ-2003) [Umbrella] Allow Tez to co-ordinate execution to external services

2015-08-10 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14680721#comment-14680721
 ] 

Siddharth Seth commented on TEZ-2003:
-

bq. logErrorIngored, hearbeats, getCurretnDagName
bq. - remove “*” e,g, import org.apache.tez.common.asterisk;

Captured in TEZ-2678

bq. abortTask vs close/cleanup
Will check the code. abortTask should try cleaning up in both of them.

bq. TezTaskRunner2
killTask isn't used yet within Tez, which is why it's not informing the AM. 
When task preemption comes in - the flow is likely to be a killTask invoked as 
a result of an RPC, at which point the AM already knows that the task is killed 
since it took the decision.

On the various atomic gets - there's separate variables to track what states 
have been set, and is used in the return result. Atomicity of the entire 
operation is handled via synchronization blocks.

TaskRunner handling containerStop is a result of containerStop coming over a 
shared Task/Container protocol - which is linked to the running task. It could 
be separated, but I think that'll need the protocols to be separated as well.

canCommit during a shutdown - will change this. I'll also verify what the 
TaskRunner behaviour was. TEZ-2678

bq. TaskReporter
I don't think shutdown needs synchronization. It modifies a final variable. 
Whether it's implemented correctly needs more investigation. It's the same as 
what exists on master.

bq. ShuffleHandler
This is essentially the shuffle handler that is used in regular clusters. It's 
not meant as a benchmark tool. Using he current shuffle mechanics seems like 
the simplest mechanism to have jobs work with the standard set of 
Inputs/Outputs which write to disk.

bq. ext-service-tests
Agree with making this a reference for ext services. It would need to implement 
the APIs better, and be documented a lot bette to serve this purpose. Creating 
a new jira to track this - TEZ-2705. Post merge ?

bq. JoinValidate
The changes are for private use, to be able to re-use the example in testing. 
Will add docs to mention this.

bq. TezTaskCommunicatorImpl
Using payloads wherever possible - including internal plugins. Avoided in 
LocalContainerLauncher only at the moment, where a lot of runtime AM 
information is used.

Will fix isKnownContainer and containerAlive t be based on specific 
communicator.

Renaming methods in TaskComm - tracked in the TaskComm enhancements jira

getDagName null - will try improving this.
getVertexName - I'm not sure there's a lot that can be done. TezException 
instead of NPE ? Eventually this will lead to an error in the plugin, which 
needs to be handled better. There's a jira to track such error handling.

onStateUpdated - is the AM telling the TaskCommunicator plugin that a vertex 
has changed state. Similar to what is done elsewhere - like the 
InputInitializers.

dagCompleteStart - couldn't find this. Maybe I removed it at some point for the 
same reason - is a very confusing name.

bq. Is there a need for the framework to make updates into the Context object? 
If yes, should the Context implement 2 interfaces? Should the internal objects 
just bind to the internal Impl objects or are they bound to the public plugin 
interfaces to catch compat errors? Binding to Impls directly may mean a smaller 
public API interface.
Need more clarification on this comment.

bq. Is there a need for the framework to make updates into the Context object? 
If yes, should the Context implement 2 interfaces? Should the internal objects 
just bind to the internal Impl objects or are they bound to the public plugin 
interfaces to catch compat errors? Binding to Impls directly may mean a smaller 
public API interface.
Will do. 

 [Umbrella] Allow Tez to co-ordinate execution to external services
 --

 Key: TEZ-2003
 URL: https://issues.apache.org/jira/browse/TEZ-2003
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Siddharth Seth
 Attachments: 2003_20150728.1.txt, 2003_20150807.1.txt, 
 2003_20150807.2.txt, Tez With External Services.pdf


 The Tez engine itself takes care of co-ordinating execution - controlling how 
 data gets routed (different connection patterns), fault tolerance, scheduling 
 of work, etc.
 This is currently tied to TaskSpecs defined within Tez and on containers 
 launched by Tez itself (TezChild).
 The proposal is to allow Tez to work with external services instead of just 
 containers launched by Tez. This involves several more pluggable layers to 
 work with alternate Task Specifications, custom launch and task allocation 
 mechanics, as well as custom scheduling sources.
 A simple example would be a simple a process with the capability to execute 
 multiple Tez TaskSpecs as threads. In such a case, a container launch isn't 
 really 

[jira] [Commented] (TEZ-2678) Fix comments from reviews - part 1

2015-08-10 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14680725#comment-14680725
 ] 

Siddharth Seth commented on TEZ-2678:
-

Verify abortTask/cleanup are used correctly in TezTaskRunner

why would a task call canCommit while shutting down? Shouldn’t we throw an 
exception anyway as it is not meant to be called during shutdown?

Add docs to joinValidate explaining extensions are for private use.

TaskCommunicatorContextImpl: Shouldn’t each plugin manage its own containers? 
Or at least shouldn’t this query be done based on which launcher plugin was 
being used for the given container? Likewise for containerAlive(). | Try fixing 
this to be specific to the communicator.

setAccessible not required during construction of plugins.

remove “*” e,g, import org.apache.tez.common.asterisk;


typos: logErrorIngored, hearbeats, getCurretnDagName

 Fix comments from reviews - part 1
 --

 Key: TEZ-2678
 URL: https://issues.apache.org/jira/browse/TEZ-2678
 Project: Apache Tez
  Issue Type: Sub-task
Affects Versions: TEZ-2003
Reporter: Siddharth Seth
Assignee: Siddharth Seth

 Typos in API - Curretn, localicty, others
 Add diagnostic string wherever ContainerEndReason is used.
 TODO in ContainerLauncherContext - TEZ-2676
 TaskEndReason lossy compared to YARN.
 Cache the context in DAGImpl.getDefaultExecutionContext
 TaskAttempt. TA_KILLED moves to KILL_IN_PROGRESS instead of KILLED
 TaskAttempt - add scheduleTime to history event
 Exception propagation in ContainerLauncherRouter
 AMNodeTracker calls super(AMNodeMap);
 ContainerLauncherOperationBase - token abstraction



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2003) [Umbrella] Allow Tez to co-ordinate execution to external services

2015-08-10 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681128#comment-14681128
 ] 

Siddharth Seth commented on TEZ-2003:
-

bq. should not need to have special casing for any plugin
Special casing is in place primarily for LocalContainerExecutor, which requires 
a bunch of information at runtime - which isn't needed in the context 
otherwise. There's a jira to provide such information via runtime binding in 
the payload. For the other cases, it's mainly used to make it simpler to write 
tests - where the default executor can be easily overwritten for the tests. The 
construction, along with the payload, remains the same - except it's direct 
instead of using reflection.

bq. Also creating a ServicePlugin class will help in reducing code duplication 
and make maintenance easier instead of having scheduler id, launcherId and 
commId everywhere.
The 3 constructs are not used together everywhere. There's multiple events / 
other classes which only use a subset of these. A single class won't really 
help there.

bq. ContainerSignatureMatcher - ExecutorSignatureMatcher ?
Tracked in 2708.

bq. ServicePluginLifecyle etc. in tez-runtime-api like 
Inputs/Output/InputInitializer etc
shutdown would make more sense for a service.

bq. Why are executedInAm and executeInContainers there
executeInAm and executeInContainers are in Contexts to specify whether a task 
runs in a service or in the AM. It's possible to set a DAG level default to run 
everything in an external service, and some vertices either in containers or in 
the AM.
Similarly for the ServiceDescriptor - decide whether the AM runs containers or 
uber-mode during setup.

bq. Rename to ExecutorEndReason ?
Give the abstraction that exists is containers (an executor could be confused 
for a service daemon), ContainerEndReason seems fine. This can change when Tez 
introduces it's own version of 'Contaienrs' instead of relying on the YARN 
abstraction. 

bq. Also, how can An error in the AM be caused by a container running a task?
Not sure what you mean by this.  An error in the AM caused by user code - 
implies an error which occurred in the AM process as a result of a plugin.

bq. Why does this have schedulerName and taskCommName ?
It's used for the startRequest.

bq. Has the internal one been replaced by this?
No but there's a jira open to consolidate the two.

bq. Rename to ExecutorBusy ?
tracked in TEZ-2707

bq. Why isLocal flag needs to be passed to Scheduler/Launcher/Communicator 
routers? Instead of a service plugin for local
There's certain operations which are performed differently for local mode. Also 
used to indicate to internal plugins whether they're running in local / uber 
mode.

bq. Is is ensured that the integer for a service plugin will turn out to be the 
same after AM restart?
Yes

bq. Why is yarn scheduler special cased? Launcher/Communicator dont have the 
special casing ?
To always run the YARNScheduler (i.e. register with YARN) if running in 
non-local mode. If we were to support alternate frameworks, this could be 
removed.

bq. Why use different code path for uber/default. They should just work when 
instantiated the same way as a custom plugin.
Primarily for testing. First part of this comment.

bq. Are this and other methods threadsafe wrt callback from multiple plugins?
They should be. I'll scan through them. Would appreciate if you do the same to 
identify issues.

bq. Also in heartbeat(), the following code has been lost during merge.
Tracked in TEZ-2707

bq. Why are the contextImpls not directing invoking/handling the plugins 
instead of going through the router?
They don't need to. ContextImpls are primarily for communication from the 
plugins to the framework. The routers should handle framework to plugins.

bq. Why are the contextImpls not directing invoking/handling the plugins 
instead of going through the router?
This avoids some race between dag transitions.

bq. Why has the synchronization been removed. I remember this being a subtle 
race condition.
sync on containerInfo is no longer required since there's a new entry inserted 
into the structure each time.

bq. The dagCompleteStart/End logic is either broken or unnecessary because the 
correct dag seems to be always received from appContext.getCurrentDAG().
This is again for transitions between DAGs. A new dag is received when a dag is 
submitted - the context update needs to be factored out. dagComplete is sent to 
a plugin - which can take an arbitrary time to process. During this time, any 
lookups it does will be from the last dag - instead of a possible new dag, 
which could be submitted anytime.

bq. Why not keep a cached copy instead of converting each time?
Fixed in TEZ-2678

bq. There is a scheduledTime on master that this is duplicating.
Will create a nice conflict when i rebase the branch next. Will resolve it then.

bq. What is the 

[jira] [Commented] (TEZ-2708) renames for tez-2003 changes

2015-08-10 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681127#comment-14681127
 ] 

Siddharth Seth commented on TEZ-2708:
-

ContainerSignatureMatcher - ExecutorSignatureMatcher


 renames for tez-2003 changes
 

 Key: TEZ-2708
 URL: https://issues.apache.org/jira/browse/TEZ-2708
 Project: Apache Tez
  Issue Type: Sub-task
Affects Versions: TEZ-2003
Reporter: Siddharth Seth
Assignee: Siddharth Seth

 This jira is to track some class renames which are required. TBD just before 
 merging or right after the merge.
 -  ContainerLauncherImpl to TezContainerLauncherImpl ? Make all the default 
 implementation with prefix Tez.
 - TaskAttemptListenerImpTezDag to TaskCommunicatorManager
 - Likewise for tests.
 - Remove TezTaskRunner
 - Rename TaskSchedulerEventHandler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2678) Fix comments from reviews - part 1

2015-08-10 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated TEZ-2678:

Attachment: TEZ-2678.1.txt

 Fix comments from reviews - part 1
 --

 Key: TEZ-2678
 URL: https://issues.apache.org/jira/browse/TEZ-2678
 Project: Apache Tez
  Issue Type: Sub-task
Affects Versions: TEZ-2003
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: TEZ-2678.1.txt


 Typos in API - Curretn, localicty, others
 Add diagnostic string wherever ContainerEndReason is used.
 TODO in ContainerLauncherContext - TEZ-2676
 TaskEndReason lossy compared to YARN.
 Cache the context in DAGImpl.getDefaultExecutionContext
 TaskAttempt. TA_KILLED moves to KILL_IN_PROGRESS instead of KILLED
 TaskAttempt - add scheduleTime to history event
 Exception propagation in ContainerLauncherRouter
 AMNodeTracker calls super(AMNodeMap);
 ContainerLauncherOperationBase - token abstraction



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2707) Fix comments from reviews - part 2

2015-08-10 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681126#comment-14681126
 ] 

Siddharth Seth commented on TEZ-2707:
-

Rename endReason.SERVICE_BUSY to EXECUTOR_BUSY
Also in heartbeat(), the following code has been lost during merge.
TaskAttempt.scheduleTime recently added to master
Base class for schedulerEvents with schedulerId
Similarly for AMContainerEvents
NodeId ref sourceId - rename to schedulerId
Remove TODO TEZ-2124 from AMNodeImpl
Remove commented code in MocKDAGAppMaster
TestTaskAttempt - taskComm setup into a method
ExecutionContextTestInfoHolder - try re-using logic from AM


 Fix comments from reviews - part 2
 --

 Key: TEZ-2707
 URL: https://issues.apache.org/jira/browse/TEZ-2707
 Project: Apache Tez
  Issue Type: Sub-task
Affects Versions: TEZ-2003
Reporter: Siddharth Seth
Assignee: Siddharth Seth





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2707) Fix comments from reviews - part 2

2015-08-10 Thread Siddharth Seth (JIRA)
Siddharth Seth created TEZ-2707:
---

 Summary: Fix comments from reviews - part 2
 Key: TEZ-2707
 URL: https://issues.apache.org/jira/browse/TEZ-2707
 Project: Apache Tez
  Issue Type: Sub-task
Affects Versions: TEZ-2003
Reporter: Siddharth Seth
Assignee: Siddharth Seth






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2708) renames for tez-2003 changes

2015-08-10 Thread Siddharth Seth (JIRA)
Siddharth Seth created TEZ-2708:
---

 Summary: renames for tez-2003 changes
 Key: TEZ-2708
 URL: https://issues.apache.org/jira/browse/TEZ-2708
 Project: Apache Tez
  Issue Type: Sub-task
Affects Versions: TEZ-2003
Reporter: Siddharth Seth
Assignee: Siddharth Seth


This jira is to track some class renames which are required. TBD just before 
merging or right after the merge.

-  ContainerLauncherImpl to TezContainerLauncherImpl ? Make all the default 
implementation with prefix Tez.
- TaskAttemptListenerImpTezDag to TaskCommunicatorManager
- Likewise for tests.
- Remove TezTaskRunner
- Rename TaskSchedulerEventHandler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2709) Enhancement to history events for external services

2015-08-10 Thread Siddharth Seth (JIRA)
Siddharth Seth created TEZ-2709:
---

 Summary: Enhancement to history events for external services
 Key: TEZ-2709
 URL: https://issues.apache.org/jira/browse/TEZ-2709
 Project: Apache Tez
  Issue Type: Sub-task
Affects Versions: TEZ-2003
Reporter: Siddharth Seth
Assignee: Siddharth Seth


- Log the scheduler, launcher and task comm for an attempt. Also for containers 
where relevant.
- scheduleTime in TaskAttempt needs to be logged.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2678) Fix comments from reviews - part 1

2015-08-10 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681069#comment-14681069
 ] 

Siddharth Seth commented on TEZ-2678:
-

bq. Typos in API - Curretn, localicty, others
Fixed

bq. Add diagnostic string wherever ContainerEndReason is used.
Done. Also for TaskEndReason

bq. TODO in ContainerLauncherContext - TEZ-2676
Likely done elsewhere. Consolidation of TODOs and jiras

bq. TaskEndReason lossy compared to YARN.
[~hitesh] - could you please elaborate on this. (Was this meant to be YARN or 
some internal error reporting?)

bq. Cache the context in DAGImpl.getDefaultExecutionContext
Done

bq. TaskAttempt. TA_KILLED moves to KILL_IN_PROGRESS instead of KILLED
Moving to KILLED now

bq. TaskAttempt - add scheduleTime to history event
Deferring to jira which will include additional history fixes like where a task 
is executing. TEZ-2709

bq. Exception propagation in ContainerLauncherRouter
Converted UnknownHost to an unchecked exception as well.

bq. AMNodeTracker calls super(AMNodeMap);
Fixed

bq. ContainerLauncherOperationBase - token abstraction
Tracked in TEZ-2702

bq. TaskCommunicator.java Rename unregisterRunningTaskAttempt to 
registerTaskAttemptEnd (make it more consistent)
Tracked in TEZ-2678

bq. Post merge / just before merge: Rename ContainerLauncherImpl to 
TezContainerLauncherImpl ? Make all the default implementation with prefix Tez.
Tracked in TEZ-2708

bq. Typo DagTypeConverters.convertServicePluginDescriptoToProto -- 
DagTypeConverters.convertServicePluginDescriptorToProto (miss r)
Fixed

bq. Verify VertexExecutionContext matches against the ServicePluginDescriptor 
setup for the TezClient
Fixed

bq. Verify abortTask/cleanup are used correctly in TezTaskRunner
ceanpu is always called - which is the correct thing to do. (In both 
TezTaskRunner and TezTaskRunner2)

bq. why would a task call canCommit while shutting down? Shouldn’t we throw an 
exception anyway as it is not meant to be called during shutdown?
Looked at this some more. The shutdown could be a result of anything including 
preemption. The task doesn't necessarily know that it has been asked to die 
(race with canCommit invocations or whatever the task is doing). Sending back 
an exception results in an unnecessary exception from the task. A false seems 
much safer - and has been the approach we've used for TaskRunner as well.

bq. TaskCommunicatorContextImpl: Shouldn’t each plugin manage its own 
containers? Or at least shouldn’t this query be done based on which launcher 
plugin was being used for the given container? Likewise for containerAlive(). | 
Try fixing this to be specific to the communicator.
Fixed

bq. setAccessible not required during construction of plugins.
Fixed

bq. remove “*” e,g, import org.apache.tez.common.asterisk;

Fixed

bq. typos: logErrorIngored, hearbeats, getCurretnDagName
Fixed



 Fix comments from reviews - part 1
 --

 Key: TEZ-2678
 URL: https://issues.apache.org/jira/browse/TEZ-2678
 Project: Apache Tez
  Issue Type: Sub-task
Affects Versions: TEZ-2003
Reporter: Siddharth Seth
Assignee: Siddharth Seth

 Typos in API - Curretn, localicty, others
 Add diagnostic string wherever ContainerEndReason is used.
 TODO in ContainerLauncherContext - TEZ-2676
 TaskEndReason lossy compared to YARN.
 Cache the context in DAGImpl.getDefaultExecutionContext
 TaskAttempt. TA_KILLED moves to KILL_IN_PROGRESS instead of KILLED
 TaskAttempt - add scheduleTime to history event
 Exception propagation in ContainerLauncherRouter
 AMNodeTracker calls super(AMNodeMap);
 ContainerLauncherOperationBase - token abstraction



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (TEZ-2703) TEZ-2003 build fails

2015-08-10 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang resolved TEZ-2703.
-
   Resolution: Fixed
 Assignee: Jeff Zhang
Fix Version/s: TEZ-2003

 TEZ-2003 build fails
 

 Key: TEZ-2703
 URL: https://issues.apache.org/jira/browse/TEZ-2703
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Jeff Zhang
Assignee: Jeff Zhang
 Fix For: TEZ-2003

 Attachments: TEZ-2703-1.patch


 {code}
 [ERROR]
 [ERROR]   The project org.apache.tez:tez-job-analyzer:0.8.0-SNAPSHOT 
 (/Users/jzhang/github/tez/tez-tools/analyzers/job-analyzer/pom.xml) has 1 
 error
 [ERROR] 'dependencies.dependency.version' for 
 io.dropwizard.metrics:metrics-core:jar is missing. @ line 28, column 17
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2703) TEZ-2003 build fails

2015-08-10 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14679624#comment-14679624
 ] 

Jeff Zhang commented on TEZ-2703:
-

Committed to TEZ-2003

 TEZ-2003 build fails
 

 Key: TEZ-2703
 URL: https://issues.apache.org/jira/browse/TEZ-2703
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Jeff Zhang
 Attachments: TEZ-2703-1.patch


 {code}
 [ERROR]
 [ERROR]   The project org.apache.tez:tez-job-analyzer:0.8.0-SNAPSHOT 
 (/Users/jzhang/github/tez/tez-tools/analyzers/job-analyzer/pom.xml) has 1 
 error
 [ERROR] 'dependencies.dependency.version' for 
 io.dropwizard.metrics:metrics-core:jar is missing. @ line 28, column 17
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TEZ-2004) Define basic interface for pluggable ContainerLaunchers

2015-08-10 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659692#comment-14659692
 ] 

Jeff Zhang edited comment on TEZ-2004 at 8/10/15 10:39 AM:
---

Comments:
* Is ContainerOp necessary ? It seems ContainerLauncherOperationBase can be 
used instead, just need to move OPType into  ContainerLauncherOperationBase, 
how about rename it as ContainerOperationBase ?
* TaskCommunicator.java  Rename unregisterRunningTaskAttempt to 
registerTaskAttemptEnd  (make it more consistent)
* Rename ContainerLauncherImpl to TezContainerLauncherImpl ? Make all the 
default implementation with prefix Tez ?
* Typo  DagTypeConverters.convertServicePluginDescriptoToProto  -- 
DagTypeConverters.convertServicePluginDescriptorToProto (miss r)
* Need to verify the DAG's defaultExecutionContext and Vertex's 
ExecutionContext exist in the TezClient.servicePluginsDescriptor when 
submitting the dag
* Need to verify VertexExecutionContext's executeInAm  executeInContainers is 
supported in TezClient's ServicePluginsDescriptor
* Seems currently there's only programmatic way to specify TaskScheduler, 
ContainerLauncher, TaskCommunicator through TezClient's 
ServicePluginsDescriptor, is it expected ? That would mean if some third party 
want to introduce new external service to hive, not only they need to implement 
new TaskScheduler, ContainerLauncher, TaskCommunicator but also need to change 
hive code and rebuild. 
* ServicePluginsDescriptor
   As my understanding, TaskSchduler/TaskCommunicator/ContainerLauncher are 
used together, can not be combined arbitrarily. So should we use 
ExecutionContextDescriptor to replace the 3 separators descriptors ?
* TaskAttempt#scheduleTime may need to put into history event 
TaskAttemptStartedEvent to be used by Tez-UI







was (Author: zjffdu):
Comments:
* Is ContainerOp necessary ? It seems ContainerLauncherOperationBase can be 
used instead, just need to move OPType into  ContainerLauncherOperationBase, 
how about rename it as ContainerOperationBase ?
* TaskCommunicator.java  Rename unregisterRunningTaskAttempt to 
registerTaskAttemptEnd  (make it more consistent)
* Rename ContainerLauncherImpl to TezContainerLauncherImpl ? Make all the 
default implementation with prefix Tez ?
* Typo  DagTypeConverters.convertServicePluginDescriptoToProto  -- 
DagTypeConverters.convertServicePluginDescriptorToProto (miss r)
* Need to verify the DAG's defaultExecutionContext and Vertex's 
ExecutionContext exist in the TezClient.servicePluginsDescriptor when 
submitting the dag
* Need to verify VertexExecutionContext's executeInAm  executeInContainers is 
supported in TezClient's ServicePluginsDescriptor
* Seems currently there's only programmatic way to specify TaskScheduler, 
ContainerLauncher, TaskCommunicator through TezClient's 
ServicePluginsDescriptor, is it expected ? That would mean if some third party 
want to introduce new external service to hive, not only they need to implement 
new TaskScheduler, ContainerLauncher, TaskCommunicator but also need to change 
hive code and rebuild. 




 Define basic interface for pluggable ContainerLaunchers
 ---

 Key: TEZ-2004
 URL: https://issues.apache.org/jira/browse/TEZ-2004
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Fix For: TEZ-2003

 Attachments: TEZ-2004.1.txt






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)