[jira] [Commented] (TEZ-2003) [Umbrella] Allow Tez to co-ordinate execution to external services
[ https://issues.apache.org/jira/browse/TEZ-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14662669#comment-14662669 ] TezQA commented on TEZ-2003: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12749362/2003_20150807.2.txt against master revision eadbfec. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 52 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 165 javac compiler warnings (more than the master's current 158 warnings). {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 3 warning messages. See https://builds.apache.org/job/PreCommit-TEZ-Build/970//artifact/patchprocess/diffJavadocWarnings.txt for details. {color:red}-1 findbugs{color}. The patch appears to cause Findbugs (version 3.0.1) to fail. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/970//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/970//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/970//console This message is automatically generated. > [Umbrella] Allow Tez to co-ordinate execution to external services > -- > > Key: TEZ-2003 > URL: https://issues.apache.org/jira/browse/TEZ-2003 > Project: Apache Tez > Issue Type: Improvement >Reporter: Siddharth Seth > Attachments: 2003_20150728.1.txt, 2003_20150807.1.txt, > 2003_20150807.2.txt, Tez With External Services.pdf > > > The Tez engine itself takes care of co-ordinating execution - controlling how > data gets routed (different connection patterns), fault tolerance, scheduling > of work, etc. > This is currently tied to TaskSpecs defined within Tez and on containers > launched by Tez itself (TezChild). > The proposal is to allow Tez to work with external services instead of just > containers launched by Tez. This involves several more pluggable layers to > work with alternate Task Specifications, custom launch and task allocation > mechanics, as well as custom scheduling sources. > A simple example would be a simple a process with the capability to execute > multiple Tez TaskSpecs as threads. In such a case, a container launch isn't > really need and can be mocked. Sourcing / scheduling containers would need to > be pluggable. > A more advanced example would be LLAP (HIVE-7926; > https://issues.apache.org/jira/secure/attachment/12665704/LLAPdesigndocument.pdf). > This works with custom interfaces - which would need to be supported by Tez, > along with a custom event model which would need translation hooks. > Tez should be able to work with a combination of certain vertices running in > external services and others running in regular Tez containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Failed: TEZ-2003 PreCommit Build #970
Jira: https://issues.apache.org/jira/browse/TEZ-2003 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/970/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 3536 lines...] {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12749362/2003_20150807.2.txt against master revision eadbfec. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 52 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 165 javac compiler warnings (more than the master's current 158 warnings). {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 3 warning messages. See https://builds.apache.org/job/PreCommit-TEZ-Build/970//artifact/patchprocess/diffJavadocWarnings.txt for details. {color:red}-1 findbugs{color}. The patch appears to cause Findbugs (version 3.0.1) to fail. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/970//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/970//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/970//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. b09736ab858213c011f80376828d9e8cc909dff0 logged out == == Finished build. == == Build step 'Execute shell' marked build as failure Archiving artifacts Sending artifact delta relative to PreCommit-TEZ-Build #965 Archived 12 artifacts Archive block size is 32768 Received 0 blocks and 2054698 bytes Compression is 0.0% Took 0.73 sec [description-setter] Could not determine description. Recording test results Email was triggered for: Failure Sending email for trigger: Failure ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Updated] (TEZ-2300) TezClient.stop() takes a lot of time or does not work sometimes
[ https://issues.apache.org/jira/browse/TEZ-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated TEZ-2300: - Attachment: TEZ-2300.1.patch Putting up a starter patch to start discussion. * Should the stop api be blocking or should a new killApplication api be exposed? * Should I be asking the AM directly and then falling back to the RM to get the application status? > TezClient.stop() takes a lot of time or does not work sometimes > --- > > Key: TEZ-2300 > URL: https://issues.apache.org/jira/browse/TEZ-2300 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy >Assignee: Jonathan Eagles > Attachments: TEZ-2300.1.patch, syslog_dag_1428329756093_325099_1_post > > > Noticed this with a couple of pig scripts which were not behaving well (AM > close to OOM, etc) and even with some that were running fine. Pig calls > Tezclient.stop() in shutdown hook. Ctrl+C to the pig script either exits > immediately or is hung. In both cases it either takes a long time for the > yarn application to go to KILLED state. Many times I just end up calling yarn > application -kill separately after waiting for 5 mins or more for it to get > killed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2003) [Umbrella] Allow Tez to co-ordinate execution to external services
[ https://issues.apache.org/jira/browse/TEZ-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated TEZ-2003: Attachment: 2003_20150807.2.txt > [Umbrella] Allow Tez to co-ordinate execution to external services > -- > > Key: TEZ-2003 > URL: https://issues.apache.org/jira/browse/TEZ-2003 > Project: Apache Tez > Issue Type: Improvement >Reporter: Siddharth Seth > Attachments: 2003_20150728.1.txt, 2003_20150807.1.txt, > 2003_20150807.2.txt, Tez With External Services.pdf > > > The Tez engine itself takes care of co-ordinating execution - controlling how > data gets routed (different connection patterns), fault tolerance, scheduling > of work, etc. > This is currently tied to TaskSpecs defined within Tez and on containers > launched by Tez itself (TezChild). > The proposal is to allow Tez to work with external services instead of just > containers launched by Tez. This involves several more pluggable layers to > work with alternate Task Specifications, custom launch and task allocation > mechanics, as well as custom scheduling sources. > A simple example would be a simple a process with the capability to execute > multiple Tez TaskSpecs as threads. In such a case, a container launch isn't > really need and can be mocked. Sourcing / scheduling containers would need to > be pluggable. > A more advanced example would be LLAP (HIVE-7926; > https://issues.apache.org/jira/secure/attachment/12665704/LLAPdesigndocument.pdf). > This works with custom interfaces - which would need to be supported by Tez, > along with a custom event model which would need translation hooks. > Tez should be able to work with a combination of certain vertices running in > external services and others running in regular Tez containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (TEZ-2675) Add javadocs for new pluggable components, fix problems reported by jenkins
[ https://issues.apache.org/jira/browse/TEZ-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth resolved TEZ-2675. - Resolution: Fixed Fix Version/s: TEZ-2003 > Add javadocs for new pluggable components, fix problems reported by jenkins > --- > > Key: TEZ-2675 > URL: https://issues.apache.org/jira/browse/TEZ-2675 > Project: Apache Tez > Issue Type: Sub-task >Affects Versions: TEZ-2003 >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Fix For: TEZ-2003 > > Attachments: TEZ-2675.1.txt > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2675) Add javadocs for new pluggable components, fix problems reported by jenkins
[ https://issues.apache.org/jira/browse/TEZ-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated TEZ-2675: Summary: Add javadocs for new pluggable components, fix problems reported by jenkins (was: Add javadocs for new pluggable components) > Add javadocs for new pluggable components, fix problems reported by jenkins > --- > > Key: TEZ-2675 > URL: https://issues.apache.org/jira/browse/TEZ-2675 > Project: Apache Tez > Issue Type: Sub-task >Affects Versions: TEZ-2003 >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: TEZ-2675.1.txt > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2675) Add javadocs for new pluggable components
[ https://issues.apache.org/jira/browse/TEZ-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated TEZ-2675: Attachment: TEZ-2675.1.txt Also fixes some of the errors reported by jenkins. > Add javadocs for new pluggable components > - > > Key: TEZ-2675 > URL: https://issues.apache.org/jira/browse/TEZ-2675 > Project: Apache Tez > Issue Type: Sub-task >Affects Versions: TEZ-2003 >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: TEZ-2675.1.txt > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2003) [Umbrella] Allow Tez to co-ordinate execution to external services
[ https://issues.apache.org/jira/browse/TEZ-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14662509#comment-14662509 ] TezQA commented on TEZ-2003: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12749323/2003_20150807.1.txt against master revision eadbfec. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 52 new or modified test files. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/969//console This message is automatically generated. > [Umbrella] Allow Tez to co-ordinate execution to external services > -- > > Key: TEZ-2003 > URL: https://issues.apache.org/jira/browse/TEZ-2003 > Project: Apache Tez > Issue Type: Improvement >Reporter: Siddharth Seth > Attachments: 2003_20150728.1.txt, 2003_20150807.1.txt, Tez With > External Services.pdf > > > The Tez engine itself takes care of co-ordinating execution - controlling how > data gets routed (different connection patterns), fault tolerance, scheduling > of work, etc. > This is currently tied to TaskSpecs defined within Tez and on containers > launched by Tez itself (TezChild). > The proposal is to allow Tez to work with external services instead of just > containers launched by Tez. This involves several more pluggable layers to > work with alternate Task Specifications, custom launch and task allocation > mechanics, as well as custom scheduling sources. > A simple example would be a simple a process with the capability to execute > multiple Tez TaskSpecs as threads. In such a case, a container launch isn't > really need and can be mocked. Sourcing / scheduling containers would need to > be pluggable. > A more advanced example would be LLAP (HIVE-7926; > https://issues.apache.org/jira/secure/attachment/12665704/LLAPdesigndocument.pdf). > This works with custom interfaces - which would need to be supported by Tez, > along with a custom event model which would need translation hooks. > Tez should be able to work with a combination of certain vertices running in > external services and others running in regular Tez containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Failed: TEZ-2003 PreCommit Build #969
Jira: https://issues.apache.org/jira/browse/TEZ-2003 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/969/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 269 lines...] == == Determining number of patched javac warnings. == == /home/jenkins/tools/maven/latest/bin/mvn clean test -DskipTests -Ptest-patch > /home/jenkins/jenkins-slave/workspace/PreCommit-TEZ-Build/../patchprocess/patchJavacWarnings.txt 2>&1 {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12749323/2003_20150807.1.txt against master revision eadbfec. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 52 new or modified test files. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/969//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. 4fd4d156e435c5cbbd46aa737e45b01d48afa038 logged out == == Finished build. == == Build step 'Execute shell' marked build as failure Archiving artifacts Sending artifact delta relative to PreCommit-TEZ-Build #965 Archived 3 artifacts Archive block size is 32768 Received 0 blocks and 955359 bytes Compression is 0.0% Took 0.39 sec [description-setter] Could not determine description. Recording test results Email was triggered for: Failure Sending email for trigger: Failure ### ## FAILED TESTS (if any) ## No tests ran.
[jira] [Commented] (TEZ-2666) Enhancements to TaskCommunicator and TaskCommunicatorContext interface
[ https://issues.apache.org/jira/browse/TEZ-2666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14662458#comment-14662458 ] Siddharth Seth commented on TEZ-2666: - Renames registerRunningContainer - registerNewContainer registerRunningTaskAttempt - registerTaskAttempt unregisterRunningTaskAttempt - registerTaskAttemptEnd Provide an API for task preemption. Differentiate between task already completed vs task needs to be completed Get rid of getAddress() > Enhancements to TaskCommunicator and TaskCommunicatorContext interface > -- > > Key: TEZ-2666 > URL: https://issues.apache.org/jira/browse/TEZ-2666 > Project: Apache Tez > Issue Type: Sub-task >Affects Versions: TEZ-2003 >Reporter: Siddharth Seth >Assignee: Siddharth Seth > > TaskCommunicator > // - registerContainerEnd should provide the end reason / possible rename > // - get rid of getAddress > // - Add methods to support task preemption > // - Add a dagStarted notification, along with a payload > TaskCommunicatorContext > // - Consolidate usage of IDs > // - Split the heartbeat API to a liveness check and a status update > // - Rename and consolidate TaskHeartbeatResponse and TaskHeartbeatRequest > // - Fix taskStarted needs to be invoked before launching the actual task. > // - Potentially add methods to report availability stats to the scheduler > // - Report taskSuccess via a method instead of the heartbeat > // - Add methods to signal container / task state changes > // - Maybe add book-keeping as a helper library, instead of each impl > tracking container to task etc. > // - Handling of containres / tasks which no longer exist in the system > (formalized interface instead of a shouldDie notification) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2702) Enhancements to ContainerLauncher and ContainerLauncherContext
[ https://issues.apache.org/jira/browse/TEZ-2702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14662302#comment-14662302 ] Siddharth Seth commented on TEZ-2702: - containerStopped - needs to be cleaned up. Remove TaskEndReason > Enhancements to ContainerLauncher and ContainerLauncherContext > -- > > Key: TEZ-2702 > URL: https://issues.apache.org/jira/browse/TEZ-2702 > Project: Apache Tez > Issue Type: Sub-task >Affects Versions: TEZ-2003 >Reporter: Siddharth Seth >Assignee: Siddharth Seth > > Get rid of YARN constructs. > Add methods to provide the state of the DAG / AM > ContainerToken not always required. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TEZ-2702) Enhancements to ContainerLauncher and ContainerLauncherContext
Siddharth Seth created TEZ-2702: --- Summary: Enhancements to ContainerLauncher and ContainerLauncherContext Key: TEZ-2702 URL: https://issues.apache.org/jira/browse/TEZ-2702 Project: Apache Tez Issue Type: Sub-task Affects Versions: TEZ-2003 Reporter: Siddharth Seth Assignee: Siddharth Seth Get rid of YARN constructs. Add methods to provide the state of the DAG / AM ContainerToken not always required. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2668) Enhancements to TaskScheduler and TaskSchedulerContext APIs
[ https://issues.apache.org/jira/browse/TEZ-2668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14662281#comment-14662281 ] Siddharth Seth commented on TEZ-2668: - Also. // appShutdownRequested not relevant to non YARN schedulers / non centrally managed schedulers // secretKey in setApplicaitonRegistrationData not always relevant. // getAppTrackingURL // containerBeingReleased - should not be required // Move hasUnregistered from TaskScheduler to TaskSchedulerContext as a setHasUnregistered > Enhancements to TaskScheduler and TaskSchedulerContext APIs > --- > > Key: TEZ-2668 > URL: https://issues.apache.org/jira/browse/TEZ-2668 > Project: Apache Tez > Issue Type: Sub-task >Affects Versions: TEZ-2003 >Reporter: Siddharth Seth >Assignee: Siddharth Seth > > Some of the required enhancements > TaskScheduler > // - Should setRegister / unregister be part of APIs when not YARN specific > ? > // - Include vertex / task information in therequest so that the scheduler > can make decisions > // around prioritizing tasks in the same vertex when others exist at the > same priority. > TaskSchedulerContext > // - setApplicationRegistrationData may not be relevant to non YARN clusters > // - getAppFinalStatus may not be relevant to non YARN clusters -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2692) bugfixes & enhancements related to job parser and analyzer
[ https://issues.apache.org/jira/browse/TEZ-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14662256#comment-14662256 ] Bikas Saha commented on TEZ-2692: - The is Guava usage but there is no Guava in the pom file? We need a test case where we parse simplehistory log and ATS log and verify that both DAGInfo's are the same. Else we cannot be sure that going forward things wont break. Should we fix SimpleHistory logging to make sure that the reader does not have workarounds like these? Or is the workaround for an organizational difference between ATS format and SimpleHistory format that cannot be fixed in the product? {code}+ long totalTime = vertexInfo.getLastTaskFinishTimeInterval() - vertexInfo + .getFirstTaskStartTimeInterval();{code} Why are we subtracting time intervals? If first and last task run for equal times then totalTime == 0. Is that the intention? If not, then should we be doing lastTaskFinishTime() - firstTaskStartTime(); TaskConcurrency could be figured out by sorting attempt events by startTime and stopTime for all attempts that actually ran. And then walking that sorted list. Inc counter for every startEvent and decrease the counter for every stopEvent. This would create 2X number of points in the timeline (where X is the number of attempts that actually ran) vs the current artificial 5 second boundary that may be too small or too large depending on the job. Thoughts? > bugfixes & enhancements related to job parser and analyzer > -- > > Key: TEZ-2692 > URL: https://issues.apache.org/jira/browse/TEZ-2692 > Project: Apache Tez > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Attachments: TEZ-2692.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2003) [Umbrella] Allow Tez to co-ordinate execution to external services
[ https://issues.apache.org/jira/browse/TEZ-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated TEZ-2003: Attachment: 2003_20150807.1.txt Another patch for jenkins. > [Umbrella] Allow Tez to co-ordinate execution to external services > -- > > Key: TEZ-2003 > URL: https://issues.apache.org/jira/browse/TEZ-2003 > Project: Apache Tez > Issue Type: Improvement >Reporter: Siddharth Seth > Attachments: 2003_20150728.1.txt, 2003_20150807.1.txt, Tez With > External Services.pdf > > > The Tez engine itself takes care of co-ordinating execution - controlling how > data gets routed (different connection patterns), fault tolerance, scheduling > of work, etc. > This is currently tied to TaskSpecs defined within Tez and on containers > launched by Tez itself (TezChild). > The proposal is to allow Tez to work with external services instead of just > containers launched by Tez. This involves several more pluggable layers to > work with alternate Task Specifications, custom launch and task allocation > mechanics, as well as custom scheduling sources. > A simple example would be a simple a process with the capability to execute > multiple Tez TaskSpecs as threads. In such a case, a container launch isn't > really need and can be mocked. Sourcing / scheduling containers would need to > be pluggable. > A more advanced example would be LLAP (HIVE-7926; > https://issues.apache.org/jira/secure/attachment/12665704/LLAPdesigndocument.pdf). > This works with custom interfaces - which would need to be supported by Tez, > along with a custom event model which would need translation hooks. > Tez should be able to work with a combination of certain vertices running in > external services and others running in regular Tez containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2658) Create a CLI utility tool to track Tez DAG/Application Stats
[ https://issues.apache.org/jira/browse/TEZ-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660847#comment-14660847 ] Saikat edited comment on TEZ-2658 at 8/7/15 5:29 PM: - Approach: create an instance of DagClientImpl to get the dag status/progress/counters etc. Limitation: For listing all dags for an appid, DagClient doesnot expose any api. Hence hooked directly into DAGClientRPCImpl to get all dags. Similar api needed for DAGClientTimelineImpl to fetch all dags from timeline server. Need a jira for this. Because of this limitation: tez.sh dag -status can only fetch all dags for a live AM.(since this command talks via RPC layer to fetch the list of dags from the live am) Added a README.txt to list current capabilities. was (Author: saikatr): Approach: create an instance of DagClientImpl to get the dag status/progress/counters etc. Limitation: For listing all dags for an appid, DagClient doesnot expose any api. Hence hooked directly into DAGClientRPCImpl to get all dags. Similar api needed for DAGClientTimelineImpl to fetch all dags from timeline server. Need a jira for this. Because of this limitation: tez.sh dag -status can only fetch all dags for a live AM. Added a README.txt to list current capabilities. > Create a CLI utility tool to track Tez DAG/Application Stats > > > Key: TEZ-2658 > URL: https://issues.apache.org/jira/browse/TEZ-2658 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2658.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2658) Create a CLI utility tool to track Tez DAG/Application Stats
[ https://issues.apache.org/jira/browse/TEZ-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14662116#comment-14662116 ] Saikat edited comment on TEZ-2658 at 8/7/15 5:27 PM: - [~rohini] request you to give some feedback on the tool and what other additional options will be useful (for future versions). I have tested the tool for running dag status/progress/counters on my local setup. There is a README.txt in tez-cli-tools project folder on instructions about how to use the tool was (Author: saikatr): [~rohini] request you to give some feedback on the tool and what other additional options will be useful (for future versions). I have tested the tool for running dag status/progress/counters on my local setup. > Create a CLI utility tool to track Tez DAG/Application Stats > > > Key: TEZ-2658 > URL: https://issues.apache.org/jira/browse/TEZ-2658 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2658.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2658) Create a CLI utility tool to track Tez DAG/Application Stats
[ https://issues.apache.org/jira/browse/TEZ-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14662116#comment-14662116 ] Saikat commented on TEZ-2658: - [~rohini] request you to give some feedback on the tool and what other additional options will be useful (for future versions). I have tested the tool for running dag status/progress/counters on my local setup. > Create a CLI utility tool to track Tez DAG/Application Stats > > > Key: TEZ-2658 > URL: https://issues.apache.org/jira/browse/TEZ-2658 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2658.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (TEZ-2300) TezClient.stop() takes a lot of time or does not work sometimes
[ https://issues.apache.org/jira/browse/TEZ-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles reassigned TEZ-2300: Assignee: Jonathan Eagles > TezClient.stop() takes a lot of time or does not work sometimes > --- > > Key: TEZ-2300 > URL: https://issues.apache.org/jira/browse/TEZ-2300 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy >Assignee: Jonathan Eagles > Attachments: syslog_dag_1428329756093_325099_1_post > > > Noticed this with a couple of pig scripts which were not behaving well (AM > close to OOM, etc) and even with some that were running fine. Pig calls > Tezclient.stop() in shutdown hook. Ctrl+C to the pig script either exits > immediately or is hung. In both cases it either takes a long time for the > yarn application to go to KILLED state. Many times I just end up calling yarn > application -kill separately after waiting for 5 mins or more for it to get > killed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2300) TezClient.stop() takes a lot of time or does not work sometimes
[ https://issues.apache.org/jira/browse/TEZ-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14662011#comment-14662011 ] Jonathan Eagles commented on TEZ-2300: -- MapReduce implementation works in a blocking manner, giving sufficient time for a normal shutdown followed by a forcing kill as Rohini suggests. This modification will give Pig/Hive/Others a more functional api. {code:title=YARNRunner.java} @Override public void killJob(JobID arg0) throws IOException, InterruptedException { /* check if the status is not running, if not send kill to RM */ JobStatus status = clientCache.getClient(arg0).getJobStatus(arg0); ApplicationId appId = TypeConverter.toYarn(arg0).getAppId(); // get status from RM and return if (status == null) { killUnFinishedApplication(appId); return; } if (status.getState() != JobStatus.State.RUNNING) { killApplication(appId); return; } try { /* send a kill to the AM */ clientCache.getClient(arg0).killJob(arg0); long currentTimeMillis = System.currentTimeMillis(); long timeKillIssued = currentTimeMillis; long killTimeOut = conf.getLong(MRJobConfig.MR_AM_HARD_KILL_TIMEOUT_MS, MRJobConfig.DEFAULT_MR_AM_HARD_KILL_TIMEOUT_MS); while ((currentTimeMillis < timeKillIssued + killTimeOut) && !isJobInTerminalState(status)) { try { Thread.sleep(1000L); } catch (InterruptedException ie) { /** interrupted, just break */ break; } currentTimeMillis = System.currentTimeMillis(); status = clientCache.getClient(arg0).getJobStatus(arg0); if (status == null) { killUnFinishedApplication(appId); return; } } } catch(IOException io) { LOG.debug("Error when checking for application status", io); } if (status != null && !isJobInTerminalState(status)) { killApplication(appId); } } {code} > TezClient.stop() takes a lot of time or does not work sometimes > --- > > Key: TEZ-2300 > URL: https://issues.apache.org/jira/browse/TEZ-2300 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy > Attachments: syslog_dag_1428329756093_325099_1_post > > > Noticed this with a couple of pig scripts which were not behaving well (AM > close to OOM, etc) and even with some that were running fine. Pig calls > Tezclient.stop() in shutdown hook. Ctrl+C to the pig script either exits > immediately or is hung. In both cases it either takes a long time for the > yarn application to go to KILLED state. Many times I just end up calling yarn > application -kill separately after waiting for 5 mins or more for it to get > killed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)