[jira] [Commented] (HIVE-10959) Templeton launcher job should reconnect to the running child job on task retry

2015-06-08 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14577914#comment-14577914
 ] 

Ivan Mitic commented on HIVE-10959:
---

RR: https://reviews.apache.org/r/35226/

> Templeton launcher job should reconnect to the running child job on task retry
> --
>
> Key: HIVE-10959
> URL: https://issues.apache.org/jira/browse/HIVE-10959
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.15.0
>Reporter: Ivan Mitic
>Assignee: Ivan Mitic
> Attachments: HIVE-10959.patch
>
>
> Currently, Templeton launcher kills all child jobs (jobs tagged with the 
> parent job's id) upon task retry. 
> Upon templeton launcher task retry, templeton should reconnect to the running 
> job and continue tracking its progress that way. 
> This logic cannot be used for all job kinds (e.g. for jobs that are driven by 
> the client side like regular hive). However, for MapReduceV2, and possibly 
> Tez and HiveOnTez, this should be the default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10959) Templeton launcher job should reconnect to the running child job on task retry

2015-06-08 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14578212#comment-14578212
 ] 

Thejas M Nair commented on HIVE-10959:
--

The jar command executes a java function in the class specified. It is possible 
that the command runs more than one MR job (lets say n), and the launcher job 
gets killed during a n-x job, the remaining x jobs will not be executed.
But I guess the way things are written in webhcat, (progresss notifcation etc), 
it is not written to work with more than one MR job being run by the function 
in class specified, so this patch should be fine. Any thoughts on that 
[~ivanmi] ?


> Templeton launcher job should reconnect to the running child job on task retry
> --
>
> Key: HIVE-10959
> URL: https://issues.apache.org/jira/browse/HIVE-10959
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.15.0
>Reporter: Ivan Mitic
>Assignee: Ivan Mitic
> Attachments: HIVE-10959.patch
>
>
> Currently, Templeton launcher kills all child jobs (jobs tagged with the 
> parent job's id) upon task retry. 
> Upon templeton launcher task retry, templeton should reconnect to the running 
> job and continue tracking its progress that way. 
> This logic cannot be used for all job kinds (e.g. for jobs that are driven by 
> the client side like regular hive). However, for MapReduceV2, and possibly 
> Tez and HiveOnTez, this should be the default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10959) Templeton launcher job should reconnect to the running child job on task retry

2015-06-08 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14578231#comment-14578231
 ] 

Ivan Mitic commented on HIVE-10959:
---

Thanks for reviewing Thejas!

Child jobs are tagged with parent's job Id. So even if there is more then one 
job, we should be able to find them when we query for all child jobs (I know 
this works for hive/pig jobs which spawn more then one mr job - I tested this). 
I assume user can do the wrong thing here by not carrying the tag explicitly, 
but I would argue this is not supported. 

In this patch I log a warning if we detect more then one child job in case of 
MR. Another possibly better way to handle this is to say that reconnect is not 
supported in this case, and let the regular code path handle this (kill all 
child jobs and relaunch). Let me know what you think.

> Templeton launcher job should reconnect to the running child job on task retry
> --
>
> Key: HIVE-10959
> URL: https://issues.apache.org/jira/browse/HIVE-10959
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.15.0
>Reporter: Ivan Mitic
>Assignee: Ivan Mitic
> Attachments: HIVE-10959.patch
>
>
> Currently, Templeton launcher kills all child jobs (jobs tagged with the 
> parent job's id) upon task retry. 
> Upon templeton launcher task retry, templeton should reconnect to the running 
> job and continue tracking its progress that way. 
> This logic cannot be used for all job kinds (e.g. for jobs that are driven by 
> the client side like regular hive). However, for MapReduceV2, and possibly 
> Tez and HiveOnTez, this should be the default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10959) Templeton launcher job should reconnect to the running child job on task retry

2015-06-09 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14579845#comment-14579845
 ] 

Thejas M Nair commented on HIVE-10959:
--

+1

> Templeton launcher job should reconnect to the running child job on task retry
> --
>
> Key: HIVE-10959
> URL: https://issues.apache.org/jira/browse/HIVE-10959
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.15.0
>Reporter: Ivan Mitic
>Assignee: Ivan Mitic
> Attachments: HIVE-10959.2.patch, HIVE-10959.3.patch, HIVE-10959.patch
>
>
> Currently, Templeton launcher kills all child jobs (jobs tagged with the 
> parent job's id) upon task retry. 
> Upon templeton launcher task retry, templeton should reconnect to the running 
> job and continue tracking its progress that way. 
> This logic cannot be used for all job kinds (e.g. for jobs that are driven by 
> the client side like regular hive). However, for MapReduceV2, and possibly 
> Tez and HiveOnTez, this should be the default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10959) Templeton launcher job should reconnect to the running child job on task retry

2015-06-09 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14580058#comment-14580058
 ] 

Hive QA commented on HIVE-10959:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12738697/HIVE-10959.3.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4235/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4235/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4235/

Messages:
{noformat}
 This message was trimmed, see log for full details 
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ 
hive-hcatalog-server-extensions ---
[INFO] Executing tasks

main:
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/hcatalog/server-extensions/target/tmp
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/hcatalog/server-extensions/target/warehouse
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/hcatalog/server-extensions/target/tmp/conf
 [copy] Copying 11 files to 
/data/hive-ptest/working/apache-github-source-source/hcatalog/server-extensions/target/tmp/conf
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ 
hive-hcatalog-server-extensions ---
[INFO] Compiling 2 source files to 
/data/hive-ptest/working/apache-github-source-source/hcatalog/server-extensions/target/test-classes
[INFO] 
[INFO] --- maven-surefire-plugin:2.16:test (default-test) @ 
hive-hcatalog-server-extensions ---
[INFO] Tests are skipped.
[INFO] 
[INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ 
hive-hcatalog-server-extensions ---
[INFO] Building jar: 
/data/hive-ptest/working/apache-github-source-source/hcatalog/server-extensions/target/hive-hcatalog-server-extensions-2.0.0-SNAPSHOT.jar
[INFO] 
[INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ 
hive-hcatalog-server-extensions ---
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ 
hive-hcatalog-server-extensions ---
[INFO] Installing 
/data/hive-ptest/working/apache-github-source-source/hcatalog/server-extensions/target/hive-hcatalog-server-extensions-2.0.0-SNAPSHOT.jar
 to 
/home/hiveptest/.m2/repository/org/apache/hive/hcatalog/hive-hcatalog-server-extensions/2.0.0-SNAPSHOT/hive-hcatalog-server-extensions-2.0.0-SNAPSHOT.jar
[INFO] Installing 
/data/hive-ptest/working/apache-github-source-source/hcatalog/server-extensions/pom.xml
 to 
/home/hiveptest/.m2/repository/org/apache/hive/hcatalog/hive-hcatalog-server-extensions/2.0.0-SNAPSHOT/hive-hcatalog-server-extensions-2.0.0-SNAPSHOT.pom
[INFO] 
[INFO] 
[INFO] Building Hive HCatalog Webhcat Java Client 2.0.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ 
hive-webhcat-java-client ---
[INFO] Deleting 
/data/hive-ptest/working/apache-github-source-source/hcatalog/webhcat/java-client/target
[INFO] Deleting 
/data/hive-ptest/working/apache-github-source-source/hcatalog/webhcat/java-client
 (includes = [datanucleus.log, derby.log], excludes = [])
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ 
hive-webhcat-java-client ---
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (default) @ 
hive-webhcat-java-client ---
[INFO] 
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ 
hive-webhcat-java-client ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory 
/data/hive-ptest/working/apache-github-source-source/hcatalog/webhcat/java-client/src/main/resources
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ 
hive-webhcat-java-client ---
[INFO] Executing tasks

main:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ 
hive-webhcat-java-client ---
[INFO] Compiling 36 source files to 
/data/hive-ptest/working/apache-github-source-source/hcatalog/webhcat/java-client/target/classes
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/hcatalog/webhcat/java-client/src/main/java/org/apache/hive/hcatalog/api/HCatClientHMSImpl.java:
 
/data/hive-ptest/working/apache-github-source-source/hcatalog/webhcat/java-client/src/main/java/org/apache/hive/hcatalog/api/HCatClientHMSImpl.java
 uses or overrides a deprecated API.
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/hcatalog/webh

[jira] [Commented] (HIVE-10959) Templeton launcher job should reconnect to the running child job on task retry

2015-06-10 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14581433#comment-14581433
 ] 

Hive QA commented on HIVE-10959:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12738939/HIVE-10959.4.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9007 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join28
org.apache.hive.beeline.TestSchemaTool.testSchemaInit
org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4243/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4243/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4243/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12738939 - PreCommit-HIVE-TRUNK-Build

> Templeton launcher job should reconnect to the running child job on task retry
> --
>
> Key: HIVE-10959
> URL: https://issues.apache.org/jira/browse/HIVE-10959
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.15.0
>Reporter: Ivan Mitic
>Assignee: Ivan Mitic
> Attachments: HIVE-10959.2.patch, HIVE-10959.3.patch, 
> HIVE-10959.4.patch, HIVE-10959.patch
>
>
> Currently, Templeton launcher kills all child jobs (jobs tagged with the 
> parent job's id) upon task retry. 
> Upon templeton launcher task retry, templeton should reconnect to the running 
> job and continue tracking its progress that way. 
> This logic cannot be used for all job kinds (e.g. for jobs that are driven by 
> the client side like regular hive). However, for MapReduceV2, and possibly 
> Tez and HiveOnTez, this should be the default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10959) Templeton launcher job should reconnect to the running child job on task retry

2015-06-10 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14581550#comment-14581550
 ] 

Thejas M Nair commented on HIVE-10959:
--

The test failures are unrelated.


> Templeton launcher job should reconnect to the running child job on task retry
> --
>
> Key: HIVE-10959
> URL: https://issues.apache.org/jira/browse/HIVE-10959
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.15.0
>Reporter: Ivan Mitic
>Assignee: Ivan Mitic
> Attachments: HIVE-10959.2.patch, HIVE-10959.3.patch, 
> HIVE-10959.4.patch, HIVE-10959.patch
>
>
> Currently, Templeton launcher kills all child jobs (jobs tagged with the 
> parent job's id) upon task retry. 
> Upon templeton launcher task retry, templeton should reconnect to the running 
> job and continue tracking its progress that way. 
> This logic cannot be used for all job kinds (e.g. for jobs that are driven by 
> the client side like regular hive). However, for MapReduceV2, and possibly 
> Tez and HiveOnTez, this should be the default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)