[jira] [Commented] (HIVE-5857) Reduce tasks do not work in uber mode in YARN

2014-08-31 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14116962#comment-14116962
 ] 

Ashutosh Chauhan commented on HIVE-5857:


+1
LGTM, unless [~appodictic] has some suggestion on how to achieve what he 
suggested.

> Reduce tasks do not work in uber mode in YARN
> -
>
> Key: HIVE-5857
> URL: https://issues.apache.org/jira/browse/HIVE-5857
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.12.0, 0.13.0, 0.13.1
>Reporter: Adam Kawa
>Assignee: Adam Kawa
>Priority: Critical
>  Labels: plan, uber-jar, uberization, yarn
> Fix For: 0.13.0
>
> Attachments: HIVE-5857.1.patch.txt, HIVE-5857.2.patch, 
> HIVE-5857.3.patch, HIVE-5857.4.patch
>
>
> A Hive query fails when it tries to run a reduce task in uber mode in YARN.
> The NullPointerException is thrown in the ExecReducer.configure method, 
> because the plan file (reduce.xml) for a reduce task is not found.
> The Utilities.getBaseWork method is expected to return BaseWork object, but 
> it returns NULL due to FileNotFoundException. 
> {code}
> // org.apache.hadoop.hive.ql.exec.Utilities
> public static BaseWork getBaseWork(Configuration conf, String name) {
>   ...
> try {
> ...
>   if (gWork == null) {
> Path localPath;
> if (ShimLoader.getHadoopShims().isLocalMode(conf)) {
>   localPath = path;
> } else {
>   localPath = new Path(name);
> }
> InputStream in = new FileInputStream(localPath.toUri().getPath());
> BaseWork ret = deserializePlan(in);
> 
>   }
>   return gWork;
> } catch (FileNotFoundException fnf) {
>   // happens. e.g.: no reduce work.
>   LOG.debug("No plan file found: "+path);
>   return null;
> } ...
> }
> {code}
> It happens because, the ShimLoader.getHadoopShims().isLocalMode(conf)) method 
> returns true, because immediately before running a reduce task, 
> org.apache.hadoop.mapred.LocalContainerLauncher changes its configuration to 
> local mode ("mapreduce.framework.name" is changed from" "yarn" to "local"). 
> On the other hand map tasks run successfully, because its configuration is 
> not changed and still remains "yarn".
> {code}
> // org.apache.hadoop.mapred.LocalContainerLauncher
> private void runSubtask(..) {
>   ...
>   conf.set(MRConfig.FRAMEWORK_NAME, MRConfig.LOCAL_FRAMEWORK_NAME);
>   conf.set(MRConfig.MASTER_ADDRESS, "local");  // bypass shuffle
>   ReduceTask reduce = (ReduceTask)task;
>   reduce.setConf(conf);  
>   reduce.run(conf, umbilical);
> }
> {code}
> A super quick fix could just an additional if-branch, where we check if we 
> run a reduce task in uber mode, and then look for a plan file in a different 
> location.
> *Java stacktrace*
> {code}
> 2013-11-20 00:50:56,862 INFO [uber-SubtaskRunner] 
> org.apache.hadoop.hive.ql.exec.Utilities: No plan file found: 
> hdfs://namenode.c.lon.spotify.net:54310/var/tmp/kawaa/hive_2013-11-20_00-50-43_888_3938384086824086680-2/-mr-10003/e3caacf6-15d6-4987-b186-d2906791b5b0/reduce.xml
> 2013-11-20 00:50:56,862 WARN [uber-SubtaskRunner] 
> org.apache.hadoop.mapred.LocalContainerLauncher: Exception running local 
> (uberized) 'child' : java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:427)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
>   at 
> org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.runSubtask(LocalContainerLauncher.java:340)
>   at 
> org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.run(LocalContainerLauncher.java:225)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
>   ... 7 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:116)
>   ... 12 more
> 2013-11-20 00:50:56,862 INFO [uber-SubtaskRunner] 
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Statu

[jira] [Commented] (HIVE-5857) Reduce tasks do not work in uber mode in YARN

2014-08-29 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14116205#comment-14116205
 ] 

Hive QA commented on HIVE-5857:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12665524/HIVE-5857.4.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 6133 tests executed
*Failed tests:*
{noformat}
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/571/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/571/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-571/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12665524

> Reduce tasks do not work in uber mode in YARN
> -
>
> Key: HIVE-5857
> URL: https://issues.apache.org/jira/browse/HIVE-5857
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.12.0, 0.13.0, 0.13.1
>Reporter: Adam Kawa
>Assignee: Adam Kawa
>Priority: Critical
>  Labels: plan, uber-jar, uberization, yarn
> Fix For: 0.13.0
>
> Attachments: HIVE-5857.1.patch.txt, HIVE-5857.2.patch, 
> HIVE-5857.3.patch, HIVE-5857.4.patch
>
>
> A Hive query fails when it tries to run a reduce task in uber mode in YARN.
> The NullPointerException is thrown in the ExecReducer.configure method, 
> because the plan file (reduce.xml) for a reduce task is not found.
> The Utilities.getBaseWork method is expected to return BaseWork object, but 
> it returns NULL due to FileNotFoundException. 
> {code}
> // org.apache.hadoop.hive.ql.exec.Utilities
> public static BaseWork getBaseWork(Configuration conf, String name) {
>   ...
> try {
> ...
>   if (gWork == null) {
> Path localPath;
> if (ShimLoader.getHadoopShims().isLocalMode(conf)) {
>   localPath = path;
> } else {
>   localPath = new Path(name);
> }
> InputStream in = new FileInputStream(localPath.toUri().getPath());
> BaseWork ret = deserializePlan(in);
> 
>   }
>   return gWork;
> } catch (FileNotFoundException fnf) {
>   // happens. e.g.: no reduce work.
>   LOG.debug("No plan file found: "+path);
>   return null;
> } ...
> }
> {code}
> It happens because, the ShimLoader.getHadoopShims().isLocalMode(conf)) method 
> returns true, because immediately before running a reduce task, 
> org.apache.hadoop.mapred.LocalContainerLauncher changes its configuration to 
> local mode ("mapreduce.framework.name" is changed from" "yarn" to "local"). 
> On the other hand map tasks run successfully, because its configuration is 
> not changed and still remains "yarn".
> {code}
> // org.apache.hadoop.mapred.LocalContainerLauncher
> private void runSubtask(..) {
>   ...
>   conf.set(MRConfig.FRAMEWORK_NAME, MRConfig.LOCAL_FRAMEWORK_NAME);
>   conf.set(MRConfig.MASTER_ADDRESS, "local");  // bypass shuffle
>   ReduceTask reduce = (ReduceTask)task;
>   reduce.setConf(conf);  
>   reduce.run(conf, umbilical);
> }
> {code}
> A super quick fix could just an additional if-branch, where we check if we 
> run a reduce task in uber mode, and then look for a plan file in a different 
> location.
> *Java stacktrace*
> {code}
> 2013-11-20 00:50:56,862 INFO [uber-SubtaskRunner] 
> org.apache.hadoop.hive.ql.exec.Utilities: No plan file found: 
> hdfs://namenode.c.lon.spotify.net:54310/var/tmp/kawaa/hive_2013-11-20_00-50-43_888_3938384086824086680-2/-mr-10003/e3caacf6-15d6-4987-b186-d2906791b5b0/reduce.xml
> 2013-11-20 00:50:56,862 WARN [uber-SubtaskRunner] 
> org.apache.hadoop.mapred.LocalContainerLauncher: Exception running local 
> (uberized) 'child' : java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:427)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
>   at 
> org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.runSubtask(LocalContainerLauncher.java

[jira] [Commented] (HIVE-5857) Reduce tasks do not work in uber mode in YARN

2014-06-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14030054#comment-14030054
 ] 

Hive QA commented on HIVE-5857:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12649989/HIVE-5857.3.patch

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 5610 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_binary_storage_queries
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_insert1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_load_dyn_part1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_scriptfile1
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas
org.apache.hive.hcatalog.templeton.tool.TestTempletonUtils.testPropertiesParsing
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/448/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/448/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-448/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12649989

> Reduce tasks do not work in uber mode in YARN
> -
>
> Key: HIVE-5857
> URL: https://issues.apache.org/jira/browse/HIVE-5857
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.12.0, 0.13.0, 0.13.1
>Reporter: Adam Kawa
>Assignee: Adam Kawa
>Priority: Critical
>  Labels: plan, uber-jar, uberization, yarn
> Fix For: 0.13.0
>
> Attachments: HIVE-5857.1.patch.txt, HIVE-5857.2.patch, 
> HIVE-5857.3.patch
>
>
> A Hive query fails when it tries to run a reduce task in uber mode in YARN.
> The NullPointerException is thrown in the ExecReducer.configure method, 
> because the plan file (reduce.xml) for a reduce task is not found.
> The Utilities.getBaseWork method is expected to return BaseWork object, but 
> it returns NULL due to FileNotFoundException. 
> {code}
> // org.apache.hadoop.hive.ql.exec.Utilities
> public static BaseWork getBaseWork(Configuration conf, String name) {
>   ...
> try {
> ...
>   if (gWork == null) {
> Path localPath;
> if (ShimLoader.getHadoopShims().isLocalMode(conf)) {
>   localPath = path;
> } else {
>   localPath = new Path(name);
> }
> InputStream in = new FileInputStream(localPath.toUri().getPath());
> BaseWork ret = deserializePlan(in);
> 
>   }
>   return gWork;
> } catch (FileNotFoundException fnf) {
>   // happens. e.g.: no reduce work.
>   LOG.debug("No plan file found: "+path);
>   return null;
> } ...
> }
> {code}
> It happens because, the ShimLoader.getHadoopShims().isLocalMode(conf)) method 
> returns true, because immediately before running a reduce task, 
> org.apache.hadoop.mapred.LocalContainerLauncher changes its configuration to 
> local mode ("mapreduce.framework.name" is changed from" "yarn" to "local"). 
> On the other hand map tasks run successfully, because its configuration is 
> not changed and still remains "yarn".
> {code}
> // org.apache.hadoop.mapred.LocalContainerLauncher
> private void runSubtask(..) {
>   ...
>   conf.set(MRConfig.FRAMEWORK_NAME, MRConfig.LOCAL_FRAMEWORK_NAME);
>   conf.set(MRConfig.MASTER_ADDRESS, "local");  // bypass shuffle
>   ReduceTask reduce = (ReduceTask)task;
>   reduce.setConf(conf);  
>   reduce.run(conf, umbilical);
> }
> {code}
> A super quick fix could just an additional if-branch, where we check if we 
> run a reduce task in uber mode, and then look for a plan file in a different 
> location.
> *Java stacktrace*
> {code}
> 2013-11-20 00:50:56,862 INFO [uber-SubtaskRunner] 
> org.apache.hadoop.hive.ql.exec.Utilities: No plan file found: 
> hdfs://namenode.c.lon.spotify.net:54310/var/tmp/kawaa/hive_2013-11-20_00-50-43_888_3938384086824086680-2/-mr-10003/e3caacf6-15d6-4987-b186-d2906791b5b0/reduce.xml
> 2013-11-20 00:50:56,862 WARN [uber-SubtaskRunner] 
> org.apache.hadoop.mapred.LocalContainerLauncher: Exception running local 
> (uberized) 'child' : java.lang.RuntimeException: Error in 

[jira] [Commented] (HIVE-5857) Reduce tasks do not work in uber mode in YARN

2014-06-12 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14029325#comment-14029325
 ] 

Edward Capriolo commented on HIVE-5857:
---

{code}
 } catch (FileNotFoundException fnf) {
>   // happens. e.g.: no reduce work.
>   LOG.debug("No plan file found: "+path);
>   return null;
> } ...
{code}

Can we remove this code? This bothers me. It is not self documenting all. Can 
we use if statements to determine when the file should be there and when it 
should not. 

Something like:
if (job.hasNoReduceWork()){
  retur null;
} else {
throw RuntimeException("work should be found but was not" + expectedPathToFile);

> Reduce tasks do not work in uber mode in YARN
> -
>
> Key: HIVE-5857
> URL: https://issues.apache.org/jira/browse/HIVE-5857
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.12.0, 0.13.0, 0.13.1
>Reporter: Adam Kawa
>Assignee: Adam Kawa
>Priority: Critical
>  Labels: plan, uber-jar, uberization, yarn
> Fix For: 0.13.0
>
> Attachments: HIVE-5857.1.patch.txt, HIVE-5857.2.patch, 
> HIVE-5857.3.patch
>
>
> A Hive query fails when it tries to run a reduce task in uber mode in YARN.
> The NullPointerException is thrown in the ExecReducer.configure method, 
> because the plan file (reduce.xml) for a reduce task is not found.
> The Utilities.getBaseWork method is expected to return BaseWork object, but 
> it returns NULL due to FileNotFoundException. 
> {code}
> // org.apache.hadoop.hive.ql.exec.Utilities
> public static BaseWork getBaseWork(Configuration conf, String name) {
>   ...
> try {
> ...
>   if (gWork == null) {
> Path localPath;
> if (ShimLoader.getHadoopShims().isLocalMode(conf)) {
>   localPath = path;
> } else {
>   localPath = new Path(name);
> }
> InputStream in = new FileInputStream(localPath.toUri().getPath());
> BaseWork ret = deserializePlan(in);
> 
>   }
>   return gWork;
> } catch (FileNotFoundException fnf) {
>   // happens. e.g.: no reduce work.
>   LOG.debug("No plan file found: "+path);
>   return null;
> } ...
> }
> {code}
> It happens because, the ShimLoader.getHadoopShims().isLocalMode(conf)) method 
> returns true, because immediately before running a reduce task, 
> org.apache.hadoop.mapred.LocalContainerLauncher changes its configuration to 
> local mode ("mapreduce.framework.name" is changed from" "yarn" to "local"). 
> On the other hand map tasks run successfully, because its configuration is 
> not changed and still remains "yarn".
> {code}
> // org.apache.hadoop.mapred.LocalContainerLauncher
> private void runSubtask(..) {
>   ...
>   conf.set(MRConfig.FRAMEWORK_NAME, MRConfig.LOCAL_FRAMEWORK_NAME);
>   conf.set(MRConfig.MASTER_ADDRESS, "local");  // bypass shuffle
>   ReduceTask reduce = (ReduceTask)task;
>   reduce.setConf(conf);  
>   reduce.run(conf, umbilical);
> }
> {code}
> A super quick fix could just an additional if-branch, where we check if we 
> run a reduce task in uber mode, and then look for a plan file in a different 
> location.
> *Java stacktrace*
> {code}
> 2013-11-20 00:50:56,862 INFO [uber-SubtaskRunner] 
> org.apache.hadoop.hive.ql.exec.Utilities: No plan file found: 
> hdfs://namenode.c.lon.spotify.net:54310/var/tmp/kawaa/hive_2013-11-20_00-50-43_888_3938384086824086680-2/-mr-10003/e3caacf6-15d6-4987-b186-d2906791b5b0/reduce.xml
> 2013-11-20 00:50:56,862 WARN [uber-SubtaskRunner] 
> org.apache.hadoop.mapred.LocalContainerLauncher: Exception running local 
> (uberized) 'child' : java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:427)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
>   at 
> org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.runSubtask(LocalContainerLauncher.java:340)
>   at 
> org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.run(LocalContainerLauncher.java:225)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method

[jira] [Commented] (HIVE-5857) Reduce tasks do not work in uber mode in YARN

2014-06-11 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14028759#comment-14028759
 ] 

Jason Dere commented on HIVE-5857:
--

Question, your fix only kicks in in the case of REDUCE_PLAN_NAME .. would we 
ever need to worry about MAP_PLAN_NAME?

> Reduce tasks do not work in uber mode in YARN
> -
>
> Key: HIVE-5857
> URL: https://issues.apache.org/jira/browse/HIVE-5857
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.12.0, 0.13.0, 0.13.1
>Reporter: Adam Kawa
>Assignee: Adam Kawa
>Priority: Critical
>  Labels: plan, uber-jar, uberization, yarn
> Fix For: 0.13.0
>
> Attachments: HIVE-5857.1.patch.txt, HIVE-5857.2.patch, 
> HIVE-5857.3.patch
>
>
> A Hive query fails when it tries to run a reduce task in uber mode in YARN.
> The NullPointerException is thrown in the ExecReducer.configure method, 
> because the plan file (reduce.xml) for a reduce task is not found.
> The Utilities.getBaseWork method is expected to return BaseWork object, but 
> it returns NULL due to FileNotFoundException. 
> {code}
> // org.apache.hadoop.hive.ql.exec.Utilities
> public static BaseWork getBaseWork(Configuration conf, String name) {
>   ...
> try {
> ...
>   if (gWork == null) {
> Path localPath;
> if (ShimLoader.getHadoopShims().isLocalMode(conf)) {
>   localPath = path;
> } else {
>   localPath = new Path(name);
> }
> InputStream in = new FileInputStream(localPath.toUri().getPath());
> BaseWork ret = deserializePlan(in);
> 
>   }
>   return gWork;
> } catch (FileNotFoundException fnf) {
>   // happens. e.g.: no reduce work.
>   LOG.debug("No plan file found: "+path);
>   return null;
> } ...
> }
> {code}
> It happens because, the ShimLoader.getHadoopShims().isLocalMode(conf)) method 
> returns true, because immediately before running a reduce task, 
> org.apache.hadoop.mapred.LocalContainerLauncher changes its configuration to 
> local mode ("mapreduce.framework.name" is changed from" "yarn" to "local"). 
> On the other hand map tasks run successfully, because its configuration is 
> not changed and still remains "yarn".
> {code}
> // org.apache.hadoop.mapred.LocalContainerLauncher
> private void runSubtask(..) {
>   ...
>   conf.set(MRConfig.FRAMEWORK_NAME, MRConfig.LOCAL_FRAMEWORK_NAME);
>   conf.set(MRConfig.MASTER_ADDRESS, "local");  // bypass shuffle
>   ReduceTask reduce = (ReduceTask)task;
>   reduce.setConf(conf);  
>   reduce.run(conf, umbilical);
> }
> {code}
> A super quick fix could just an additional if-branch, where we check if we 
> run a reduce task in uber mode, and then look for a plan file in a different 
> location.
> *Java stacktrace*
> {code}
> 2013-11-20 00:50:56,862 INFO [uber-SubtaskRunner] 
> org.apache.hadoop.hive.ql.exec.Utilities: No plan file found: 
> hdfs://namenode.c.lon.spotify.net:54310/var/tmp/kawaa/hive_2013-11-20_00-50-43_888_3938384086824086680-2/-mr-10003/e3caacf6-15d6-4987-b186-d2906791b5b0/reduce.xml
> 2013-11-20 00:50:56,862 WARN [uber-SubtaskRunner] 
> org.apache.hadoop.mapred.LocalContainerLauncher: Exception running local 
> (uberized) 'child' : java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:427)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
>   at 
> org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.runSubtask(LocalContainerLauncher.java:340)
>   at 
> org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.run(LocalContainerLauncher.java:225)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
>   ... 7 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:116)
>   ... 12 more
> 2013-11-20 00:50:56,862 INFO [uber-SubtaskRunner] 
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status

[jira] [Commented] (HIVE-5857) Reduce tasks do not work in uber mode in YARN

2014-05-30 Thread Adam Kawa (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14013514#comment-14013514
 ] 

Adam Kawa commented on HIVE-5857:
-

[~cwsteinbach] Thanks for having a look on this patch. I will prepare tests 
during next week.

> Reduce tasks do not work in uber mode in YARN
> -
>
> Key: HIVE-5857
> URL: https://issues.apache.org/jira/browse/HIVE-5857
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.12.0, 0.13.0, 0.13.1
>Reporter: Adam Kawa
>Assignee: Adam Kawa
>Priority: Critical
>  Labels: plan, uber-jar, uberization, yarn
> Attachments: HIVE-5857.1.patch.txt
>
>
> A Hive query fails when it tries to run a reduce task in uber mode in YARN.
> The NullPointerException is thrown in the ExecReducer.configure method, 
> because the plan file (reduce.xml) for a reduce task is not found.
> The Utilities.getBaseWork method is expected to return BaseWork object, but 
> it returns NULL due to FileNotFoundException. 
> {code}
> // org.apache.hadoop.hive.ql.exec.Utilities
> public static BaseWork getBaseWork(Configuration conf, String name) {
>   ...
> try {
> ...
>   if (gWork == null) {
> Path localPath;
> if (ShimLoader.getHadoopShims().isLocalMode(conf)) {
>   localPath = path;
> } else {
>   localPath = new Path(name);
> }
> InputStream in = new FileInputStream(localPath.toUri().getPath());
> BaseWork ret = deserializePlan(in);
> 
>   }
>   return gWork;
> } catch (FileNotFoundException fnf) {
>   // happens. e.g.: no reduce work.
>   LOG.debug("No plan file found: "+path);
>   return null;
> } ...
> }
> {code}
> It happens because, the ShimLoader.getHadoopShims().isLocalMode(conf)) method 
> returns true, because immediately before running a reduce task, 
> org.apache.hadoop.mapred.LocalContainerLauncher changes its configuration to 
> local mode ("mapreduce.framework.name" is changed from" "yarn" to "local"). 
> On the other hand map tasks run successfully, because its configuration is 
> not changed and still remains "yarn".
> {code}
> // org.apache.hadoop.mapred.LocalContainerLauncher
> private void runSubtask(..) {
>   ...
>   conf.set(MRConfig.FRAMEWORK_NAME, MRConfig.LOCAL_FRAMEWORK_NAME);
>   conf.set(MRConfig.MASTER_ADDRESS, "local");  // bypass shuffle
>   ReduceTask reduce = (ReduceTask)task;
>   reduce.setConf(conf);  
>   reduce.run(conf, umbilical);
> }
> {code}
> A super quick fix could just an additional if-branch, where we check if we 
> run a reduce task in uber mode, and then look for a plan file in a different 
> location.
> *Java stacktrace*
> {code}
> 2013-11-20 00:50:56,862 INFO [uber-SubtaskRunner] 
> org.apache.hadoop.hive.ql.exec.Utilities: No plan file found: 
> hdfs://namenode.c.lon.spotify.net:54310/var/tmp/kawaa/hive_2013-11-20_00-50-43_888_3938384086824086680-2/-mr-10003/e3caacf6-15d6-4987-b186-d2906791b5b0/reduce.xml
> 2013-11-20 00:50:56,862 WARN [uber-SubtaskRunner] 
> org.apache.hadoop.mapred.LocalContainerLauncher: Exception running local 
> (uberized) 'child' : java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:427)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
>   at 
> org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.runSubtask(LocalContainerLauncher.java:340)
>   at 
> org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.run(LocalContainerLauncher.java:225)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
>   ... 7 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:116)
>   ... 12 more
> 2013-11-20 00:50:56,862 INFO [uber-SubtaskRunner] 
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from 
> attempt_1384392632998_34791_r_00_0
> 2013-11-20 00:50:56,862 INFO [uber-Subtas

[jira] [Commented] (HIVE-5857) Reduce tasks do not work in uber mode in YARN

2013-11-20 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13827950#comment-13827950
 ] 

Hive QA commented on HIVE-5857:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12614761/HIVE-5857.1.patch.txt

{color:green}SUCCESS:{color} +1 4665 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/374/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/374/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12614761

> Reduce tasks do not work in uber mode in YARN
> -
>
> Key: HIVE-5857
> URL: https://issues.apache.org/jira/browse/HIVE-5857
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.12.0
>Reporter: Adam Kawa
>Priority: Critical
>  Labels: plan, uber-jar, yarn
> Attachments: HIVE-5857.1.patch.txt
>
>
> A Hive query fails when it tries to run a reduce task in uber mode in YARN.
> The NullPointerException is thrown in the ExecReducer.configure method, 
> because the plan file (reduce.xml) for a reduce task is not found.
> The Utilities.getBaseWork method is expected to return BaseWork object, but 
> it returns NULL due to FileNotFoundException. 
> {code}
> // org.apache.hadoop.hive.ql.exec.Utilities
> public static BaseWork getBaseWork(Configuration conf, String name) {
>   ...
> try {
> ...
>   if (gWork == null) {
> Path localPath;
> if (ShimLoader.getHadoopShims().isLocalMode(conf)) {
>   localPath = path;
> } else {
>   localPath = new Path(name);
> }
> InputStream in = new FileInputStream(localPath.toUri().getPath());
> BaseWork ret = deserializePlan(in);
> 
>   }
>   return gWork;
> } catch (FileNotFoundException fnf) {
>   // happens. e.g.: no reduce work.
>   LOG.debug("No plan file found: "+path);
>   return null;
> } ...
> }
> {code}
> It happens because, the ShimLoader.getHadoopShims().isLocalMode(conf)) method 
> returns true, because immediately before running a reduce task, 
> org.apache.hadoop.mapred.LocalContainerLauncher changes its configuration to 
> local mode ("mapreduce.framework.name" is changed from" "yarn" to "local"). 
> On the other hand map tasks run successfully, because its configuration is 
> not changed and still remains "yarn".
> {code}
> // org.apache.hadoop.mapred.LocalContainerLauncher
> private void runSubtask(..) {
>   ...
>   conf.set(MRConfig.FRAMEWORK_NAME, MRConfig.LOCAL_FRAMEWORK_NAME);
>   conf.set(MRConfig.MASTER_ADDRESS, "local");  // bypass shuffle
>   ReduceTask reduce = (ReduceTask)task;
>   reduce.setConf(conf);  
>   reduce.run(conf, umbilical);
> }
> {code}
> A super quick fix could just an additional if-branch, where we check if we 
> run a reduce task in uber mode, and then look for a plan file in a different 
> location.
> *Java stacktrace*
> {code}
> 2013-11-20 00:50:56,862 INFO [uber-SubtaskRunner] 
> org.apache.hadoop.hive.ql.exec.Utilities: No plan file found: 
> hdfs://namenode.c.lon.spotify.net:54310/var/tmp/kawaa/hive_2013-11-20_00-50-43_888_3938384086824086680-2/-mr-10003/e3caacf6-15d6-4987-b186-d2906791b5b0/reduce.xml
> 2013-11-20 00:50:56,862 WARN [uber-SubtaskRunner] 
> org.apache.hadoop.mapred.LocalContainerLauncher: Exception running local 
> (uberized) 'child' : java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:427)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
>   at 
> org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.runSubtask(LocalContainerLauncher.java:340)
>   at 
> org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.run(LocalContainerLauncher.java:225)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at jav