[ 
https://issues.apache.org/jira/browse/HIVE-8545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14181543#comment-14181543
 ] 

Chao commented on HIVE-8545:
----------------------------

OK, I managed to find the log file with help from Brock. I think the test 
failure is caused by this error:

{noformat}
2014-10-22 20:44:05,883 INFO  exec.Utilities (Utilities.java:getBaseWork(425)) 
- File not found: File
file:/home/hiveptest/54.177.19.115-hiveptest-0/apache-svn-spark-source/itests/qtest-spark/target/tmp/scratchdir/hiveptest/e91b92c0-6d0b-49a7-b7aa-5cb7ccb9e7da/hive_2014-10-22_20-44-05_260_8919336312617093153-1/-mr-10002/275fcb87-b139-4069-a544-b55d6c28ea44/reduce.xml
does not exist
2014-10-22 20:44:05,883 INFO  exec.Utilities (Utilities.java:getBaseWork(426)) 
- No plan file found:
file:/home/hiveptest/54.177.19.115-hiveptest-0/apache-svn-spark-source/itests/qtest-spark/target/tmp/scratchdir/hiveptest/e91b92c0-6d0b-49a7-b7aa-5cb7ccb9e7da/hive_2014-10-22_20-44-05_260_8919336312617093153-1/-mr-10002/275fcb87-b139-4069-a544-b55d6c28ea44/reduce.xml
2014-10-22 20:44:05,883 INFO  mr.ObjectCache (ObjectCache.java:cache(36)) - 
Ignoring cache key: __REDUCE_PLAN__
2014-10-22 20:44:05,883 DEBUG parse.SemanticAnalyzer 
(SemanticAnalyzer.java:genSelectPlan(3488)) - genSelectPlan: input = 
null{(key,_col0: string)(value,_col1: string)} {((tok_table_or_col
key),_col0: string)((tok_table_or_col value),_col1: string)}
2014-10-22 20:44:05,884 DEBUG parse.SemanticAnalyzer 
(SemanticAnalyzer.java:genSelectPlan(3643)) - Created Select Plan row schema: 
{((tok_table_or_col key),_col0: string)((tok_table_or_col
value),_col1: string)}
2014-10-22 20:44:05,884 DEBUG parse.SemanticAnalyzer 
(SemanticAnalyzer.java:genSelectPlan(3370)) - Created Select Plan for clause: 
insclause-3
2014-10-22 20:44:05,884 ERROR executor.Executor (Logging.scala:logError(96)) - 
Exception in task 0.0 in stage 135.0 (TID 283)
java.lang.NullPointerException
  at 
org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.init(SparkReduceRecordHandler.java:120)
  at 
org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:44)
  at 
org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:27)
  at 
org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:167)
  at 
org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:167)
  at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:599)
  at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:599)
  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
  at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:86)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
  at org.apache.spark.scheduler.Task.run(Task.scala:56)
  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:181)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:744)
{noformat}

It seems like a strange error, by looking at the log message I couldn't figure 
out why it happened. 

> Exception when casting Text to BytesWritable [Spark Branch]
> -----------------------------------------------------------
>
>                 Key: HIVE-8545
>                 URL: https://issues.apache.org/jira/browse/HIVE-8545
>             Project: Hive
>          Issue Type: Bug
>          Components: Spark
>            Reporter: Chao
>            Assignee: Chao
>         Attachments: HIVE-8545.1-spark.patch, HIVE-8545.2-spark.patch, 
> HIVE-8545.3-spark.patch, HIVE-8545.4-spark.patch, HIVE-8545.5-spark.patch
>
>
> With the current multi-insertion implementation, when caching is enabled for 
> input RDD, query may fail with the following exception:
> {noformat}
> 2014-10-21 13:57:34,742 WARN  [task-result-getter-0]: 
> scheduler.TaskSetManager (Logging.scala:logWarning(71)) - Lost task 0.0 in 
> stage 1.0 (TID 1, localhost): java.lang.ClassCastException: 
> org.apache.hadoop.io.Text cannot be cast to org.apache.hadoop.io.BytesWritable
>         
> org.apache.hadoop.hive.ql.exec.spark.MapInput$CopyFunction.call(MapInput.java:67)
>         
> org.apache.hadoop.hive.ql.exec.spark.MapInput$CopyFunction.call(MapInput.java:61)
>         
> org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1002)
>         
> org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1002)
>         scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>         
> org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:234)
>         
> org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:163)
>         org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:70)
>         org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
>         
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
>         org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>         org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>         
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
>         
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>         org.apache.spark.scheduler.Task.run(Task.scala:56)
>         org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:181)
>         
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         java.lang.Thread.run(Thread.java:745)
> {noformat}
> The fix should be easy. However, interestingly, this error doesn't show up 
> when the caching is turned off. We need to find out why.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to