Re: shark queries failed

2015-02-15 Thread Grandl Robert
Thanks for reply, Akhil. I cannot update the spark version and run SparkSQL due 
to some old dependencies and a specific project I want to run. 

I was wondering if you have any clue, why that exception might be triggered, or 
if you saw it before. 

Thanks,Robert
 

 On Sunday, February 15, 2015 9:18 AM, Akhil Das 
 wrote:
   

 I'd suggest you updating your spark to the latest version and try SparkSQL 
instead of Shark.
ThanksBest Regards
On Sun, Feb 15, 2015 at 7:36 AM, Grandl Robert  
wrote:

Hi guys,
I deployed BlinkDB(built atop Shark) and running Spark 0.9. 
I tried to run several TPCDS shark queries taken from 
https://github.com/cloudera/impala-tpcds-kit/tree/master/queries-sql92-modified/queries/shark.
 However, the following exceptions are encountered. Do you have any idea why 
that might happen ? 

Thanks,Robert

2015-02-14 17:58:29,358 WARN  util.NativeCodeLoader 
(NativeCodeLoader.java:(52)) - Unable to load native-
hadoop library for your platform... using builtin-java classes where applicable
2015-02-14 17:58:29,360 WARN  snappy.LoadSnappy (LoadSnappy.java:(46)) 
- Snappy native library not loaded
2015-02-14 17:58:34,963 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Lost TID 6 (task 5.0:2)
2015-02-14 17:58:34,970 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Loss was due to java.lang
.ClassCastException
java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be cast 
to org.apache.hadoop.io.FloatWrita
ble
    at 
org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableFloatObjectInspector.get(WritableFloat
ObjectInspector.java:35)
    at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:331)
    at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serializeStruct(LazyBinarySerDe.java:257)
    at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:204)
    at 
shark.execution.ReduceSinkOperator$$anonfun$processPartitionNoDistinct$1.apply(ReduceSinkOperator.scal
a:188)
    at 
shark.execution.ReduceSinkOperator$$anonfun$processPartitionNoDistinct$1.apply(ReduceSinkOperator.scal
a:153)
    at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
    at scala.collection.Iterator$class.foreach(Iterator.scala:727)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
    at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161)
    at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102)
    at org.apache.spark.scheduler.Task.run(Task.scala:53)
    at 
org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:213)
    at 
org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:49)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
    at java.lang.Thread.run(Thread.java:722)
2015-02-14 17:58:34,983 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Lost TID 8 (task 5.0:4)
2015-02-14 17:58:35,075 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Lost TID 12 (task 5.0:8)
2015-02-14 17:58:35,119 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Lost TID 15 (task 5.0:2)
2015-02-14 17:58:35,134 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Lost TID 9 (task 5.0:5)
2015-02-14 17:58:35,187 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Lost TID 16 (task 5.0:4)
2015-02-14 17:58:35,203 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Lost TID 11 (task 5.0:7)
2015-02-14 17:58:35,214 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Lost TID 13 (task 5.0:9)
2015-02-14 17:58:35,265 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Lost TID 4 (task 5.0:0)
2015-02-14 17:58:35,274 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Lost TID 18 (task 5.0:2)
2015-02-14 17:58:35,304 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Lost TID 17 (task 5.0:8)
2015-02-14 17:58:35,330 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Lost TID 5 (task 5.0:1)
2015-02-14 17:58:35,354 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Lost TID 20 (task 5.0:4)
2015-02-14 17:58:35,387 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Lost TID 19 (task 5.0:5)
2015-02-14 17:58:35,430 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Lost TID 7 (task 5.0:3)
2015-02-14 17:58:35,432 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Lost TID 24 (task 5.0:2)
2015-02-14 17:58:35,433 ERROR scheduler.TaskSetManager 
(Logging.scala:logError(65)) - Task 5.0:2 failed 4 times; 
aborting j

Re: shark queries failed

2015-02-15 Thread Akhil Das
I'd suggest you updating your spark to the latest version and try SparkSQL
instead of Shark.

Thanks
Best Regards

On Sun, Feb 15, 2015 at 7:36 AM, Grandl Robert 
wrote:

> Hi guys,
>
> I deployed BlinkDB(built atop Shark) and running Spark 0.9.
>
> I tried to run several TPCDS shark queries taken from
> https://github.com/cloudera/impala-tpcds-kit/tree/master/queries-sql92-modified/queries/shark.
> However, the following exceptions are encountered. Do you have any idea why
> that might happen ?
>
> Thanks,
> Robert
>
> 2015-02-14 17:58:29,358 WARN  util.NativeCodeLoader
> (NativeCodeLoader.java:(52)) - Unable to load native-
> hadoop library for your platform... using builtin-java classes where
> applicable
> 2015-02-14 17:58:29,360 WARN  snappy.LoadSnappy
> (LoadSnappy.java:(46)) - Snappy native library not loaded
> 2015-02-14 17:58:34,963 WARN  scheduler.TaskSetManager
> (Logging.scala:logWarning(61)) - Lost TID 6 (task 5.0:2)
> 2015-02-14 17:58:34,970 WARN  scheduler.TaskSetManager
> (Logging.scala:logWarning(61)) - Loss was due to java.lang
> .ClassCastException
> java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be
> cast to org.apache.hadoop.io.FloatWrita
> ble
> at
> org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableFloatObjectInspector.get(WritableFloat
> ObjectInspector.java:35)
> at
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:331)
> at
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serializeStruct(LazyBinarySerDe.java:257)
> at
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:204)
> at
> shark.execution.ReduceSinkOperator$$anonfun$processPartitionNoDistinct$1.apply(ReduceSinkOperator.scal
> a:188)
> at
> shark.execution.ReduceSinkOperator$$anonfun$processPartitionNoDistinct$1.apply(ReduceSinkOperator.scal
> a:153)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161)
> at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102)
> at org.apache.spark.scheduler.Task.run(Task.scala:53)
> at
> org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:213)
> at
> org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:49)
> at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:722)
> 2015-02-14 17:58:34,983 WARN  scheduler.TaskSetManager
> (Logging.scala:logWarning(61)) - Lost TID 8 (task 5.0:4)
> 2015-02-14 17:58:35,075 WARN  scheduler.TaskSetManager
> (Logging.scala:logWarning(61)) - Lost TID 12 (task 5.0:8)
> 2015-02-14 17:58:35,119 WARN  scheduler.TaskSetManager
> (Logging.scala:logWarning(61)) - Lost TID 15 (task 5.0:2)
> 2015-02-14 17:58:35,134 WARN  scheduler.TaskSetManager
> (Logging.scala:logWarning(61)) - Lost TID 9 (task 5.0:5)
> 2015-02-14 17:58:35,187 WARN  scheduler.TaskSetManager
> (Logging.scala:logWarning(61)) - Lost TID 16 (task 5.0:4)
> 2015-02-14 17:58:35,203 WARN  scheduler.TaskSetManager
> (Logging.scala:logWarning(61)) - Lost TID 11 (task 5.0:7)
> 2015-02-14 17:58:35,214 WARN  scheduler.TaskSetManager
> (Logging.scala:logWarning(61)) - Lost TID 13 (task 5.0:9)
> 2015-02-14 17:58:35,265 WARN  scheduler.TaskSetManager
> (Logging.scala:logWarning(61)) - Lost TID 4 (task 5.0:0)
> 2015-02-14 17:58:35,274 WARN  scheduler.TaskSetManager
> (Logging.scala:logWarning(61)) - Lost TID 18 (task 5.0:2)
> 2015-02-14 17:58:35,304 WARN  scheduler.TaskSetManager
> (Logging.scala:logWarning(61)) - Lost TID 17 (task 5.0:8)
> 2015-02-14 17:58:35,330 WARN  scheduler.TaskSetManager
> (Logging.scala:logWarning(61)) - Lost TID 5 (task 5.0:1)
> 2015-02-14 17:58:35,354 WARN  scheduler.TaskSetManager
> (Logging.scala:logWarning(61)) - Lost TID 20 (task 5.0:4)
> 2015-02-14 17:58:35,387 WARN  scheduler.TaskSetManager
> (Logging.scala:logWarning(61)) - Lost TID 19 (task 5.0:5)
> 2015-02-14 17:58:35,430 WARN  scheduler.TaskSetManager
> (Logging.scala:logWarning(61)) - Lost TID 7 (task 5.0:3)
> 2015-02-14 17:58:35,432 WARN  scheduler.TaskSetManager
> (Logging.scala:logWarning(61)) - Lost TID 24 (task 5.0:2)
> 2015-02-14 17:58:35,433 ERROR scheduler.TaskSetManager
> (Logging.scala:logError(65)) - Task 5.0:2 failed 4 times;
> aborting job
> 2015-02-14 17:58:35,438 ERROR ql.Driver
> (SessionState.java:printError(400)) - FAILED: Execution Error, return cod
> e -101 from shark.execution.SparkTask
> 2015-02-14 17:58:35,552 WA

shark queries failed

2015-02-15 Thread Grandl Robert
Hi guys,
I deployed BlinkDB(built atop Shark) and running Spark 0.9. 
I tried to run several TPCDS shark queries taken from 
https://github.com/cloudera/impala-tpcds-kit/tree/master/queries-sql92-modified/queries/shark.
 However, the following exceptions are encountered. Do you have any idea why 
that might happen ? 

Thanks,Robert

2015-02-14 17:58:29,358 WARN  util.NativeCodeLoader 
(NativeCodeLoader.java:(52)) - Unable to load native-
hadoop library for your platform... using builtin-java classes where applicable
2015-02-14 17:58:29,360 WARN  snappy.LoadSnappy (LoadSnappy.java:(46)) 
- Snappy native library not loaded
2015-02-14 17:58:34,963 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Lost TID 6 (task 5.0:2)
2015-02-14 17:58:34,970 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Loss was due to java.lang
.ClassCastException
java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be cast 
to org.apache.hadoop.io.FloatWrita
ble
    at 
org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableFloatObjectInspector.get(WritableFloat
ObjectInspector.java:35)
    at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:331)
    at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serializeStruct(LazyBinarySerDe.java:257)
    at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:204)
    at 
shark.execution.ReduceSinkOperator$$anonfun$processPartitionNoDistinct$1.apply(ReduceSinkOperator.scal
a:188)
    at 
shark.execution.ReduceSinkOperator$$anonfun$processPartitionNoDistinct$1.apply(ReduceSinkOperator.scal
a:153)
    at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
    at scala.collection.Iterator$class.foreach(Iterator.scala:727)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
    at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161)
    at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102)
    at org.apache.spark.scheduler.Task.run(Task.scala:53)
    at 
org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:213)
    at 
org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:49)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
    at java.lang.Thread.run(Thread.java:722)
2015-02-14 17:58:34,983 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Lost TID 8 (task 5.0:4)
2015-02-14 17:58:35,075 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Lost TID 12 (task 5.0:8)
2015-02-14 17:58:35,119 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Lost TID 15 (task 5.0:2)
2015-02-14 17:58:35,134 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Lost TID 9 (task 5.0:5)
2015-02-14 17:58:35,187 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Lost TID 16 (task 5.0:4)
2015-02-14 17:58:35,203 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Lost TID 11 (task 5.0:7)
2015-02-14 17:58:35,214 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Lost TID 13 (task 5.0:9)
2015-02-14 17:58:35,265 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Lost TID 4 (task 5.0:0)
2015-02-14 17:58:35,274 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Lost TID 18 (task 5.0:2)
2015-02-14 17:58:35,304 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Lost TID 17 (task 5.0:8)
2015-02-14 17:58:35,330 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Lost TID 5 (task 5.0:1)
2015-02-14 17:58:35,354 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Lost TID 20 (task 5.0:4)
2015-02-14 17:58:35,387 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Lost TID 19 (task 5.0:5)
2015-02-14 17:58:35,430 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Lost TID 7 (task 5.0:3)
2015-02-14 17:58:35,432 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Lost TID 24 (task 5.0:2)
2015-02-14 17:58:35,433 ERROR scheduler.TaskSetManager 
(Logging.scala:logError(65)) - Task 5.0:2 failed 4 times; 
aborting job
2015-02-14 17:58:35,438 ERROR ql.Driver (SessionState.java:printError(400)) - 
FAILED: Execution Error, return cod
e -101 from shark.execution.SparkTask
2015-02-14 17:58:35,552 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Lost TID 30 (task 6.0:0)
2015-02-14 17:58:35,565 WARN  scheduler.TaskSetManager 
(Logging.scala:logWarning(61)) - Loss was due to java.io.F
ileNotFoundException
java.io.FileNotFoundException: http://10.200.146.12:46812/broadcast_4
    at 
sun.net.www.protoc