Hi Sofia.

 

I don’t think version 1.5.2  of spark can be used as Hive engine. I tried it 
many times. 

 

What works is you download spark 1.3.1 anf build it as you did.

 

You then create spark-assembly-1.3.1-hadoop2.4.0.jar (after unzip and untar the 
result file) and put it in $HIVE_HOME/lib

 

Then download the built version of spark 1.3.1 and install it as usual. You do 
not need to start master slaves etc

 

So far so good.

 

In the directory you want to start spark 

 

Do

 

unset SPARK_HOME

 

log in to spark and do as follows:

 

set spark.home=/usr/lib/spark-1.3.1-bin-hadoop2.6;   - change to yours

set hive.execution.engine=spark;

set spark.master=yarn-client;

set spark.eventLog.enabled=true;

set spark.eventLog.dir=/usr/lib/spark-1.3.1-bin-hadoop2.6/logs;

set spark.executor.memory=512m;

set spark.serializer=org.apache.spark.serializer.KryoSerializer;

set hive.spark.client.server.connect.timeout=220000ms;

set spark.io.compression.codec=org.apache.spark.io.LZFCompressionCodec;

 

 

0: jdbc:hive2://rhes564:10010/default> select count(1) from t;

INFO  :

Query Hive on Spark job[1] stages:

INFO  : 2

INFO  : 3

INFO  :

Status: Running (Hive on Spark job[1])

INFO  : Job Progress Format

CurrentTime StageId_StageAttemptId: 
SucceededTasksCount(+RunningTasksCount-FailedTasksCount)/TotalTasksCount 
[StageCost]

INFO  : 2015-12-24 17:47:15,781 Stage-2_0: 0(+2)/256    Stage-3_0: 0/1

INFO  : 2015-12-24 17:47:17,790 Stage-2_0: 1(+2)/256    Stage-3_0: 0/1

INFO  : 2015-12-24 17:47:18,794 Stage-2_0: 2(+2)/256    Stage-3_0: 0/1

INFO  : 2015-12-24 17:47:19,798 Stage-2_0: 4(+2)/256    Stage-3_0: 0/1

INFO  : 2015-12-24 17:47:20,802 Stage-2_0: 5(+2)/256    Stage-3_0: 0/1

INFO  : 2015-12-24 17:47:21,807 Stage-2_0: 6(+2)/256    Stage-3_0: 0/1

INFO  : 2015-12-24 17:47:22,823 Stage-2_0: 8(+2)/256    Stage-3_0: 0/1

INFO  : 2015-12-24 17:47:23,830 Stage-2_0: 9(+2)/256    Stage-3_0: 0/1

INFO  : 2015-12-24 17:47:24,835 Stage-2_0: 10(+2)/256   Stage-3_0: 0/1

INFO  : 2015-12-24 17:47:25,838 Stage-2_0: 12(+2)/256   Stage-3_0: 0/1

INFO  : 2015-12-24 17:47:26,842 Stage-2_0: 13(+2)/256   Stage-3_0: 0/1

INFO  : 2015-12-24 17:47:27,847 Stage-2_0: 15(+2)/256   Stage-3_0: 0/1

INFO  : 2015-12-24 17:47:28,856 Stage-2_0: 26(+3)/256   Stage-3_0: 0/1

INFO  : 2015-12-24 17:47:29,862 Stage-2_0: 66(+2)/256   Stage-3_0: 0/1

INFO  : 2015-12-24 17:47:30,867 Stage-2_0: 107(+2)/256  Stage-3_0: 0/1

INFO  : 2015-12-24 17:47:31,871 Stage-2_0: 154(+2)/256  Stage-3_0: 0/1

INFO  : 2015-12-24 17:47:32,875 Stage-2_0: 206(+2)/256  Stage-3_0: 0/1

INFO  : 2015-12-24 17:47:33,879 Stage-2_0: 256/256 Finished     Stage-3_0: 
0(+1)/1

INFO  : 2015-12-24 17:47:34,882 Stage-2_0: 256/256 Finished     Stage-3_0: 1/1 
Finished

INFO  : Status: Finished successfully in 20.12 seconds

+----------+--+

|   _c0    |

+----------+--+

| 2074897  |

+----------+--+

1 row selected (20.247 seconds)

 

 

 

 

Mich Talebzadeh

 

Sybase ASE 15 Gold Medal Award 2008

A Winning Strategy: Running the most Critical Financial Data on ASE 15

http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdf

Author of the books "A Practitioner’s Guide to Upgrading to Sybase ASE 15", 
ISBN 978-0-9563693-0-7. 

co-author "Sybase Transact SQL Guidelines Best Practices", ISBN 
978-0-9759693-0-4

Publications due shortly:

Complex Event Processing in Heterogeneous Environments, ISBN: 978-0-9563693-3-8

Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume one 
out shortly

 

http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/> 

 

NOTE: The information in this email is proprietary and confidential. This 
message is for the designated recipient only, if you are not the intended 
recipient, you should destroy it immediately. Any information in this message 
shall not be understood as given or endorsed by Peridale Technology Ltd, its 
subsidiaries or their employees, unless expressly so stated. It is the 
responsibility of the recipient to ensure that this email is virus free, 
therefore neither Peridale Ltd, its subsidiaries nor their employees accept any 
responsibility.

 

From: Sofia [mailto:sofia.panagiot...@taiger.com] 
Sent: 24 December 2015 16:25
To: user@hive.apache.org
Subject: Executor getting killed when running Hive on Spark

 

Hello and happy holiday to those who are already enjoying it!

 

 

I am still having trouble running Hive with Spark. I downloaded Spark 1.5.2 and 
built it like this (my Hadoop is version 2.7.1):

 

./make-distribution.sh --name "hadoop2-without-hive" --tgz 
"-Pyarn,hadoop-provided,hadoop-2.4,parquet-provided”

 

When trying to run it with Hive 1.2.1 (a simple command that creates a Spark 
job like ‘Select count(*) from userstweetsdailystatistics;') get the following 
error

 

15/12/24 17:12:54 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:54 INFO log.PerfLogger: <PERFLOG method=SparkBuildPlan 
from=org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator>

15/12/24 17:12:54 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:54 INFO log.PerfLogger: <PERFLOG method=SparkCreateTran.Map 1 
from=org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator>

15/12/24 17:12:54 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:54 INFO Configuration.deprecation: mapred.task.is.map is deprecated. 
Instead, use mapreduce.task.ismap

15/12/24 17:12:54 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:54 INFO exec.Utilities: Processing alias userstweetsdailystatistics

15/12/24 17:12:54 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:54 INFO exec.Utilities: Adding input file 
hdfs://hadoop-master:8020/user/ubuntu/hive/warehouse/userstweetsdailystatistics

15/12/24 17:12:55 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:55 INFO log.PerfLogger: <PERFLOG method=serializePlan 
from=org.apache.hadoop.hive.ql.exec.Utilities>

15/12/24 17:12:55 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:55 INFO exec.Utilities: Serializing MapWork via kryo

15/12/24 17:12:56 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:56 INFO log.PerfLogger: </PERFLOG method=serializePlan 
start=1450973575887 end=1450973576279 duration=392 
from=org.apache.hadoop.hive.ql.exec.Utilities>

15/12/24 17:12:57 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:57 INFO storage.MemoryStore: ensureFreeSpace(572800) called with 
curMem=0, maxMem=556038881

15/12/24 17:12:57 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:57 INFO storage.MemoryStore: Block broadcast_0 stored as values in memory 
(estimated size 559.4 KB, free 529.7 MB)

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO storage.MemoryStore: ensureFreeSpace(43075) called with 
curMem=572800, maxMem=556038881

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes in 
memory (estimated size 42.1 KB, free 529.7 MB)

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on 
192.168.1.64:49690 (size: 42.1 KB, free: 530.2 MB)

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 ERROR util.Utils: uncaught error in thread SparkListenerBus, stopping 
SparkContext

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 
java.lang.AbstractMethodError

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl:         at 
org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:62)

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl:         at 
org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl:         at 
org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl:         at 
org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:56)

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl:         at 
org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37)

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl:         at 
org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:79)

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl:         at 
org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1136)

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl:         at 
org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/metrics/json,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/stages/stage/kill,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/api,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/static,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/executors/threadDump/json,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/executors/threadDump,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/executors/json,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/executors,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/environment/json,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/environment,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/storage/rdd/json,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/storage/rdd,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/storage/json,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/storage,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/stages/pool/json,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/stages/pool,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/stages/stage/json,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/stages/stage,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/stages/json,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/stages,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/jobs/job/json,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/jobs/job,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/jobs/json,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/jobs,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO spark.SparkContext: Created broadcast 0 from hadoopRDD at 
SparkPlanGenerator.java:188

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO ui.SparkUI: Stopped Spark web UI at http://192.168.1.64:4040

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO scheduler.DAGScheduler: Stopping DAGScheduler

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO cluster.SparkDeploySchedulerBackend: Shutting down all executors

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO cluster.SparkDeploySchedulerBackend: Asking each executor to shut 
down

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO log.PerfLogger: </PERFLOG method=SparkCreateTran.Map 1 
start=1450973574712 end=1450973578874 duration=4162 
from=org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator>

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO log.PerfLogger: <PERFLOG method=SparkCreateTran.Reducer 2 
from=org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator>

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO log.PerfLogger: <PERFLOG method=serializePlan 
from=org.apache.hadoop.hive.ql.exec.Utilities>

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:58 INFO exec.Utilities: Serializing ReduceWork via kryo

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:59 INFO log.PerfLogger: </PERFLOG method=serializePlan 
start=1450973578926 end=1450973579000 duration=74 
from=org.apache.hadoop.hive.ql.exec.Utilities>

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:59 INFO log.PerfLogger: </PERFLOG method=SparkCreateTran.Reducer 2 
start=1450973578874 end=1450973579073 duration=199 
from=org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator>

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:59 INFO log.PerfLogger: </PERFLOG method=SparkBuildPlan 
start=1450973574707 end=1450973579074 duration=4367 
from=org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator>

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:59 INFO log.PerfLogger: <PERFLOG method=SparkBuildRDDGraph 
from=org.apache.hadoop.hive.ql.exec.spark.SparkPlan>

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:59 WARN remote.ReliableDeliverySupervisor: Association with remote system 
[akka.tcp://sparkExecutor@192.168.1.64:35089] has failed, address is now gated 
for [5000] ms. Reason: [Disassociated] 

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:59 INFO log.PerfLogger: </PERFLOG method=SparkBuildRDDGraph 
start=1450973579074 end=1450973579273 duration=199 
from=org.apache.hadoop.hive.ql.exec.spark.SparkPlan>

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 
17:12:59 INFO client.RemoteDriver: Failed to run job 
d3746d11-eac8-4bf9-9897-bef27fd0423e

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 
java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:         at 
org.apache.spark.SparkContext.org <http://org.apache.spark.sparkcontext.org> 
$apache$spark$SparkContext$$assertNotStopped(SparkContext.scala:104)

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:         at 
org.apache.spark.SparkContext.submitJob(SparkContext.scala:1981)

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:         at 
org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1.apply(AsyncRDDActions.scala:118)

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:         at 
org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1.apply(AsyncRDDActions.scala:116)

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:         at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:         at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:         at 
org.apache.spark.rdd.RDD.withScope(RDD.scala:310)

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:         at 
org.apache.spark.rdd.AsyncRDDActions.foreachAsync(AsyncRDDActions.scala:116)

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:         at 
org.apache.spark.api.java.JavaRDDLike$class.foreachAsync(JavaRDDLike.scala:690)

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:         at 
org.apache.spark.api.java.AbstractJavaRDDLike.foreachAsync(JavaRDDLike.scala:47)

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:         at 
org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient$JobStatusJob.call(RemoteHiveSparkClient.java:257)

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:         at 
org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:366)

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:         at 
org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:335)

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:         at 
java.util.concurrent.FutureTask.run(FutureTask.java:262)

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:         at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:         at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:         at 
java.lang.Thread.run(Thread.java:745)

15/12/24 17:12:59 [RPC-Handler-3]: INFO client.SparkClientImpl: Received result 
for d3746d11-eac8-4bf9-9897-bef27fd0423e

Status: Failed

15/12/24 17:12:59 [Thread-8]: ERROR status.SparkJobMonitor: Status: Failed

15/12/24 17:12:59 [Thread-8]: INFO log.PerfLogger: </PERFLOG method=SparkRunJob 
start=1450973569576 end=1450973579584 duration=10008 
from=org.apache.hadoop.hive.ql.exec.spark.status.SparkJobMonitor>

FAILED: Execution Error, return code 3 from 
org.apache.hadoop.hive.ql.exec.spark.SparkTask

15/12/24 17:13:01 [main]: ERROR ql.Driver: FAILED: Execution Error, return code 
3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask

15/12/24 17:13:01 [main]: INFO log.PerfLogger: </PERFLOG method=Driver.execute 
start=1450973565261 end=1450973581307 duration=16046 
from=org.apache.hadoop.hive.ql.Driver>

15/12/24 17:13:01 [main]: INFO log.PerfLogger: <PERFLOG method=releaseLocks 
from=org.apache.hadoop.hive.ql.Driver>

15/12/24 17:13:01 [main]: INFO log.PerfLogger: </PERFLOG method=releaseLocks 
start=1450973581308 end=1450973581308 duration=0 
from=org.apache.hadoop.hive.ql.Driver>

15/12/24 17:13:01 [main]: INFO exec.ListSinkOperator: 7 finished. closing... 

15/12/24 17:13:01 [main]: INFO exec.ListSinkOperator: 7 Close done

15/12/24 17:13:01 [main]: INFO log.PerfLogger: <PERFLOG method=releaseLocks 
from=org.apache.hadoop.hive.ql.Driver>

15/12/24 17:13:01 [main]: INFO log.PerfLogger: </PERFLOG method=releaseLocks 
start=1450973581362 end=1450973581362 duration=0 
from=org.apache.hadoop.hive.ql.Driver>

 

 

The only useful thing I can find at the Spark side is in the worker log:

 

15/12/24 17:12:53 INFO worker.Worker: Asked to launch executor 
app-20151224171253-0000/0 for Hive on Spark

15/12/24 17:12:53 INFO spark.SecurityManager: Changing view acls to: ubuntu

15/12/24 17:12:53 INFO spark.SecurityManager: Changing modify acls to: ubuntu

15/12/24 17:12:53 INFO spark.SecurityManager: SecurityManager: authentication 
disabled; ui acls disabled; users with view permissions: Set(ubuntu); users 
with modify permissions: Set(ubuntu)

15/12/24 17:12:53 INFO worker.ExecutorRunner: Launch command: 
"/usr/lib/jvm/java-7-openjdk-amd64/bin/java" "-cp" 
"/usr/local/hadoop/etc/hadoop/:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs/:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*:/usr/local/hadoop/contrib/capacity-scheduler/*.jar:/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/lib/spark-assembly-1.5.2-hadoop2.4.0.jar:/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/lib/spark-assembly-1.5.2-hadoop2.4.0.jar:/usr/local/hadoop/etc/hadoop/:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs/:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*:/usr/local/hadoop/contrib/capacity-scheduler/*.jar:/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/lib/spark-assembly-1.5.2-hadoop2.4.0.jar:/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/lib/spark-assembly-1.5.2-hadoop2.4.0.jar:/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/sbin/../conf/:/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/lib/spark-assembly-1.5.2-hadoop2.4.0.jar:/usr/local/hadoop/etc/hadoop/:/usr/local/hadoop/etc/hadoop/:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs/:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*:/usr/local/hadoop/contrib/capacity-scheduler/*.jar"
 "-Xms1024M" "-Xmx1024M" "-Dspark.driver.port=44858" 
"-Dhive.spark.log.dir=/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/logs/"
 "-XX:MaxPermSize=256m" 
"org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" 
"akka.tcp://sparkDriver@192.168.1.64:44858/user/CoarseGrainedScheduler" 
"--executor-id" "0" "--hostname" "192.168.1.64" "--cores" "3" "--app-id" 
"app-20151224171253-0000" "--worker-url" 
"akka.tcp://sparkWorker@192.168.1.64:54209/user/Worker"

15/12/24 17:12:58 INFO worker.Worker: Asked to kill executor 
app-20151224171253-0000/0

15/12/24 17:12:58 INFO worker.ExecutorRunner: Runner thread for executor 
app-20151224171253-0000/0 interrupted

15/12/24 17:12:58 INFO worker.ExecutorRunner: Killing process!

15/12/24 17:12:58 ERROR logging.FileAppender: Error writing stream to file 
/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/work/app-20151224171253-0000/0/stderr

java.io.IOException: Stream closed

            at 
java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:162)

            at java.io.BufferedInputStream.read1(BufferedInputStream.java:272)

            at java.io.BufferedInputStream.read(BufferedInputStream.java:334)

            at java.io.FilterInputStream.read(FilterInputStream.java:107)

            at 
org.apache.spark.util.logging.FileAppender.appendStreamToFile(FileAppender.scala:70)

            at 
org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply$mcV$sp(FileAppender.scala:39)

            at 
org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply(FileAppender.scala:39)

            at 
org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply(FileAppender.scala:39)

            at 
org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699)

            at 
org.apache.spark.util.logging.FileAppender$$anon$1.run(FileAppender.scala:38)

15/12/24 17:12:59 INFO worker.Worker: Executor app-20151224171253-0000/0 
finished with state KILLED exitStatus 143

15/12/24 17:12:59 INFO worker.Worker: Cleaning up local directories for 
application app-20151224171253-0000

15/12/24 17:12:59 INFO shuffle.ExternalShuffleBlockResolver: Application 
app-20151224171253-0000 removed, cleanupLocalDirs = true

 

Here is my Spark configuration

 

export HADOOP_HOME=/usr/local/hadoop

export PATH=$PATH:$HADOOP_HOME/bin

export SPARK_DIST_CLASSPATH=`hadoop class path`

 

 

Any hints as to what could be going wrong? Why is the executor getting killed? 
Have I built Spark wrongly? I have tried building it in several different ways 
and I keep failing.

I must admit I am confused with the information I find online on how to 
use/build Spark on Hive and which version goes with what.

Can I download a pre-built version of Spark that would be suitable with my 
existing Hadoop 2.7.1 and my Hive 1.2.1?

This error has been baffling me for weeks..

 

 

More than grateful for any help!

Sofia

 

 

Reply via email to