Re: Spark Streaming RDD to Shark table
Hi, I'm running into an identical issue running Spark 1.0.0 on Mesos 0.19. Were you able to get it sorted? There's no real documentation for the spark.httpBroadcast.uri except what's in the code - is this config setting required for running on a Mesos cluster? I'm running this in a dev environment with a simple 2 machine setup - the driver is running on dev-1, and dev-2 (10.0.0.5 in the below stack trace) has a mesos master, zookeeper, and mesos slave. Stack Trace: 14/07/11 18:00:05 INFO SparkEnv: Connecting to MapOutputTracker: akka.tcp://spark@dev-1:58136/user/MapOutputTracker 14/07/11 18:00:06 INFO SparkEnv: Connecting to BlockManagerMaster: akka.tcp://spark@dev-1:58136/user/BlockManagerMaster 14/07/11 18:00:06 INFO DiskBlockManager: Created local directory at /tmp/spark-local-20140711180006-dea8 14/07/11 18:00:06 INFO MemoryStore: MemoryStore started with capacity 589.2 MB. 14/07/11 18:00:06 INFO ConnectionManager: Bound socket to port 60708 with id = ConnectionManagerId(10.0.0.5,60708) 14/07/11 18:00:06 INFO BlockManagerMaster: Trying to register BlockManager 14/07/11 18:00:06 INFO BlockManagerMaster: Registered BlockManager java.util.NoSuchElementException: spark.httpBroadcast.uri at org.apache.spark.SparkConf$$anonfun$get$1.apply(SparkConf.scala:149) at org.apache.spark.SparkConf$$anonfun$get$1.apply(SparkConf.scala:149) at scala.collection.MapLike$class.getOrElse(MapLike.scala:128) at scala.collection.AbstractMap.getOrElse(Map.scala:58) at org.apache.spark.SparkConf.get(SparkConf.scala:149) at org.apache.spark.broadcast.HttpBroadcast$.initialize(HttpBroadcast.scala:130) at org.apache.spark.broadcast.HttpBroadcastFactory.initialize(HttpBroadcastFactory.scala:31) at org.apache.spark.broadcast.BroadcastManager.initialize(BroadcastManager.scala:48) at org.apache.spark.broadcast.BroadcastManager.init(BroadcastManager.scala:35) at org.apache.spark.SparkEnv$.create(SparkEnv.scala:218) at org.apache.spark.executor.Executor.init(Executor.scala:85) at org.apache.spark.executor.MesosExecutorBackend.registered(MesosExecutorBackend.scala:56) Exception in thread Thread-2 I0711 18:00:06.454962 14037 exec.cpp:412] Deactivating the executor libprocess If I manually set the httpBroadcastUri to http://dev-1; I get the following error, I assume because I'm not setting the port correctly (which I don't think I have any way of knowing?) 14/07/11 18:31:27 ERROR Executor: Exception in task ID 4 java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at java.net.Socket.connect(Socket.java:528) at sun.net.NetworkClient.doConnect(NetworkClient.java:180) at sun.net.www.http.HttpClient.openServer(HttpClient.java:432) at sun.net.www.http.HttpClient.openServer(HttpClient.java:527) at sun.net.www.http.HttpClient.init(HttpClient.java:211) at sun.net.www.http.HttpClient.New(HttpClient.java:308) at sun.net.www.http.HttpClient.New(HttpClient.java:326) at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:996) at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:932) at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:850) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1300) at org.apache.spark.broadcast.HttpBroadcast$.read(HttpBroadcast.scala:196) at org.apache.spark.broadcast.HttpBroadcast.readObject(HttpBroadcast.scala:89) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) at
Re: Spark Streaming RDD to Shark table
OK...I needed to set the JVM class.path for the worker to find the fb class: env.put(SPARK_JAVA_OPTS, -Djava.class.path=/home/myInc/hive-0.9.0-bin/lib/libfb303.jar); Now I am seeing the following spark.httpBroadcast.uri error. What am I missing? java.util.NoSuchElementException: spark.httpBroadcast.uri at org.apache.spark.SparkConf$$anonfun$get$1.apply(SparkConf.scala:151) at org.apache.spark.SparkConf$$anonfun$get$1.apply(SparkConf.scala:151) at scala.collection.MapLike$class.getOrElse(MapLike.scala:128) at scala.collection.AbstractMap.getOrElse(Map.scala:58) at org.apache.spark.SparkConf.get(SparkConf.scala:151) at org.apache.spark.broadcast.HttpBroadcast$.initialize(HttpBroadcast.scala:104) at org.apache.spark.broadcast.HttpBroadcastFactory.initialize(HttpBroadcast.scala:70) at org.apache.spark.broadcast.BroadcastManager.initialize(Broadcast.scala:81) at org.apache.spark.broadcast.BroadcastManager.init(Broadcast.scala:68) at org.apache.spark.SparkEnv$.create(SparkEnv.scala:175) at org.apache.spark.executor.Executor.init(Executor.scala:110) at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$receive$1.applyOrElse(CoarseGrainedExecutorBackend.scala:56) . . . 14/05/27 15:26:45 INFO CoarseGrainedExecutorBackend: Connecting to driver: akka.tcp://sp...@clim2-dsv.myinc.ad.myinccorp.com:3694/user/CoarseGrainedScheduler 14/05/27 15:26:46 ERROR CoarseGrainedExecutorBackend: Slave registration failed: Duplicate executor ID: 8 === Full Stack: === Spark Executor Command: /usr/lib/jvm/java-7-openjdk-i386/bin/java -cp :/home/myInc/spark-0.9.1-bin-hadoop1/conf:/home/myInc/spark-0.9.1-bin-hadoop1/assembly/target/scala-2.10/spark-assembly_2.10-0.9.1-hadoop1.0.4.jar -Djava.library.path=/home/myInc/hive-0.9.0-bin/lib/libfb303.jar -Djava.library.path=/home/myInc/hive-0.9.0-bin/lib/libfb303.jar -Xms512M -Xmx512M org.apache.spark.executor.CoarseGrainedExecutorBackend akka.tcp://sp...@clim2-dsv.myinc.ad.myinccorp.com:3694/user/CoarseGrainedScheduler 8 tahiti-ins.myInc.ad.myInccorp.com 1 akka.tcp://sparkwor...@tahiti-ins.myinc.ad.myinccorp.com:37841/user/Worker app-20140527152556-0029 log4j:WARN No appenders could be found for logger (akka.event.slf4j.Slf4jLogger). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. 14/05/27 15:26:44 INFO CoarseGrainedExecutorBackend: Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 14/05/27 15:26:44 INFO WorkerWatcher: Connecting to worker akka.tcp://sparkwor...@tahiti-ins.myinc.ad.myinccorp.com:37841/user/Worker 14/05/27 15:26:44 INFO CoarseGrainedExecutorBackend: Connecting to driver: akka.tcp://sp...@clim2-dsv.myinc.ad.myinccorp.com:3694/user/CoarseGrainedScheduler 14/05/27 15:26:45 INFO WorkerWatcher: Successfully connected to akka.tcp://sparkwor...@tahiti-ins.myinc.ad.myinccorp.com:37841/user/Worker 14/05/27 15:26:45 INFO CoarseGrainedExecutorBackend: Successfully registered with driver 14/05/27 15:26:45 INFO Slf4jLogger: Slf4jLogger started 14/05/27 15:26:45 INFO Remoting: Starting remoting 14/05/27 15:26:45 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sp...@tahiti-ins.myinc.ad.myinccorp.com:43488] 14/05/27 15:26:45 INFO Remoting: Remoting now listens on addresses: [akka.tcp://sp...@tahiti-ins.myinc.ad.myinccorp.com:43488] 14/05/27 15:26:45 INFO SparkEnv: Connecting to BlockManagerMaster: akka.tcp://sp...@clim2-dsv.myinc.ad.myinccorp.com:3694/user/BlockManagerMaster 14/05/27 15:26:45 INFO DiskBlockManager: Created local directory at /tmp/spark-local-20140527152645-b13b 14/05/27 15:26:45 INFO MemoryStore: MemoryStore started with capacity 297.0 MB. 14/05/27 15:26:45 INFO ConnectionManager: Bound socket to port 55853 with id = ConnectionManagerId(tahiti-ins.myInc.ad.myInccorp.com,55853) 14/05/27 15:26:45 INFO BlockManagerMaster: Trying to register BlockManager 14/05/27 15:26:45 INFO BlockManagerMaster: Registered BlockManager 14/05/27 15:26:45 ERROR OneForOneStrategy: spark.httpBroadcast.uri java.util.NoSuchElementException: spark.httpBroadcast.uri at org.apache.spark.SparkConf$$anonfun$get$1.apply(SparkConf.scala:151) at org.apache.spark.SparkConf$$anonfun$get$1.apply(SparkConf.scala:151) at scala.collection.MapLike$class.getOrElse(MapLike.scala:128) at scala.collection.AbstractMap.getOrElse(Map.scala:58) at org.apache.spark.SparkConf.get(SparkConf.scala:151) at org.apache.spark.broadcast.HttpBroadcast$.initialize(HttpBroadcast.scala:104) at org.apache.spark.broadcast.HttpBroadcastFactory.initialize(HttpBroadcast.scala:70) at org.apache.spark.broadcast.BroadcastManager.initialize(Broadcast.scala:81) at