Hi All, My application works when I use the spark-submit with master=local[*]. But if I deploy the application to a standalone cluster master=spark://master:7077 then the application polls twice twice from kafka topic and then it stops working. I don't get any error logs.
I can see application connecting to kafka but it just doesn't seem to receive any messages in a cluster standalone environment. The application code is as followed: val sc = new SparkContext(sparkConf) val sqlContext = new SQLContext(sc) import sqlContext._ val ssc = new StreamingContext(sparkConf, Seconds(3)) ssc.checkpoint("checkpoint") val dStream = KafkaUtils.createStream(ssc, zkQuorum, group, topicpMap) val eventData = dStream.map(_._2).window(Seconds(9)).map(_.split(",")).map(data => Data(data(0), data(1), data(2), data(3), data(4))) val result = eventData.transform((rdd, time)=>{ sqlContext.registerRDDAsTable(rdd, "data") sql("SELECT count(state) FROM data WHERE state='Active'") }) result.print() ssc.start() ssc.awaitTermination() I would appreciate your help. Thanks, Ali -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Kafka-and-Spark-application-after-polling-twice-tp11291.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org