Re: Multiple Kafka topics processing in Spark 2.2

2017-09-08 Thread Dan Dong
empty[OffsetRange] > > directKafkaStream.transform { rdd => >offsetRanges = rdd.asInstanceOf[HasOffsetRanges].offsetRanges >rdd > }.map { >... > }.foreachRDD { rdd => >for (o <- offsetRanges) { > println(*s"${o.topic}* ${o.partition}

Multiple Kafka topics processing in Spark 2.2

2017-09-06 Thread Dan Dong
Hi, All, I have one issue here about how to process multiple Kafka topics in a Spark 2.* program. My question is: How to get the topic name from a message received from Kafka? E.g: .. val messages = KafkaUtils.createDirectStream[String, String, StringDecoder, StringDecoder]( ssc,

Could not access Spark webUI on OpenStack VMs

2016-04-28 Thread Dan Dong
Hi, all, I'm having problem to access the web UI of my Spark cluster. The cluster is composed of a few virtual machines running on a OpenStack platform. The VMs are launched from CentOS7.0 server image available from official site. The Spark itself runs well and master and worker process are all

Re: java.lang.NoSuchMethodError for list.toMap.

2015-07-27 Thread Dan Dong
. Thanks Best Regards On Fri, Jul 24, 2015 at 2:15 AM, Dan Dong dongda...@gmail.com wrote: Hi, When I ran with spark-submit the following simple Spark program of: import org.apache.spark.SparkContext._ import org.apache.spark.SparkConf import org.apache.spark.rdd.RDD import

java.lang.NoSuchMethodError for list.toMap.

2015-07-23 Thread Dan Dong
Hi, When I ran with spark-submit the following simple Spark program of: import org.apache.spark.SparkContext._ import org.apache.spark.SparkConf import org.apache.spark.rdd.RDD import org.apache.spark.SparkContext import org.apache.spark._ import SparkContext._ object TEST2{ def

Re: spark-submit and spark-shell behaviors mismatch.

2015-07-23 Thread Dan Dong
or with --master? I'd try with --master whatever master you use for spark-submit Also, if you're using standalone mode I believe the worker log contains the launch command for the executor -- you probably want to examine that classpath carefully On Wed, Jul 22, 2015 at 5:25 PM, Dan Dong dongda

Re: How to share a Map among RDDS?

2015-07-22 Thread Dan Dong
do a join On 22 Jul 2015 07:54, Dan Dong dongda...@gmail.com wrote: Hi, All, I am trying to access a Map from RDDs that are on different compute nodes, but without success. The Map is like: val map1 = Map(aa-1,bb-2,cc-3,...) All RDDs will have to check against it to see if the key

Re: How to share a Map among RDDS?

2015-07-22 Thread Dan Dong
Thanks Andrew, exactly. 2015-07-22 14:26 GMT-05:00 Andrew Or and...@databricks.com: Hi Dan, `map2` is a broadcast variable, not your map. To access the map on the executors you need to do `map2.value(a)`. -Andrew 2015-07-22 12:20 GMT-07:00 Dan Dong dongda...@gmail.com: Hi, Andrew

spark-submit and spark-shell behaviors mismatch.

2015-07-22 Thread Dan Dong
Hi, I have a simple test spark program as below, the strange thing is that it runs well under a spark-shell, but will get a runtime error of java.lang.NoSuchMethodError: in spark-submit, which indicate the line of: val maps2=maps.collect.toMap has problem. But why the compilation has no

How to share a Map among RDDS?

2015-07-21 Thread Dan Dong
Hi, All, I am trying to access a Map from RDDs that are on different compute nodes, but without success. The Map is like: val map1 = Map(aa-1,bb-2,cc-3,...) All RDDs will have to check against it to see if the key is in the Map or not, so seems I have to make the Map itself global, the problem

To access elements of a org.apache.spark.mllib.linalg.Vector

2015-07-14 Thread Dan Dong
Hi, I'm wondering how to access elements of a linalg.Vector, e.g: sparseVector: Seq[org.apache.spark.mllib.linalg.Vector] = List((3,[1,2],[1.0,2.0]), (3,[0,1,2],[3.0,4.0,5.0])) scala sparseVector(1) res16: org.apache.spark.mllib.linalg.Vector = (3,[0,1,2],[3.0,4.0,5.0]) How to get the

Re: To access elements of a org.apache.spark.mllib.linalg.Vector

2015-07-14 Thread Dan Dong
(sVec.values).toMap ``` Best, Burak On Tue, Jul 14, 2015 at 12:23 PM, Dan Dong dongda...@gmail.com wrote: Hi, I'm wondering how to access elements of a linalg.Vector, e.g: sparseVector: Seq[org.apache.spark.mllib.linalg.Vector] = List((3,[1,2],[1.0,2.0]), (3,[0,1,2],[3.0,4.0,5.0])) scala

How to specify PATHS for user defined functions.

2015-07-09 Thread Dan Dong
Hi, All, I have a function and want to access it in my spark programs, but I got the: Exception in thread main java.lang.NoSuchMethodError in spark-submit. I put the function under: ./src/main/scala/com/aaa/MYFUNC/MYFUNC.scala: package com.aaa.MYFUNC object MYFUNC{ def FUNC1(input:

question about the TFIDF.

2015-05-06 Thread Dan Dong
Hi, All, When I try to follow the document about tfidf from: http://spark.apache.org/docs/latest/mllib-feature-extraction.html val conf = new SparkConf().setAppName(TFIDF) val sc=new SparkContext(conf) val

multiple programs compilation by sbt.

2015-04-29 Thread Dan Dong
Hi, Following the Quick Start guide: https://spark.apache.org/docs/latest/quick-start.html I could compile and run a Spark program successfully, now my question is how to compile multiple programs with sbt in a bunch. E.g, two programs as: ./src ./src/main ./src/main/scala

Re: multiple programs compilation by sbt.

2015-04-29 Thread Dan Dong
HI, Ted, I will have a look at it , thanks a lot. Cheers, Dan 2015年4月29日 下午5:00于 Ted Yu yuzhih...@gmail.com写道: Have you looked at http://www.scala-sbt.org/0.13/tutorial/Multi-Project.html ? Cheers On Wed, Apr 29, 2015 at 2:45 PM, Dan Dong dongda...@gmail.com wrote: Hi

Error: no snappyjava in java.library.path

2015-02-26 Thread Dan Dong
Hi, All, When I run a small program in spark-shell, I got the following error: ... Caused by: java.lang.UnsatisfiedLinkError: no snappyjava in java.library.path at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1886) at java.lang.Runtime.loadLibrary0(Runtime.java:849) at

no snappyjava in java.library.path

2015-01-12 Thread Dan Dong
Hi, My Spark job failed with no snappyjava in java.library.path as: Caused by: java.lang.UnsatisfiedLinkError: no snappyjava in java.library.path at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1857) at java.lang.Runtime.loadLibrary0(Runtime.java:870) at