Re: how to get file name of record being reading in spark

2016-05-31 Thread Vikash Kumar
Can anybody suggest different solution using inputFileName or input_file_name On Tue, May 31, 2016 at 11:43 PM, Vikash Kumar wrote: > thanks Ajay but I have this below code to generate dataframes, So I wanted > to change in df only to achieve this. I thought inputFileName

Re: Map tuple to case class in Dataset

2016-05-31 Thread Tim Gautier
That's really odd. I copied that code directly out of the shell and it errored out on me, several times. I wonder if something I did previously caused some instability. I'll see if it happens again tomorrow. On Tue, May 31, 2016, 8:37 PM Ted Yu wrote: > Using spark-shell of

Re: Map tuple to case class in Dataset

2016-05-31 Thread Ted Yu
Using spark-shell of 1.6.1 : scala> case class Test(a: Int) defined class Test scala> Seq(1,2).toDS.map(t => Test(t)).show +---+ | a| +---+ | 1| | 2| +---+ FYI On Tue, May 31, 2016 at 7:35 PM, Tim Gautier wrote: > 1.6.1 The exception is a null pointer exception.

Re: Map tuple to case class in Dataset

2016-05-31 Thread Tim Gautier
1.6.1 The exception is a null pointer exception. I'll paste the whole thing after I fire my cluster up again tomorrow. I take it by the responses that this is supposed to work? Anyone know when the next version is coming out? I keep running into bugs with 1.6.1 that are hindering my progress.

Re: Map tuple to case class in Dataset

2016-05-31 Thread Saisai Shao
It works fine in my local test, I'm using latest master, maybe this bug is already fixed. On Wed, Jun 1, 2016 at 7:29 AM, Michael Armbrust wrote: > Version of Spark? What is the exception? > > On Tue, May 31, 2016 at 4:17 PM, Tim Gautier > wrote:

Re: Map tuple to case class in Dataset

2016-05-31 Thread Michael Armbrust
Version of Spark? What is the exception? On Tue, May 31, 2016 at 4:17 PM, Tim Gautier wrote: > How should I go about mapping from say a Dataset[(Int,Int)] to a > Dataset[]? > > I tried to use a map, but it throws exceptions: > > case class Test(a: Int) >

Map tuple to case class in Dataset

2016-05-31 Thread Tim Gautier
How should I go about mapping from say a Dataset[(Int,Int)] to a Dataset[]? I tried to use a map, but it throws exceptions: case class Test(a: Int) Seq(1,2).toDS.map(t => Test(t)).show Thanks, Tim

Re: Protobuf class not found exception

2016-05-31 Thread Nikhil Goyal
http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-find-proto-buffer-class-error-with-RDD-lt-protobuf-gt-td14529.html But has this been solved? On Tue, May 31, 2016 at 3:26 PM, Nikhil Goyal wrote: > I am getting this error when I am trying to create rdd of

Protobuf class not found exception

2016-05-31 Thread Nikhil Goyal
I am getting this error when I am trying to create rdd of (protokey, value). When I change this to (*protokey.toString*, value) it works fine. *This is the stack trace:* java.lang.RuntimeException: Unable to find proto buffer class at

Re: Spark + Kafka processing trouble

2016-05-31 Thread Malcolm Lockyer
Thanks for the suggestions. I agree that there isn't some magic configuration setting, or that the sql options have some flaw - I just intended to explain the frustration of having a non-trivial (but still simple) Spark streaming job running on tiny amounts of data performing absolutely horribly.

Re: Debug spark jobs on Intellij

2016-05-31 Thread Marcelo Oikawa
> Is this python right? I'm not used to it, I'm used to scala, so > No. It is Java. > val toDebug = rdd.foreachPartition(partition -> { //breakpoint stop here > *// by val toDebug I mean to assign the result of foreachPartition to a > variable* > partition.forEachRemaining(message -> { >

Re: Debug spark jobs on Intellij

2016-05-31 Thread Dirceu Semighini Filho
Try this: Is this python right? I'm not used to it, I'm used to scala, so val toDebug = rdd.foreachPartition(partition -> { //breakpoint stop here *// by val toDebug I mean to assign the result of foreachPartition to a variable* partition.forEachRemaining(message -> { //breakpoint

Re: Debug spark jobs on Intellij

2016-05-31 Thread Marcelo Oikawa
> Hi Marcelo, this is because the operations in rdd are lazy, you will only > stop at this inside foreach breakpoint when you call a first, a collect or > a reduce operation. > Does forEachRemaining isn't a final method as first, collect or reduce? Anyway, I guess this is not the problem itself

Re: Debug spark jobs on Intellij

2016-05-31 Thread Dirceu Semighini Filho
Hi Marcelo, this is because the operations in rdd are lazy, you will only stop at this inside foreach breakpoint when you call a first, a collect or a reduce operation. This is when the spark will run the operations. Have you tried that? Cheers. 2016-05-31 17:18 GMT-03:00 Marcelo Oikawa

Debug spark jobs on Intellij

2016-05-31 Thread Marcelo Oikawa
Hello, list. I'm trying to debug my spark application on Intellij IDE. Before I submit my job, I ran the command line: export SPARK_SUBMIT_OPTS=-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=4000 after that: bin/spark-submit app-jar-with-dependencies.jar The IDE connects with

Re: Splitting RDD to exact number of partitions

2016-05-31 Thread Ovidiu-Cristian MARCU
Hi Ted, Any chance to develop more on the SQLConf parameters in the sense to have more explanations for changing these settings? Not all of them are made clear in the descriptions. Thanks! Best, Ovidiu > On 31 May 2016, at 16:30, Ted Yu wrote: > > Maciej: > You can refer

RE: About a problem when mapping a file located within a HDFS vmware cdh-5.7 image

2016-05-31 Thread David Newberger
Have you tried it without either of the setMaster lines? Also, CDH 5.7 uses spark 1.6.0 with some patches. I would recommend using the cloudera repo for spark files in build sbt. I’d also check other files in the build sbt to see if there are cdh specific versions. David Newberger From:

Re: About a problem when mapping a file located within a HDFS vmware cdh-5.7 image

2016-05-31 Thread Alonso Isidoro Roman
Hi David, the one of the develop branch. I think It should be the same, but actually not sure... Regards Alonso Isidoro Roman [image: https://]about.me/alonso.isidoro.roman

Re: how to get file name of record being reading in spark

2016-05-31 Thread Vikash Kumar
thanks Ajay but I have this below code to generate dataframes, So I wanted to change in df only to achieve this. I thought inputFileName will work but it's not working. private def getPaths: String = { val regex = (conf.namingConvention + conf.extension).replace("?", ".?").replace("*​*",

Re: how to get file name of record being reading in spark

2016-05-31 Thread Ajay Chander
Hi Vikash, These are my thoughts, read the input directory using wholeTextFiles() which would give a paired RDD with key as file name and value as file content. Then you can apply a map function to read each line and append key to the content. Thank you, Aj On Tuesday, May 31, 2016, Vikash

RE: About a problem when mapping a file located within a HDFS vmware cdh-5.7 image

2016-05-31 Thread David Newberger
Is https://github.com/alonsoir/awesome-recommendation-engine/blob/master/build.sbt the build.sbt you are using? David Newberger QA Analyst WAND - The Future of Restaurant Technology (W) www.wandcorp.com (E)

Recommended way to close resources in a Spark streaming application

2016-05-31 Thread Mohammad Tariq
Dear fellow Spark users, I have a streaming app which is reading data from Kafka, doing some computations and storing the results into HBase. Since I am new to Spark streaming I feel that there could still be scope of making my app better. To begin with, I was wondering what's the best way to

how to get file name of record being reading in spark

2016-05-31 Thread Vikash Kumar
I have a requirement in which I need to read the input files from a directory and append the file name in each record while output. e.g. I have directory /input/files/ which have folllowing files: ABC_input_0528.txt ABC_input_0531.txt suppose input file ABC_input_0528.txt contains 111,abc,234

Re: GraphX Java API

2016-05-31 Thread Sonal Goyal
Its very much possible to use GraphX through Java, though some boilerplate may be needed. Here is an example. Create a graph from edge and vertex RDD (JavaRDD> vertices, JavaRDD edges ) ClassTag longTag = scala.reflect.ClassTag$.MODULE$.apply(Long.class);

About a problem when mapping a file located within a HDFS vmware cdh-5.7 image

2016-05-31 Thread Alonso
I have a vmware cloudera image, cdh-5.7 running with centos6.8, i am using OS X as my development machine, and the cdh image to run the code, i upload the code using git to the cdh image, i have modified my /etc/hosts file located in the cdh image with a line like this: 127.0.0.1

Re: Spark + Kafka processing trouble

2016-05-31 Thread Cody Koeninger
> 500ms is I believe the minimum batch interval for Spark micro batching. It's better to test than to believe, I've run 250ms jobs. Same applies to the comments around JDBC, why assume when you could (dis)prove? It's not like it's a lot of effort to set up a minimal job that does

Re: Spark + Kafka processing trouble

2016-05-31 Thread Mich Talebzadeh
500ms is I believe the minimum batch interval for Spark micro batching. However, a JDBC call is a use of Unix file descriptor and context switch and it does have performance implication. That is irrespective of Kafka as it is happening one is actually going through Hive JDBC. It is a classic

Re: Behaviour of RDD sampling

2016-05-31 Thread firemonk9
yes, Spark needs to create the RDD first(loads all the data) to create the sample. You can split the files into two sets outside of spark in order to load only the sample set. Thank youDhiraj -- View this message in context:

Re: Spark + Kafka processing trouble

2016-05-31 Thread Cody Koeninger
There isn't a magic spark configuration setting that would account for multiple-second-long fixed overheads, you should be looking at maybe 200ms minimum for a streaming batch. 1024 kafka topicpartitions is not reasonable for the volume you're talking about. Unless you have really extreme

Re: Splitting RDD to exact number of partitions

2016-05-31 Thread Takeshi Yamamuro
If you don't hesitate the newest version, you try to use v2.0-preview. http://spark.apache.org/news/spark-2.0.0-preview.html There, you can control #partitions for input partitions without shuffles by two parameters below; spark.sql.files.maxPartitionBytes spark.sql.files.openCostInBytes ( Not

Re: Splitting RDD to exact number of partitions

2016-05-31 Thread Ted Yu
Maciej: You can refer to the doc in sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala for these parameters. On Tue, May 31, 2016 at 7:27 AM, Takeshi Yamamuro wrote: > If you don't hesitate the newest version, you try to use v2.0-preview. >

java.io.FileNotFoundException

2016-05-31 Thread kishore kumar
Hi, We installed spark1.2.1 in single node, running a job in yarn-client mode on yarn which loads data into hbase and elasticsearch, the error which we are encountering is Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 38 in stage 26800.0

Re: Splitting RDD to exact number of partitions

2016-05-31 Thread Maciej Sokołowski
Thanks. At what conditions number of partitions can be higher than minPartitions when reading textFile? Should this be considered as unfrequent situation? To sum up - is there more efficient way to ensure exact number of partitions than following: rdd = sc.textFile("perf_test1.csv",

Re: Splitting RDD to exact number of partitions

2016-05-31 Thread Maciej Sokołowski
After setting shuffle to true I get expected 128 partitions, but I'm worried about performance of such solution - especially I see that some shuffling is done because size of partitions chages: scala> sc.textFile("hdfs:///proj/dFAB_test/testdata/perf_test1.csv", minPartitions=128).coalesce(128,

Re: Splitting RDD to exact number of partitions

2016-05-31 Thread Ted Yu
Value for shuffle is false by default. Have you tried setting it to true ? Which Spark release are you using ? On Tue, May 31, 2016 at 6:13 AM, Maciej Sokołowski wrote: > Hello Spark users and developers. > > I read file and want to ensure that it has exact number of

Splitting RDD to exact number of partitions

2016-05-31 Thread Maciej Sokołowski
Hello Spark users and developers. I read file and want to ensure that it has exact number of partitions, for example 128. In documentation I found: def textFile(path: String, minPartitions: Int = defaultMinPartitions): RDD[String] But argument here is minimal number of partitions, so I use

Re: spark.hadoop.dfs.replication parameter not working for kafka-spark streaming

2016-05-31 Thread Abhishek Anand
I also tried jsc.sparkContext().sc().hadoopConfiguration().set("dfs.replication", "2") But, still its not working. Any ideas why its not working ? Abhi On Tue, May 31, 2016 at 4:03 PM, Abhishek Anand wrote: > My spark streaming checkpoint directory is being

Re: Spark Thrift Server run job as hive user

2016-05-31 Thread Radhika Kothari
Hi, Sorry. I mean to say Service management of my cluster is configured by ambari server. With the help of Ambari server Web Ui , i can start and stop thrift server. Warm Regards, -Radhika On Tue, May 31, 2016 at 5:48 PM, Radhika Kothari < radhikakothari100...@gmail.com> wrote: > Hi, > I mean

Re: Splitting RDD to exact number of partitions

2016-05-31 Thread Maciej Sokołowski
Hello Spark users and developers. I read file and want to ensure that it has exact number of partitions, for example 128. In documentation I found: def textFile(path: String, minPartitions: Int = defaultMinPartitions): RDD[String] But argument here is minimal number of partitions, so I use

RE: Running R codes in sparkR

2016-05-31 Thread Kumar, Saurabh 5. (Nokia - IN/Bangalore)
Hi Arunkumar, sparkR has very limited functionality than R , and few of datatypes like 'data table' in R is not there sparkR . So you need to be check compatibility of you R code carefully with sparkR. Regards, Saurabh -Original Message- From: mylistt...@gmail.com

Re: Running R codes in sparkR

2016-05-31 Thread mylisttech
Hi Arunkumar , Yes , R can be integrated with Spark to give you SparkR. There are a couple of blogs on the net. The Spark dev page has it too. https://spark.apache.org/docs/latest/sparkr.html Just remember that all packages of R that you may have worked on in R are not supported in SparkR.

Running R codes in sparkR

2016-05-31 Thread Arunkumar Pillai
Hi I have some basic doubt regarding spark R. 1. can we run R codes in spark using sparkR or some spark functionalities that are executed in spark through R. -- Thanks and Regards Arun

Re: Behaviour of RDD sampling

2016-05-31 Thread nguyen duc tuan
​​Spark will load the whole dataset. The sampling action can be viewed as an filter. The real implementation can be more complicate, but I give you the idea by simple implementation. val rand = new Random(); val subRdd = rdd.filter(x => rand.nextDouble() <= 0.3) To prevent recomputing data, you

processing twitter data

2016-05-31 Thread Ashok Kumar
hi all, i know very little about the subject. we would like to get streaming data from twitter and facebook. so questions please may i - what format is data from twitter. is it jason format - can i use spark and spark streaming for analyzing data - can data be fed in/streamed via

Re: Spark Thrift Server run job as hive user

2016-05-31 Thread Jacek Laskowski
What's "With the help of UI"? Pozdrawiam, Jacek Laskowski https://medium.com/@jaceklaskowski/ Mastering Apache Spark http://bit.ly/mastering-apache-spark Follow me at https://twitter.com/jaceklaskowski On Tue, May 31, 2016 at 1:02 PM, Radhika Kothari wrote:

Re: Accessing s3a files from Spark

2016-05-31 Thread Mayuresh Kunjir
How do I use it? I'm accessing s3a from Spark's textFile API. On Tue, May 31, 2016 at 7:16 AM, Deepak Sharma wrote: > Hi Mayuresh > Instead of s3a , have you tried the https:// uri for the same s3 bucket? > > HTH > Deepak > > On Tue, May 31, 2016 at 4:41 PM, Mayuresh

Re: Accessing s3a files from Spark

2016-05-31 Thread Mayuresh Kunjir
On Tue, May 31, 2016 at 7:05 AM, Gourav Sengupta wrote: > Hi, > > And on another note, is it required to use s3a? Why not use s3:// only? I > prefer to use s3a:// only while writing files to S3 from EMR > ​Does Spark support s3://? I am using s3a over s3n because I

Re: Accessing s3a files from Spark

2016-05-31 Thread Deepak Sharma
Hi Mayuresh Instead of s3a , have you tried the https:// uri for the same s3 bucket? HTH Deepak On Tue, May 31, 2016 at 4:41 PM, Mayuresh Kunjir wrote: > > > On Tue, May 31, 2016 at 5:29 AM, Steve Loughran > wrote: > >> which s3 endpoint? >> >> >

Re: Accessing s3a files from Spark

2016-05-31 Thread Mayuresh Kunjir
On Tue, May 31, 2016 at 5:29 AM, Steve Loughran wrote: > which s3 endpoint? > > ​I have tried both s3.amazonaws.com and s3-external-1.amazonaws.com​. > > > On 29 May 2016, at 22:55, Mayuresh Kunjir wrote: > > I'm running into permission issues

Re: Accessing s3a files from Spark

2016-05-31 Thread Gourav Sengupta
Hi, And on another note, is it required to use s3a? Why not use s3:// only? I prefer to use s3a:// only while writing files to S3 from EMR. Regards, Gourav Sengupta On Tue, May 31, 2016 at 12:04 PM, Gourav Sengupta wrote: > Hi, > > Is your spark cluster running in

Re: Accessing s3a files from Spark

2016-05-31 Thread Gourav Sengupta
Hi, Is your spark cluster running in EMR or via self created SPARK cluster using EC2 or from a local cluster behind firewall? What is the SPARK version you are using? Regards, Gourav Sengupta On Sun, May 29, 2016 at 10:55 PM, Mayuresh Kunjir wrote: > I'm running into

Re: Spark Thrift Server run job as hive user

2016-05-31 Thread Radhika Kothari
Hi, I am using multinode machine and spark thrift server running on all the nodes. With the help of UI ,i started spark thrift server. With the help of spark beeline ,connect to port 10015. Create UDfF and use it in beeline I am Running job as user1 , but internally it is taking hive.

Re: Spark Thrift Server run job as hive user

2016-05-31 Thread Jacek Laskowski
Hi, How do you start thrift server? What's your user name? I think it takes the user and always runs as it. Seen proxyUser today in spark-submit that may or may not be useful here. Jacek On 31 May 2016 10:01 a.m., "Radhika Kothari" wrote: Hi Anyone knows about

spark.hadoop.dfs.replication parameter not working for kafka-spark streaming

2016-05-31 Thread Abhishek Anand
My spark streaming checkpoint directory is being written to HDFS with default replication factor of 3. In my streaming application where I am listening from kafka and setting the dfs.replication = 2 as below the files are still being written with replication factor=3 SparkConf sparkConfig = new

Compute the global rank of the column

2016-05-31 Thread Dai, Kevin
Hi, All I want to compute the rank of some column in a table. Currently, I use the window function to do it. However all data will be in one partition. Is there better solution to do it? Regards, Kevin.

Re: Accessing s3a files from Spark

2016-05-31 Thread Steve Loughran
which s3 endpoint? On 29 May 2016, at 22:55, Mayuresh Kunjir > wrote: I'm running into permission issues while accessing data in S3 bucket stored using s3a file system from a local Spark cluster. Has anyone found success with this? My setup

Re: Spark + Kafka processing trouble

2016-05-31 Thread Alonso Isidoro Roman
Mich`s idea is quite fine, if i was you, i will follow his idea... Alonso Isidoro Roman [image: https://]about.me/alonso.isidoro.roman 2016-05-31 6:37 GMT+02:00 Mich Talebzadeh

Spark Streaming: Combine MLlib Prediction and Features on Dstreams

2016-05-31 Thread obaidul karim
Hi nguyen, Thanks again. Yes, faltMap may do the trick as well. I may try it out. I will let you know the result when done. On Tue, May 31, 2016 at 3:58 PM, nguyen duc tuan > wrote: > 1. RandomForest 'predict'

Re: Behaviour of RDD sampling

2016-05-31 Thread Patrick Baier
I would assume that the driver has to count the number of lines in the json file anyway. Otherwise, how could it tell the workers which lines they should work on? 2016-05-31 10:03 GMT+02:00 Gavin Yue : > If not reading the whole dataset, how do you know the total number

Re: Behaviour of RDD sampling

2016-05-31 Thread Gavin Yue
If not reading the whole dataset, how do you know the total number of records? If not knowing total number, how do you choose 30%? > On May 31, 2016, at 00:45, pbaier wrote: > > Hi all, > > I have to following use case: > I have around 10k of jsons that I want to

Spark Thrift Server run job as hive user

2016-05-31 Thread Radhika Kothari
Hi Anyone knows about spark thrift server always take hive user ,when i am running job as another user. Warm Regards, -Radhika

Re: Spark Streaming: Combine MLlib Prediction and Features on Dstreams

2016-05-31 Thread nguyen duc tuan
1. RandomForest 'predict' method supports both RDD or Vector as input ( http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.mllib.tree.model.RandomForestModel) . So, in this case, function extract_feature should return tuple.(prediction, rawtext). If each input text can

Behaviour of RDD sampling

2016-05-31 Thread pbaier
Hi all, I have to following use case: I have around 10k of jsons that I want to use for learning. The jsons are all stored in one file. For learning a ML model, however, I only need around 30% of the jsons (the rest is not needed at all). So, my idea was to load all data into a RDD and then use

Behaviour of RDD sampling

2016-05-31 Thread pbaier
Hi all, I have to following use case: I have around 10k of jsons that I want to use for learning. The jsons are all stored in one file. For learning a ML model, however, I only need around 30% of the jsons (the rest is not needed at all). So, my idea was to load all data into a RDD and then use

Re: Spark Streaming: Combine MLlib Prediction and Features on Dstreams

2016-05-31 Thread obaidul karim
Hi nguyen, Thanks a lot for your time and really appreciate good suggestions. Please find my concerns in line below: def extract_feature(rf_model, x): text = getFeatures(x).split(',') fea = [float(i) for i in text] prediction = rf_model.predict(fea) return (prediction, x) <<< this will return

?????? can not use udf in hivethriftserver2

2016-05-31 Thread ??????
hi,lalit sharma. I add jar to SPARK_CLASSPATH env variable. But spark thriftserver can not start. errors: ### 16/05/31 02:29:12 ERROR SparkContext: Error initializing SparkContext. org.apache.spark.SparkException: Found both spark.executor.extraClassPath and SPARK_CLASSPATH. Use only the

Fwd: User finding issue in Spark Thrift server

2016-05-31 Thread Radhika Kothari
Warm Regards, -Radhika -- Forwarded message -- From: Radhika Kothari Date: Tue, May 31, 2016 at 11:44 AM Subject: Fwd: User finding issue in Spark Thrift server To: user@spark.apache.org Warm Regards, -Radhika -- Forwarded message

Fwd: User finding issue in Spark Thrift server

2016-05-31 Thread Radhika Kothari
Warm Regards, -Radhika -- Forwarded message -- From: Radhika Kothari Date: Tue, May 31, 2016 at 11:42 AM Subject: Fwd: User finding issue in Spark Thrift server To: user@spark.apache.org Warm Regards, -Radhika -- Forwarded message

Fwd: User finding issue in Spark Thrift server

2016-05-31 Thread Radhika Kothari
Warm Regards, -Radhika -- Forwarded message -- From: Radhika Kothari Date: Tue, May 31, 2016 at 11:41 AM Subject: Fwd: User finding issue in Spark Thrift server To: user-i...@spark.apache.org Warm Regards, -Radhika -- Forwarded message

RE: GraphX Java API

2016-05-31 Thread Santoshakhilesh
Hi , Scala has similar package structure as java and finally it runs on JVM so probably you get an impression that its in Java. As far as I know there are no Java API for GraphX. I had used GraphX last year and at that time I had to code in Scala to use the GraphX APIs. Regards, Santosh Akhilesh

Re: Spark SQL Errors

2016-05-31 Thread ayan guha
Unfortunately, I do not have it, as it is 3rd party code :( But essentially I am trying to overwrite data to a hive table from a source On Tue, May 31, 2016 at 4:01 PM, Mich Talebzadeh wrote: > ok what is the exact spark code that is causing the issue. > > can you

Re: Spark Streaming: Combine MLlib Prediction and Features on Dstreams

2016-05-31 Thread nguyen duc tuan
I'm not sure what do you mean by saying "does not return any value". How do you use this method? I will use this method as following : def extract_feature(rf_model, x): text = getFeatures(x).split(',') fea = [float(i) for i in text] prediction = rf_model.predict(fea) return (prediction, x) def

Re: Spark SQL Errors

2016-05-31 Thread Mich Talebzadeh
ok what is the exact spark code that is causing the issue. can you show it in its entirety? HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw *

Re: Spark Streaming: Combine MLlib Prediction and Features on Dstreams

2016-05-31 Thread obaidul karim
Sorry for lots of typos (writing from mobile) On Tuesday, 31 May 2016, obaidul karim wrote: > foreachRDD does not return any value. I can be used just to send result to > another place/context, like db,file etc. > I could use that but seems like over head of having another