Hi Fanoos

I would be careful about using collect(). You need to make sure you local
computer has enough memory to hold your entire data set.

Eventually I will need to do something similar. I have to written the code
yet. My plan is to load the data into a data frame and then write a UDF that
actually publishes the Kafka

If you are using RDD¹s you could use map() or some other transform to cause
the data to be published

Andy

From:  fanooos <dev.fano...@gmail.com>
Date:  Tuesday, March 29, 2016 at 4:26 AM
To:  "user @spark" <user@spark.apache.org>
Subject:  Re: Sending events to Kafka from spark job

> I think I find a solution but I have no idea how this affects the execution
> of the application.
> 
> At the end of the script I added  a sleep statement.
> 
> import time
> time.sleep(1)
> 
> 
> This solved the problem.
> 
> 
> 
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Sending-events-to-Kafka-fr
> om-spark-job-tp26622p26624.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
> 
> 


Reply via email to