Hi Fanoos I would be careful about using collect(). You need to make sure you local computer has enough memory to hold your entire data set.
Eventually I will need to do something similar. I have to written the code yet. My plan is to load the data into a data frame and then write a UDF that actually publishes the Kafka If you are using RDD¹s you could use map() or some other transform to cause the data to be published Andy From: fanooos <dev.fano...@gmail.com> Date: Tuesday, March 29, 2016 at 4:26 AM To: "user @spark" <user@spark.apache.org> Subject: Re: Sending events to Kafka from spark job > I think I find a solution but I have no idea how this affects the execution > of the application. > > At the end of the script I added a sleep statement. > > import time > time.sleep(1) > > > This solved the problem. > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Sending-events-to-Kafka-fr > om-spark-job-tp26622p26624.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >