Hi,
I have an application which creates a Kafka Direct Stream from 1 topic
having 5 partitions.
As a result each batch is composed of an RDD having 5 partitions.
In order to apply transformation to my batch I have decided to convert the
RDD to DataFrame (DF) so that I can easily add column to the
Nevermind, I found the answer to my questions.
The following spark configuration property will allow you to process
multiple KafkaDirectStream in parallel:
--conf spark.streaming.concurrentJobs=
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Kafka-streami