In spark, every action (foreach, collect etc.) gets converted into a spark job and jobs are executed sequentially.
You may want to refactor your code in calculateUseCase? to just run transformations (map, flatmap) and call a single action in the end. On Sun, Aug 16, 2015 at 3:19 PM, mohanaugust <mohanaug...@gmail.com> wrote: > JavaPairReceiverInputDStream<String, byte[]> messages = > KafkaUtils.createStream(...); > JavaPairDStream<String, byte[]> filteredMessages = > filterValidMessages(messages); > > JavaDStream<String> useCase1 = calculateUseCase1(filteredMessages); > JavaDStream<String> useCase2 = calculateUseCase2(filteredMessages); > JavaDStream<String> useCase3 = calculateUseCase3(filteredMessages); > JavaDStream<String> useCase4 = calculateUseCase4(filteredMessages); > ... > > I retrieve messages from Kafka, filter that and use the same messages for > mutiple use-cases. Here useCase1 to 4 are independent of each other and can > be calculated parallely. However, when i look at the logs, i see that > calculations are happening sequentially. How can i make them to run > parallely. Any suggestion would be helpful > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Apache-Spark-Parallel-Processing-of-messages-from-Kafka-Java-tp24284.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >