Use foreachPartition and batch the writes

On Sat, Jul 25, 2015 at 9:14 AM, <> wrote:

> Hello,
> I am new user of Spark, and need to know what could be the best practice
> to do the following scenario :
> - Spark Streaming receives XML messages from Kafka
> - Spark transforms each message of the RDD (xml2json + some enrichments)
> - Spark store the transformed/enriched messages inside MongoDB and HDFS
> (Mongo Key as file name)
> Basically, I would say that I have to manage message one by one inside a
> foreach loop of the RDD and write each message one by one in MongoDB and
> Do you think it is the best way to dot it ?
> Tks
> Nicolas
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

Reply via email to