Hello,
So I have about 500 Spark Streams and I want to know the fastest and most
reliable way to process each of them. Right now, I am creating and process
them in a list:
val ssc = new StreamingContext(sc, Minutes(10))
val streams = paths.par.map { nameAndPath =>
(path._1, ssc.textFileStream(path._1))
}
streams.par.foreach { nameAndStream =>
streamTuple.foreachRDD { rdd =>
df = sqlContext.jsonRDD(rdd)
df.insertInto(stream._1)
}
}
ssc.start()
Is this the best way to do this? Are there any better faster methods?