Re: Programmatically launch several hundred Spark Streams in parallel

2015-07-24 Thread Brandon White
THanks. Sorry the last section was supposed be streams.par.foreach { nameAndStream = nameAndStream._2.foreachRDD { rdd = df = sqlContext.jsonRDD(rdd) df.insertInto(stream._1) } } ssc.start() On Fri, Jul 24, 2015 at 10:39 AM, Dean Wampler deanwamp...@gmail.com wrote: You don't

Re: Programmatically launch several hundred Spark Streams in parallel

2015-07-24 Thread Dean Wampler
You don't need the par (parallel) versions of the Scala collections, actually, Recall that you are building a pipeline in the driver, but it doesn't start running cluster tasks until ssc.start() is called, at which point Spark will figure out the task parallelism. In fact, you might as well do the