Yes, that is the correct understanding. There are undocumented parameters that allow that, but I do not recommend using those :)
TD On Wed, Mar 25, 2015 at 6:57 AM, Luis Ángel Vicente Sánchez < langel.gro...@gmail.com> wrote: > I have a simple and probably dumb question about foreachRDD. > > We are using spark streaming + cassandra to compute concurrent users every > 5min. Our batch size is 10secs and our block interval is 2.5secs. > > At the end of the world we are using foreachRDD to join the data in the > RDD with existing data in Cassandra, update the counters and then save it > back to Cassandra. > > To the best of my understanding, in this scenario, spark streaming > produces one RDD every 10secs and foreachRDD executes them sequentially, > that is, foreachRDD would never run in parallel. > > Am I right? > > Regards, > > Luis > > >