Spark in general isn't a good fit if you're trying to make sure that
certain tasks only run on certain executors.
You can look at overriding getPreferredLocations and increasing the value
of spark.locality.wait, but even then, what do you do when an executor
fails?
On Fri, Feb 26, 2016 at 8:08
Hi,
I am working a streaming application integrated with Kafka by the API
createDirectStream. The application streams a topic which contains 10
partitions (on Kafka). It executes with 10 workers (--num-executors 10)
When it reads data from Kafka/ZooKeeper, Spark creates 10 tasks (as same
as