On 28 Mar 2014, at 02:10, Scott Clasen <[email protected]> wrote:
> Thanks everyone for the discussion. > > Just to note, I restarted the job yet again, and this time there are indeed > tasks being executed by both worker nodes. So the behavior does seem > inconsistent/broken atm. > > Then I added a third node to the cluster, and a third executor came up, and > everything broke :| > > This is kafka’s high-level consumer. Try to raise rebalance retries. Also, as this consumer is threaded, it have some protection against this failure - first it waits some time, and then rebalances. But for spark cluster i think this time is not enough. If there was a way to wait every spark executor to start, rebalance, and only when start to consume, this issue would be less visible. > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/KafkaInputDStream-mapping-of-partitions-to-tasks-tp3360p3391.html > Sent from the Apache Spark User List mailing list archive at Nabble.com.
