Hi, as far as I know, there is a 1:1 mapping between Spark partition and Kafka partition, and in Spark's fault-tolerance mechanism, if a partition failed, another partition will be used to recompute those data. And my questions are below:
When a partition (worker node) fails in Spark Streaming, 1. Is its computation passed to another partition, or just waits for the failed partition to restart? 2. How does the restarted partition know the offset range it should consume from Kafka? It should consume the some data as the before-failed one, right?