The per partition offsets are part of the rdd as defined on the driver.
Have you read

and/or watched

On Wed, Feb 24, 2016 at 9:05 PM, Yuhang Chen <> wrote:

> Hi, as far as I know, there is a 1:1 mapping between Spark partition and
> Kafka partition, and in Spark's fault-tolerance mechanism, if a partition
> failed, another partition will be used to recompute those data. And my
> questions are below:
> When a partition (worker node) fails in Spark Streaming,
> 1. Is its computation passed to another partition, or just waits for the
> failed partition to restart?
> 2. How does the restarted partition know the offset range it should
> consume from Kafka? It should consume the some data as the before-failed
> one, right?

Reply via email to