The reason is because of the following code:

val numStreams = numShards
val kinesisStreams = (0 until numStreams).map { i =>
  KinesisUtils.createStream(ssc, streamName, endpointUrl,
kinesisCheckpointInterval,
      InitialPositionInStream.LATEST, StorageLevel.MEMORY_AND_DISK_2)
}

In the above code, numStreams is set as numShards. This enforces the need
to have #shards + 1 workers. If you set numStreams as Math.min(numShards,
numAvailableWorkers - 1), you can have lesser number of workers than number
of shards. Makes sense?

On Sun Dec 14 2014 at 10:06:36 A.K.M. Ashrafuzzaman <
ashrafuzzaman...@gmail.com> wrote:

> Thanks Aniket,
> The trick is to have the #workers >= #shards + 1. But I don’t know why is
> that.
> http://spark.apache.org/docs/latest/streaming-kinesis-integration.html
>
> Here in the figure[spark streaming kinesis architecture], it seems like
> one node should be able to take on more than one shards.
>
>
> A.K.M. Ashrafuzzaman
> Lead Software Engineer
> NewsCred <http://www.newscred.com/>
>
> (M) 880-175-5592433
> Twitter <https://twitter.com/ashrafuzzaman> | Blog
> <http://jitu-blog.blogspot.com/> | Facebook
> <https://www.facebook.com/ashrafuzzaman.jitu>
>
> Check out The Academy <http://newscred.com/theacademy>, your #1 source
> for free content marketing resources
>
> On Nov 26, 2014, at 6:23 PM, A.K.M. Ashrafuzzaman <
> ashrafuzzaman...@gmail.com> wrote:
>
> Hi guys,
> When we are using Kinesis with 1 shard then it works fine. But when we use
> more that 1 then it falls into an infinite loop and no data is processed by
> the spark streaming. In the kinesis dynamo DB, I can see that it keeps
> increasing the leaseCounter. But it do start processing.
>
> I am using,
> scala: 2.10.4
> java version: 1.8.0_25
> Spark: 1.1.0
> spark-streaming-kinesis-asl: 1.1.0
>
> A.K.M. Ashrafuzzaman
> Lead Software Engineer
> NewsCred <http://www.newscred.com/>
>
> (M) 880-175-5592433
> Twitter <https://twitter.com/ashrafuzzaman> | Blog
> <http://jitu-blog.blogspot.com/> | Facebook
> <https://www.facebook.com/ashrafuzzaman.jitu>
>
> Check out The Academy <http://newscred.com/theacademy>, your #1 source
> for free content marketing resources
>
>
>

Reply via email to