The reason is because of the following code: val numStreams = numShards val kinesisStreams = (0 until numStreams).map { i => KinesisUtils.createStream(ssc, streamName, endpointUrl, kinesisCheckpointInterval, InitialPositionInStream.LATEST, StorageLevel.MEMORY_AND_DISK_2) }
In the above code, numStreams is set as numShards. This enforces the need to have #shards + 1 workers. If you set numStreams as Math.min(numShards, numAvailableWorkers - 1), you can have lesser number of workers than number of shards. Makes sense? On Sun Dec 14 2014 at 10:06:36 A.K.M. Ashrafuzzaman < ashrafuzzaman...@gmail.com> wrote: > Thanks Aniket, > The trick is to have the #workers >= #shards + 1. But I don’t know why is > that. > http://spark.apache.org/docs/latest/streaming-kinesis-integration.html > > Here in the figure[spark streaming kinesis architecture], it seems like > one node should be able to take on more than one shards. > > > A.K.M. Ashrafuzzaman > Lead Software Engineer > NewsCred <http://www.newscred.com/> > > (M) 880-175-5592433 > Twitter <https://twitter.com/ashrafuzzaman> | Blog > <http://jitu-blog.blogspot.com/> | Facebook > <https://www.facebook.com/ashrafuzzaman.jitu> > > Check out The Academy <http://newscred.com/theacademy>, your #1 source > for free content marketing resources > > On Nov 26, 2014, at 6:23 PM, A.K.M. Ashrafuzzaman < > ashrafuzzaman...@gmail.com> wrote: > > Hi guys, > When we are using Kinesis with 1 shard then it works fine. But when we use > more that 1 then it falls into an infinite loop and no data is processed by > the spark streaming. In the kinesis dynamo DB, I can see that it keeps > increasing the leaseCounter. But it do start processing. > > I am using, > scala: 2.10.4 > java version: 1.8.0_25 > Spark: 1.1.0 > spark-streaming-kinesis-asl: 1.1.0 > > A.K.M. Ashrafuzzaman > Lead Software Engineer > NewsCred <http://www.newscred.com/> > > (M) 880-175-5592433 > Twitter <https://twitter.com/ashrafuzzaman> | Blog > <http://jitu-blog.blogspot.com/> | Facebook > <https://www.facebook.com/ashrafuzzaman.jitu> > > Check out The Academy <http://newscred.com/theacademy>, your #1 source > for free content marketing resources > > >