Re: repartition on direct kafka stream

2015-09-04 Thread Cody Koeninger
The answer already given is correct. You shouldn't doubt this, because you've already seen the shuffle data change accordingly. On Fri, Sep 4, 2015 at 11:25 AM, Shushant Arora wrote: > But Kafka stream has underlyng RDD which consists of offsets reanges only- > so

Re: repartition on direct kafka stream

2015-09-04 Thread Shushant Arora
Yes agree shuffle data reveals that offsets+data is transformed. Wanted to understand mapPartition or any transformation in ( directKafkaStream.repartition(numexecutors).mapPartitions(...)) is happening before shuffle or after shuffle. If after shuffle - is this due to the reason that very

repartition on direct kafka stream

2015-09-03 Thread Shushant Arora
1.Does repartitioning on direct kafka stream shuffles only the offsets or exact kafka messages across executors? Say I have a direct kafkastream directKafkaStream.repartition(numexecutors).mapPartitions(new FlatMapFunction>, String>(){ ... } Say originally I have

Re: repartition on direct kafka stream

2015-09-03 Thread Saisai Shao
Yes not the offset ranges, but the real data will be shuffled when you using repartition(). Thanks Saisai On Fri, Sep 4, 2015 at 12:42 PM, Shushant Arora wrote: > 1.Does repartitioning on direct kafka stream shuffles only the offsets or > exact kafka messages across