The answer already given is correct. You shouldn't doubt this, because
you've already seen the shuffle data change accordingly.
On Fri, Sep 4, 2015 at 11:25 AM, Shushant Arora
wrote:
> But Kafka stream has underlyng RDD which consists of offsets reanges only-
> so
Yes agree shuffle data reveals that offsets+data is transformed.
Wanted to understand mapPartition or any transformation in (
directKafkaStream.repartition(numexecutors).mapPartitions(...)) is
happening before shuffle or after shuffle.
If after shuffle - is this due to the reason that very
1.Does repartitioning on direct kafka stream shuffles only the offsets or
exact kafka messages across executors?
Say I have a direct kafkastream
directKafkaStream.repartition(numexecutors).mapPartitions(new
FlatMapFunction>, String>(){
...
}
Say originally I have
Yes not the offset ranges, but the real data will be shuffled when you
using repartition().
Thanks
Saisai
On Fri, Sep 4, 2015 at 12:42 PM, Shushant Arora
wrote:
> 1.Does repartitioning on direct kafka stream shuffles only the offsets or
> exact kafka messages across