1.Does repartitioning on direct kafka stream shuffles only the offsets or
exact kafka messages across executors?

Say I have a direct kafkastream

directKafkaStream.repartition(numexecutors).mapPartitions(new
FlatMapFunction<Iterator<Tuple2<byte[],byte[]>>, String>(){
...
}

Say originally I have 5*numexceutor partitons in kafka.

Now only the offset ranges should be shuffled to executors not exact kafka
messages? But I am seeing a very large size of shuffles data read/write on
streaming ui. When I remove this repartition - shuffle read /write becomes
0.

Reply via email to