Yes not the offset ranges, but the real data will be shuffled when you
using repartition().

Thanks
Saisai

On Fri, Sep 4, 2015 at 12:42 PM, Shushant Arora <shushantaror...@gmail.com>
wrote:

> 1.Does repartitioning on direct kafka stream shuffles only the offsets or
> exact kafka messages across executors?
>
> Say I have a direct kafkastream
>
> directKafkaStream.repartition(numexecutors).mapPartitions(new
> FlatMapFunction<Iterator<Tuple2<byte[],byte[]>>, String>(){
> ...
> }
>
> Say originally I have 5*numexceutor partitons in kafka.
>
> Now only the offset ranges should be shuffled to executors not exact kafka
> messages? But I am seeing a very large size of shuffles data read/write on
> streaming ui. When I remove this repartition - shuffle read /write becomes
> 0.
>
>

Reply via email to