Yes not the offset ranges, but the real data will be shuffled when you using repartition().
Thanks Saisai On Fri, Sep 4, 2015 at 12:42 PM, Shushant Arora <shushantaror...@gmail.com> wrote: > 1.Does repartitioning on direct kafka stream shuffles only the offsets or > exact kafka messages across executors? > > Say I have a direct kafkastream > > directKafkaStream.repartition(numexecutors).mapPartitions(new > FlatMapFunction<Iterator<Tuple2<byte[],byte[]>>, String>(){ > ... > } > > Say originally I have 5*numexceutor partitons in kafka. > > Now only the offset ranges should be shuffled to executors not exact kafka > messages? But I am seeing a very large size of shuffles data read/write on > streaming ui. When I remove this repartition - shuffle read /write becomes > 0. > >