With DirectKafkaStream there are two approaches.
1. you increase the number of KAfka partitions Spark will automatically
read in parallel
2. if that's not possible, then explicitly repartition only if there are
more cores in the cluster than the number of Kafka partitions, AND the
first map-like st
does spark streaming 1.3 launches task for each partition offset range
whether that is 0 or not ?
If yes, how can I enforce it to not to launch tasks for empty rdds.Not able
t o use coalesce on directKafkaStream.
Shall we enforce repartitioning always before processing direct stream ?
use case i