I have a stream got from Kafka with direct approach, say, inputStream, I need to
1. Create another DStream derivedStream with map or mapPartitions (with some data enrichment with reference table) on inputStream 2. Join derivedStream with inputStream In my use case, I don't need to shuffle data. Each partition in derivedStream only needs to be joined with the corresponding partition in the original parent inputStream it is generated from. My question is 1. Is there a Partitioner defined in KafkaRDD at all? 2. How would I preserve the partitioning scheme and avoid data shuffle? -- Chen Song