Hey folks, I have a very large number of Kafka topics (many thousands of partitions) that I want to consume, filter based on topic-specific filters, then produce back to filtered topics in Kafka.
Using the receiver-less based approach with Spark 1.4.1 (described here<https://github.com/koeninger/kafka-exactly-once/blob/master/blogpost.md>) I am able to use either KafkaUtils.createDirectStream or KafkaUtils.createRDD, consume from many topics, and filter them with the same filters but I can't seem to wrap my head around how to apply topic-specific filters, or to finally produce to topic-specific destination topics. Another point would be that I will need to checkpoint the metadata after each successful batch and set starting offsets per partition back to ZK. I expect I can do that on the final RDDs after casting them accordingly, but if anyone has any expertise/guidance doing that and is willing to share, I'd be pretty grateful.