If you really want to just not process the bad topicpartitions, you can use the version of createDirectStream that takes
fromOffsets: Map[TopicAndPartition, Long] and exclude the broken topicpartitions from the map. On Mon, Nov 23, 2015 at 4:54 PM, Hudong Wang <justupl...@hotmail.com> wrote: > Hi folks, > > We have a 10 node cluster and have several topics. Each topic has about > 256 partitions with 3 replica factor. Now we run into an issue that in some > topic, a few partition (< 10)'s leader is -1 and all of them has only one > synced partition. > > Exception in thread "main" org.apache.spark.SparkException: > org.apache.spark.SparkException: Couldn't find leaders for Set([xxx,251], > [xxx,253], [xxx,53], [xxx,161], [xxx,71], [xxx,163], [xxx,73]) > at > org.apache.spark.streaming.kafka.KafkaUtils$$anonfun$createDirectStream$2.apply(KafkaUtils.scala:416) > at > org.apache.spark.streaming.kafka.KafkaUtils$$anonfun$createDirectStream$2.apply(KafkaUtils.scala:416) > at scala.util.Either.fold(Either.scala:97) > at > org.apache.spark.streaming.kafka.KafkaUtils$.createDirectStream(KafkaUtils.scala:415) > > Is there any workaround so I can still createDirectStream with bad sets in > system? > > Many thanks! > Tony >