[ https://issues.apache.org/jira/browse/SPARK-21378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ambud Sharma updated SPARK-21378: --------------------------------- Description: Kafka direct stream fails with poll timeout: {code:java} JavaInputDStream<ConsumerRecord<String, String>> stream = KafkaUtils.createDirectStream(ssc, LocationStrategies.PreferConsistent(), ConsumerStrategies.<String, String>Subscribe(topicsCollection, kafkaParams, fromOffsets)); {code} Digging deeper shows that there's an assert statement such that if no records are returned (which is a valid case) then a failure will happen. https://github.com/apache/spark/blob/39e2bad6a866d27c3ca594d15e574a1da3ee84cc/external/kafka-0-10/src/main/scala/org/apache/spark/streaming/kafka010/CachedKafkaConsumer.scala#L75 This solution: https://issues.apache.org/jira/browse/SPARK-19275 causes OOM was: Kafka direct stream fails with poll timeout: {code:java} JavaInputDStream<ConsumerRecord<String, String>> stream = KafkaUtils.createDirectStream(ssc, LocationStrategies.PreferConsistent(), ConsumerStrategies.<String, String>Subscribe(topicsCollection, kafkaParams, fromOffsets)); {code} Digging deeper shows that there's an assert statement such that if no records are returned (which is a valid case) then a failure will happen. https://github.com/apache/spark/blob/39e2bad6a866d27c3ca594d15e574a1da3ee84cc/external/kafka-0-10/src/main/scala/org/apache/spark/streaming/kafka010/CachedKafkaConsumer.scala#L75 > Spark Poll timeout when specific offsets are passed > --------------------------------------------------- > > Key: SPARK-21378 > URL: https://issues.apache.org/jira/browse/SPARK-21378 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 2.0.0, 2.0.2 > Reporter: Ambud Sharma > > Kafka direct stream fails with poll timeout: > {code:java} > JavaInputDStream<ConsumerRecord<String, String>> stream = > KafkaUtils.createDirectStream(ssc, > LocationStrategies.PreferConsistent(), > ConsumerStrategies.<String, > String>Subscribe(topicsCollection, kafkaParams, fromOffsets)); > {code} > Digging deeper shows that there's an assert statement such that if no records > are returned (which is a valid case) then a failure will happen. > https://github.com/apache/spark/blob/39e2bad6a866d27c3ca594d15e574a1da3ee84cc/external/kafka-0-10/src/main/scala/org/apache/spark/streaming/kafka010/CachedKafkaConsumer.scala#L75 > This solution: https://issues.apache.org/jira/browse/SPARK-19275 causes OOM -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org