HeartSaVioR commented on a change in pull request #22138: [SPARK-25151][SS] 
Apply Apache Commons Pool to KafkaDataConsumer
URL: https://github.com/apache/spark/pull/22138#discussion_r319500695
 
 

 ##########
 File path: 
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaDataConsumer.scala
 ##########
 @@ -269,9 +300,12 @@ private[kafka010] case class InternalKafkaConsumer(
           // When there is some error thrown, it's better to use a new 
consumer to drop all cached
           // states in the old consumer. We don't need to worry about the 
performance because this
           // is not a common path.
-          resetConsumer()
-          reportDataLoss(failOnDataLoss, s"Cannot fetch offset 
$toFetchOffset", e)
-          toFetchOffset = getEarliestAvailableOffsetBetween(toFetchOffset, 
untilOffset)
+          releaseConsumer()
+          fetchedData.reset()
 
 Review comment:
   Yes, as FetchedData is designed to be modified per task. So based on the 
desired offset, in most cases pool will provide same FetchedData. Once you get 
the one for that task, you can just modify it, and also reset if necessary.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to