[ 
https://issues.apache.org/jira/browse/SPARK-11698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15003478#comment-15003478
 ] 

Liang-Chi Hsieh commented on SPARK-11698:
-----------------------------------------

Yes, but it is intentional. We don't want to increase data latency due to heavy 
data loading. So we need to ignore some data in each iteration and keep 
consuming latest data.

> Add option to ignore kafka messages that are out of limit rate
> --------------------------------------------------------------
>
>                 Key: SPARK-11698
>                 URL: https://issues.apache.org/jira/browse/SPARK-11698
>             Project: Spark
>          Issue Type: Improvement
>          Components: Streaming
>            Reporter: Liang-Chi Hsieh
>
> With spark.streaming.kafka.maxRatePerPartition, we can control the max rate 
> limit. However, we can not ignore these messages out of limit. These messages 
> will be consumed in next iteration. We have a use case that we need to ignore 
> these messages and process latest messages in next iteration.
> In other words, we simply want to consume part of messages in each iteration 
> and ignore remaining messages that are not consumed.
> We add an option for this purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to