[ https://issues.apache.org/jira/browse/SPARK-9619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tathagata Das updated SPARK-9619: --------------------------------- Summary: Restarting the receiver's BlockGenerator does not clear previously buffered data (was: Restarting the receiver's BlockGenerator does clear previous data) > Restarting the receiver's BlockGenerator does not clear previously buffered > data > -------------------------------------------------------------------------------- > > Key: SPARK-9619 > URL: https://issues.apache.org/jira/browse/SPARK-9619 > Project: Spark > Issue Type: Bug > Components: Streaming > Reporter: Tathagata Das > Assignee: Tathagata Das > Priority: Minor > > The internal default block generator that is used by receivers gets reused > across receiver restarts. This can lead to duplicate data. This is > sort-of-okay as receivers really provide at-least once guarantee at best. > Furthermore Reliable receivers like the ReliableKafkaReceiver, did not reuse > BlockGenerator objects hence did not have the problem. > The solution is to ensure that the internal buffer of the BlockGenerator is > cleared every time it is started. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org