Hari Shreedharan created SPARK-4707:
---------------------------------------

             Summary: Reliable Kafka Receiver can lose data if the block 
generator fails to store data
                 Key: SPARK-4707
                 URL: https://issues.apache.org/jira/browse/SPARK-4707
             Project: Spark
          Issue Type: Bug
          Components: Streaming
    Affects Versions: 1.2.0
            Reporter: Hari Shreedharan


The Reliable Kafka Receiver commits offsets only when events are actually 
stored, which ensures that on restart we will actually start where we left off. 
But if the failure happens in the store() call, and the block generator reports 
an error the receiver does not do anything and will continue reading from the 
current offset and not the last commit. This means that messages between the 
last commit and the current offset will be lost. 

I will send a PR for this soon - I have a patch which needs some minor fixes, 
which I need to test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to