Gabor Somogyi created SPARK-23438:
-------------------------------------

             Summary: DStreams could lose blocks with WAL enabled when driver 
crashes
                 Key: SPARK-23438
                 URL: https://issues.apache.org/jira/browse/SPARK-23438
             Project: Spark
          Issue Type: Bug
          Components: DStreams
    Affects Versions: 1.6.0
            Reporter: Gabor Somogyi


There is a race condition introduced in SPARK-11141 which could cause data loss.

This affects all versions since 1.6.0.

Problematic situation:
 # Start streaming job with 2 receivers with WAL enabled.
 # Receiver 1 receives a block and does the following

 ** Writes a BlockAdditionEvent into WAL
 ** Puts the block into it's received block queue with ID 1
 # Receiver 2 receives a block and does the following

 ** Writes a BlockAdditionEvent into WAL

 # Spark allocates all blocks from it's received block queue and writes 
AllocatedBlocks(IDs=(1)) into WAL
 # Driver crashes
 # New Driver recovers from WAL
 # Realise block with ID 2 never processed

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to