[ https://issues.apache.org/jira/browse/SPARK-23438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16365861#comment-16365861 ]
Gabor Somogyi commented on SPARK-23438: --------------------------------------- I'm working on that. > DStreams could lose blocks with WAL enabled when driver crashes > --------------------------------------------------------------- > > Key: SPARK-23438 > URL: https://issues.apache.org/jira/browse/SPARK-23438 > Project: Spark > Issue Type: Bug > Components: DStreams > Affects Versions: 1.6.0 > Reporter: Gabor Somogyi > Priority: Critical > > There is a race condition introduced in SPARK-11141 which could cause data > loss. > This affects all versions since 1.6.0. > Problematic situation: > # Start streaming job with 2 receivers with WAL enabled. > # Receiver 1 receives a block and does the following > ** Writes a BlockAdditionEvent into WAL > ** Puts the block into it's received block queue with ID 1 > # Receiver 2 receives a block and does the following > ** Writes a BlockAdditionEvent into WAL > # Spark allocates all blocks from it's received block queue and writes > AllocatedBlocks(IDs=(1)) into WAL > # Driver crashes > # New Driver recovers from WAL > # Realise block with ID 2 never processed > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org