[ https://issues.apache.org/jira/browse/SPARK-23438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Marcelo Vanzin resolved SPARK-23438. ------------------------------------ Resolution: Fixed Assignee: Gabor Somogyi Fix Version/s: 2.4.0 2.3.1 2.2.2 2.1.3 2.0.3 > DStreams could lose blocks with WAL enabled when driver crashes > --------------------------------------------------------------- > > Key: SPARK-23438 > URL: https://issues.apache.org/jira/browse/SPARK-23438 > Project: Spark > Issue Type: Bug > Components: DStreams > Affects Versions: 1.6.0 > Reporter: Gabor Somogyi > Assignee: Gabor Somogyi > Priority: Critical > Fix For: 2.0.3, 2.1.3, 2.2.2, 2.3.1, 2.4.0 > > > There is a race condition introduced in SPARK-11141 which could cause data > loss. > This affects all versions since 1.6.0. > Problematic situation: > # Start streaming job with 2 receivers with WAL enabled. > # Receiver 1 receives a block and does the following > ** Writes a BlockAdditionEvent into WAL > ** Puts the block into it's received block queue with ID 1 > # Receiver 2 receives a block and does the following > ** Writes a BlockAdditionEvent into WAL > # Spark allocates all blocks from it's received block queue and writes > AllocatedBlocks(IDs=(1)) into WAL > # Driver crashes > # New Driver recovers from WAL > # Realise block with ID 2 never processed > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org