[ 
https://issues.apache.org/jira/browse/SPARK-7139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tathagata Das updated SPARK-7139:
---------------------------------
    Priority: Blocker  (was: Critical)

> Allow received block metadata to be saved to WAL and recovered on driver 
> failure
> --------------------------------------------------------------------------------
>
>                 Key: SPARK-7139
>                 URL: https://issues.apache.org/jira/browse/SPARK-7139
>             Project: Spark
>          Issue Type: Improvement
>          Components: Streaming
>            Reporter: Tathagata Das
>            Assignee: Tathagata Das
>            Priority: Blocker
>
> The received API allows arbitrary metadata to be added for each block. 
> However that information is not saved in the WAL as part of the block 
> information in the driver. 
> To fix this, the following needs to be done. 
> 1. Forward the metadata to the ReceivedBlockTracker in the driver.
> 2. ReceivedBlockTracker saves the metadata and recovers it on restart. 
> However there is one tricky thing. The ReceivedBlockTracker WAL is enabled 
> only when `spark.streaming.receiver.writeAheadLog.enable = true`. This means 
> that only when  receiver WAL is enabled is the driver WAL enabled. This is 
> not desired as the one may want to save and recovered block metadata 
> information (especially information like Kafka offsets or Kinesis sequence 
> numbers) that can be used to recover data without actually saving the data to 
> the receiver WAL. So we have to always enable the tracker WAL. 
> 3. Always enable the ReceivedBlockTracker WAL. However, make sure that the 
> WriteAheadLogBackedBlockRDD skips block lookup after restart as the blocks 
> are obviously gone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to