FYI re WAL on S3

http://search-hadoop.com/m/q3RTtFMpd41A7TnH/WAL+S3&subj=WAL+on+S3



On 18 September 2015 at 13:32, Alan Dipert <a...@dipert.org> wrote:

> Hello,
>
> Thanks all for considering our problem.  We are doing transformations in
> Spark Streaming.  We have also since learned that WAL to S3 on 1.4 is "not
> reliable" [1]
>
> We are just going to wait for EMR to support 1.5 and hopefully this won't
> be a problem anymore [2].
>
> Alan
>
> 1.
> https://mail-archives.apache.org/mod_mbox/spark-user/201508.mbox/%3CCA+AHuKkH9r0BwQMgQjDG+j=qdcqzpow1rw1u4d0nrcgmq5x...@mail.gmail.com%3E
> 2. https://issues.apache.org/jira/browse/SPARK-9215
>
> On Fri, Sep 18, 2015 at 4:23 AM, Nick Pentreath <nick.pentre...@gmail.com>
> wrote:
>
>> Are you doing actual transformations / aggregation in Spark Streaming? Or
>> just using it to bulk write to S3?
>>
>> If the latter, then you could just use your AWS Lambda function to read
>> directly from the Kinesis stream. If the former, then perhaps either look
>> into the WAL option that Aniket mentioned, or perhaps you could write the
>> processed RDD back to Kinesis, and have the Lambda function read the
>> Kinesis stream and write to Redshift?
>>
>> On Thu, Sep 17, 2015 at 5:48 PM, Alan Dipert <a...@dipert.org> wrote:
>>
>>> Hello,
>>> We are using Spark Streaming 1.4.1 in AWS EMR to process records from
>>> Kinesis.  Our Spark program saves RDDs to S3, after which the records are
>>> picked up by a Lambda function that loads them into Redshift.  That no data
>>> is lost during processing is important to us.
>>>
>>> We have set our Kinesis checkpoint interval to 15 minutes, which is also
>>> our window size.
>>>
>>> Unfortunately, checkpointing happens after receiving data from Kinesis,
>>> not after we have successfully written to S3.  If batches back up in Spark,
>>> and the cluster is terminated, whatever data was in-memory will be lost
>>> because it was checkpointed but not actually saved to S3.
>>>
>>> We are considering forking and modifying the kinesis-asl library with
>>> changes that would allow us to perform the checkpoint manually and at the
>>> right time.  We'd rather not do this.
>>>
>>> Are we overlooking an easier way to deal with this problem?  Thank you
>>> in advance for your insight!
>>>
>>> Alan
>>>
>>
>>
>

Reply via email to