Re: Spark streaming at-least once guarantee

lalit1303 Tue, 05 Aug 2014 00:46:26 -0700

Hi Sanjeet,

I have been using spark streaming for processing of files present in S3 and
HDFS.
I am also using SQS messages for the same purpose as yours i.e. pointer to
S3 file.
As of now, I have a separate SQS job which receive message from SQS queue
and gets the corresponding file from S3.
Now, I wasnt to integrate the SQS receiver with spark streaming. Like, my
spark streaming job would listen for new SQS messages and proceed
accordingly.
I was wondering if you find any solution to this. Please let me know in
case!!


In your above approach, you can achieve #4 in the following way:
When you are passing a forEach function to be applied on each RDD of
Dstream, you can pass information of SQS message (lke receipthandle for
deleting message) associated with that particualar file.
After success/failure in processing you can perform deletion of your SQS
message accordingly.


Thanks
--Lalit



-----
Lalit Yadav
la...@sigmoidanalytics.com
--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-streaming-at-least-once-guarantee-tp10902p11419.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: Spark streaming at-least once guarantee

Reply via email to