Hi,

Need to get bit more understanding of reliability aspects of the Custom
Receivers in the context of the code in spark-streaming-jms
https://github.com/mattf/spark-streaming-jms.

Based on the documentation in
http://spark.apache.org/docs/latest/streaming-custom-receivers.html#receiver-reliability,
I understand that if the store api is called with multiple records the
message is reliably stored as it is a blocking call. On the other hand if
the store api is called with a single record then it is not reliable as the
call is returned back to the calling program before the message is stored
appropriately.

Given that I have few questions

1. Which are the store APIs that relate to multiple records ? Are they the
ones which use scala.collection.mutable.ArrayBuffer<T
<http://spark.apache.org/docs/latest/api/java/org/apache/spark/streaming/receiver/Receiver.html>>,

scala.collection.Iterator<T
<http://spark.apache.org/docs/latest/api/java/org/apache/spark/streaming/receiver/Receiver.html>>
and
java.util.Iterator<T
<http://spark.apache.org/docs/latest/api/java/org/apache/spark/streaming/receiver/Receiver.html>>
in the parameter signature?

2. Is there a sample code which can show how to create multiple records
like that and send the same to appropriate store API ?

3. If I take the example of spark-streaming-jms, the onMessage method of
JMSReceiver class calls store API with one JMSEvent. Does that mean that
this code does not guarantee the reliability of storage of the message
received even if storage level specified to MEMORY_AND_DISK_SER_2 ?

Regards,
Sourav

Reply via email to