Re: question on Write Ahead Log (Spark Streaming )

2017-03-10 Thread Dibyendu Bhattacharya
Hi, You could also use this Receiver : https://github.com/dibbhatt/kafka-spark-consumer This is part of spark-packages also : https://spark-packages.org/package/dibbhatt/kafka-spark-consumer You do not need to enable WAL in this and still recover from Driver failure with no data loss. You can

Re: question on Write Ahead Log (Spark Streaming )

2017-03-08 Thread Saisai Shao
IIUC, your scenario is quite like what currently ReliableKafkaReceiver does. You can only send ack to the upstream source after WAL is persistent, otherwise because of asynchronization of data processing and data receiving, there's still a chance data could be lost if you send out ack before WAL.

question on Write Ahead Log (Spark Streaming )

2017-03-08 Thread kant kodali
Hi All, I am using a Receiver based approach. And I understand that spark streaming API's will convert the received data from receiver into blocks and these blocks that are in memory are also stored in WAL if one enables it. my upstream source which is not Kafka can also replay by which I mean if