Spark streaming data loss due to timeout in writing BlockAdditionEvent to WAL by the driver

Arijit Mon, 14 Nov 2016 10:26:06 -0800

Hi,


We are seeing another case of data loss/drop when the following exception 
happens. This particular Exception treated as WARN resulted in dropping 2095 
events from processing.


16/10/26 19:24:08 WARN ReceivedBlockTracker: Exception thrown while writing 
record: 
BlockAdditionEvent(ReceivedBlockInfo(12,Some(2095),None,WriteAheadLogBasedStoreResult(input-12-1477508431881,Some(2095),FileBasedWriteAheadLogSegment(hdfs://mycluster/commerce/streamingContextCheckpointDir/receivedData/12/log-1477509840005-1477509900005,0,2097551))))
 to the WriteAheadLog.
java.util.concurrent.TimeoutException: Futures timed out after [5000 
milliseconds]
        at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
        at 
scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
        at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
        at 
scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
        at scala.concurrent.Await$.result(package.scala:107)
        at 
org.apache.spark.streaming.util.BatchedWriteAheadLog.write(BatchedWriteAheadLog.scala:81)
        at 
org.apache.spark.streaming.scheduler.ReceivedBlockTracker.writeToLog(ReceivedBlockTracker.scala:232)
        at 
org.apache.spark.streaming.scheduler.ReceivedBlockTracker.addBlock(ReceivedBlockTracker.scala:87)
        at 
org.apache.spark.streaming.scheduler.ReceiverTracker.org$apache$spark$streaming$scheduler$ReceiverTracker$$addBlock(ReceiverTracker.scala:321)
        at 
org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$receiveAndReply$1$$anon$1$$anonfun$run$1.apply$mcV$sp(ReceiverTracker.scala:500)
        at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1229)
        at 
org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$receiveAndReply$1$$anon$1.run(ReceiverTracker.scala:498)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

We tried with increasing the timeout to 60 seconds but could not eliminate the 
issue completely. Requesting suggestions on what would be the recourse to stop 
this data bleeding.

Thanks, Arijit

Spark streaming data loss due to timeout in writing BlockAdditionEvent to WAL by the driver

Reply via email to