Hello, Users.

While I use a write-ahead logs in spark streaming, I got an error that is a 
NullPointerException at FileBasedWriteAheadLogRandomReader.scala:48[1]

[1]: 
https://github.com/apache/spark/blob/v2.4.4/streaming/src/main/scala/org/apache/spark/streaming/util/FileBasedWriteAheadLogRandomReader.scala#L48

---- Full stack trace
Caused by: org.apache.spark.SparkException: Could not read data from write 
ahead log record 
FileBasedWriteAheadLogSegment(hdfs://.../receivedData/0/log-...,...)
        at 
org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDD.org$apache$spark$streaming$rdd$WriteAheadLogBackedBlockRDD$$getBlockFromWriteAheadLog$1(WriteAheadLogBackedBlockRDD.scala:145)
        at 
org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDD$$anonfun$compute$1.apply(WriteAheadLogBackedBlockRDD.scala:173)
        at 
org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDD$$anonfun$compute$1.apply(WriteAheadLogBackedBlockRDD.scala:173)
        at scala.Option.getOrElse(Option.scala:121)
        at 
org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDD.compute(WriteAheadLogBackedBlockRDD.scala:173)
        //...
        at 
org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$28.apply(RDD.scala:935)
        at 
org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$28.apply(RDD.scala:935)
        at 
org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
        at 
org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:123)
        at 
org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
        ... 3 more
Caused by: java.lang.NullPointerException
        at 
org.apache.spark.streaming.util.FileBasedWriteAheadLogRandomReader.close(FileBasedWriteAheadLogRandomReader.scala:48)
        at 
org.apache.spark.streaming.util.FileBasedWriteAheadLog.read(FileBasedWriteAheadLog.scala:122)
        at 
org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDD.org$apache$spark$streaming$rdd$WriteAheadLogBackedBlockRDD$$getBlockFromWriteAheadLog$1(WriteAheadLogBackedBlockRDD.scala:142)
        ... 50 more
----

- Spark version: 2.4.4
- Hadoop version: 2.7.1
- spark conf
    - "spark.streaming.receiver.writeAheadLog.enable" -> "true"

Did I do something wrong?

Best regards,
Minwoo Kang

Reply via email to