Hi

With SparkStreaming on 1.3.0 version when I'm using WAL and checkpoints,
sometimes, I'm hitting fileNotFound exceptions.

Here's the complete stacktrace:
https://gist.github.com/akhld/126b945f7fef408a525e

The application simply reads data from Kafka and does a simple wordcount
over it. Batch duration is 1 second and processing delay is somewhat around
3-6 seconds. (Standalone 2 node cluster with 15GB of mem and 4 cores each)


Without WAL and checkpoints and using only MEMORY_ONLY as StorageLevel
Instead of fileNotFound, the exception is blockNotFound which is reduced
while using MEMORY_ONLY_2 as StorageLevel, and when using MEMORY_AND_DISK,
the performance is really awful and it fills up disk in /tmp
with spark-d2ad4262-0f6f-409d-b51f-a0a871cbf64f files.

Any thoughts on this are welcome.


Thanks
Best Regards

Reply via email to