Hello all, I am running a Spark application which schedules multiple Spark jobs. Something like:
val df = sqlContext.read.parquet("/path/to/file") filterExpressions.par.foreach { expression => df.filter(expression).count() } When the block manager fails to fetch a block, it throws an exception which eventually kills the exception: http://pastebin.com/2ggwv68P This code works when I run it on one thread with: filterExpressions.foreach { expression => df.filter(expression).count() } But I really need the parallel execution of the jobs. Is there anyway around this? It seems like a bug in the BlockManagers doGetRemote function. I have tried the HTTP Block Manager as well.