How can we catch exceptions that are thrown from custom RDDs or custom map functions?
We have a custom RDD that is throwing an exception that we would like to catch but the exception that is thrown back to the caller is a *org.apache.spark.SparkException* that does not contain any useful information about the original exception. The detail message is a string representation of the original stack trace but its hard to do anything useful with that. Below is a small class that exhibits the issue. It uses a map function instead of a custom RDD but the symptom is the same, the original *RuntimeException* is lost. I tested this with spark 1.2.1 and 1.3.0 public class SparkErrorExample { public static void main(String [] args) throws Exception { SparkConf sparkConf = new SparkConf().setAppName("SparkExample").setMaster("local[*]"); JavaSparkContext ctx = new JavaSparkContext(sparkConf); JavaRDD<String> data = ctx.parallelize(Arrays.asList("1", "2", "3")); try { data.map(line -> { throw new RuntimeException(); }).count(); } catch (Exception ex) { System.out.println("Exception class: " + ex.getClass()); System.out.println("Exception message: " + ex.getMessage()); System.out.println("Exception cause: "+ ex.getCause()); } } }