Hi, in my Spark Streaming application, computations depend on users' input in terms of * user-defined functions * computation rules * etc. that can throw exceptions in various cases (think: exception in UDF, division by zero, invalid access by key etc.).
Now I am wondering about what is a good/reasonable way to deal with those errors. I think I want to continue processing (the whole stream processing pipeline should not die because of one single malformed item in the stream), i.e., catch the exception, but still I need a way to tell the user something went wrong. So how can I get the information that something went wrong back to the driver and what is a reasonable way to do that? While writing this, something like the following came into my mind: val errors = sc.accumulator(...) // of type List[Throwable] dstream.map(item => { Try { someUdf(item) } match { case Success(value) => value case Failure(err) => errors += err // remember error 0 // default value } }) Does this make sense? Thanks Tobias