We already have SparkException, indeed. The ID is an interesting idea;
simple to implement and might help disambiguate.

Does it solve a lot of problems of this form? if something is
squelching Exception or SparkException the result will be the same. #2
is something we can sniff out with static analysis pretty easily, but
not as much #1. Ideally we'd just fix blocks like this but I bet there
are lots of them.

I like the idea but for a different reason, and that's that it's
probably best to control exceptions that propagate from the public
API, since in some cases they're a meaningful part of the API (see
https://issues.apache.org/jira/browse/SPARK-8393 which I'm hoping to
fix now)

And the catch there is -- throwing checked exceptions from Scala code
in a way that Java code can catch requires annotating lots of methods.

On Mon, Apr 18, 2016 at 8:16 PM, Reynold Xin <r...@databricks.com> wrote:
> Josh's pull request on rpc exception handling got me to think ...
>
> In my experience, there have been a few things related exceptions that
> created a lot of trouble for us in production debugging:
>
> 1. Some exception is thrown, but is caught by some try/catch that does not
> do any logging nor rethrow.
> 2. Some exception is thrown, but is caught by some try/catch that does not
> do any logging, but do rethrow. But the original exception is now masked.
> 2. Multiple exceptions are logged at different places close to each other,
> but we don't know whether they are caused by the same problem or not.
>
>
> To mitigate some of the above, here's an idea ...
>
> (1) Create a common root class for all the exceptions (e.g. call it
> SparkException) used in Spark. We should make sure every time we catch an
> exception from a 3rd party library, we rethrow them as SparkException (a lot
> of places already do that). In SparkException's constructor, log the
> exception and the stacktrace.
>
> (2) SparkException has a monotonically increasing ID, and this ID appears in
> the exception error message (say at the end).
>
>
> I think (1) will eliminate most of the cases that an exception gets
> swallowed. The main downside I can think of is we might log an exception
> multiple times. However, I'd argue exceptions should be rare, and it is not
> that big of a deal to log them twice or three times. The unique ID (2) can
> help us correlate exceptions if they appear multiple times.
>
> Thoughts?
>
>
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Reply via email to