Hi Ruijing,

Spark uses SerializationDebugger (
https://spark.apache.org/docs/latest/api/java/org/apache/spark/serializer/SerializationDebugger.html)
as default debugger to detect the serialization issues. You can take more
detailed serialization exception information by setting the following while
creating a cluster:
spark.driver.extraJavaOptions -Dsun.io.serialization.extendedDebugInfo=true
spark.executor.extraJavaOptions
-Dsun.io.serialization.extendedDebugInfo=true

Maxim Gekk

Software Engineer

Databricks, Inc.


On Tue, Feb 18, 2020 at 1:02 PM Ruijing Li <liruijin...@gmail.com> wrote:

> Hi all,
>
> When working with spark jobs, I sometimes have to tackle with
> serialization issues, and I have a difficult time trying to fix those. A
> lot of times, the serialization issues happen only in cluster mode across
> the network in a mesos container, so I can’t debug locally. And the
> exception thrown by spark is not very helpful to find the cause.
>
> I’d love to hear some tips on how to debug in the right places. Also, I’d
> be interested to know if in future releases it would be possible to point
> out which class or function is causing the serialization issue (right now I
> find its either Java generic classes or the class Spark is running itself).
> Thanks!
> --
> Cheers,
> Ruijing Li
>

Reply via email to