I have both SPARK-2878 and SPARK-2893.
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/SPARK-2878-Kryo-serialisation-with-custom-Kryo-registrator-failing-tp7719p8046.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com
-developers-list.1001551.n3.nabble.com/SPARK-2878-Kryo-serialisation-with-custom-Kryo-registrator-failing-tp7719p7989.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: dev
)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
--
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/SPARK-2878-Kryo-serialisation-with-custom
@rxin With the fixes, I could run it fine on top of branch-1.0
On master when running using YARN I am getting another KryoException:
Exception in thread main org.apache.spark.SparkException: Job aborted due
to stage failure: Task 247 in stage 52.0 failed 4 times, most recent
failure: Lost task
I am still a bit confused that why this issue did not show up in 0.9...at
that time there was no spark-submit and the context was constructed with
low level calls...
Kryo register for ALS was always in my application code..
Was this bug introduced in 1.0 or it was always there ?
On Aug 14, 2014
Hi Deb,
The only alternative serialiser is the JavaSerialiser (the default).
Theoretically Spark supports custom serialisers, but due to a related
issue, custom serialisers currently can't live in application jars and must
be available to all executors at launch. My PR fixes this issue as well,
Graham,
Thanks for working on this. This is an important bug to fix.
I don't have the whole context and obviously I haven't spent nearly as much
time on this as you have, but I'm wondering what if we always pass the
executor's ClassLoader to the Kryo serializer? Will that solve this problem?
Hi Reynold,
That would solve this specific issue, but you'd need to be careful that you
never created a serialiser instance before the first task is received.
Currently in Executor.TaskRunner.run a closure serialiser instance is
created before any application jars are downloaded, but that could
In part, my assertion was based on a comment by sryza on my PR (
https://github.com/apache/spark/pull/1890#issuecomment-51805750), however I
thought I had also seen it in the YARN code base. However, now that I look
for it, I can't find where this happens, so perhaps I was imagining the
YARN
Graham,
SparkEnv only creates a KryoSerializer, but as I understand that serializer
doesn't actually initializes the registrator since that is only called when
newKryo() is called when KryoSerializerInstance is initialized.
Basically I'm thinking a quick fix for 1.2:
1. Add a classLoader field
I now have a complete pull request for this issue that I'd like to get
reviewed and committed. The PR is available here:
https://github.com/apache/spark/pull/1890 and includes a testcase for the
issue I described. I've also submitted a related PR (
https://github.com/apache/spark/pull/1827) that
I've submitted a work-in-progress pull request for this issue that I'd like
feedback on. See https://github.com/apache/spark/pull/1890 . I've also
submitted a pull request for the related issue that the exceptions hit when
trying to use a custom kryo registrator are being swallowed:
Hi Spark devs,
I’ve posted an issue on JIRA (
https://issues.apache.org/jira/browse/SPARK-2878) which occurs when using
Kryo serialisation with a custom Kryo registrator to register custom
classes with Kryo. This is an insidious issue that non-deterministically
causes Kryo to have different ID
I don't think it was a conscious design decision to not include the
application classes in the connection manager serializer. We should fix
that. Where is it deserializing data in that thread?
4 might make sense in the long run, but it adds a lot of complexity to the
code base (whole separate
See my comment on https://issues.apache.org/jira/browse/SPARK-2878 for the
full stacktrace, but it's in the BlockManager/BlockManagerWorker where it's
trying to fulfil a getBlock request for another node. The objects that
would be in the block haven't yet been serialised, and that then causes the
15 matches
Mail list logo