Re: Kryo serialization fails when using SparkSQL and HiveContext

Michael Armbrust Mon, 14 Dec 2015 14:32:16 -0800

You'll need to either turn off registration
(spark.kryo.registrationRequired) or create a custom register
spark.kryo.registrator


http://spark.apache.org/docs/latest/configuration.html#compression-and-serialization

On Mon, Dec 14, 2015 at 2:17 AM, Linh M. Tran <linh.mtran...@gmail.com>
wrote:

> Hi everyone,
> I'm using HiveContext and SparkSQL to query a Hive table and doing join
> operation on it.
> After changing the default serializer to Kryo with
> spark.kryo.registrationRequired = true, the Spark application failed with
> the following error:
>
> java.lang.IllegalArgumentException: Class is not registered:
> org.apache.spark.sql.catalyst.expressions.GenericRow
> Note: To register this class use:
> kryo.register(org.apache.spark.sql.catalyst.expressions.GenericRow.class);
>         at com.esotericsoftware.kryo.Kryo.getRegistration(Kryo.java:442)
>         at
> com.esotericsoftware.kryo.util.DefaultClassResolver.writeClass(DefaultClassResolver.java:79)
>         at com.esotericsoftware.kryo.Kryo.writeClass(Kryo.java:472)
>         at
> com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:565)
>         at
> com.twitter.chill.Tuple2Serializer.write(TupleSerializers.scala:36)
>         at
> com.twitter.chill.Tuple2Serializer.write(TupleSerializers.scala:33)
>         at
> com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:568)
>         at
> org.apache.spark.serializer.KryoSerializationStream.writeObject(KryoSerializer.scala:124)
>         at
> org.apache.spark.storage.DiskBlockObjectWriter.write(BlockObjectWriter.scala:204)
>         at
> org.apache.spark.util.collection.ExternalSorter.spillToPartitionFiles(ExternalSorter.scala:375)
>         at
> org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:211)
>         at
> org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:63)
>         at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
>         at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>         at org.apache.spark.scheduler.Task.run(Task.scala:64)
>         at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
>
> I'm using Spark 1.3.1 (HDP 2.3.0) and submitting Spark application to Yarn
> in cluster mode.
> Any help is appreciated.
> --
> Linh M. Tran
>

Re: Kryo serialization fails when using SparkSQL and HiveContext

Reply via email to