You are only required to add classes to Kryo (compulsorily) if you use a specific setting:

//require registration of all classes with Kyro 
.set("spark.kryo.registrationRequired","true")

Here's an example of my setup, I think this is the best approach because it forces me to really think about what I am serializing:

// for kyro serializer it wants to register all classes that need to be serialized Class[] kryoClassArray = new Class[]{DropResult.class, DropEvaluation.class, PrintHetSharing.class}; SparkConf sparkConf = new SparkConf() .setAppName("MyAppName") .setMaster(spark://ipaddress:7077) // now for the Kryo stuff .set("spark.serializer", "org.apache.spark.serializer.KryoSerializer") //require registration of all classes with Kyro .set("spark.kryo.registrationRequired", "true") // don't forget to register ALL classes or will get error .registerKryoClasses(kryoClassArray);




On 01/27/2016 12:58 PM, Shixiong(Ryan) Zhu wrote:
It depends. The default Kryo serializer cannot handle all cases. If you encounter any issue, you can follow the Kryo doc to set up custom serializer: https://github.com/EsotericSoftware/kryo/blob/master/README.md On Wed, Jan 27, 2016 at 3:13 AM, amit tewari <amittewar...@gmail.com <mailto:amittewar...@gmail.com>> wrote:

    This is what I have added in my code:

    rdd.persist(StorageLevel.MEMORY_ONLY_SER())

    conf.set("spark.serializer","org.apache.spark.serializer.KryoSerializer");

    Do I compulsorily need to do anything via
    : spark.kryo.classesToRegister?

    Or the above code sufficient to achieve performance gain using
    Kryo serialization?

    Thanks

    Amit

Reply via email to