[GitHub] spark issue #19586: [SPARK-22367][WIP][CORE] Separate the serialization of c...

jerryshao Wed, 01 Nov 2017 20:14:33 -0700

Github user jerryshao commented on the issue:

    https://github.com/apache/spark/pull/19586
  
    I tend to agree with @cloud-fan , I think you can implement your own 
serializer out of Spark to be more specialized for your application, that will 
definitely be more efficient than the built-in one. But for the Spark's default 
solution, it should be general enough to cover all cases. Setting a flag or a 
configuration is not intuitive enough from my understanding.
    
    And for ML, can you please provide an example about how this could be 
improved with your approach. From my understanding you approach is more useful 
when leverage custom class definition, like `Person` in your example. But for 
ML/SQL cases, all the types should be predefined or primitives, will that 
improved a lot?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19586: [SPARK-22367][WIP][CORE] Separate the serialization of c...

Reply via email to