Github user ConeyLiu commented on the issue:

    https://github.com/apache/spark/pull/19586
  
    Hi @cloud-fan, for most case the data type should be same. So I think this 
optimization is valuable, because it can save the space and cpu resource 
considerable. What about setting a flag for the RDD, which indicates whether 
the RDD only has the same types. If it'st not valid, could we putting it to the 
ml package for special serializer, then user could configure it. But for this 
case, there must be provided the exactly classtag of the RDD for serialization 
due to the relocation of unsafeshufflewrite.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to