[GitHub] spark issue #21157: [SPARK-22674][PYTHON] Removed the namedtuple pickling pa...

HyukjinKwon Thu, 26 Apr 2018 02:59:23 -0700

Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/21157
  
    I don't like the hack too but the complete removal just basically means we 
are going to drop namedtuple supports in RDD without, for example, any 
deprecation warnings. Spark is being super conservative and this's going to 
break compatibility. So, I was thinking we could do this for Spark 3.0. We 
already start to talk about this.
    
    This should probably be something we should discuss in the mailing list 
since it's a breaking change. One thing clear is that the complete removal 
should target 3.0.0.
    
    For now, yea, `cloudpickle` solves it but probably it's less performant.
    
    Logically, `cloudpickle` [fixed 
it](https://github.com/cloudpipe/cloudpickle/pull/113) (our cloudpickle copy is 
matched to 
[0.4.3](https://github.com/cloudpipe/cloudpickle/releases/tag/v0.4.3)) and we 
can take after the fix with the normal pickle side, aren't we?




---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #21157: [SPARK-22674][PYTHON] Removed the namedtuple pickling pa...

Reply via email to