[ https://issues.apache.org/jira/browse/SPARK-36337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17410300#comment-17410300 ]
Yikun Jiang edited comment on SPARK-36337 at 9/6/21, 2:36 AM: -------------------------------------------------------------- Good news: the doubel Nan pickled issue solved in https://github.com/irmen/pickle/issues/7 Bad news: Spark [is using Pyrolite|https://github.com/apache/spark/blob/1ccb06ca8cf439e0c13ffbfb50365402e7d1330d/core/pom.xml#L418], include below features: - net.razorvine.pickle (Only use this feature in current Spark, [~hyukjin.kwon] Could you do a double confirm?) - net.razorvine.pyro And the pyrolite was spilted as two separate repo: https://github.com/irmen/Pyrolite/tree/master#where-is-pickle So, looks like we have 2 choices: 1. wait to [backport Nan fixed PR to pyrolite|https://github.com/irmen/pickle/issues/7#issuecomment-913293219] and bump version to 4.32. or 2. change dep from irmen/pyrolite to irmen/pickle was (Author: yikunkero): Good news: the doubel Nan pickled issue solved in https://github.com/irmen/pickle/issues/7 Bad news: Spark is using Pyrolite, include below features: - net.razorvine.pickle (Only use this feature in current Spark, [~hyukjin.kwon] Could you do a double confirm?) - net.razorvine.pyro And the pyrolite was spilted as two separate repo: https://github.com/irmen/Pyrolite/tree/master#where-is-pickle So, looks like we have 2 choices: 1. wait to [backport Nan fixed PR to pyrolite|https://github.com/irmen/pickle/issues/7#issuecomment-913293219] and bump version to 4.32. or 2. change dep from irmen/pyrolite to irmen/pickle [1] https://github.com/apache/spark/blob/1ccb06ca8cf439e0c13ffbfb50365402e7d1330d/core/pom.xml#L418 , > decimal('Nan') is unsupported in net.razorvine.pickle > ------------------------------------------------------ > > Key: SPARK-36337 > URL: https://issues.apache.org/jira/browse/SPARK-36337 > Project: Spark > Issue Type: Sub-task > Components: PySpark > Affects Versions: 3.2.0 > Reporter: Yikun Jiang > Priority: Major > > Decimal('NaN') is not supported by net.razorvine.pickle now. > In Python > {code:java} > >>> pickled = cloudpickle.dumps(decimal.Decimal('NaN')) > b'\x80\x05\x95!\x00\x00\x00\x00\x00\x00\x00\x8c\x07decimal\x94\x8c\x07Decimal\x94\x93\x94\x8c\x03NaN\x94\x85\x94R\x94.' > >>> pickle.loads(pickled) > Decimal('NaN') > {code} > In Scala > {code:java} > scala> import net.razorvine.pickle.\{Pickler, Unpickler, PickleUtils} > scala> val unpickle = new Unpickler > scala> > unpickle.loads(PickleUtils.str2bytes("\u0080\u0005\u0095!\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u008c\u0007decimal\u0094\u008c\u0007Decimal\u0094\u0093\u0094\u008c\u0003NaN\u0094\u0085\u0094R\u0094.")) > net.razorvine.pickle.PickleException: problem construction object: > java.lang.reflect.InvocationTargetException > at > net.razorvine.pickle.objects.AnyClassConstructor.construct(AnyClassConstructor.java:29) > at net.razorvine.pickle.Unpickler.load_reduce(Unpickler.java:773) > at net.razorvine.pickle.Unpickler.dispatch(Unpickler.java:213) > at net.razorvine.pickle.Unpickler.load(Unpickler.java:123) > at net.razorvine.pickle.Unpickler.loads(Unpickler.java:136) > ... 48 elided > {code} > I submit an issue in pickle upstream > [https://github.com/irmen/pickle/issues/7] . > we should bump pickle latest version after it fixed. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org