[ https://issues.apache.org/jira/browse/SPARK-7379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15037678#comment-15037678 ]
Xusen Yin edited comment on SPARK-7379 at 12/3/15 11:19 AM: ------------------------------------------------------------ OK, I fixed my code in that PR. My case is getSplits returns an Array[Double], then in the mllib/common.py code, _java2py() method it fails to match JavaList. {code} if isinstance(r, (bytearray, bytes)): r = PickleSerializer().loads(bytes(r), encoding=encoding) {code} The above code cath the Array[Double] but fails to deserialize it with Python3. was (Author: yinxusen): OK, I fixed my code in that PR. My case is getSplits returns an Array[Double], then in the mllib/common.py code, _java2py() method it fails to match JavaList. {code} if isinstance(r, (bytearray, bytes)): r = PickleSerializer().loads(bytes(r), encoding=encoding) {code} The above code cath the Array[Double] but fails to deserialize it with Python3. > pickle.loads expects a string instead of bytes in Python 3. > ----------------------------------------------------------- > > Key: SPARK-7379 > URL: https://issues.apache.org/jira/browse/SPARK-7379 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 1.4.0 > Reporter: Xiangrui Meng > Assignee: Davies Liu > > In PickleSerializer, we call pickle.loads in Python 3. However, the input obj > could be bytes, which works in Python 2 but not 3. > The error message is > {code} > File > "/home/jenkins/workspace/SparkPullRequestBuilder@3/python/pyspark/serializers.py", > line 418, in loads > return pickle.loads(obj, encoding=encoding) > TypeError: must be a unicode character, not bytes > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org