[jira] [Updated] (SPARK-34463) toPandas failed with error: buffer source array is read-only when Arrow with self-destruct is enabled
[ https://issues.apache.org/jira/browse/SPARK-34463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-34463: - Affects Version/s: (was: 3.0.2) 3.2.0 > toPandas failed with error: buffer source array is read-only when Arrow with > self-destruct is enabled > - > > Key: SPARK-34463 > URL: https://issues.apache.org/jira/browse/SPARK-34463 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 3.2.0 >Reporter: Weichen Xu >Priority: Major > > Environment: > apache/spark master > pandas version > 1.0.5 > Reproduce code: > {code:java} > spark.conf.set('spark.sql.execution.arrow.pyspark.enabled', True) > spark.conf.set('spark.sql.execution.arrow.pyspark.selfDestruct.enabled', True) > spark.createDataFrame(sc.parallelize([(i,) for i in range(13)], 1), 'id > long').selectExpr('IF(id % 3==0, id+1, NULL) AS f1', '(id+1) % 2 AS > label').toPandas()['label'].value_counts() > {code} > Get error like: > {quote}Traceback (most recent call last): > File "", line 1, in > File > "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/base.py", > line 1033, in value_counts > dropna=dropna, > File > "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/algorithms.py", > line 820, in value_counts > keys, counts = value_counts_arraylike(values, dropna) > File > "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/algorithms.py", > line 865, in value_counts_arraylike > keys, counts = f(values, dropna) > File "pandas/_libs/hashtable_func_helper.pxi", line 1098, in > pandas._libs.hashtable.value_count_int64 > File "stringsource", line 658, in View.MemoryView.memoryview_cwrapper > File "stringsource", line 349, in View.MemoryView.memoryview.__cinit__ > ValueError: buffer source array is read-only > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-34463) toPandas failed with error: buffer source array is read-only when Arrow with self-destruct is enabled
[ https://issues.apache.org/jira/browse/SPARK-34463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-34463: - Summary: toPandas failed with error: buffer source array is read-only when Arrow with self-destruct is enabled (was: toPandas failed with error: buffer source array is read-only) > toPandas failed with error: buffer source array is read-only when Arrow with > self-destruct is enabled > - > > Key: SPARK-34463 > URL: https://issues.apache.org/jira/browse/SPARK-34463 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 3.0.2 >Reporter: Weichen Xu >Priority: Major > > Environment: > apache/spark master > pandas version > 1.0.5 > Reproduce code: > {code:java} > spark.conf.set('spark.sql.execution.arrow.pyspark.enabled', True) > spark.conf.set('spark.sql.execution.arrow.pyspark.selfDestruct.enabled', True) > spark.createDataFrame(sc.parallelize([(i,) for i in range(13)], 1), 'id > long').selectExpr('IF(id % 3==0, id+1, NULL) AS f1', '(id+1) % 2 AS > label').toPandas()['label'].value_counts() > {code} > Get error like: > {quote}Traceback (most recent call last): > File "", line 1, in > File > "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/base.py", > line 1033, in value_counts > dropna=dropna, > File > "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/algorithms.py", > line 820, in value_counts > keys, counts = value_counts_arraylike(values, dropna) > File > "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/algorithms.py", > line 865, in value_counts_arraylike > keys, counts = f(values, dropna) > File "pandas/_libs/hashtable_func_helper.pxi", line 1098, in > pandas._libs.hashtable.value_count_int64 > File "stringsource", line 658, in View.MemoryView.memoryview_cwrapper > File "stringsource", line 349, in View.MemoryView.memoryview.__cinit__ > ValueError: buffer source array is read-only > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org