[jira] [Resolved] (SPARK-39130) How do I read parquet with python object
[ https://issues.apache.org/jira/browse/SPARK-39130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Wan resolved SPARK-39130. - Resolution: Won't Do > How do I read parquet with python object > - > > Key: SPARK-39130 > URL: https://issues.apache.org/jira/browse/SPARK-39130 > Project: Spark > Issue Type: Question > Components: PySpark >Affects Versions: 2.4.5 > Environment: pyspark2.4.5 >Reporter: Ben Wan >Priority: Trivial > Original Estimate: 0.5h > Remaining Estimate: 0.5h > > {{python:}} > > import pandas as pd > a=pd.DataFrame([[1,[2.3,1.2]]],columns=['a','b']) > a.to_parquet('a.parquet') > > pyspark: > > d2 = spark.read.parquet('a.parquet') > > will return error: > An error was encountered: An error occurred while calling o277.showString. : > org.apache.spark.SparkException: Job aborted due to stage failure: Task 14 in > stage 9.0 failed 4 times, most recent failure: Lost task 14.2 in stage 9.0 > (TID 63, 10.169.0.196, executor 15): java.lang.IllegalArgumentException: > Illegal Capacity: -221 > how can I fix it? > Thanks. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39130) How do I read parquet with python object
Ben Wan created SPARK-39130: --- Summary: How do I read parquet with python object Key: SPARK-39130 URL: https://issues.apache.org/jira/browse/SPARK-39130 Project: Spark Issue Type: Question Components: PySpark Affects Versions: 2.4.5 Environment: pyspark2.4.5 Reporter: Ben Wan {{python:}} import pandas as pd a=pd.DataFrame([[1,[2.3,1.2]]],columns=['a','b']) a.to_parquet('a.parquet') pyspark: d2 = spark.read.parquet('a.parquet') will return error: An error was encountered: An error occurred while calling o277.showString. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 14 in stage 9.0 failed 4 times, most recent failure: Lost task 14.2 in stage 9.0 (TID 63, 10.169.0.196, executor 15): java.lang.IllegalArgumentException: Illegal Capacity: -221 how can I fix it? Thanks. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38224) How do I get a lot of results in KDE
[ https://issues.apache.org/jira/browse/SPARK-38224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Wan updated SPARK-38224: Priority: Trivial (was: Major) > How do I get a lot of results in KDE > > > Key: SPARK-38224 > URL: https://issues.apache.org/jira/browse/SPARK-38224 > Project: Spark > Issue Type: Question > Components: ML >Affects Versions: 2.4.5 >Reporter: Ben Wan >Priority: Trivial > > I have a pyspark.DataFrame, I have converted one of the columns to RDD and > performed KDE, I need to get all the KDE estimates of the column and add a > new column in the DataFrame for subsequent work, how can I do it by Spark? -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38224) How do I get a lot of results in KDE
Ben Wan created SPARK-38224: --- Summary: How do I get a lot of results in KDE Key: SPARK-38224 URL: https://issues.apache.org/jira/browse/SPARK-38224 Project: Spark Issue Type: Question Components: ML Affects Versions: 2.4.5 Reporter: Ben Wan I have a pyspark.DataFrame, I have converted one of the columns to RDD and performed KDE, I need to get all the KDE estimates of the column and add a new column in the DataFrame for subsequent work, how can I do it by Spark? -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org