Ben Wan created SPARK-39130:
-------------------------------

             Summary: How do I read  parquet with python object
                 Key: SPARK-39130
                 URL: https://issues.apache.org/jira/browse/SPARK-39130
             Project: Spark
          Issue Type: Question
          Components: PySpark
    Affects Versions: 2.4.5
         Environment: pyspark2.4.5
            Reporter: Ben Wan


{{python:}}

 

import pandas as pd

a=pd.DataFrame([[1,[2.3,1.2]]],columns=['a','b'])

a.to_parquet('a.parquet')

 

pyspark:

 

d2 = spark.read.parquet('a.parquet')

 

will return error:

An error was encountered: An error occurred while calling o277.showString. : 
org.apache.spark.SparkException: Job aborted due to stage failure: Task 14 in 
stage 9.0 failed 4 times, most recent failure: Lost task 14.2 in stage 9.0 (TID 
63, 10.169.0.196, executor 15): java.lang.IllegalArgumentException: Illegal 
Capacity: -221

how can I fix it?

Thanks.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to