Takuya Ueshin created SPARK-35638:
-------------------------------------

             Summary: Introduce Field to manage dtypes and StructField.
                 Key: SPARK-35638
                 URL: https://issues.apache.org/jira/browse/SPARK-35638
             Project: Spark
          Issue Type: Improvement
          Components: PySpark
    Affects Versions: 3.2.0
            Reporter: Takuya Ueshin


Currently there are some performance issues in the pandas-on-Spark layer.

One of them is accessing Java DataFrame and run analysis phase too many times, 
especially just for retrieving the current column names or data types.

We should reduce the amount of unnecessary access.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to