Takuya Ueshin created SPARK-35638: ------------------------------------- Summary: Introduce Field to manage dtypes and StructField. Key: SPARK-35638 URL: https://issues.apache.org/jira/browse/SPARK-35638 Project: Spark Issue Type: Improvement Components: PySpark Affects Versions: 3.2.0 Reporter: Takuya Ueshin
Currently there are some performance issues in the pandas-on-Spark layer. One of them is accessing Java DataFrame and run analysis phase too many times, especially just for retrieving the current column names or data types. We should reduce the amount of unnecessary access. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org