Eric Liang created SPARK-27669: ---------------------------------- Summary: Refactor DataFrameWriter to always go through Catalyst for analysis Key: SPARK-27669 URL: https://issues.apache.org/jira/browse/SPARK-27669 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 2.4.3 Reporter: Eric Liang
Currently, DataFrameWriter.save() does a large amount of ad-hoc analysis (e.g., loading data source classes, validating options, and so on) before executing the command. The execution of this code falls outside the scope of any SQL execution, which is unfortunate since it means it's untracked by Spark (e.g., in the Spark UI), and also means df.write ops cannot be manipulated by custom catalyst rules prior to execution. These issues can be largely resolved by creating a command that represents df.write.save/saveAsTable(). -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org