Eric Liang created SPARK-27669:
----------------------------------

             Summary: Refactor DataFrameWriter to always go through Catalyst 
for analysis
                 Key: SPARK-27669
                 URL: https://issues.apache.org/jira/browse/SPARK-27669
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 2.4.3
            Reporter: Eric Liang


Currently, DataFrameWriter.save() does a large amount of ad-hoc analysis (e.g., 
loading data source classes, validating options, and so on) before executing 
the command.

The execution of this code falls outside the scope of any SQL execution, which 
is unfortunate since it means it's untracked by Spark (e.g., in the Spark UI), 
and also means df.write ops cannot be manipulated by custom catalyst rules 
prior to execution.

These issues can be largely resolved by creating a command that represents 
df.write.save/saveAsTable().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to