gabrywu created SPARK-38258: ------------------------------- Summary: [proposal] collect & update statistics automatically when spark SQL is running Key: SPARK-38258 URL: https://issues.apache.org/jira/browse/SPARK-38258 Project: Spark Issue Type: Wish Components: Spark Core, SQL Affects Versions: 3.2.0, 3.1.0, 3.0.0 Reporter: gabrywu
As we all know, table & column statistics are very important to spark SQL optimizer, however we have to collect & update them using {code:java} analyze table tableName compute statistics{code} It's a little inconvenient, so why can't we collect & update statistics when a spark stage runs and finishes? For example, when a insert overwrite table statement finishes, we can update a corresponding table statistics using SQL metric. And in next queries, spark sql optimizer can use these statistics. So what do you think of it?[~yumwang] -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org