[ https://issues.apache.org/jira/browse/SPARK-8312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14582551#comment-14582551 ]
Apache Spark commented on SPARK-8312: ------------------------------------- User 'navis' has created a pull request for this issue: https://github.com/apache/spark/pull/6767 > Populate statistics info of hive tables if it's needed to be > ------------------------------------------------------------ > > Key: SPARK-8312 > URL: https://issues.apache.org/jira/browse/SPARK-8312 > Project: Spark > Issue Type: Improvement > Components: SQL > Reporter: Navis > Priority: Minor > > Currently, spark-sql uses stats in metastore for estimating size of hive > table, which means analyze command should be executed before accessing the > table for better planning especially for joins. But still with the stats, it > cannot reflect real input size of the query when partition prunning predicate > exists in it. > Even worse is that hive cannot update metastore stats for external tables, > which is fixed recently in HIVE-6727. The issue detail says the bug is > applied to all hive version between 0.13.0 and 1.2.0 -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org