[ https://issues.apache.org/jira/browse/HIVE-16957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jesus Camacho Rodriguez updated HIVE-16957: ------------------------------------------- Attachment: HIVE-16957.04.patch > Support CTAS for auto gather column stats > ----------------------------------------- > > Key: HIVE-16957 > URL: https://issues.apache.org/jira/browse/HIVE-16957 > Project: Hive > Issue Type: Sub-task > Reporter: Pengcheng Xiong > Assignee: Jesus Camacho Rodriguez > Priority: Major > Attachments: HIVE-16957.01.patch, HIVE-16957.02.patch, > HIVE-16957.03.patch, HIVE-16957.04.patch, HIVE-16957.patch > > > The idea is to rely as much as possible on the logic in > ColumnStatsSemanticAnalyzer as other operations do. In particular, they > create a 'analyze table t compute statistics for columns', use > ColumnStatsSemanticAnalyzer to parse it, and connect resulting plan to > existing INSERT/INSERT OVERWRITE statement. The challenge for CTAS or CREATE > MATERIALIZED VIEW is that the table object does not exist yet, hence we > cannot rely fully on ColumnStatsSemanticAnalyzer. > Thus, we use same process, but ColumnStatsSemanticAnalyzer produces a > statement for column stats collection that uses a table values clause instead > of the original table reference: > {code} > select compute_stats(col1), compute_stats(col2), compute_stats(col3) > from table(values(cast(null as int), cast(null as int), cast(null as > string))) as t(col1, col2, col3); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)