Aman Sinha created IMPALA-10723: ----------------------------------- Summary: Allow querying and computing stats on a materialized view Key: IMPALA-10723 URL: https://issues.apache.org/jira/browse/IMPALA-10723 Project: IMPALA Issue Type: Improvement Components: Frontend Reporter: Aman Sinha Assignee: Aman Sinha
Currently, in Impala, a Materialized View (MV) created via Hive is visible through the metadata catalog and can be queried but the query is expanded to its corresponding view definition. This is incorrect because a materialized view is a regular (physical) table, not a view. Even though Impala does not support either creating MV or automatic rewriting to use MV, querying an MV directly should be allowed. Here's the current behavior: {noformat} [localhost:21050] functional> explain select * from materialized_view; Query: explain select * from materialized_view +------------------------------------------------------------------------------------+ | Explain String | +------------------------------------------------------------------------------------+ | Max Per-Host Resource Reservation: Memory=4.00MB Threads=2 | | Per-Host Resource Estimates: Memory=10MB | | WARNING: The following tables are missing relevant table and/or column statistics. | | functional.insert_only_transactional_table | | | | PLAN-ROOT SINK | | | | | 01:EXCHANGE [UNPARTITIONED] | | | | | 00:SCAN HDFS [functional.insert_only_transactional_table] | | HDFS partitions=1/1 files=0 size=0B | | row-size=4B cardinality=0 | +------------------------------------------------------------------------------------+ {noformat} Note that the plan shows the scan of the underlying table instead of the materialized_view table. We should only be scanning the MV (including applying partition pruning, predicate pushdown etc.) and not treating this as a view. This JIRA is to enhance the frontend to recognize a materialized view as a table rather than a view. This will further allow commands such as COMPUTE STATS, DROP STATS, SHOW [TABLE | COLUMN] STATS to be run on the MV. One motivation for doing this is to allow an external frontend that supports automatic query rewrites using MVs to leverage the statistics on MVs. Since Impala is not creating the MV, we will need to block DML operations on the MV. Further, special handling needs to be done for Ranger authorization policies such that any column masking/row filtering policies defined on the source tables of the MV are taken into consideration. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org