Hello all,

Hive 3 has brought significant changes to the community with the support
for ACID tables as default managed tables. With ACID tables, we can use
features such as materialized views, query result caching for BI tools and
more. But without ACID tables such as external tables, Hive doesn't support
any of these advanced features which makes a majority of cloud-native users
like me sad :(.

I propose we should support a more limited version of read-only external
tables such that materialized views and query result caching would work.
For example:

CREATE EXTERNAL TABLE table_name (..) STORED AS ORC
LOCATION 's3://some-bucket/some-dir'
TBLPROPERTIES ('read-only': "true");

In such tables, any data modification operations such as INSERT and UPDATE
would fail and DDL operations that "add" or "remove" partitions to the
table would succeed such as "ALTER TABLE ... ADD PARTITION". This would
make it possible for Hive to invalidate the cache and materialized views
even when the table is an external table.

Let me know what do you guys think and maybe I can start writing a wiki
document describing the approach in greater details.

Thanks,
Thai

Reply via email to