Have you looked at the insert only ACID tables in Hive 3 (
https://issues.apache.org/jira/browse/HIVE-14535 )?  These were designed
specifically with the cloud in mind, since the way Hive traditionally adds
new data doesn't work well in the cloud.  And they do not require ORC, they
work with any file format.

Alan.

On Wed, Apr 24, 2019 at 12:04 PM Thai Bui <blquyt...@gmail.com> wrote:

> Hello all,
>
> Hive 3 has brought significant changes to the community with the support
> for ACID tables as default managed tables. With ACID tables, we can use
> features such as materialized views, query result caching for BI tools and
> more. But without ACID tables such as external tables, Hive doesn't support
> any of these advanced features which makes a majority of cloud-native users
> like me sad :(.
>
> I propose we should support a more limited version of read-only external
> tables such that materialized views and query result caching would work.
> For example:
>
> CREATE EXTERNAL TABLE table_name (..) STORED AS ORC
> LOCATION 's3://some-bucket/some-dir'
> TBLPROPERTIES ('read-only': "true");
>
> In such tables, any data modification operations such as INSERT and UPDATE
> would fail and DDL operations that "add" or "remove" partitions to the
> table would succeed such as "ALTER TABLE ... ADD PARTITION". This would
> make it possible for Hive to invalidate the cache and materialized views
> even when the table is an external table.
>
> Let me know what do you guys think and maybe I can start writing a wiki
> document describing the approach in greater details.
>
> Thanks,
> Thai
>

Reply via email to