Hive SQL extension

Peter Vary Thu, 22 Oct 2020 00:49:49 -0700

Hi Hive experts,

I would like to extend Hive SQL language to provide a way to create Iceberg 
partitioned tables like this:
create table iceberg_test(
        level string,
        event_time timestamp,
        message string,
        register_time date,
        telephone array <string>
    )
    partition by spec(
        level identity,
        event_time identity,
        event_time hour,
        register_time day
    )
    stored as iceberg;


The problem is that this syntax is very specific of Iceberg, and I think it is 
not a good idea to change the Hive syntax globally to accommodate a specific 
use-case.
The following CREATE TABLE statement could archive the same thing:
create table iceberg_test(
        level string,
        event_time timestamp,
        message string,
        register_time date,
        telephone array <string>
    )
    STORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler' 
    TBLPROPERTIES ('iceberg.mr.table.partition.spec'='...');

I am looking for a way to rewrite the original (Hive syntactically not correct) 
query to a new (syntactically correct) one.

I was checking the hooks as a possible solution, but I have found that:
HiveDriverRunHook.preDriverRun can get the original / syntactically not correct 
query, but I have found no way to rewrite it to a syntactically correct one (it 
looks like a read only query)
HiveSemanticAnalyzerHook can rewrite the AST tree, but it needs a syntactically 
correct query to start with

Any other ideas how to archive the goals above? Either with Hooks, or with any 
other way?

Thanks,
Peter

Hive SQL extension

Reply via email to