Hi Hive experts,

I would like to extend Hive SQL language to provide a way to create Iceberg 
partitioned tables like this:
create table iceberg_test(
        level string,
        event_time timestamp,
        message string,
        register_time date,
        telephone array <string>
    )
    partition by spec(
        level identity,
        event_time identity,
        event_time hour,
        register_time day
    )
    stored as iceberg;

The problem is that this syntax is very specific of Iceberg, and I think it is 
not a good idea to change the Hive syntax globally to accommodate a specific 
use-case.
The following CREATE TABLE statement could archive the same thing:
create table iceberg_test(
        level string,
        event_time timestamp,
        message string,
        register_time date,
        telephone array <string>
    )
    STORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler' 
    TBLPROPERTIES ('iceberg.mr.table.partition.spec'='...');

I am looking for a way to rewrite the original (Hive syntactically not correct) 
query to a new (syntactically correct) one.

I was checking the hooks as a possible solution, but I have found that:
HiveDriverRunHook.preDriverRun can get the original / syntactically not correct 
query, but I have found no way to rewrite it to a syntactically correct one (it 
looks like a read only query)
HiveSemanticAnalyzerHook can rewrite the AST tree, but it needs a syntactically 
correct query to start with

Any other ideas how to archive the goals above? Either with Hooks, or with any 
other way?

Thanks,
Peter

Reply via email to