Re: Airflow Integration

2021-03-02 Thread Ryan Blue
I think it’s difficult for Iceberg to support the partition-related commands because it ends up being a scan over metadata files rather than a metastore operation. We have been trying to move away from our own getPartitions API because it is expensive to satisfy those queries compared to Hive becau

Re: Airflow Integration

2021-03-02 Thread Gustavo Torres Torres
Thanks Peter! So in that case we do let users create iceberg tables with Hive DDL `CREATE EXTERNAL TABLE ice_table PARTITIONED BY ...` but my guess is that `SHOW PARTITIONS ice_table` would not work. Has there been any discussion about whether Iceberg tables in Hive should support these partition

Re: Airflow Integration

2021-03-02 Thread Peter Vary
Hi Gustavo, Not too familiar with the Airflow user base/use cases, but we had to consider similar things when decided what to do with `CREATE EXTERNAL TABLE ice_table PARTITIONED BY ...` Hive queries. See: https://github.com/apache/iceberg/pull/1917

Airflow Integration

2021-03-01 Thread Gustavo Torres Torres
Hey folks, Lately I've been thinking about integration between Airflow & Iceberg for a smooth transition from Hive-based tables to Iceberg ones and would like to hear about your experience. Specifically about Iceberg partition sensors in Airflow. >From the way I see it, there are two ways to go a