Hey folks,

Lately I've been thinking about integration between Airflow & Iceberg for a
smooth transition from Hive-based tables to Iceberg ones and would like to
hear about your experience. Specifically about Iceberg partition sensors in
Airflow.

>From the way I see it, there are two ways to go about this (at least for
Hive-based catalogs):


   1. Modify our Hive Metastore API so that partitions-APIs are handled
   directly by the Iceberg API. This has the advantage of being mostly
   transparent to users but has the downside of being confusing since Iceberg
   creates tables with the Hive catalog as external non-partitioned tables.
   2. Create a separate sensor that makes it clear that we are sensing over
   an Iceberg table. This is probably the most straightforward approach, but
   if we do this we would probably need to do the same for any tool that used
   the metastore to get partition information.


Would love to hear what your experiences have been.
Thanks

Reply via email to