CTTY opened a new issue, #2021:
URL: https://github.com/apache/iceberg-rust/issues/2021

   ### Is your feature request related to a problem or challenge?
   
   ##Background
   
   DataFusion already supports CREATE EXTERNAL TABLE ... STORED AS ICEBERG. 
Today, iceberg-rust integrates via IcebergTableProviderFactory, but the factory 
primarily supports registering a static table (e.g., created from a metadata 
JSON path). That works for:
   ```sql
   -- Static table (existing, backward compatible)
   CREATE EXTERNAL TABLE my_table
   STORED AS ICEBERG
   LOCATION '/path/to/metadata.json';
   ```
   
   However, we also want CREATE EXTERNAL TABLE to create a normal 
IcebergTableProvider backed by a Catalog, so users can define the catalog via 
SQL OPTIONS (and then resolve tables by identifier through that catalog).
   
   ### Describe the solution you'd like
   
   Dumping my thoughts here and feedbacks are welcome!
   
   ### Option A: Build Catalog inside the ProviderFactory using `OPTIONS`
   `IcebergTableProviderFactory` parses `OPTIONS` and uses a `CatalogBuilder` 
to construct the `Catalog` internally, then creates a normal 
`IcebergTableProvider`
   ```sql
   CREATE EXTERNAL TABLE my_table
   STORED AS ICEBERG
   LOCATION 'ignored_or_optional' // this will be ignored if a catalog is 
configured
   OPTIONS (
     'datafusion.iceberg.catalog.type' = 'rest', // if catalog type is not 
configured, it should fall back to create static table
     'datafusion.iceberg.catalog.uri' = 'http://localhost:8181',
     'datafusion.iceberg.catalog.warehouse' = 's3://bucket/warehouse'
   );
   ```
   
   ### Option B: Allow injecting a pre-built Catalog into the factory
   Essentially we have 
   ```rust
   pub struct IcebergTableProviderFactory {
     catalog: Option<Arc<dyn Catalog>>, // when it's none, fall back to static 
table
   }
   ...
   IcebergTableProviderFactory::new_with_catalog(Arc<dyn Catalog>)
   ```
   
   I prefer this as it is much more straight-forward, but one drawback I can 
think of is users cannot easily use multiple catalogs at the same time. A 
workaround would look like this:
   ```rust
   state
           .table_factories_mut()
           .insert("ICEBERG_REST_A".to_string(), 
Arc::new(IcebergTableProviderFactory(rest_catalog_a)));
   
   state
           .table_factories_mut()
           .insert("ICEBERG_REST_B".to_string(), 
Arc::new(IcebergTableProviderFactory(rest_catalog_b)));
   ```
   
   and then when creating the table using sql:
   ```sql
   CREATE EXTERNAL TABLE my_table
   STORED AS ICEBERG_REST_A
   ...
   
   ```
   
   ### Willingness to contribute
   
   I can contribute to this feature independently


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to