CTTY opened a new issue, #2021:
URL: https://github.com/apache/iceberg-rust/issues/2021
### Is your feature request related to a problem or challenge?
##Background
DataFusion already supports CREATE EXTERNAL TABLE ... STORED AS ICEBERG.
Today, iceberg-rust integrates via IcebergTableProviderFactory, but the factory
primarily supports registering a static table (e.g., created from a metadata
JSON path). That works for:
```sql
-- Static table (existing, backward compatible)
CREATE EXTERNAL TABLE my_table
STORED AS ICEBERG
LOCATION '/path/to/metadata.json';
```
However, we also want CREATE EXTERNAL TABLE to create a normal
IcebergTableProvider backed by a Catalog, so users can define the catalog via
SQL OPTIONS (and then resolve tables by identifier through that catalog).
### Describe the solution you'd like
Dumping my thoughts here and feedbacks are welcome!
### Option A: Build Catalog inside the ProviderFactory using `OPTIONS`
`IcebergTableProviderFactory` parses `OPTIONS` and uses a `CatalogBuilder`
to construct the `Catalog` internally, then creates a normal
`IcebergTableProvider`
```sql
CREATE EXTERNAL TABLE my_table
STORED AS ICEBERG
LOCATION 'ignored_or_optional' // this will be ignored if a catalog is
configured
OPTIONS (
'datafusion.iceberg.catalog.type' = 'rest', // if catalog type is not
configured, it should fall back to create static table
'datafusion.iceberg.catalog.uri' = 'http://localhost:8181',
'datafusion.iceberg.catalog.warehouse' = 's3://bucket/warehouse'
);
```
### Option B: Allow injecting a pre-built Catalog into the factory
Essentially we have
```rust
pub struct IcebergTableProviderFactory {
catalog: Option<Arc<dyn Catalog>>, // when it's none, fall back to static
table
}
...
IcebergTableProviderFactory::new_with_catalog(Arc<dyn Catalog>)
```
I prefer this as it is much more straight-forward, but one drawback I can
think of is users cannot easily use multiple catalogs at the same time. A
workaround would look like this:
```rust
state
.table_factories_mut()
.insert("ICEBERG_REST_A".to_string(),
Arc::new(IcebergTableProviderFactory(rest_catalog_a)));
state
.table_factories_mut()
.insert("ICEBERG_REST_B".to_string(),
Arc::new(IcebergTableProviderFactory(rest_catalog_b)));
```
and then when creating the table using sql:
```sql
CREATE EXTERNAL TABLE my_table
STORED AS ICEBERG_REST_A
...
```
### Willingness to contribute
I can contribute to this feature independently
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]