awslife opened a new issue, #10280:
URL: https://github.com/apache/gravitino/issues/10280
### Describe the feature
Currently, Iceberg snapshot maintenance procedures available in the native
Trino Iceberg Connector are not supported through the Gravitino Trino Connector.
This feature request proposes full support for Iceberg system procedures
delegation through the Gravitino Trino Connector, so that users can manage
snapshot lifecycle entirely within Trino without needing to rely on external
tools such as Spark or the Iceberg Java API.
### Motivation
When using Gravitino as a unified metadata layer, users naturally expect
that all catalog operations — including maintenance tasks — are accessible
through the same interface.
Currently, snapshot cleanup must be performed via a separate tool (e.g.,
Spark, Iceberg Java API), which introduces operational complexity and breaks
the unified access model that Gravitino aims to provide.
Supporting these procedures through the Gravitino Trino Connector would:
- Allow users to fully manage Iceberg table lifecycle within a single Trino
interface.
- Eliminate the need to maintain a separate Spark or Java-based pipeline
solely for snapshot cleanup.
- Strengthen Gravitino's value as a truly unified metadata and catalog
management layer.
### Describe the solution
The following Iceberg system procedures should be supported via the
Gravitino Trino Connector:
Procedure | Description
-- | --
system.expire_snapshots | Remove old snapshots older than a given timestamp
system.remove_orphan_files | Delete orphan data files not referenced by any
snapshot
system.rewrite_data_files | Compact small data files into larger ones
system.rewrite_manifests | Rewrite manifest files for improved query
performance
Example Usage (Expected to work after this feature is implemented)
```sql
-- Expire old snapshots
CALL gravitino_catalog.system.expire_snapshots(
schema_name => 'my_schema',
table_name => 'my_table',
older_than => TIMESTAMP '2024-01-01 00:00:00'
);
-- Remove orphan files
CALL gravitino_catalog.system.remove_orphan_files(
schema_name => 'my_schema',
table_name => 'my_table'
);
-- Compact small files
CALL gravitino_catalog.system.rewrite_data_files(
schema_name => 'my_schema',
table_name => 'my_table'
);
```
**Current Behavior**
These procedure calls either fail with an error (e.g., procedure not found,
unsupported operation) or complete silently without actually performing the
expected maintenance operations.
This is because the Gravitino Trino Connector acts as a metadata proxy layer
and does not currently delegate Iceberg-specific system procedures to the
underlying catalog.
**Expected Behavior**
The Gravitino Trino Connector should properly intercept and delegate Iceberg
system procedure calls to the underlying Iceberg catalog, in the same way that
the native Trino Iceberg Connector handles them.
**Environment**
Apache Gravitino version: (1.2.0-rc6)
Trino version: (472)
Iceberg version: (1.8)
Catalog type: Iceberg (backed by REST)
### Additional context
This feature is particularly important for production environments where
automated snapshot expiration and storage cost management are critical
operational requirements.
Without this feature, Gravitino cannot be adopted as a complete metadata
management solution for Iceberg-heavy workloads.
Thank you for considering this feature request!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]