awslife opened a new issue, #10280:
URL: https://github.com/apache/gravitino/issues/10280

   ### Describe the feature
   
   Currently, Iceberg snapshot maintenance procedures available in the native 
Trino Iceberg Connector are not supported through the Gravitino Trino Connector.
   This feature request proposes full support for Iceberg system procedures 
delegation through the Gravitino Trino Connector, so that users can manage 
snapshot lifecycle entirely within Trino without needing to rely on external 
tools such as Spark or the Iceberg Java API.
   
   ### Motivation
   
   When using Gravitino as a unified metadata layer, users naturally expect 
that all catalog operations — including maintenance tasks — are accessible 
through the same interface.
   Currently, snapshot cleanup must be performed via a separate tool (e.g., 
Spark, Iceberg Java API), which introduces operational complexity and breaks 
the unified access model that Gravitino aims to provide.
   Supporting these procedures through the Gravitino Trino Connector would:
   
   - Allow users to fully manage Iceberg table lifecycle within a single Trino 
interface.
   - Eliminate the need to maintain a separate Spark or Java-based pipeline 
solely for snapshot cleanup.
   - Strengthen Gravitino's value as a truly unified metadata and catalog 
management layer.
   
   ### Describe the solution
   
   The following Iceberg system procedures should be supported via the 
Gravitino Trino Connector:
   
   Procedure | Description
   -- | --
   system.expire_snapshots | Remove old snapshots older than a given timestamp
   system.remove_orphan_files | Delete orphan data files not referenced by any 
snapshot
   system.rewrite_data_files | Compact small data files into larger ones
   system.rewrite_manifests | Rewrite manifest files for improved query 
performance
   
   Example Usage (Expected to work after this feature is implemented)
   ```sql
   -- Expire old snapshots
   CALL gravitino_catalog.system.expire_snapshots(
       schema_name => 'my_schema',
       table_name  => 'my_table',
       older_than  => TIMESTAMP '2024-01-01 00:00:00'
   );
   
   -- Remove orphan files
   CALL gravitino_catalog.system.remove_orphan_files(
       schema_name => 'my_schema',
       table_name  => 'my_table'
   );
   
   -- Compact small files
   CALL gravitino_catalog.system.rewrite_data_files(
       schema_name => 'my_schema',
       table_name  => 'my_table'
   );
   ```
   
   **Current Behavior**
   These procedure calls either fail with an error (e.g., procedure not found, 
unsupported operation) or complete silently without actually performing the 
expected maintenance operations.
   This is because the Gravitino Trino Connector acts as a metadata proxy layer 
and does not currently delegate Iceberg-specific system procedures to the 
underlying catalog.
   
   **Expected Behavior**
   The Gravitino Trino Connector should properly intercept and delegate Iceberg 
system procedure calls to the underlying Iceberg catalog, in the same way that 
the native Trino Iceberg Connector handles them.
   
   **Environment**
   
   Apache Gravitino version: (1.2.0-rc6)
   Trino version: (472)
   Iceberg version: (1.8)
   Catalog type: Iceberg (backed by REST)
   
   
   ### Additional context
   
   This feature is particularly important for production environments where 
automated snapshot expiration and storage cost management are critical 
operational requirements.
   Without this feature, Gravitino cannot be adopted as a complete metadata 
management solution for Iceberg-heavy workloads.
   Thank you for considering this feature request!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to