advancedxy commented on PR #10755:
URL: https://github.com/apache/iceberg/pull/10755#issuecomment-2246811037

   >  I don't think I'd go with a general SetPartitionSpecs update, I think I'd 
have a RemovePartitionSpec, and the TableMetadata builder APIs to remove a 
given spec (which will have validation that we're not removing the current 
spec, the spec to remove is a valid partition spec etc).
   
   This is a nice suggestion and better approach. I will change the 
`withSpecs/setSpecs` by using `removePartitionSpec`.
   
   Replying other comments inline. 
   > 1. Should we included removing unused schemas? 
   
   No, and I think we are on the same page. `PruneUnusedSchemas` should be in a 
separate and dedicated action. However, it might not be possible to do that in 
current code as there's no SchemaId bounding to DataFile/DeleteFile. It's 
impossible to decide which schemas are unused?
   
   > In this approach I think we'd have to do a REST spec change to introduce 
the new update type. If we think this API is worth it for general purpose 
metadata cleaning, beyond preventing the drop column issue then we'd probably 
have to go through with the spec change. However, if this is only being 
introduced for the drop column issue, maybe we want to think through lighter 
weight options to solve that particular issue?
   
   I don't think we should expose `removePartitionSpec` to the REST catalog, at 
least for now. I think `RemoveUnusedSpecs` is a general metadata cleaning API 
and worth introducing, however the API is self-contained and doesn't have to be 
coupled with REST catalog. Like the comment in the TableMetadata, it's not safe 
for external client to simply call `removePartitionSpec` without checking the 
spec is unused, which might not an easy task in the REST catalog. Unless we 
think it's worthy to add check by reading manifest files in the REST catalog, 
in that way, we may expose the `removePartitionSpec` to the REST catalog. 
That's why the `org.apache.iceberg.MetadataUpdate.SetPartitionSpecs#applyTo` 
method throws an exception instead of actually implementing it.
   
   If introducing a new MetadataUpdate implies a REST spec change, I think we 
can change the `TableMetadata.Builder#build` to build without adding a 
`RemovePartitionSpec` change, how does that sounds to you?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to