emkornfield commented on code in PR #1876:
URL: https://github.com/apache/iceberg-rust/pull/1876#discussion_r2752454670


##########
crates/iceberg/src/spec/table_metadata.rs:
##########
@@ -468,9 +483,44 @@ impl TableMetadata {
         file_io: &FileIO,
         metadata_location: impl AsRef<str>,

Review Comment:
   > I think we may need to change metadata_location's type to the struct 
MetadataLocation, rather than impl AsRef<str>. 
   
   I was trying to avoid a breaking change here, are there general policies I 
should be aware of (for some reason I thought there was a preference not have 
breaking changes).  if we don't mind breaking changes, I'm happy to make the 
change).
   
   > Also we should not infer compression codec from suffix, instead we should 
infer it from TableMetadata itself. With this change, we could avoid a lot of 
validation in the method body.
   
   I don't believe we are inferring compression codec from the suffix.  The 
complexity here is since we are only deprecating `with_next_version`, we end up 
writing the wrong name under the following scenario.
   
   1.  Metadata file 0 - has no compression
   2.  Caller uses `with_next_version` to generate the next location.
   3.  Caller updates to use compression = 'gz'
   
   Without this logic we would write out a gzipped file without the compression 
suffix, which would break other readers (e.g. Java library).
   
   I realize I actually missed handling the same transition for 
`CompressionCodec::None => (json_data, 
metadata_location.as_ref().to_string()),` which would check if the suffix is 
compressed and remove it.
   
   if we chose to just delete `with_next_version` (or update to take 
TableMetadata on the existing method). A lot of complexity goes away (we no 
longer need parse the suffix, although it might be nice for display purposes) 
and we don't need the special checks here.
   
   So in short I think the two paths forward are:
   1.  Breaking change for MetadataLocation to force taking properties on next 
version (and not deprecate the method).
   2. Fix the Match on None to handle a similar down-grade.
   
   I think other options introduce a risk that clients can end up writing out 
metadata json files that other readers will not be able to read.
    



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to