yshcz opened a new issue, #2009:
URL: https://github.com/apache/iceberg-rust/issues/2009

   ### Apache Iceberg Rust version
   
   None
   
   ### Describe the bug
   
   When upgrading a table from v2 to v3, metadata serialization fails if the 
table contains existing snapshots. This occurs because SnapshotV3 requires 
first-row-id and added-rows, but snapshots created before the upgrade do not 
have these values.
   
   According to the spec, when a table is upgraded to v3, existing snapshots 
should remain unmodified with first-row-id unset or null. iceberg-rust violates 
this by requiring all snapshots to have row lineage fields.
   
   The error message is: "v3 Snapshots must have first-row-id and rows-added 
fields set."
   
   This effectively makes v2 to v3 upgrades impossible for any table with 
snapshot history.
   
   ### To Reproduce
   
   Add the following test to crates/iceberg/src/spec/table_metadata.rs
   
   ```rust
   fn test_v2_to_v3_upgrade_with_existing_snapshot_serialization_fails() {
       // Create a v2 table metadata
       let schema = Schema::builder()
           .with_fields(vec![
               NestedField::required(1, "id", 
Type::Primitive(PrimitiveType::Long)).into(),
           ])
           .build()
           .unwrap();
   
       let v2_metadata = TableMetadataBuilder::new(
           schema,
           PartitionSpec::unpartition_spec().into_unbound(),
           SortOrder::unsorted_order(),
           "s3://bucket/test/location".to_string(),
           FormatVersion::V2,
           HashMap::new(),
       )
       .unwrap()
       .build()
       .unwrap()
       .metadata;
   
       // Add a v2 snapshot
       let snapshot = Snapshot::builder()
           .with_snapshot_id(1)
           .with_timestamp_ms(v2_metadata.last_updated_ms + 1)
           .with_sequence_number(1)
           .with_schema_id(0)
           .with_manifest_list("s3://bucket/test/metadata/snap-1.avro")
           .with_summary(Summary {
               operation: Operation::Append,
               additional_properties: HashMap::from([(
                   "added-data-files".to_string(),
                   "1".to_string(),
               )]),
           })
           .build();
   
       let v2_with_snapshot = v2_metadata
           
.into_builder(Some("s3://bucket/test/metadata/v00001.json".to_string()))
           .add_snapshot(snapshot)
           .unwrap()
           .set_ref(crate::spec::snapshot::MAIN_BRANCH, SnapshotReference {
               snapshot_id: 1,
               retention: SnapshotRetention::Branch {
                   min_snapshots_to_keep: None,
                   max_snapshot_age_ms: None,
                   max_ref_age_ms: None,
               },
           })
           .unwrap()
           .build()
           .unwrap()
           .metadata;
   
       // Verify v2 serialization works fine
       let v2_json = serde_json::to_string(&v2_with_snapshot);
       assert!(v2_json.is_ok(), "v2 serialization should work");
   
       // Upgrade to v3
       let v3_metadata = v2_with_snapshot
           
.into_builder(Some("s3://bucket/test/metadata/v00002.json".to_string()))
           .upgrade_format_version(FormatVersion::V3)
           .unwrap()
           .build()
           .unwrap()
           .metadata;
   
       assert_eq!(v3_metadata.format_version, FormatVersion::V3);
       assert_eq!(v3_metadata.snapshots.len(), 1);
   
       // Verify the snapshot has no row_range
       let snapshot = v3_metadata.snapshots.values().next().unwrap();
       assert!(
           snapshot.row_range().is_none(),
           "Snapshot should have no row_range after upgrade"
       );
   
       // Try to serialize v3 metadata
       let v3_json = serde_json::to_string(&v3_metadata);
       assert!(v3_json.is_err());
   }
   ```
   
   ### Expected behavior
   
   Upgraded v3 metadata should serialize successfully
   
   ### Willingness to contribute
   
   None


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to