RussellSpitzer opened a new pull request, #15241: URL: https://github.com/apache/iceberg/pull/15241
### Context This is part of the write metadata with columnar formats change. When make this change calls to ManifestReader() without passing through the partitionSpecByID will error out on Parquet manifest files. Snapshot.addedFiles and it's friends are some of the main users of this path (that aren't test code) so we need to remove those methods and switch our usage to a version which passes through partitionSpecByID. Otherwise switching to parquet manifests will cause issues throughout the codebase. ### This PR We currently offer several methods for getting files changed in a snapshot but they rely on the assumption that you can read the partition_spec from the manifest metadata. In advance of the move to Parquet Manifest, we'll be no longer able to rely on this part of the manifest read code. In this PR we deprecate those existing methods and create a new utility class which can do the same thing as the old Snapshot methods. The new utility class does not assume that the manifest read code can actually read the partition_spec info and instead takes it as an arguement. #### Production Code Changes ##### Core 1. CherryPickOperation 2. MicroBatches ##### Flink 1. TableChange ##### Spark 1. MicroBatchStream #### Test Changes Unfortunately there are also a huge number of test usages of these methods, the majority of this commit is cleaning those up. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
