Thanks Kevin. The first change is not in the versioned doc so it can be released anytime.
Regards, Manu On Fri, Jul 25, 2025 at 4:21 AM Kevin Liu <kevinjq...@apache.org> wrote: > The 3 PRs above are merged. Thanks everyone for the review. > > I've added 2 more PRs to the 1.10 milestone. These are both nice-to-haves. > - docs: add subpage for REST Catalog Spec in "Specification" #13521 > <https://github.com/apache/iceberg/pull/13521> > - REST-Fixture: Ensure strict mode on jdbc catalog for rest fixture #13599 > <https://github.com/apache/iceberg/pull/13599> > > The first one changes the link for "REST Catalog Spec" on the left nav of > https://iceberg.apache.org/spec/ from the swagger.io link to a dedicated > page for IRC. > The second one fixes the default behavior of `iceberg-rest-fixture` image > to align with the general expectation when creating a table in a catalog. > > Please take a look. I would like to have both of these as part of the 1.10 > release. > > Best, > Kevin Liu > > > On Wed, Jul 23, 2025 at 1:31 PM Kevin Liu <kevinjq...@apache.org> wrote: > >> Here are the 3 PRs to add corresponding tests. >> https://github.com/apache/iceberg/pull/13648 >> https://github.com/apache/iceberg/pull/13649 >> https://github.com/apache/iceberg/pull/13650 >> >> I've tagged them with the 1.10 milestone, waiting for CI to complete :) >> >> Best, >> Kevin Liu >> >> On Wed, Jul 23, 2025 at 1:08 PM Steven Wu <stevenz...@gmail.com> wrote: >> >>> Kevin, thanks for checking that. I will take a look at your backport >>> PRs. Can you add them to the 1.10.0 milestone? >>> >>> On Wed, Jul 23, 2025 at 12:27 PM Kevin Liu <kevinjq...@apache.org> >>> wrote: >>> >>>> Thanks again for driving this Steven! We're very close!! >>>> >>>> As mentioned in the community sync today, I wanted to verify feature >>>> parity between Spark 3.5 and Spark 4.0 for this release. >>>> I was able to verify that Spark 3.5 and Spark 4.0 have feature parity >>>> for this upcoming release. More details in the other devlist thread >>>> https://lists.apache.org/thread/7x7xcm3y87y81c4grq4nn9gdjd4jm05f >>>> >>>> Thanks, >>>> Kevin Liu >>>> >>>> On Wed, Jul 23, 2025 at 12:17 PM Steven Wu <stevenz...@gmail.com> >>>> wrote: >>>> >>>>> Another update on the release. >>>>> >>>>> The existing blocker PRs are almost done. >>>>> >>>>> During today's community sync, we identified the following issues/PRs >>>>> to be included in the 1.10.0 release. >>>>> >>>>> 1. backport of PR 13100 to the main branch. I have created a >>>>> cherry-pick >>>>> PR <https://github.com/apache/iceberg/pull/13647> for that. There >>>>> is a one line difference compared to the original PR due to the >>>>> removal of >>>>> the deprecated RemoveSnapshot class in main branch for 1.10.0 target. >>>>> Amogh >>>>> has suggested using RemoveSnapshots with a single snapshot id, which >>>>> should >>>>> be supported by all REST catalog servers. >>>>> 2. Flink compaction doesn't support row lineage. Fail the >>>>> compaction for V3 tables. I created a PR >>>>> <https://github.com/apache/iceberg/pull/13646> for that. Will >>>>> backport after it is merged. >>>>> 3. Spark: fix data frame join based on different versions of the >>>>> same table that may lead to weird results. Anton is working on a fix. >>>>> It >>>>> requires a small behavior change (table state may be stale up to >>>>> refresh >>>>> interval). Hence it is better to include it in the 1.10.0 release where >>>>> Spark 4.0 is first supported. >>>>> 4. Variant support in core and Spark 4.0. Ryan thinks this is very >>>>> close and will prioritize the review. >>>>> >>>>> Thanks, >>>>> steven >>>>> >>>>> The 1.10.0 milestone can be found here. >>>>> https://github.com/apache/iceberg/milestone/54 >>>>> >>>>> >>>>> On Wed, Jul 16, 2025 at 9:15 AM Steven Wu <stevenz...@gmail.com> >>>>> wrote: >>>>> >>>>>> Ajantha/Robin, thanks for the note. We can include the PR in the >>>>>> 1.10.0 milestone. >>>>>> >>>>>> >>>>>> >>>>>> On Wed, Jul 16, 2025 at 3:20 AM Robin Moffatt >>>>>> <ro...@confluent.io.invalid> wrote: >>>>>> >>>>>>> Thanks Ajantha. Just to confirm, from a Confluent point of view, we >>>>>>> will not be able to publish the connector on Confluent Hub until this >>>>>>> CVE[1] is fixed. >>>>>>> Since we would not publish a snapshot build, if the fix doesn't make >>>>>>> it into 1.10 then we'd have to wait for 1.11 (or a dot release of 1.10) >>>>>>> to >>>>>>> be able to include the connector on Confluent Hub. >>>>>>> >>>>>>> Thanks, Robin. >>>>>>> >>>>>>> [1] >>>>>>> https://github.com/apache/iceberg/issues/10745#issuecomment-3074300861 >>>>>>> >>>>>>> On Wed, 16 Jul 2025 at 04:03, Ajantha Bhat <ajanthab...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> I have approached Confluent people >>>>>>>> <https://github.com/apache/iceberg/issues/10745#issuecomment-3058281281> >>>>>>>> to help us publish the OSS Kafka Connect Iceberg sink plugin. >>>>>>>> It seems we have a CVE from dependency that blocks us from >>>>>>>> publishing the plugin. >>>>>>>> >>>>>>>> Please include the below PR for 1.10.0 release which fixes that. >>>>>>>> https://github.com/apache/iceberg/pull/13561 >>>>>>>> >>>>>>>> - Ajantha >>>>>>>> >>>>>>>> On Tue, Jul 15, 2025 at 10:48 AM Steven Wu <stevenz...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> > Engines may model operations as deleting/inserting rows or as >>>>>>>>> modifications to rows that preserve row ids. >>>>>>>>> >>>>>>>>> Manu, I agree this sentence probably lacks some context. The first >>>>>>>>> half (as deleting/inserting rows) is probably about the row >>>>>>>>> lineage handling with equality deletes, which is described in another >>>>>>>>> place. >>>>>>>>> >>>>>>>>> "Row lineage does not track lineage for rows updated via Equality >>>>>>>>> Deletes <https://iceberg.apache.org/spec/#equality-delete-files>, >>>>>>>>> because engines using equality deletes avoid reading existing data >>>>>>>>> before >>>>>>>>> writing changes and can't provide the original row ID for the new >>>>>>>>> rows. >>>>>>>>> These updates are always treated as if the existing row was completely >>>>>>>>> removed and a unique new row was added." >>>>>>>>> >>>>>>>>> On Mon, Jul 14, 2025 at 5:49 PM Manu Zhang < >>>>>>>>> owenzhang1...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Thanks Steven, I missed that part but the following sentence is a >>>>>>>>>> bit hard to understand (maybe just me) >>>>>>>>>> >>>>>>>>>> Engines may model operations as deleting/inserting rows or as >>>>>>>>>> modifications to rows that preserve row ids. >>>>>>>>>> >>>>>>>>>> Can you please help to explain? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Steven Wu <stevenz...@gmail.com>于2025年7月15日 周二04:41写道: >>>>>>>>>> >>>>>>>>>>> Manu >>>>>>>>>>> >>>>>>>>>>> The spec already covers the row lineage carry over (for replace) >>>>>>>>>>> https://iceberg.apache.org/spec/#row-lineage >>>>>>>>>>> >>>>>>>>>>> "When an existing row is moved to a different data file for any >>>>>>>>>>> reason, writers should write _row_id and >>>>>>>>>>> _last_updated_sequence_number according to the following rules:" >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Steven >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Mon, Jul 14, 2025 at 1:38 PM Steven Wu <stevenz...@gmail.com> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> another update on the release. >>>>>>>>>>>> >>>>>>>>>>>> We have one open PR left for the 1.10.0 milestone >>>>>>>>>>>> <https://github.com/apache/iceberg/milestone/54> (with 25 >>>>>>>>>>>> closed PRs). Amogh is actively working on the last blocker PR. >>>>>>>>>>>> Spark 4.0: Preserve row lineage information on compaction >>>>>>>>>>>> <https://github.com/apache/iceberg/pull/13555> >>>>>>>>>>>> >>>>>>>>>>>> I will publish a release candidate after the above blocker is >>>>>>>>>>>> merged and backported. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Steven >>>>>>>>>>>> >>>>>>>>>>>> On Mon, Jul 7, 2025 at 11:56 PM Manu Zhang < >>>>>>>>>>>> owenzhang1...@gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi Amogh, >>>>>>>>>>>>> >>>>>>>>>>>>> Is it defined in the table spec that "replace" operation >>>>>>>>>>>>> should carry over existing lineage info insteading of assigning >>>>>>>>>>>>> new IDs? If >>>>>>>>>>>>> not, we'd better firstly define it in spec because all engines and >>>>>>>>>>>>> implementations need to follow it. >>>>>>>>>>>>> >>>>>>>>>>>>> On Tue, Jul 8, 2025 at 11:44 AM Amogh Jahagirdar < >>>>>>>>>>>>> 2am...@gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> One other area I think we need to make sure works with row >>>>>>>>>>>>>> lineage before release is data file compaction. At the >>>>>>>>>>>>>> moment, >>>>>>>>>>>>>> <https://github.com/apache/iceberg/blob/main/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/SparkBinPackFileRewriteRunner.java#L44> >>>>>>>>>>>>>> it >>>>>>>>>>>>>> looks like compaction will read the records from the data files >>>>>>>>>>>>>> without >>>>>>>>>>>>>> projecting the lineage fields. What this means is that on write >>>>>>>>>>>>>> of the new >>>>>>>>>>>>>> compacted data files we'd be losing the lineage information. >>>>>>>>>>>>>> There's no >>>>>>>>>>>>>> data change in a compaction but we do need to make sure the >>>>>>>>>>>>>> lineage info >>>>>>>>>>>>>> from carried over records is materialized in the newly compacted >>>>>>>>>>>>>> files so >>>>>>>>>>>>>> they don't get new IDs or inherit the new file sequence number. >>>>>>>>>>>>>> I'm working >>>>>>>>>>>>>> on addressing this as well, but I'd call this out as a blocker >>>>>>>>>>>>>> as well. >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> *Robin Moffatt* >>>>>>> *Sr. Principal Advisor, Streaming Data Technologies* >>>>>>> >>>>>>