Thanks everyone for the review. The 2 PRs are both merged. Looks like there's only 1 PR left in the 1.10 milestone <https://github.com/apache/iceberg/milestone/54> :)
Best, Kevin Liu On Thu, Jul 24, 2025 at 7:44 PM Manu Zhang <owenzhang1...@gmail.com> wrote: > Thanks Kevin. The first change is not in the versioned doc so it can be > released anytime. > > Regards, > Manu > > On Fri, Jul 25, 2025 at 4:21 AM Kevin Liu <kevinjq...@apache.org> wrote: > >> The 3 PRs above are merged. Thanks everyone for the review. >> >> I've added 2 more PRs to the 1.10 milestone. These are both >> nice-to-haves. >> - docs: add subpage for REST Catalog Spec in "Specification" #13521 >> <https://github.com/apache/iceberg/pull/13521> >> - REST-Fixture: Ensure strict mode on jdbc catalog for rest fixture >> #13599 <https://github.com/apache/iceberg/pull/13599> >> >> The first one changes the link for "REST Catalog Spec" on the left nav of >> https://iceberg.apache.org/spec/ from the swagger.io link to a dedicated >> page for IRC. >> The second one fixes the default behavior of `iceberg-rest-fixture` image >> to align with the general expectation when creating a table in a catalog. >> >> Please take a look. I would like to have both of these as part of the >> 1.10 release. >> >> Best, >> Kevin Liu >> >> >> On Wed, Jul 23, 2025 at 1:31 PM Kevin Liu <kevinjq...@apache.org> wrote: >> >>> Here are the 3 PRs to add corresponding tests. >>> https://github.com/apache/iceberg/pull/13648 >>> https://github.com/apache/iceberg/pull/13649 >>> https://github.com/apache/iceberg/pull/13650 >>> >>> I've tagged them with the 1.10 milestone, waiting for CI to complete :) >>> >>> Best, >>> Kevin Liu >>> >>> On Wed, Jul 23, 2025 at 1:08 PM Steven Wu <stevenz...@gmail.com> wrote: >>> >>>> Kevin, thanks for checking that. I will take a look at your backport >>>> PRs. Can you add them to the 1.10.0 milestone? >>>> >>>> On Wed, Jul 23, 2025 at 12:27 PM Kevin Liu <kevinjq...@apache.org> >>>> wrote: >>>> >>>>> Thanks again for driving this Steven! We're very close!! >>>>> >>>>> As mentioned in the community sync today, I wanted to verify feature >>>>> parity between Spark 3.5 and Spark 4.0 for this release. >>>>> I was able to verify that Spark 3.5 and Spark 4.0 have feature parity >>>>> for this upcoming release. More details in the other devlist thread >>>>> https://lists.apache.org/thread/7x7xcm3y87y81c4grq4nn9gdjd4jm05f >>>>> >>>>> Thanks, >>>>> Kevin Liu >>>>> >>>>> On Wed, Jul 23, 2025 at 12:17 PM Steven Wu <stevenz...@gmail.com> >>>>> wrote: >>>>> >>>>>> Another update on the release. >>>>>> >>>>>> The existing blocker PRs are almost done. >>>>>> >>>>>> During today's community sync, we identified the following issues/PRs >>>>>> to be included in the 1.10.0 release. >>>>>> >>>>>> 1. backport of PR 13100 to the main branch. I have created a >>>>>> cherry-pick >>>>>> PR <https://github.com/apache/iceberg/pull/13647> for that. There >>>>>> is a one line difference compared to the original PR due to the >>>>>> removal of >>>>>> the deprecated RemoveSnapshot class in main branch for 1.10.0 target. >>>>>> Amogh >>>>>> has suggested using RemoveSnapshots with a single snapshot id, which >>>>>> should >>>>>> be supported by all REST catalog servers. >>>>>> 2. Flink compaction doesn't support row lineage. Fail the >>>>>> compaction for V3 tables. I created a PR >>>>>> <https://github.com/apache/iceberg/pull/13646> for that. Will >>>>>> backport after it is merged. >>>>>> 3. Spark: fix data frame join based on different versions of the >>>>>> same table that may lead to weird results. Anton is working on a fix. >>>>>> It >>>>>> requires a small behavior change (table state may be stale up to >>>>>> refresh >>>>>> interval). Hence it is better to include it in the 1.10.0 release >>>>>> where >>>>>> Spark 4.0 is first supported. >>>>>> 4. Variant support in core and Spark 4.0. Ryan thinks this is >>>>>> very close and will prioritize the review. >>>>>> >>>>>> Thanks, >>>>>> steven >>>>>> >>>>>> The 1.10.0 milestone can be found here. >>>>>> https://github.com/apache/iceberg/milestone/54 >>>>>> >>>>>> >>>>>> On Wed, Jul 16, 2025 at 9:15 AM Steven Wu <stevenz...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Ajantha/Robin, thanks for the note. We can include the PR in the >>>>>>> 1.10.0 milestone. >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Wed, Jul 16, 2025 at 3:20 AM Robin Moffatt >>>>>>> <ro...@confluent.io.invalid> wrote: >>>>>>> >>>>>>>> Thanks Ajantha. Just to confirm, from a Confluent point of view, we >>>>>>>> will not be able to publish the connector on Confluent Hub until this >>>>>>>> CVE[1] is fixed. >>>>>>>> Since we would not publish a snapshot build, if the fix doesn't >>>>>>>> make it into 1.10 then we'd have to wait for 1.11 (or a dot release of >>>>>>>> 1.10) to be able to include the connector on Confluent Hub. >>>>>>>> >>>>>>>> Thanks, Robin. >>>>>>>> >>>>>>>> [1] >>>>>>>> https://github.com/apache/iceberg/issues/10745#issuecomment-3074300861 >>>>>>>> >>>>>>>> On Wed, 16 Jul 2025 at 04:03, Ajantha Bhat <ajanthab...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> I have approached Confluent people >>>>>>>>> <https://github.com/apache/iceberg/issues/10745#issuecomment-3058281281> >>>>>>>>> to help us publish the OSS Kafka Connect Iceberg sink plugin. >>>>>>>>> It seems we have a CVE from dependency that blocks us from >>>>>>>>> publishing the plugin. >>>>>>>>> >>>>>>>>> Please include the below PR for 1.10.0 release which fixes that. >>>>>>>>> https://github.com/apache/iceberg/pull/13561 >>>>>>>>> >>>>>>>>> - Ajantha >>>>>>>>> >>>>>>>>> On Tue, Jul 15, 2025 at 10:48 AM Steven Wu <stevenz...@gmail.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> > Engines may model operations as deleting/inserting rows or as >>>>>>>>>> modifications to rows that preserve row ids. >>>>>>>>>> >>>>>>>>>> Manu, I agree this sentence probably lacks some context. The >>>>>>>>>> first half (as deleting/inserting rows) is probably about the >>>>>>>>>> row lineage handling with equality deletes, which is described in >>>>>>>>>> another >>>>>>>>>> place. >>>>>>>>>> >>>>>>>>>> "Row lineage does not track lineage for rows updated via Equality >>>>>>>>>> Deletes <https://iceberg.apache.org/spec/#equality-delete-files>, >>>>>>>>>> because engines using equality deletes avoid reading existing data >>>>>>>>>> before >>>>>>>>>> writing changes and can't provide the original row ID for the new >>>>>>>>>> rows. >>>>>>>>>> These updates are always treated as if the existing row was >>>>>>>>>> completely >>>>>>>>>> removed and a unique new row was added." >>>>>>>>>> >>>>>>>>>> On Mon, Jul 14, 2025 at 5:49 PM Manu Zhang < >>>>>>>>>> owenzhang1...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Thanks Steven, I missed that part but the following sentence is >>>>>>>>>>> a bit hard to understand (maybe just me) >>>>>>>>>>> >>>>>>>>>>> Engines may model operations as deleting/inserting rows or as >>>>>>>>>>> modifications to rows that preserve row ids. >>>>>>>>>>> >>>>>>>>>>> Can you please help to explain? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Steven Wu <stevenz...@gmail.com>于2025年7月15日 周二04:41写道: >>>>>>>>>>> >>>>>>>>>>>> Manu >>>>>>>>>>>> >>>>>>>>>>>> The spec already covers the row lineage carry over (for replace) >>>>>>>>>>>> https://iceberg.apache.org/spec/#row-lineage >>>>>>>>>>>> >>>>>>>>>>>> "When an existing row is moved to a different data file for >>>>>>>>>>>> any reason, writers should write _row_id and >>>>>>>>>>>> _last_updated_sequence_number according to the following rules: >>>>>>>>>>>> " >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Steven >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Mon, Jul 14, 2025 at 1:38 PM Steven Wu <stevenz...@gmail.com> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> another update on the release. >>>>>>>>>>>>> >>>>>>>>>>>>> We have one open PR left for the 1.10.0 milestone >>>>>>>>>>>>> <https://github.com/apache/iceberg/milestone/54> (with 25 >>>>>>>>>>>>> closed PRs). Amogh is actively working on the last blocker PR. >>>>>>>>>>>>> Spark 4.0: Preserve row lineage information on compaction >>>>>>>>>>>>> <https://github.com/apache/iceberg/pull/13555> >>>>>>>>>>>>> >>>>>>>>>>>>> I will publish a release candidate after the above blocker is >>>>>>>>>>>>> merged and backported. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Steven >>>>>>>>>>>>> >>>>>>>>>>>>> On Mon, Jul 7, 2025 at 11:56 PM Manu Zhang < >>>>>>>>>>>>> owenzhang1...@gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Amogh, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Is it defined in the table spec that "replace" operation >>>>>>>>>>>>>> should carry over existing lineage info insteading of assigning >>>>>>>>>>>>>> new IDs? If >>>>>>>>>>>>>> not, we'd better firstly define it in spec because all engines >>>>>>>>>>>>>> and >>>>>>>>>>>>>> implementations need to follow it. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Tue, Jul 8, 2025 at 11:44 AM Amogh Jahagirdar < >>>>>>>>>>>>>> 2am...@gmail.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> One other area I think we need to make sure works with row >>>>>>>>>>>>>>> lineage before release is data file compaction. At the >>>>>>>>>>>>>>> moment, >>>>>>>>>>>>>>> <https://github.com/apache/iceberg/blob/main/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/SparkBinPackFileRewriteRunner.java#L44> >>>>>>>>>>>>>>> it >>>>>>>>>>>>>>> looks like compaction will read the records from the data files >>>>>>>>>>>>>>> without >>>>>>>>>>>>>>> projecting the lineage fields. What this means is that on write >>>>>>>>>>>>>>> of the new >>>>>>>>>>>>>>> compacted data files we'd be losing the lineage information. >>>>>>>>>>>>>>> There's no >>>>>>>>>>>>>>> data change in a compaction but we do need to make sure the >>>>>>>>>>>>>>> lineage info >>>>>>>>>>>>>>> from carried over records is materialized in the newly >>>>>>>>>>>>>>> compacted files so >>>>>>>>>>>>>>> they don't get new IDs or inherit the new file sequence number. >>>>>>>>>>>>>>> I'm working >>>>>>>>>>>>>>> on addressing this as well, but I'd call this out as a blocker >>>>>>>>>>>>>>> as well. >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> *Robin Moffatt* >>>>>>>> *Sr. Principal Advisor, Streaming Data Technologies* >>>>>>>> >>>>>>>