Hi Everyone, I would like to report a performance regression we've identified in Spark queries on Iceberg tables stored in cloud storage (tested with GCS), which I believe should be addressed in the 1.11.0 release.
Current SerializableFileIOWithSize drops file length, causing performance regression due to excessive metadata calls in Cloud Storage: https://github.com/apache/iceberg/ssues/16283. The fix overrides InputFile newInputFile(String path, long length) to preserve file length and avoid unwanted metadata calls https://github.com/apache/iceberg/pull/16284 On 2026/05/08 15:27:05 Péter Váry wrote: > Just to clarify: > > The following PRs are already merged to 1.11.0: > > - https://github.com/apache/iceberg/pull/14297 - Spark: Support writing > shredded variant in Iceberg-Spark > - https://github.com/apache/iceberg/pull/15512 - Spark: fix delete from > branch for canDeleteWhere where it does not resolve to the correct branch - > WAP fix > - https://github.com/apache/iceberg/pull/15475 - Flink: Add Nanosecond > Precision Support for Flink-Iceberg Integration > > > The missing ones are the ones backporting those to other engine versions: > > - For: 14297 <https://github.com/apache/iceberg/pull/14297>: > - 16241 <https://github.com/apache/iceberg/pull/16241> - Backport for > variant shredding in Spark 4.0 > - For: 15512 <https://github.com/apache/iceberg/pull/15512>: > - 16245 <https://github.com/apache/iceberg/pull/16245> - Spark: > backport PR #15512 to v3.4, v3.5, v4.0 for WAP branch delete fix > - For: 15475 <https://github.com/apache/iceberg/pull/15475>: > - #16183 <https://github.com/apache/iceberg/pull/16183>, #16239 > <https://github.com/apache/iceberg/pull/16239>, #16240 > <https://github.com/apache/iceberg/pull/16240> - Backport for Nano > timestamps for Flink 2.0/1.20 > > > So the PRs needed on 1.11.0 are: > https://github.com/apache/iceberg/pull/16241 > https://github.com/apache/iceberg/pull/16245 > https://github.com/apache/iceberg/pull/16183 > https://github.com/apache/iceberg/pull/16239 > https://github.com/apache/iceberg/pull/16240 > https://github.com/apache/iceberg/pull/16186 > > Aihua Xu <[email protected]> ezt írta (időpont: 2026. máj. 8., P, 17:13): > > > Thank you all for the feedback and for verifying the release candidate. > > Based on the issues identified above, we will include the following fixes > > and cut RC2 with a new vote: > > > > https://github.com/apache/iceberg/pull/14297 > > https://github.com/apache/iceberg/pull/15512 > > https://github.com/apache/iceberg/pull/15475 > > https://github.com/apache/iceberg/pull/16186 > > > > Please let me know if you have any questions or identified additional > > issues. > > > > Thanks, > > Aihua > > > > On Thu, May 7, 2026 at 10:09 PM Aihua Xu <[email protected]> wrote: > > > >> I also looked into this. There is a configuration > >> gcs.analytics-core.enabled to enable/disable GCS Analytics Core. The > >> current implementation always requires runtime dependency of GCS Analytics > >> Core even if the configuration is off. Ideally we can lazy load such > >> dependency so the dependency is only required when the feature is > >> explicitly enabled. But since GCP is likely to enable GCS Analytics Core by > >> default, I feel it's reasonable for downstream projects using non-bundle > >> jars to add this dependency. > >> > >> > >> On Thu, May 7, 2026 at 6:54 PM Steven Wu <[email protected]> wrote: > >> > >>> Looked a little more. > >>> > >>> So Iceberg's cloud modules consistently use compileOnly for vendor SDKs > >>> and rely on either the bundle artifact or downstream coordination for > >>> runtime. So, both changes are expected for downstream consumers using the > >>> non-bundle jars. Maybe we don't need to change anything. > >>> > >>> iceberg-gcp module > >>> > >>> compileOnly platform(libs.google.libraries.bom) > >>> compileOnly "com.google.cloud:google-cloud-storage" > >>> compileOnly "com.google.cloud:google-cloud-kms" > >>> compileOnly(libs.gcs.analytics.core) > >>> > >>> > >>> On Thu, May 7, 2026 at 6:16 PM Steven Wu <[email protected]> wrote: > >>> > >>>> Yuya, thanks for reporting the discovery. > >>>> > >>>> Azure: I approved your PR and can merge it soon: > >>>> https://github.com/apache/iceberg/pull/16186 > >>>> GCP: the new dependency is marked as compileOnly in PR 14333 > >>>> <https://github.com/apache/iceberg/pull/14333>, as it is an opt-in > >>>> feature. we need to either change the dep to implementation or update the > >>>> code similar to the Azure fix above. > >>>> > >>>> > >>>> On Thu, May 7, 2026 at 4:07 PM Yuya Ebihara < > >>>> [email protected]> wrote: > >>>> > >>>>> Hi Aihua, > >>>>> > >>>>> Thanks for leading the release! > >>>>> > >>>>> Just a quick reminder about two dependency-related items from a > >>>>> downstream perspective: > >>>>> * Azure module users will require azure-security-keyvault-keys, even > >>>>> when table encryption is not used, as noted in > >>>>> https://github.com/apache/iceberg/pull/16186 > >>>>> * GCS module users will require gcs-analytics-core > >>>>> > >>>>> I ran into CI failures with 1.11.0 in Trino because the project does > >>>>> not use the azure-bundle or gcp-bundle modules. > >>>>> The CI passed once we explicitly added these two dependencies. > >>>>> > >>>>> Thanks, > >>>>> Yuya Ebihara > >>>>> > >>>>> On Fri, May 8, 2026 at 4:58 AM Péter Váry <[email protected]> > >>>>> wrote: > >>>>> > >>>>>> First of all, thanks to everyone for the effort put into preparing > >>>>>> this release! > >>>>>> > >>>>>> I would like to highlight that RC1 is built from a branch where the > >>>>>> following features have not been backported to all engine versions: > >>>>>> - Spark: Support writing shredded variant in Iceberg-Spark ( > >>>>>> https://github.com/apache/iceberg/pull/14297) - Available in Spark > >>>>>> 4.1, but not in Spark 4.0 > >>>>>> - Spark: fix delete from branch for canDeleteWhere where it does not > >>>>>> resolve to the correct branch ( > >>>>>> https://github.com/apache/iceberg/pull/15512) - Available in Spark > >>>>>> 4.1, but not in Spark 4.0, 3.5, or 3.4 > >>>>>> - Flink: Add Nanosecond Precision Support for Flink-Iceberg > >>>>>> Integration (https://github.com/apache/iceberg/pull/15475) - > >>>>>> Available in Flink 2.1, but not in Flink 2.0 or 1.20 > >>>>>> > >>>>>> It is up to the community to decide whether these missing backports > >>>>>> should be considered release blockers. Most of the corresponding PRs > >>>>>> have > >>>>>> already been merged to main (except #15512), and including them in the > >>>>>> release should be relatively straightforward. > >>>>>> > >>>>>> From my perspective, I would prefer not to release with these gaps. > >>>>>> That said, I understand the urgency and the need for a release, and I > >>>>>> am > >>>>>> happy to go with the community’s decision. > >>>>>> > >>>>>> Peter > >>>>>> > >>>>>> Aihua Xu <[email protected]> ezt írta (időpont: 2026. máj. 7., Cs, > >>>>>> 18:26): > >>>>>> > >>>>>>> Hi Everyone, > >>>>>>> > >>>>>>> I propose that we release the following RC as the official Apache > >>>>>>> Iceberg 1.11.0 release. > >>>>>>> > >>>>>>> The commit ID is 0f657edf12dc29f8487a679bfdd4210e9588d014 > >>>>>>> * This corresponds to the tag: apache-iceberg-1.11.0-rc1 > >>>>>>> * > >>>>>>> https://github.com/apache/iceberg/commits/apache-iceberg-1.11.0-rc1 > >>>>>>> * > >>>>>>> https://github.com/apache/iceberg/tree/0f657edf12dc29f8487a679bfdd4210e9588d014 > >>>>>>> > >>>>>>> The release tarball, signature, and checksums are here: > >>>>>>> * > >>>>>>> https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-1.11.0-rc1 > >>>>>>> > >>>>>>> You can find the KEYS file here: > >>>>>>> * https://downloads.apache.org/iceberg/KEYS > >>>>>>> > >>>>>>> Convenience binary artifacts are staged on Nexus. The Maven > >>>>>>> repository URL is: > >>>>>>> * > >>>>>>> https://repository.apache.org/content/repositories/orgapacheiceberg-1278/ > >>>>>>> > >>>>>>> Please download, verify, and test. > >>>>>>> > >>>>>>> Instructions for verifying a release can be found here: > >>>>>>> * https://iceberg.apache.org/how-to-release/#how-to-verify-a-release > >>>>>>> > >>>>>>> Please vote in the next 72 hours. > >>>>>>> > >>>>>>> [ ] +1 Release this as Apache Iceberg 1.11.0 > >>>>>>> [ ] +0 > >>>>>>> [ ] -1 Do not release this because... > >>>>>>> > >>>>>>> Only PMC members have binding votes, but other community members are > >>>>>>> encouraged to cast > >>>>>>> non-binding votes. This vote will pass if there are 3 binding +1 > >>>>>>> votes and more binding > >>>>>>> +1 votes than -1 votes. > >>>>>>> > >>>>>>> >
