Hey Rahul, Aihua, I was looking into the same thing.
The PR that you're referring to, was already included since 1.15.0 <https://github.com/apache/parquet-java/commits/apache-parquet-1.15.0>. Iceberg currently uses Parquet 1.15.2 <https://github.com/apache/iceberg/blob/76ff67c658066bd7d05ce4ce54a1d6340ee0a899/gradle/libs.versions.toml#L80>. I don't see anything obvious in the changelog <https://github.com/apache/parquet-java/releases/tag/apache-parquet-1.16.0-rc2> that might have caused the increase in size. Let me do a git bisect to find out the PR that introduced the change. Kind regards, Fokko Op di 2 sep 2025 om 14:11 schreef Rahul Sharma <[email protected]>: > Hi Aihua, > > Regarding the Iceberg failure, which parquet-java version is the test > passing for? I suspect that the failure might be related to > size-statistics. Could you try running the test with > `parquet.size.statistics.enabled=false`. This flag was added in this PR > <https://github.com/apache/parquet-java/pull/3060>. > > Thanks, > Rahul > > > On Tue, Sep 2, 2025 at 3:07 AM Aihua Xu <[email protected]> wrote: > > > Checked checksum and signature and ran unit tests. > > > > I'm also running the tests against Iceberg. Notice one failure > > < > > > https://github.com/apache/iceberg/blob/main/spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteDataFilesAction.java#L308 > > > > > that > > is from Iceberg format version 3 that is writing row lineage. Seems the > > file size increases after the version upgrade and I haven’t yet > pinpointed > > the exact change causing it. But I don't think that is a blocker for this > > release though. > > > > org.opentest4j.AssertionFailedError: [Did not have the expected number of > > files] > > expected: 20 > > but was: 21 > > at > > > > > org.apache.iceberg.spark.actions.TestRewriteDataFilesAction.shouldHaveFiles(TestRewriteDataFilesAction.java:2144) > > at > > > > > org.apache.iceberg.spark.actions.TestRewriteDataFilesAction.testBinPackAfterPartitionChange(TestRewriteDataFilesAction.java:321) > > > > > > On Mon, Sep 1, 2025 at 12:16 AM Gábor Szádovszky <[email protected]> > wrote: > > > > > I've checked tarball content, checksum, and signature. Executed unit > > tests, > > > and also some of our internal tests. All passed. > > > > > > +1 (binding) > > > > > > Gang Wu <[email protected]> ezt írta (időpont: 2025. aug. 30., Szo, > > 8:47): > > > > > > > Hi everyone, > > > > > > > > I propose the following RC to be released as the official Apache > > Parquet > > > > Java 1.16.0 release. > > > > > > > > The commit id is 402c3810c372d29603e181771acebfecc71bef61 > > > > * This corresponds to the tag: apache-parquet-1.16.0-rc2 > > > > * > > > > > > > > > > > > > > https://github.com/apache/parquet-java/tree/402c3810c372d29603e181771acebfecc71bef61 > > > > > > > > The release tarball, signature, and checksums are here: > > > > * > > > > https://dist.apache.org/repos/dist/dev/parquet/apache-parquet-1.16.0-rc2 > > > > > > > > You can find the KEYS file here: > > > > * https://downloads.apache.org/parquet/KEYS > > > > > > > > You can find the changelog here: > > > > * > > > > > > > > > > > > > > https://github.com/apache/parquet-java/releases/tag/apache-parquet-1.16.0-rc2 > > > > > > > > Binary artifacts are staged in Nexus here: > > > > * > > > > https://repository.apache.org/content/groups/staging/org/apache/parquet/ > > > > > > > > Please download, verify, and test. > > > > > > > > Please vote in the next 72 hours. > > > > > > > > [ ] +1 Release this as Apache Parquet Java 1.16.0 > > > > [ ] +0 > > > > [ ] -1 Do not release this because... > > > > > > > > Thanks, > > > > Gang > > > > > > > > > >
