Here's a PR that removes the OpenAPI runtime Jar: https://github.com/apache/iceberg/pull/15655
I think we still want to produce a docker image for testing across implementations in the project, but it shouldn't use a runtime Jar and we shouldn't publish the runtime Jar to maven central. This PR should unblock the releases. On Thu, Apr 30, 2026 at 10:53 AM Russell Spitzer <[email protected]> wrote: > I'm on board. The Kafka Connector already has "do it at home" instructions > for building so that shouldn't be a limitation > > The OpenAPI release can also be ignored for now. We can always to a point > release after we fix the issues if folks desperately need it and can't > build it from scratch. > > On Thu, Apr 30, 2026 at 12:49 PM Ryan Blue <[email protected]> wrote: > >> Hi everyone, >> >> I have a quick update on LICENSE issues that are currently blocking 1.11 >> and 1.10.2. Also, sorry if you got this twice, but it looks like it didn't >> go through the first time. >> >> TL;DR: I think we should: >> >> - Hold off on adding Kafka Connect to the release process >> - Remove the iceberg-open-api-test-fixtures-runtime Jar from releases >> >> The background is that over the last few weeks, we found two fairly large >> leaks that added transitive dependencies into Iceberg runtime Jars (fixed >> by #15655 <https://github.com/apache/iceberg/pull/15655> and #15858 >> <https://github.com/apache/iceberg/pull/15858>). As a result, Russell >> added a new way to track and validate the dependencies included in our >> published artifacts. To make sure the new checks are correct, I’ve been >> going through to validate the LICENSE/NOTICE files against the dependency >> list. Unfortunately, there are more problems. >> >> The first problem is with our Kafka Connect distribution. There are two >> zip distributions, a Hive and a non-Hive version. Robin has been working on >> getting these published as part of our release process in #15212 >> <https://github.com/apache/iceberg/pull/15212>. The non-Hive >> distribution is very large and has some dependencies that may not need to >> be there, like Apache Commons Jars that aren’t used in Iceberg (and would >> be provided by KC if needed?). #16147 >> <https://github.com/apache/iceberg/pull/16147> is a draft with some of >> the non-Hive changes. The Hive distribution has about 100 more Jars than >> non-Hive, and includes many dependencies that are almost certainly >> unnecessary, like 3 hadoop-mapreduce-* Jars. *My recommendation is to >> hold off on making Kafka Connect part of releases until the license issues >> are solved*. >> >> Another issue is the open-api module. We added this to the Java build to >> verify the REST catalog spec, but then added tests and fixtures for >> validating REST implementations. #11279 >> <https://github.com/apache/iceberg/pull/11279> added a runtime Jar for >> to run a test service, but most PMC members I’ve talked to about it didn’t >> know that we have been publishing it — and have been since 1.7. This >> runtime Jar indiscriminately bundles far more libraries than it needs, like >> the cloud provider libs, Hadoop common, JUnit, Jetty, and others. The Jar >> is 200+ MB >> <https://repo1.maven.org/maven2/org/apache/iceberg/iceberg-open-api/1.10.1/> >> . *My recommendation is to remove this Jar from publication to unblock >> releases*. >> >> As a general rule, when we are considering adding a new runtime >> distribution to the project, we need to check that it is something we need >> to do (vs an easy alternative), and if it is, then minimize the >> dependencies included to only those required to run it. Once that’s done, >> we need to document the dependencies in LICENSE and NOTICE and, as of >> #15855 <https://github.com/apache/iceberg/pull/15855>, ensure that the >> bundled dependencies are tracked in a runtime-deps.txt file. >> >> I think the priority right now is to unblock the 1.11 and 1.10.2 >> releases. We can do that by not releasing these artifacts. After that, I >> think we need to verify for all of these that they are needed, have minimal >> included dependencies, and then document those dependencies. For example, >> do we need a Kafka Connect Hive distribution or is the REST catalog version >> enough? Does everyone agree that this is the right path forward? >> >> Thanks, >> >> Ryan >> >
