+1 , agree to unblock the 1.11 and 1.12 releases first and revisit.

Thanks,
Steve

On Thu, Apr 30, 2026 at 12:45 PM Jean-Baptiste Onofré <[email protected]> wrote:
>
> Hi Ryan,
>
> I agree with this approach. It is perfectly aligned with the points I 
> mentioned a while ago.
>
> Regards,
> JB
>
> On Thu, Apr 30, 2026 at 7:49 PM Ryan Blue <[email protected]> wrote:
>>
>> Hi everyone,
>>
>> I have a quick update on LICENSE issues that are currently blocking 1.11 and 
>> 1.10.2. Also, sorry if you got this twice, but it looks like it didn't go 
>> through the first time.
>>
>> TL;DR: I think we should:
>>
>> Hold off on adding Kafka Connect to the release process
>> Remove the iceberg-open-api-test-fixtures-runtime Jar from releases
>>
>> The background is that over the last few weeks, we found two fairly large 
>> leaks that added transitive dependencies into Iceberg runtime Jars (fixed by 
>> #15655 and #15858). As a result, Russell added a new way to track and 
>> validate the dependencies included in our published artifacts. To make sure 
>> the new checks are correct, I’ve been going through to validate the 
>> LICENSE/NOTICE files against the dependency list. Unfortunately, there are 
>> more problems.
>>
>> The first problem is with our Kafka Connect distribution. There are two zip 
>> distributions, a Hive and a non-Hive version. Robin has been working on 
>> getting these published as part of our release process in #15212. The 
>> non-Hive distribution is very large and has some dependencies that may not 
>> need to be there, like Apache Commons Jars that aren’t used in Iceberg (and 
>> would be provided by KC if needed?). #16147 is a draft with some of the 
>> non-Hive changes. The Hive distribution has about 100 more Jars than 
>> non-Hive, and includes many dependencies that are almost certainly 
>> unnecessary, like 3 hadoop-mapreduce-* Jars. My recommendation is to hold 
>> off on making Kafka Connect part of releases until the license issues are 
>> solved.
>>
>> Another issue is the open-api module. We added this to the Java build to 
>> verify the REST catalog spec, but then added tests and fixtures for 
>> validating REST implementations. #11279 added a runtime Jar for to run a 
>> test service, but most PMC members I’ve talked to about it didn’t know that 
>> we have been publishing it — and have been since 1.7. This runtime Jar 
>> indiscriminately bundles far more libraries than it needs, like the cloud 
>> provider libs, Hadoop common, JUnit, Jetty, and others. The Jar is 200+ MB. 
>> My recommendation is to remove this Jar from publication to unblock releases.
>>
>> As a general rule, when we are considering adding a new runtime distribution 
>> to the project, we need to check that it is something we need to do (vs an 
>> easy alternative), and if it is, then minimize the dependencies included to 
>> only those required to run it. Once that’s done, we need to document the 
>> dependencies in LICENSE and NOTICE and, as of #15855, ensure that the 
>> bundled dependencies are tracked in a runtime-deps.txt file.
>>
>> I think the priority right now is to unblock the 1.11 and 1.10.2 releases. 
>> We can do that by not releasing these artifacts. After that, I think we need 
>> to verify for all of these that they are needed, have minimal included 
>> dependencies, and then document those dependencies. For example, do we need 
>> a Kafka Connect Hive distribution or is the REST catalog version enough? 
>> Does everyone agree that this is the right path forward?
>>
>> Thanks,
>>
>> Ryan

Reply via email to