Hi,

I have perhaps managed to deadlock this process :) I'd appreciate some help
untangling it. The recap is in my previous email (below).

thanks, Robin.

On Thu, 29 Jan 2026 at 06:20, Robin Moffatt <[email protected]> wrote:

> Hi Kevin,
>
> Just recapping so that I'm clear, cos I'm getting confused :)
> I have two related PRs:
>
> #15124: Add Flink Quickstart docker image
> #15062: Add Flink quickstart (which includes the Dockerfile too)
>
> I can see a few routes forward:
>
> 1. Merge #15062, fast-follow with #15124 once we're happy with the publish
> script (I've not seen anything raised about it yet tho?)
> 2. Merge #15124 minus publish script, and then #15062 still relying on
> local image build (not sure what this would achieve vs the option above
> tho?)
> 3. Merge #15124 including publish script, then #15062 using the published
> image not the local build
>
> Either way, one thing that needs resolving is the Dockerfile location:
> flink/quickstart (#15062) vs docker/iceberg-flink-quickstart (#15124).
>
> LMK if I've missed an angle here.
>
> thanks, Robin
>
> On Wed, 28 Jan 2026 at 15:57, Kevin Liu <[email protected]> wrote:
>
>> Thanks for working on this, Robin! It looks like the complexity here is
>> publishing the docker image. What do you think about isolating that part?
>> (Just move the publish script out of #15124) We can start
>> with the Dockerfile definition, which allows us to build locally. This
>> should unblock us from merging the getting started docs in #15062
>> Thoughts?
>>
>> Best,
>> Kevin Liu
>>
>> On Wed, Jan 28, 2026 at 5:57 AM Robin Moffatt via dev <
>> [email protected]> wrote:
>>
>>> Hi,
>>>
>>> Thanks for the discussion and input.
>>> It sounds like there are no major blockers. Could someone please review
>>> https://github.com/apache/iceberg/pull/15124 ?
>>>
>>> thanks,
>>>
>>> Robin.
>>>
>>> On Mon, 26 Jan 2026 at 16:36, Kevin Liu <[email protected]> wrote:
>>>
>>>> Hey folks,
>>>>
>>>> We have a Dockerfile defined in pyiceberg [1] that uses the Spark base
>>>> image and installs all the necessary jars. This is used for our integration
>>>> test setup [2] and is inspired by databricks/docker-spark-iceberg [3].
>>>> We've made many improvements such as upgrading to Spark 4, supporting Spark
>>>> Connect, and better image build caching.
>>>>
>>>> This is already self-contained and can be reused by other subprojects.
>>>> In fact, iceberg-rust already uses it [4] and I try to keep them in sync.
>>>> I think it would be beneficial for the project to publish this image
>>>> and something similar for Flink.
>>>>
>>>> Let me know what you think.
>>>>
>>>> Best,
>>>> Kevin Liu
>>>>
>>>>
>>>>
>>>> [1]
>>>> https://github.com/apache/iceberg-python/blob/6de6d6acad440885788fb1a24c04ed647b92af0e/dev/spark/Dockerfile
>>>> [2]
>>>> https://github.com/apache/iceberg-python/blob/6de6d6acad440885788fb1a24c04ed647b92af0e/dev/docker-compose-integration.yml#L20-L21
>>>> [3]
>>>> https://github.com/databricks/docker-spark-iceberg/blob/cf617dc29e8672792e76b9bcf6017af52f570020/spark/Dockerfile
>>>> [4]
>>>> https://github.com/apache/iceberg-rust/blob/330f21da894948fc10b57d541cb2d6f32c8bdbb8/crates/integration_tests/testdata/spark/Dockerfile
>>>>
>>>> On Mon, Jan 26, 2026 at 10:27 AM Steven Wu <[email protected]>
>>>> wrote:
>>>>
>>>>> > Since the integration code for both Spark and Flink lives in our
>>>>> repository, it might make sense to also store the Docker images and the
>>>>> corresponding scripts there.
>>>>>
>>>>> I agree with Peter here.
>>>>>
>>>>> The previous thread has some concerns if the Iceberg project should
>>>>> host those docker images. Not sure if the opinions have changed.
>>>>>
>>>>> On Mon, Jan 26, 2026 at 2:43 AM Robin Moffatt via dev <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Thanks Ajantha, I'd not seen that thread.
>>>>>> Having looked at it, am I understanding the view to be that ideally
>>>>>> Flink would publish a Docker image that included the Iceberg 
>>>>>> dependencies?
>>>>>>
>>>>>> However we do this, I feel that the user coming to run the Flink
>>>>>> quickstart should not have to build their own Docker image; this adds
>>>>>> unnecessary friction that is easily alleviated.
>>>>>>
>>>>>> If I've understood the situation correctly, then I'm happy to discuss
>>>>>> this idea with the Flink community; please let me know before I do so.
>>>>>>
>>>>>> thanks, Robin.
>>>>>>
>>>>>> On Fri, 23 Jan 2026 at 16:50, Ajantha Bhat <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Robin and Peter,
>>>>>>>
>>>>>>> I discussed community-maintained Docker images previously:
>>>>>>> https://lists.apache.org/thread/xl1cwq7vmnh6zgfd2vck2nq7dfd33ncq
>>>>>>>
>>>>>>> The consensus was to publish only the REST fixture Docker image
>>>>>>> <https://hub.docker.com/r/apache/iceberg-rest-fixture> (now at
>>>>>>> 100K+ total downloads) and use Docker images published by the main 
>>>>>>> engines
>>>>>>> in the quickstart, instead of maintaining these images ourselves.
>>>>>>> See the thread above for more details.
>>>>>>>
>>>>>>> With respect to adding a Flink quickstart page, I’m in favor of
>>>>>>> adding it and relying on the Docker images provided by Flink rather than
>>>>>>> maintaining our own images.
>>>>>>> - Ajantha
>>>>>>>
>>>>>>> On Fri, Jan 23, 2026 at 9:43 PM Péter Váry <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> Hi Robin,
>>>>>>>> It would be nice to separate them. I expect that we will have some
>>>>>>>> extra stuff to do with the docker image. For example make sure that we 
>>>>>>>> have
>>>>>>>> ci in place to build it.
>>>>>>>> Thanks,
>>>>>>>> Peter
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Jan 23, 2026, 16:55 Robin Moffatt via dev <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>>> Thanks for the positive reception of this idea.
>>>>>>>>> I've drafted a PR [1] and would appreciate input :)
>>>>>>>>>
>>>>>>>>> Also, should I keep this and the quickstart PR [2] as separate
>>>>>>>>> PRs, or combine them?
>>>>>>>>>
>>>>>>>>> thanks, Robin.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> [1] https://github.com/apache/iceberg/pull/15124
>>>>>>>>> [2] https://github.com/apache/iceberg/pull/15062
>>>>>>>>>
>>>>>>>>> On Fri, 23 Jan 2026 at 13:58, Jean-Baptiste Onofré <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> This is a great idea.
>>>>>>>>>>
>>>>>>>>>> If we are moving forward with an "official" Docker image
>>>>>>>>>> published by the project, we must ensure it is fully compliant with 
>>>>>>>>>> ASF
>>>>>>>>>> requirements regarding LICENSE/NOTICE files, etc. While this may seem
>>>>>>>>>> straightforward, it is a detail that is often overlooked.
>>>>>>>>>>
>>>>>>>>>> I would be happy to help with this process.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> JB
>>>>>>>>>>
>>>>>>>>>> On Fri, Jan 23, 2026 at 1:52 PM Maximilian Michels <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hey Robin,
>>>>>>>>>>>
>>>>>>>>>>> +1 That's a great idea. It's often a bit painful for new users
>>>>>>>>>>> to get
>>>>>>>>>>> all the dependencies in the right place.
>>>>>>>>>>>
>>>>>>>>>>> +1 for building upon the official Flink Docker images:
>>>>>>>>>>> https://hub.docker.com/r/apache/flink
>>>>>>>>>>>
>>>>>>>>>>> -Max
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Jan 23, 2026 at 12:27 PM Péter Váry <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>> >
>>>>>>>>>>> > Hi Robin,
>>>>>>>>>>> >
>>>>>>>>>>> > I would love to see the Flink quickstart image in the Iceberg
>>>>>>>>>>> repo.
>>>>>>>>>>> >
>>>>>>>>>>> > Ajantha was working on the Spark side:
>>>>>>>>>>> https://github.com/apache/iceberg/issues/13519
>>>>>>>>>>> > The conclusion was:
>>>>>>>>>>> >>
>>>>>>>>>>> >> we should both remove the vendor reference and bring this
>>>>>>>>>>> back up to date. My preference would be to rely on the Spark image <
>>>>>>>>>>> https://hub.docker.com/r/apache/spark> provided by the Apache
>>>>>>>>>>> Spark project, similar to what we do for the Hive <
>>>>>>>>>>> https://iceberg.apache.org/hive-quickstart/> quickstart. We
>>>>>>>>>>> should be able to load all the Iceberg-specific JARs through the
>>>>>>>>>>> spark.jars.packages configuration <
>>>>>>>>>>> https://spark.apache.org/docs/3.5.1/configuration.html>.
>>>>>>>>>>> >
>>>>>>>>>>> >
>>>>>>>>>>> > Ajantha also added the link to the old dev list thread:
>>>>>>>>>>> https://lists.apache.org/thread/4kknk8mvnffbmhdt63z8t4ps0mt1jbf4
>>>>>>>>>>> >
>>>>>>>>>>> > Thanks for working on this,
>>>>>>>>>>> > Peter
>>>>>>>>>>> >
>>>>>>>>>>> > Robin Moffatt via dev <[email protected]> ezt írta
>>>>>>>>>>> (időpont: 2026. jan. 22., Cs, 19:23):
>>>>>>>>>>> >>
>>>>>>>>>>> >> Hi,
>>>>>>>>>>> >>
>>>>>>>>>>> >> Following discussion on the Flink quickstart PR [1], what do
>>>>>>>>>>> people think about adding an official quickstart Docker image for 
>>>>>>>>>>> Flink to
>>>>>>>>>>> the project?
>>>>>>>>>>> >> At the moment the Spark quickstart uses
>>>>>>>>>>> tabulario/spark-iceberg so perhaps that could be brought into the 
>>>>>>>>>>> project
>>>>>>>>>>> too.
>>>>>>>>>>> >>
>>>>>>>>>>> >> thanks, Robin.
>>>>>>>>>>> >>
>>>>>>>>>>> >> 1: https://github.com/apache/iceberg/pull/15062
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>
>
>

Reply via email to