In there are no objections then I would prefer it in the docker/iceberg-flink-quickstart
Robin Moffatt via dev <[email protected]> ezt írta (időpont: 2026. febr. 6., P, 11:13): > Hi Peter, > > Thanks for the direction. I'll remove the publish step so that we can get > the quickstart published, and then work on the publishing subsequently. > > Do you think the Dockerfile is best kept in flink/quickstart, or > docker/iceberg-flink-quickstart ? > > thanks, Robin > > On Thu, 5 Feb 2026 at 16:10, Péter Váry <[email protected]> > wrote: > >> I think we have two options: >> >> 1. Remove the image publication from this PR ( >> https://github.com/apache/iceberg/pull/15124) for now, and proceed >> with adding the Docker image and updating the documentation. >> 2. Alternatively, we could discuss publishing the Flink quickstart >> image at the next Iceberg Community Sync and use that as an opportunity to >> simplify both the documentation and the overall user experience. >> >> >> >> >> Robin Moffatt via dev <[email protected]> ezt írta (időpont: 2026. >> febr. 4., Sze, 18:52): >> >>> Hi, >>> >>> I have perhaps managed to deadlock this process :) I'd appreciate some >>> help untangling it. The recap is in my previous email (below). >>> >>> thanks, Robin. >>> >>> On Thu, 29 Jan 2026 at 06:20, Robin Moffatt <[email protected]> wrote: >>> >>>> Hi Kevin, >>>> >>>> Just recapping so that I'm clear, cos I'm getting confused :) >>>> I have two related PRs: >>>> >>>> #15124: Add Flink Quickstart docker image >>>> #15062: Add Flink quickstart (which includes the Dockerfile too) >>>> >>>> I can see a few routes forward: >>>> >>>> 1. Merge #15062, fast-follow with #15124 once we're happy with the >>>> publish script (I've not seen anything raised about it yet tho?) >>>> 2. Merge #15124 minus publish script, and then #15062 still relying on >>>> local image build (not sure what this would achieve vs the option above >>>> tho?) >>>> 3. Merge #15124 including publish script, then #15062 using the >>>> published image not the local build >>>> >>>> Either way, one thing that needs resolving is the Dockerfile location: >>>> flink/quickstart (#15062) vs docker/iceberg-flink-quickstart (#15124). >>>> >>>> LMK if I've missed an angle here. >>>> >>>> thanks, Robin >>>> >>>> On Wed, 28 Jan 2026 at 15:57, Kevin Liu <[email protected]> wrote: >>>> >>>>> Thanks for working on this, Robin! It looks like the complexity here >>>>> is publishing the docker image. What do you think about isolating that >>>>> part? (Just move the publish script out of #15124) We can start >>>>> with the Dockerfile definition, which allows us to build locally. This >>>>> should unblock us from merging the getting started docs in #15062 >>>>> Thoughts? >>>>> >>>>> Best, >>>>> Kevin Liu >>>>> >>>>> On Wed, Jan 28, 2026 at 5:57 AM Robin Moffatt via dev < >>>>> [email protected]> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> Thanks for the discussion and input. >>>>>> It sounds like there are no major blockers. Could someone please >>>>>> review https://github.com/apache/iceberg/pull/15124 ? >>>>>> >>>>>> thanks, >>>>>> >>>>>> Robin. >>>>>> >>>>>> On Mon, 26 Jan 2026 at 16:36, Kevin Liu <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hey folks, >>>>>>> >>>>>>> We have a Dockerfile defined in pyiceberg [1] that uses the Spark >>>>>>> base image and installs all the necessary jars. This is used for our >>>>>>> integration test setup [2] and is inspired by >>>>>>> databricks/docker-spark-iceberg [3]. We've made many improvements such >>>>>>> as >>>>>>> upgrading to Spark 4, supporting Spark Connect, and better image build >>>>>>> caching. >>>>>>> >>>>>>> This is already self-contained and can be reused by other >>>>>>> subprojects. In fact, iceberg-rust already uses it [4] and I try to keep >>>>>>> them in sync. >>>>>>> I think it would be beneficial for the project to publish this image >>>>>>> and something similar for Flink. >>>>>>> >>>>>>> Let me know what you think. >>>>>>> >>>>>>> Best, >>>>>>> Kevin Liu >>>>>>> >>>>>>> >>>>>>> >>>>>>> [1] >>>>>>> https://github.com/apache/iceberg-python/blob/6de6d6acad440885788fb1a24c04ed647b92af0e/dev/spark/Dockerfile >>>>>>> [2] >>>>>>> https://github.com/apache/iceberg-python/blob/6de6d6acad440885788fb1a24c04ed647b92af0e/dev/docker-compose-integration.yml#L20-L21 >>>>>>> [3] >>>>>>> https://github.com/databricks/docker-spark-iceberg/blob/cf617dc29e8672792e76b9bcf6017af52f570020/spark/Dockerfile >>>>>>> [4] >>>>>>> https://github.com/apache/iceberg-rust/blob/330f21da894948fc10b57d541cb2d6f32c8bdbb8/crates/integration_tests/testdata/spark/Dockerfile >>>>>>> >>>>>>> On Mon, Jan 26, 2026 at 10:27 AM Steven Wu <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> > Since the integration code for both Spark and Flink lives in our >>>>>>>> repository, it might make sense to also store the Docker images and the >>>>>>>> corresponding scripts there. >>>>>>>> >>>>>>>> I agree with Peter here. >>>>>>>> >>>>>>>> The previous thread has some concerns if the Iceberg project should >>>>>>>> host those docker images. Not sure if the opinions have changed. >>>>>>>> >>>>>>>> On Mon, Jan 26, 2026 at 2:43 AM Robin Moffatt via dev < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> Thanks Ajantha, I'd not seen that thread. >>>>>>>>> Having looked at it, am I understanding the view to be that >>>>>>>>> ideally Flink would publish a Docker image that included the Iceberg >>>>>>>>> dependencies? >>>>>>>>> >>>>>>>>> However we do this, I feel that the user coming to run the Flink >>>>>>>>> quickstart should not have to build their own Docker image; this adds >>>>>>>>> unnecessary friction that is easily alleviated. >>>>>>>>> >>>>>>>>> If I've understood the situation correctly, then I'm happy to >>>>>>>>> discuss this idea with the Flink community; please let me know before >>>>>>>>> I do >>>>>>>>> so. >>>>>>>>> >>>>>>>>> thanks, Robin. >>>>>>>>> >>>>>>>>> On Fri, 23 Jan 2026 at 16:50, Ajantha Bhat <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi Robin and Peter, >>>>>>>>>> >>>>>>>>>> I discussed community-maintained Docker images previously: >>>>>>>>>> https://lists.apache.org/thread/xl1cwq7vmnh6zgfd2vck2nq7dfd33ncq >>>>>>>>>> >>>>>>>>>> The consensus was to publish only the REST fixture Docker image >>>>>>>>>> <https://hub.docker.com/r/apache/iceberg-rest-fixture> (now at >>>>>>>>>> 100K+ total downloads) and use Docker images published by the main >>>>>>>>>> engines >>>>>>>>>> in the quickstart, instead of maintaining these images ourselves. >>>>>>>>>> See the thread above for more details. >>>>>>>>>> >>>>>>>>>> With respect to adding a Flink quickstart page, I’m in favor of >>>>>>>>>> adding it and relying on the Docker images provided by Flink rather >>>>>>>>>> than >>>>>>>>>> maintaining our own images. >>>>>>>>>> - Ajantha >>>>>>>>>> >>>>>>>>>> On Fri, Jan 23, 2026 at 9:43 PM Péter Váry < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Robin, >>>>>>>>>>> It would be nice to separate them. I expect that we will have >>>>>>>>>>> some extra stuff to do with the docker image. For example make sure >>>>>>>>>>> that we >>>>>>>>>>> have ci in place to build it. >>>>>>>>>>> Thanks, >>>>>>>>>>> Peter >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Fri, Jan 23, 2026, 16:55 Robin Moffatt via dev < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>>> Thanks for the positive reception of this idea. >>>>>>>>>>>> I've drafted a PR [1] and would appreciate input :) >>>>>>>>>>>> >>>>>>>>>>>> Also, should I keep this and the quickstart PR [2] as separate >>>>>>>>>>>> PRs, or combine them? >>>>>>>>>>>> >>>>>>>>>>>> thanks, Robin. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> [1] https://github.com/apache/iceberg/pull/15124 >>>>>>>>>>>> [2] https://github.com/apache/iceberg/pull/15062 >>>>>>>>>>>> >>>>>>>>>>>> On Fri, 23 Jan 2026 at 13:58, Jean-Baptiste Onofré < >>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi, >>>>>>>>>>>>> >>>>>>>>>>>>> This is a great idea. >>>>>>>>>>>>> >>>>>>>>>>>>> If we are moving forward with an "official" Docker image >>>>>>>>>>>>> published by the project, we must ensure it is fully compliant >>>>>>>>>>>>> with ASF >>>>>>>>>>>>> requirements regarding LICENSE/NOTICE files, etc. While this may >>>>>>>>>>>>> seem >>>>>>>>>>>>> straightforward, it is a detail that is often overlooked. >>>>>>>>>>>>> >>>>>>>>>>>>> I would be happy to help with this process. >>>>>>>>>>>>> >>>>>>>>>>>>> Regards, >>>>>>>>>>>>> JB >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Jan 23, 2026 at 1:52 PM Maximilian Michels < >>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hey Robin, >>>>>>>>>>>>>> >>>>>>>>>>>>>> +1 That's a great idea. It's often a bit painful for new >>>>>>>>>>>>>> users to get >>>>>>>>>>>>>> all the dependencies in the right place. >>>>>>>>>>>>>> >>>>>>>>>>>>>> +1 for building upon the official Flink Docker images: >>>>>>>>>>>>>> https://hub.docker.com/r/apache/flink >>>>>>>>>>>>>> >>>>>>>>>>>>>> -Max >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Jan 23, 2026 at 12:27 PM Péter Váry < >>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > Hi Robin, >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > I would love to see the Flink quickstart image in the >>>>>>>>>>>>>> Iceberg repo. >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > Ajantha was working on the Spark side: >>>>>>>>>>>>>> https://github.com/apache/iceberg/issues/13519 >>>>>>>>>>>>>> > The conclusion was: >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> we should both remove the vendor reference and bring this >>>>>>>>>>>>>> back up to date. My preference would be to rely on the Spark >>>>>>>>>>>>>> image < >>>>>>>>>>>>>> https://hub.docker.com/r/apache/spark> provided by the >>>>>>>>>>>>>> Apache Spark project, similar to what we do for the Hive < >>>>>>>>>>>>>> https://iceberg.apache.org/hive-quickstart/> quickstart. We >>>>>>>>>>>>>> should be able to load all the Iceberg-specific JARs through the >>>>>>>>>>>>>> spark.jars.packages configuration < >>>>>>>>>>>>>> https://spark.apache.org/docs/3.5.1/configuration.html>. >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > Ajantha also added the link to the old dev list thread: >>>>>>>>>>>>>> https://lists.apache.org/thread/4kknk8mvnffbmhdt63z8t4ps0mt1jbf4 >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > Thanks for working on this, >>>>>>>>>>>>>> > Peter >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > Robin Moffatt via dev <[email protected]> ezt írta >>>>>>>>>>>>>> (időpont: 2026. jan. 22., Cs, 19:23): >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> Hi, >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> Following discussion on the Flink quickstart PR [1], what >>>>>>>>>>>>>> do people think about adding an official quickstart Docker image >>>>>>>>>>>>>> for Flink >>>>>>>>>>>>>> to the project? >>>>>>>>>>>>>> >> At the moment the Spark quickstart uses >>>>>>>>>>>>>> tabulario/spark-iceberg so perhaps that could be brought into >>>>>>>>>>>>>> the project >>>>>>>>>>>>>> too. >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> thanks, Robin. >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> 1: https://github.com/apache/iceberg/pull/15062 >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>> >>>>>> >>>> >>>> >>> >
