Thanks all for chiming in. I'll continue tomorrow with a VOTE as suggested
by Till.

Regarding my initially proposed timeline: I don't think we will have
everything ready before the first 1.10 RC, but I also think it's not that
big of a deal. ;-)

– Ufuk


On Fri, Jan 24, 2020 at 11:59 AM Till Rohrmann <trohrm...@apache.org> wrote:

> +1 for Ufuk's proposal how to proceed. I guess the immediate next step
> would be a VOTE for accepting the dockerfiles and where to store them.
>
> Cheers,
> Till
>
> On Wed, Jan 22, 2020 at 4:05 PM Fabian Hueske <fhue...@gmail.com> wrote:
>
> > Hi everyone,
> >
> > First of all, thank you very much Patrick for maintaining and publishing
> > the Flink Docker images so far and for starting this discussion!
> >
> > I'm in favor of adding the Dockerfiles in a separate repository and not
> in
> > the main Flink repository.
> > I also think that it makes sense to first focus on the contribution of
> the
> > Dockerfiles and consolidation of existing Dockerfiles before discussing
> > special cases for development and testing.
> >
> > In addition to the Dockerfiles in the Flink main repo, there is also one
> in
> > the flink-playgrounds repo [1] to build a customized Docker image for the
> > playground.
> >
> > Besides building and publishing "official" Flink images via DockerHub,
> > there is also the option to let ASF Infra build Docker images and publish
> > them under https://hub.docker.com/u/apache.
> > These images would not be "official" DockerHub images anymore, but
> > available under the Apache DockerHub user.
> > However, I think it would be a good idea to keep the current setup for
> the
> > main Flink images (those that depend on Flink releases) for better
> > visibility and to not confuse our users.
> > We might want to publish less critical images (playground images, dev
> > images, nightly builds, etc) via Infra under the Apache DockerHub user.
> >
> > Best,
> > Fabian
> >
> > Am Mo., 13. Jan. 2020 um 11:38 Uhr schrieb Ufuk Celebi <u...@apache.org>:
> >
> > > Hey all,
> > >
> > > first of all a big thank you for driving many of the Docker image
> > releases
> > > in the last two years.
> > >
> > > *(1) Moving docker-flink/docker-flink to apache/docker-flink*
> > >
> > > +1 to do this as you outlined. I would propose to aim for a first
> > > integration with the 1.10 release without major changes to the existing
> > > Dockerfiles. The work items would be to move the Dockerfiles and update
> > the
> > > release process documentation so everyone is on the same page.
> > >
> > > *(2) Consolidate Dockerfiles in apache/flink*
> > >
> > > +1 to start the process for this. I think this requires a bit of
> thinking
> > > about what the requirements are and which problems we want to solve.
> From
> > > skimming the existing Dockerfiles, it seems to me that the Docker image
> > > builds fulfil quite a few different tasks. We have a script that can
> > bundle
> > > Hadoop, can copy an existing Flink distribution, can include user jars,
> > > etc. The scope of this is quite broad and would warrant a design
> > document/a
> > > FLIP.
> > >
> > > I would move the questions about nightly builds, using a different base
> > > image or having image variants with debug tooling to after (1) and (2)
> or
> > > make it part of (2).
> > >
> > > *(3) Next steps*
> > >
> > > If there are no objections, I would propose to tackle (1) and (2)
> > separate
> > > and to continue as follows:
> > >
> > > (i) Create tickets for (1) and aim to align with 1.10 release timeline
> > > (ideally before the first RC). Since this does not touch any code in
> the
> > > release branches, I think this would not be affected by the feature
> > freeze.
> > > The major work item would be to update the docs and potential
> > refactorings
> > > of the existing process and Dockerfiles. I can help with the process to
> > > create a new repo.
> > >
> > > (ii) Create first draft for consolidation of existing Dockerfiles.
> After
> > > this proposal is done, I would propose to bring it up for a separate
> > > discussion on the ML.
> > >
> > >
> > > What do you think? @Patrick: would you be interested in working on both
> > (1)
> > > + (2) or did you mainly have (1) in mind?
> > >
> > > Best,
> > >
> > > Ufuk
> > >
> > > On Sun, Jan 12, 2020 at 8:30 PM Konstantin Knauf <
> > konstan...@ververica.com
> > > >
> > > wrote:
> > >
> > > > Big +1 for
> > > >
> > > > * official images in a separate repository
> > > > * unified images (session cluster vs application cluster)
> > > > * images for development in Apache flink repository
> > > >
> > > > On Fri, Jan 10, 2020 at 7:14 PM Till Rohrmann <trohrm...@apache.org>
> > > > wrote:
> > > >
> > > > > Thanks a lot for starting this discussion Patrick! I think it is a
> > very
> > > > > good idea to move Flink's docker image more under the jurisdiction
> of
> > > the
> > > > > Flink PMC and to make it releasing new docker images part of
> Flink's
> > > > > release process (not saying that we cannot release new docker
> images
> > > > > independent of Flink's release cycle).
> > > > >
> > > > > One thing I have no strong opinion about is where to place the
> > > > Dockerfiles
> > > > > (apache/flink.git vs. apache/flink-docker.git). I see the point
> that
> > > one
> > > > > wants to separate concerns (Flink code vs. Dockerfiles) and, hence,
> > > that
> > > > > having separate repositories might help with this objective. But on
> > the
> > > > > other hand, I don't have a lot of experience with Docker Hub and
> how
> > to
> > > > > best host Dockerfiles. Consequently, it would be helpful if others
> > who
> > > > have
> > > > > made some experience could share it with us.
> > > > >
> > > > > Cheers,
> > > > > Till
> > > > >
> > > > > On Sat, Dec 21, 2019 at 2:28 PM Hequn Cheng <chenghe...@gmail.com>
> > > > wrote:
> > > > >
> > > > > > Hi Patrick,
> > > > > >
> > > > > > Thanks a lot for your continued work on the Docker images. That’s
> > > > really
> > > > > > really a great job! And I have also benefited from it.
> > > > > >
> > > > > > Big +1 for integrating docker image publication into the Flink
> > > release
> > > > > > process since we can leverage the Flink release process to make
> > sure
> > > a
> > > > > more
> > > > > > legitimacy docker publication. We can also check and vote on it
> > > during
> > > > > the
> > > > > > release.
> > > > > >
> > > > > > I think the most import thing we need to discuss first is whether
> > to
> > > > > have a
> > > > > > dedicated git repo for the Dockerfiles.
> > > > > >
> > > > > > Although it is convention shared by nearly every other “official”
> > > image
> > > > > on
> > > > > > Docker Hub to have a dedicated repo, I'm still not sure about it.
> > > > Maybe I
> > > > > > have missed something important. From my point of view, I think
> > it’s
> > > > > better
> > > > > > to have the Dockerfiles in the (main)Flink repo.
> > > > > >   - First, I think the Dockerfiles can be treated as part of the
> > > > release.
> > > > > > And it is also natural to put the corresponding version of the
> > > > Dockerfile
> > > > > > in the corresponding Flink release.
> > > > > >   - Second, we can put the Dockerfiles in the path like
> > > > > > flink/docker-flink/version/ and the version varies in different
> > > > releases.
> > > > > > For example, for release 1.8.3, we have a
> flink/docker-flink/1.8.3
> > > > > > folder(or maybe flink/docker-flink/1.8). Even though all
> > Dockerfiles
> > > > for
> > > > > > supported versions are not in one path but they are still in one
> > Git
> > > > tree
> > > > > > with different refs.
> > > > > >   - Third, it seems the Docker Hub also supports specifying
> > different
> > > > > refs.
> > > > > > For the file[1], we can change the GitRepo link from
> > > > > > https://github.com/docker-flink/docker-flink.git to
> > > > > > https://github.com/apache/flink.git and add a GitFetch for each
> > tag,
> > > > > e.g.,
> > > > > > GitFetch: refs/tags/release-1.8.3. There are some examples in the
> > > file
> > > > of
> > > > > > ubuntu[2].
> > > > > >
> > > > > > If the above assumptions are right and there are no more
> obstacles,
> > > I'm
> > > > > > intended to have these Dockerfiles in the main Flink repo. In
> this
> > > > case,
> > > > > we
> > > > > > can reduce the number of repos and reduce the management
> overhead.
> > > > > > What do you think?
> > > > > >
> > > > > > Best,
> > > > > > Hequn
> > > > > >
> > > > > > [1]
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/docker-library/official-images/blob/master/library/flink
> > > > > > [2]
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/docker-library/official-images/blob/master/library/ubuntu
> > > > > >
> > > > > >
> > > > > > On Fri, Dec 20, 2019 at 5:29 PM Yang Wang <danrtsey...@gmail.com
> >
> > > > wrote:
> > > > > >
> > > > > > >  Big +1 for this effort.
> > > > > > >
> > > > > > > It is really exciting we have started this great work. More and
> > > more
> > > > > > > companies start to
> > > > > > > use Flink in container environment(docker, Kubernetes, Mesos,
> > even
> > > > > > > Yarn-3.x). So it is
> > > > > > > very important that we could have unified official image
> building
> > > and
> > > > > > > releasing process.
> > > > > > >
> > > > > > >
> > > > > > > The image building process in this proposal is really good and
> i
> > > just
> > > > > > have
> > > > > > > the following thoughts.
> > > > > > >
> > > > > > > >> Keep a dedicated repo for Dockerfiles to build official
> image
> > > > > > > I think this is a good way and we do not need to make some
> > > > unnecessary
> > > > > > > changes to Flink repository.
> > > > > > >
> > > > > > > >> Integrate building image into the Flink release process
> > > > > > > It will bring a better experience for container environment
> > users.
> > > In
> > > > > my
> > > > > > > opinion, a complete
> > > > > > > release includes the official image. It should be verified to
> > work
> > > > > well.
> > > > > > >
> > > > > > > >> Nightly building
> > > > > > > Do we support for all the release branch or just master branch?
> > > > > > >
> > > > > > > >> Multiple purpose Flink images
> > > > > > > It is really indeed. In developing and testing process, we need
> > > some
> > > > > > > profiling tools to help
> > > > > > > us investigate some problems. Currently, we do not even have
> > > > > jstack/jmap
> > > > > > in
> > > > > > > the image.
> > > > > > >
> > > > > > > >> Unify the Dockerfile in Flink repository
> > > > > > > In the current code base, we have
> > > > flink-contrib/docker-flink/Dockerfile
> > > > > > to
> > > > > > > build a image
> > > > > > > for session cluster. However, it is not updated. For per-job
> > > cluster,
> > > > > > > flink-container/docker/Dockerfile
> > > > > > > could be used to build a flink image with user artifacts. I
> think
> > > we
> > > > > need
> > > > > > > to unify them and
> > > > > > > provide a more powerful build script and entry point.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Best,
> > > > > > > Yang
> > > > > > >
> > > > > > > Patrick Lucas <patr...@ververica.com> 于2019年12月19日周四 下午9:20写道:
> > > > > > >
> > > > > > > > Hi everyone,
> > > > > > > >
> > > > > > > >
> > > > > > > > I would like to start a discussion about integrating
> > publication
> > > of
> > > > > the
> > > > > > > > Flink Docker images hosted on Docker Hub[1] more tightly with
> > the
> > > > > Flink
> > > > > > > > release process. Apologies in advance for the long post.
> > > > > > > >
> > > > > > > > More than two and a half years ago (time flies!) we
> introduced
> > > > > > “official”
> > > > > > > > Docker images for Flink[2]. Since then, the popularity of
> > running
> > > > > > > > containerized applications in general and containerized Flink
> > in
> > > > > > > particular
> > > > > > > > has continued to grow. Today, Flink is one of the most
> popular
> > > > > > “official”
> > > > > > > > images on Docker Hub[3].
> > > > > > > >
> > > > > > > > > A graph of Flink Docker image pulls over time:
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://gist.githubusercontent.com/patricklucas/7312444b1056ff82528e9a129e74e2b3/raw/9c8e139c1abc70b2b3fb34aadd7f44d46a540fe8/docker-flink-pulls.png
> > > > > > > >
> > > > > > > > “Official” is in quotation marks because while that’s how the
> > > > Docker
> > > > > > > > community refers to top-level images on Docker Hub (i.e.
> those
> > > that
> > > > > can
> > > > > > > be
> > > > > > > > run with just <docker run foo>), they are not official in the
> > > sense
> > > > > of
> > > > > > > > being officially endorsed by the Flink PMC.
> > > > > > > >
> > > > > > > > I think it’s time for that to change.
> > > > > > > >
> > > > > > > > Currently, the Dockerfiles that produce these images are
> > > maintained
> > > > > in
> > > > > > a
> > > > > > > > repository called docker-flink[4] in a separate,
> > > community-managed
> > > > > > GitHub
> > > > > > > > organization of the same name. When a new release of Flink is
> > > > > > available,
> > > > > > > or
> > > > > > > > when other changes are necessary, these Dockerfiles—one per
> > > > image—are
> > > > > > > > updated, and then a pull request[5] is made to the Docker Hub
> > > > > > > > official-images repo with an updated manifest of images and
> > tags,
> > > > > after
> > > > > > > > which infrastructure run by Docker Hub builds, checks, and
> > > > publishes
> > > > > > the
> > > > > > > > images.
> > > > > > > >
> > > > > > > > A question that has come up regularly is “Why are the
> > Dockerfiles
> > > > in
> > > > > a
> > > > > > > > separate repository from Flink?”, and there are a few
> different
> > > > > > answers:
> > > > > > > >
> > > > > > > >    -
> > > > > > > >
> > > > > > > >    These Dockerfiles package only released, published
> > > distributions
> > > > > of
> > > > > > > >    Flink, and are therefore decoupled from a particular
> commit
> > in
> > > > the
> > > > > > > Flink
> > > > > > > >    repo
> > > > > > > >    -
> > > > > > > >
> > > > > > > >    All the Dockerfiles for supported versions (and the
> > > > corresponding
> > > > > > > Scala
> > > > > > > >    version variants) should be available in one Git tree for
> > > > > > > > discoverability
> > > > > > > >    -
> > > > > > > >
> > > > > > > >    The master branch of Flink is not the right place to
> encode
> > > what
> > > > > the
> > > > > > > >    supported versions are, or how to run previous versions of
> > > > > Flink—it
> > > > > > > > should
> > > > > > > >    be concerned with the point-in-time of the code
> represented
> > in
> > > > > that
> > > > > > > > commit
> > > > > > > >
> > > > > > > >
> > > > > > > > But mostly, having a dedicated repo for Dockerfiles is a
> > > convention
> > > > > > > shared
> > > > > > > > by nearly every other “official” image on Docker Hub[6]. If
> the
> > > > Flink
> > > > > > > > community wants to do this differently, we will need to work
> > with
> > > > the
> > > > > > > > Docker Hub maintainers to make sure we continue to work
> within
> > > > their
> > > > > > > > guidelines and expectations.
> > > > > > > >
> > > > > > > > While it seems intuitive that integrating these images into
> the
> > > > Flink
> > > > > > > > release process is a good thing, I don’t believe it is
> strictly
> > > > > > > necessary,
> > > > > > > > since the images only package approved and signed Flink
> > releases,
> > > > and
> > > > > > do
> > > > > > > > not themselves build Flink from source. However, there are
> some
> > > > > > concrete
> > > > > > > > advantages:
> > > > > > > >
> > > > > > > >    -
> > > > > > > >
> > > > > > > >    Putting the Docker images on (almost) equal footing with
> > Flink
> > > > > > binary
> > > > > > > >    release artifacts will help the legitimacy of and user
> > > > confidence
> > > > > in
> > > > > > > >    running Flink in containerized environments
> > > > > > > >    -
> > > > > > > >
> > > > > > > >    By publishing release candidate (and possibly nightly)
> > images,
> > > > the
> > > > > > > >    release testing and automated testing processes could be
> > > > improved
> > > > > > > >    -
> > > > > > > >
> > > > > > > >    The delay between Flink releases and when the
> corresponding
> > > > Docker
> > > > > > > >    images are available will be reduced
> > > > > > > >
> > > > > > > >
> > > > > > > > Considering all of this, I propose the following:
> > > > > > > >
> > > > > > > >    -
> > > > > > > >
> > > > > > > >    We move the Git repository containing the Dockerfiles from
> > the
> > > > > > > >    docker-flink GitHub organization to Apache, placing it
> under
> > > > > control
> > > > > > > of
> > > > > > > > the
> > > > > > > >    Flink PMC
> > > > > > > >    -
> > > > > > > >
> > > > > > > >    We codify updating these Dockerfiles and notifying Docker
> > Hub
> > > > into
> > > > > > the
> > > > > > > >    Flink release process
> > > > > > > >    -
> > > > > > > >
> > > > > > > >       For release candidates, Dockerfiles should be added to
> a
> > > > > special
> > > > > > > >       directory which will be automatically built and pushed
> to
> > > the
> > > > > > > > Apache Docker
> > > > > > > >       Hub organization[7], e.g. apache/flink-rc:1.10.0-rc1
> > > > > > > >       -
> > > > > > > >
> > > > > > > >       Upon release, the appropriate “release” Dockerfiles are
> > > added
> > > > > > (e.g.
> > > > > > > >       under the 1.10 directory) and release candidate
> > Dockerfiles
> > > > > > > removed,
> > > > > > > > and
> > > > > > > >       then a pull request opened on the
> > > > > docker-library/official-images
> > > > > > > > repository
> > > > > > > >       -
> > > > > > > >
> > > > > > > >    Optionally, we introduce “nightly” builds, with an
> automated
> > > > > process
> > > > > > > >    building and pushing images to the Apache Docker Hub
> > > > organization,
> > > > > > > e.g.
> > > > > > > >    apache/flink-dev:1.10-SNAPSHOT
> > > > > > > >
> > > > > > > >
> > > > > > > > If we choose to move forward in this direction, there are
> some
> > > > > further
> > > > > > > > steps we could take to improve the experience of both
> > developing
> > > > and
> > > > > > > using
> > > > > > > > Flink with Docker (these are actually mostly orthogonal to
> the
> > > > > proposed
> > > > > > > > changes above, but I think this is a natural first step and
> > > should
> > > > > make
> > > > > > > the
> > > > > > > > following ideas easier to implement).
> > > > > > > >
> > > > > > > > First, there are important differences between images meant
> for
> > > > > running
> > > > > > > > Flink and those meant for development: the former should
> > strictly
> > > > > > package
> > > > > > > > only released distributions of software and be as thin of a
> > layer
> > > > as
> > > > > > > > possible over the software itself, while the latter can be
> used
> > > > > during
> > > > > > > > development and testing, and can easily be rebuilt from a
> > > “working
> > > > > > copy”
> > > > > > > of
> > > > > > > > the software’s source code.
> > > > > > > >
> > > > > > > > By standardizing on defining such “production” images in the
> > > > > > docker-flink
> > > > > > > > repository and “development” image(s) in the Flink repository
> > > > itself,
> > > > > > it
> > > > > > > is
> > > > > > > > much clearer to developers and users what the right
> Dockerfile
> > or
> > > > > image
> > > > > > > > they should use for a given purpose. To that end, we could
> > > > introduce
> > > > > > one
> > > > > > > or
> > > > > > > > more documented Maven goals or Make targets for building a
> > Docker
> > > > > image
> > > > > > > > from the current source tree or a specific release (including
> > > > > > unreleased
> > > > > > > or
> > > > > > > > unsupported versions).
> > > > > > > >
> > > > > > > > Additionally, there has been discussion among Flink
> > contributors
> > > > for
> > > > > > some
> > > > > > > > time about the confusing state of Dockerfiles within the
> Flink
> > > > > > > repository,
> > > > > > > > each meant for a different way of running Flink. I’m not
> > > completely
> > > > > up
> > > > > > to
> > > > > > > > speed about these different efforts, but we could possibly
> > solve
> > > > this
> > > > > > by
> > > > > > > > either building additional “official” images with different
> > > > > entrypoints
> > > > > > > for
> > > > > > > > these various purposes, or by developing an improved
> entrypoint
> > > > > script
> > > > > > > that
> > > > > > > > conveniently supports all cases. I defer to Till Rohrmann,
> > > > Konstantin
> > > > > > > > Knauf, or Stephan Ewen for further discussion on this point.
> > > > > > > >
> > > > > > > > I apologize again for the wall of text, but if you made it
> this
> > > > far,
> > > > > > > thank
> > > > > > > > you! These improvements have been a long time coming, and I
> > hope
> > > we
> > > > > can
> > > > > > > > find a solution that serves the Flink and Docker communities
> > > well.
> > > > > > Please
> > > > > > > > don’t hesitate to ask any questions.
> > > > > > > >
> > > > > > > > --
> > > > > > > >
> > > > > > > > Patrick Lucas
> > > > > > > >
> > > > > > > > [1] https://hub.docker.com/_/flink
> > > > > > > >
> > > > > > > > [2]
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://lists.apache.org/thread.html/c50297f8659aaa59d4f2ae327b69c4d46d1ab8ecc53138e659e4fe91%40%3Cdev.flink.apache.org%3E
> > > > > > > >
> > > > > > > > [3] On page 2 at the time we went to press:
> > > > > > > >
> > > https://hub.docker.com/search?q=&type=image&image_filter=official
> > > > > > > >
> > > > > > > > [4] https://github.com/docker-flink/docker-flink
> > > > > > > >
> > > > > > > > [5]
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/docker-library/official-images/pulls?q=is%3Apr+label%3Alibrary%2Fflink
> > > > > > > >
> > > > > > > > [6] I looked at the 25 most popular “official” images (see
> [3])
> > > as
> > > > > well
> > > > > > > as
> > > > > > > > “official” images of Apache software from the top 125; all
> use
> > a
> > > > > > > dedicated
> > > > > > > > repo
> > > > > > > > [7] https://hub.docker.com/u/apache
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Konstantin Knauf | Solutions Architect
> > > >
> > > > +49 160 91394525
> > > >
> > > >
> > > > Follow us @VervericaData Ververica <https://www.ververica.com/>
> > > >
> > > >
> > > > --
> > > >
> > > > Join Flink Forward <https://flink-forward.org/> - The Apache Flink
> > > > Conference
> > > >
> > > > Stream Processing | Event Driven | Real Time
> > > >
> > > > --
> > > >
> > > > Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
> > > >
> > > > --
> > > > Ververica GmbH
> > > > Registered at Amtsgericht Charlottenburg: HRB 158244 B
> > > > Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason,
> Ji
> > > > (Tony) Cheng
> > > >
> > >
> >
>

Reply via email to