Bumping this thread from the other one [1]. > 1. Read sdk version from gradle.properties and use this as the default tag. Done with Python, need to implement it with Java and Go.
100% agree with this one. Using the same tag for local and release images has already caused a good deal of confusion. Filed BEAM-8570 and BEAM-8571 [2][3]. > 2. Remove pulling images before executing docker run command. This should be fixed for Python, Java and Go. Valentyn (from [1]): > I think pulling the latest image for the current tag is actually a desired behavior, in case the external image was updated (due to a bug fix for example). There's a PR for this [4]. Once we fix the default tag for Java/Go containers, the dev and release containers will be distinct, which makes it seldom important whether or not the image is `docker pull`ed. Anyway, I agree with Thomas that implicitly running `docker pull` is confusing and requires some adjustments to work around. The user can always run `docker pull` themselves if that's the intention. [1] https://lists.apache.org/thread.html/0f2ccbbe7969b91dc21ba331c1a30d730e268cc0355c1ac1ba0b7988@%3Cdev.beam.apache.org%3E [2] https://issues.apache.org/jira/browse/BEAM-8570 [3] https://issues.apache.org/jira/browse/BEAM-8571 [4] https://github.com/apache/beam/pull/9972 On Wed, Oct 2, 2019 at 5:32 PM Ahmet Altay <[email protected]> wrote: > I do not believe this is a blocker for Beam 2.16. I agree that it would be > good to fix this. > > On Wed, Oct 2, 2019 at 3:15 PM Hannah Jiang <[email protected]> > wrote: > >> Hi Thomas >> >> Thanks for bring this up. >> >> Now Python uses sdk version as a default tag, while Java and Go use >> latest as a default tag. I agree using latest as a tag is problematic. The >> reason only Python uses sdk version as a default tag is Python has >> version.py so the version is easy to read. For Java and Go, we need to read >> it from gradle.properties when creating images with the default tag and >> when setting the default image. >> >> Here is what we need to do: >> 1. Read sdk version from gradle.properties and use this as the default >> tag. Done with Python, need to implement it with Java and Go. >> 2. Remove pulling images before executing docker run command. This should >> be fixed for Python, Java and Go. >> >> Is this a blocker for 2.16? If so and above are too much work for 2.16 at >> the moment, we can hardcode the default tag for release branch for now. >> >> Using timestamp as a tag is an option as well, as long as runners know >> which timestamp they should use. >> >> Hannah >> >> On Wed, Oct 2, 2019 at 10:13 AM Alan Myrvold <[email protected]> wrote: >> >>> Yes, using the latest tag is problematic and can lead to unexpected >>> behavior. >>> Using a date/time or 2.17.0.dev-$USER tag would be better. The validates >>> container shell script uses a datetime >>> <https://github.com/apache/beam/blob/6551d0937ee31a8e310b63b222dbc750ec9331f8/sdks/python/container/run_validatescontainer.sh#L87> >>> tag, which allows a unique name if no two tests are run in the same second. >>> >>> On Wed, Oct 2, 2019 at 10:05 AM Thomas Weise <[email protected]> wrote: >>> >>>> Want to bump this thread. >>>> >>>> If the current behavior is to replace locally built image with the last >>>> published, then this is not only unexpected for developers but also >>>> problematic for the CI, where tests should run against what was built from >>>> source. Or am I missing something? >>>> >>>> Thanks, >>>> Thomas >>>> >>>> >>>> On Tue, Sep 24, 2019 at 7:08 PM Thomas Weise <[email protected]> wrote: >>>> >>>>> Hi Hannah, >>>>> >>>>> I believe this is unexpected from the developer perspective. When >>>>> building something locally, we do expect that to be used. We may need to >>>>> change to not pull when the image is available locally, at least when it >>>>> is >>>>> a snapshot/master branch. Release images should be immutable anyways. >>>>> >>>>> Thomas >>>>> >>>>> >>>>> On Tue, Sep 24, 2019 at 4:13 PM Hannah Jiang <[email protected]> >>>>> wrote: >>>>> >>>>>> A minor update, with custom container, the pipeline would not fail, >>>>>> it throws out warning and moves on to `docker run` command. >>>>>> >>>>>> On Tue, Sep 24, 2019 at 4:05 PM Hannah Jiang <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hi Brian >>>>>>> >>>>>>> If we pull docker images, it always downloads from remote >>>>>>> repository, which is expected behavior. >>>>>>> In case we want to run a local image and pull it only when the image >>>>>>> is not available at local, we can use `docker run` command directly, >>>>>>> without pulling it in advance. [1] >>>>>>> In case we want to pull images only when they are not available at >>>>>>> local, we can use `docker images -q` to check if images are existing at >>>>>>> local before pulling it. >>>>>>> Another option is re-tag your local image, pass your image to >>>>>>> pipeline and overwrite default one, but the code is still trying to >>>>>>> pull, >>>>>>> so if your image is not pushed to the remote repository, it would fail. >>>>>>> >>>>>>> 1. https://github.com/docker/cli/pull/1498 >>>>>>> >>>>>>> Hannah >>>>>>> >>>>>>> On Tue, Sep 24, 2019 at 11:56 AM Brian Hulette <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> I'm working on a demo cross-language pipeline on a local flink >>>>>>>> cluster that relies on my python row coder PR [1]. The PR includes some >>>>>>>> changes to the Java worker code, so I need to build a Java SDK >>>>>>>> container >>>>>>>> locally and use that in the pipeline. >>>>>>>> >>>>>>>> Unfortunately, whenever I run the pipeline, >>>>>>>> the apachebeam/java_sdk:latest tag is moved off of my locally built >>>>>>>> image >>>>>>>> to a newly downloaded image with a creation date 2 weeks ago, and that >>>>>>>> image is used instead. It looks like the reason is we run `docker pull` >>>>>>>> before running the container [2]. As the comment says this should be a >>>>>>>> no-op if the image already exists, but that doesn't seem to be the >>>>>>>> case. If >>>>>>>> I just run `docker pull apachebeam/java_sdk:latest` on my local >>>>>>>> machine it >>>>>>>> downloads the 2 week old image and happily informs me: >>>>>>>> >>>>>>>> Status: Downloaded newer image for apachebeam/java_sdk:latest >>>>>>>> >>>>>>>> Does anyone know how I can prevent `docker pull` from doing this? I >>>>>>>> can unblock myself for now just by commenting out the docker pull >>>>>>>> command, >>>>>>>> but I'd like to understand what is going on here. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Brian >>>>>>>> >>>>>>>> [1] https://github.com/apache/beam/pull/9188 >>>>>>>> [2] >>>>>>>> https://github.com/apache/beam/blob/master/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/environment/DockerCommand.java#L80 >>>>>>>> >>>>>>>
