+1 to have an official Beam released container image.

Also I would propose to add a verification step to (or after) the release
process to do smoke check. Python have ValidatesContainer test that runs
basic pipeline using newly built container for verification. Other sdk
languages can do similar thing or add a common framework.

Mark

On Thu, Jan 17, 2019 at 5:56 AM Alan Myrvold <[email protected]> wrote:

> +1 This would be great. gcr.io seems like a good option for snapshots due
> to the permissions from jenkins to upload and ability to keep snapshots
> around.
>
> On Wed, Jan 16, 2019 at 6:51 PM Ruoyun Huang <[email protected]> wrote:
>
>> +1 This would be a great thing to have.
>>
>> On Wed, Jan 16, 2019 at 6:11 PM Ankur Goenka <[email protected]> wrote:
>>
>>> grc.io seems to be a good option. Given that we don't need the hosting
>>> server name in the image name makes it easily changeable later.
>>>
>>> Docker container for Apache Flink is named "flink" and they have
>>> different tags for different releases and configurations
>>> https://hub.docker.com/_/flink .We can follow a similar model and can
>>> name the image as "beam" (beam doesn't seem to be taken on docker hub) and
>>> use tags to distinguish Java/Python/Go and versions etc.
>>>
>>> Tags will look like:
>>> java-SNAPSHOT
>>> java-2.10.1
>>> python2-SNAPSHOT
>>> python2-2.10.1
>>> go-SNAPSHOT
>>> go-2.10.1
>>>
>>>
>>> On Wed, Jan 16, 2019 at 5:56 PM Ahmet Altay <[email protected]> wrote:
>>>
>>>> For snapshots, we could use gcr.io. Permission would not be a problem
>>>> since Jenkins is already correctly setup. The cost will be covered under
>>>> apache-beam-testing project. And since this is only for snapshots, it will
>>>> be only for temporary artifacts not for release artifacts.
>>>>
>>>> On Wed, Jan 16, 2019 at 5:50 PM Valentyn Tymofieiev <
>>>> [email protected]> wrote:
>>>>
>>>>> +1, releasing containers is a useful process that we need to build in
>>>>> Beam and it is required for FnApi users. Among other reasons, having
>>>>> officially-released Beam SDK harness container images will make it easier
>>>>> for users to do simple customizations to  container images, as they will 
>>>>> be
>>>>> able to use container image released by Beam as a base image.
>>>>>
>>>>> Good point about potential storage limitations on Bintray. With Beam
>>>>> Release cadence we may quickly exceed the 10 GB quota. It may also affect
>>>>> our decisions as to which images we want to release, for example: do we
>>>>> want to only release one container image with Python 3 interpreter, or do
>>>>> we want to release a container image for each Python 3 minor version that
>>>>> Beam is compatible with.
>>>>>
>>>>
>>>> Probably worth a separate discussion. I would favor first releasing a
>>>> python 3 compatible version before figuring out how we would target
>>>> multiple python 3 versions.
>>>>
>>>
>>>>
>>>>>
>>>>> On Wed, Jan 16, 2019 at 5:48 PM Ankur Goenka <[email protected]>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Jan 16, 2019 at 5:37 PM Ahmet Altay <[email protected]> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jan 16, 2019 at 5:28 PM Ankur Goenka <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> - Could we start from snapshots first and then do it for releases?
>>>>>>>> +1, releasing snapsots first makes sense to me.
>>>>>>>> - For snapshots, do we need to clean old containers after a while?
>>>>>>>> Otherwise I guess we will accumulate lots of containers.
>>>>>>>> For snap shots we can maintain a single snapshot image from git
>>>>>>>> HEAD daily. Docker has the internal image container id which changes
>>>>>>>> everytime an image is changed and pulls new images as needed.
>>>>>>>>
>>>>>>>
>>>>>>> There is a potential use this may not work with. If a user picks up
>>>>>>> a snaphsot build and want to use it until the next release arrives. I 
>>>>>>> guess
>>>>>>> in that case the user can copy the snapshotted container image and rely 
>>>>>>> on
>>>>>>> that.
>>>>>>>
>>>>>>>
>>>>>> Yes, that should be reasonable.
>>>>>>
>>>>>>> - Do we also need additional code changes for snapshots and releases
>>>>>>>> to default to these specific containers? There could be a version based
>>>>>>>> mechanism to resolve the correct container to use.
>>>>>>>> The current image defaults have username in it. We should be ok by
>>>>>>>> just updating the default image url to published image url.
>>>>>>>>
>>>>>>>> We should also check for pricing and details about Apache-Bintray
>>>>>>>> agreement before pushing images and changing defaults.
>>>>>>>>
>>>>>>>
>>>>>>> There is information on bintray's pricing page about open source
>>>>>>> projects [1]. I do not know if there is a special apache-bintray 
>>>>>>> agreement
>>>>>>> or not. If there is no special agreement there is a 10GB storage limit 
>>>>>>> for
>>>>>>> using bintray.
>>>>>>>
>>>>>> As each image can easily run into Gigs, 10GB might not be sufficient
>>>>>> for future proofing.
>>>>>> We can also register docker image to docker image registry and not
>>>>>> have bintray in the name to later host images on a different vendor for
>>>>>> future proofing.
>>>>>>
>>>>>>
>>>>>>> [1] https://bintray.com/account/pricing?tab=account&type=pricing
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Jan 16, 2019 at 5:11 PM Ahmet Altay <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> This sounds like a good idea. Some questions:
>>>>>>>>>
>>>>>>>>> - Could we start from snapshots first and then do it for releases?
>>>>>>>>> - For snapshots, do we need to clean old containers after a while?
>>>>>>>>> Otherwise I guess we will accumulate lots of containers.
>>>>>>>>> - Do we also need additional code changes for snapshots and
>>>>>>>>> releases to default to these specific containers? There could be a 
>>>>>>>>> version
>>>>>>>>> based mechanism to resolve the correct container to use.
>>>>>>>>>
>>>>>>>>> On Wed, Jan 16, 2019 at 4:42 PM Ankur Goenka <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi All,
>>>>>>>>>>
>>>>>>>>>> As portability/FnApi is taking shape and are compatible with ULR
>>>>>>>>>> and Flink. I wanted to discuss the release plan release of SDKHarness
>>>>>>>>>> Docker images. Of-course users can create their own images but it 
>>>>>>>>>> will be
>>>>>>>>>> useful to have a default image available out of box.
>>>>>>>>>> Pre build image are a must for making FnApi available for users
>>>>>>>>>> and not just the developers.
>>>>>>>>>> The other purpose of these images is to be server as base image
>>>>>>>>>> layer for building custom images.
>>>>>>>>>>
>>>>>>>>>> Apache already have bintray repositories for beam.
>>>>>>>>>> https://bintray.com/apache/beam-snapshots-docker
>>>>>>>>>> https://bintray.com/apache/beam-docker
>>>>>>>>>>
>>>>>>>>>> Shall we start pushing Python/Java/Go SDK Harness containers to
>>>>>>>>>> https://bintray.com/apache/beam-docker for beam release and
>>>>>>>>>> maintain daily snapshot at
>>>>>>>>>> https://bintray.com/apache/beam-snapshots-docker ?
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Ankur
>>>>>>>>>>
>>>>>>>>>
>>
>> --
>> ================
>> Ruoyun  Huang
>>
>>

Reply via email to