* Yeah v2.0.0dev seemed superfluous - we only have one master at the end :)

* The Main image is now "bare" airflow - it contains just bare minimum of
apt-get dependencies needed to install  "*all*" extras it's around 415 MB -
In the latest version of the CI image I use it to do all the lightweight
operations - such as pylint/mypy/flake8 etc are all run using this image.

* The CI image contains 'devel_ci' extras + all the apt-get dependencies
needed to install them + all the apt-get deps needed to run all the tests
(including java, openssh, postgres-client, krb5-user, sqlite3 + tools
useful for debugging (tmux, vim) + hadoop + hive + minicluster - those are
not docker-composed and installed in the image itself. Maybe some day we
will be able to move all of them to docker compose (for now we try to do ti
with the minicluster with Gerardo). It's around 1GB

Both Main and CI image are not optimised for running airflow but for build
time. One big difference is that CASSANDRA pip driver is not compiled -
thus not optimised (It adds around 10 minutes to build time) on my PC.
There might be few other optimisations that might be needed for prod
regarding docker internals.

* For the prod image I think we need to optimize it at the very least and
take into account how people are using puckel image and at least give
people viable replacement (with maybe some workarounds). Also puckel
contains some docker-compose: for Local/Celery executor which we also might
want to provide as "accompanying" the image (that's where labelling scheme
will become important).  I wanted to do some discussion around it. The
image has subset of extras only but it has few things we do not have in
neither image (pytz, pyasn1 etc.).

BTW. I believe in the current master '*devel_ci'* is almost the same as '
*all*'. '*all*' = devel_all from setup.py where 'devel_ci' = devel_all -
snakebite (for python3) . This IS a bit strange and I wanted to come back
to it after I finish the DockerImage change (I wanted to clarify that when
we do official image).

J.



On Mon, Jun 17, 2019 at 5:34 PM Ash Berlin-Taylor <[email protected]> wrote:

> Thinking about it a bit I'd suggest we remove "v2.0.0dev0" when master -
> at the point we release 2.0.0 and master becomes 2.1.0dev0 we want master
> to be that (and I don't think we want to have the old tags hanging around)
>
> What's the difference between Main non-CI and prod? The "prod" flavour
> should be the default, so `airflow:1.10.4` is designed/suitable for prod
> (without the `v` prefix seems to be the common pattern in docker, for
> example 'postgres:9.6')
>
> And the difference bwteeen main non-ci (as you had it) and CI is that one
> contains [all], and the other is that + devel?
>
>
> > On 17 Jun 2019, at 16:20, Jarek Potiuk <[email protected]> wrote:
> >
> > OK. The next build should prepare "master" tags for all the "master"
> > images: Would such revised labelling make sense?
> >
> > V*ersions from master (development use only):*
> >
> >   - Main non-CI images (small) *: airflow:master-v2.0.0dev0-python3.5,
> >   airflow:**master-v2.0.0dev0-python3.6, airflow:master*
> >   ==airflow:master-v2.0.0dev0-python3.5
> >   - CI images (big) *: airflow:master-v2.0.0dev0-ci-python3.5,
> > airflow:**master-v2.0.0dev0-ci-python3.6,
> >   airflow:master-ci*==airflow:master-v2.0.0dev0-ci-python3.5
> >   - Production optimised images: (future):
> > *airflow:master-v2.0.0dev0-prod-python3.5,
> >   airflow:**master-v2.0.0dev0-prod-python3.6, airflow:master-prod*
> >   ==airflow:master-v2.0.0dev0-prod-python3.5
> >
> > *Release versions (future):*
> >
> >   - Main non-CI images (small):  *airflow:v1.10.4-python3.5, *
> >   *airflow:v1.10.4**-python3.6, airflow:latest*==*airflow:v1.10.4 (should
> >   we have them ? I think it might be useful to have a reference image for
> >   tests)*
> >   - CI images (big): *airflow:v1.10.4-ci-python3.5,
> > **airflow:v1.10.4**-ci-python3.6,
> >   airflow:latest-ci*==*airflow:v1.10.4-ci **(should we have them ? I
> think
> >   it might be useful to have a reference image for tests)*
> >   - Production optimised images: *airflow:v1.10.4-prod-python3.5, *
> >   *airflow:v1.10.4**-prod-python3.6, airflow:latest-prod*==
> >   *airflow:v1.10.4-pro*
> >
> >
> > *J.*
> >
> > On Mon, Jun 17, 2019 at 4:56 PM Jarek Potiuk <[email protected]>
> > wrote:
> >
> >> Sure. I agree "latest" might be misleading until we work out how we
> >> release it so I am fine with changing to master :). It's super easy -
> >> merely changing tags in DockerHub.
> >>
> >> On Mon, Jun 17, 2019 at 4:25 PM Ash Berlin-Taylor <[email protected]>
> wrote:
> >>
> >>> That page does mention "Nightly" builds which is close to what building
> >>> master would be. The other thing that matters is what we actual call A
> >>> Release.
> >>>
> >>>> Do not include any links on the project website that might encourage
> >>> non-developers to download and use nightly builds, snapshots, release
> >>> candidates, or any other similar package
> >>>
> >>> I think we're find so long as we don't do that -- or in this case,
> since
> >>> we will probably want to link to the docker hub page once we have
> versioned
> >>> images there if we make it clear that `:master` is not intended for end
> >>> users, and by the same argument if we have anything as `:latest` it
> should
> >>> be a docker image relating to an official Release.
> >>>
> >>> Jarek: no `latest` pointing at CI images please.
> >>>
> >>> -a
> >>>
> >>>> On 17 Jun 2019, at 15:04, Philippe Gagnon <[email protected]>
> >>> wrote:
> >>>>
> >>>> One thing: we talked about releasing images under a "master" tag
> >>> (perhaps in another thread?), we should check if this is compatible
> with
> >>> Apache's release policy [1]. It's not clear to me if this is allowable
> or
> >>> not after a cursory reading.
> >>>>
> >>>> [1] http://www.apache.org/legal/release-policy.html#what
> >>>>
> >>>> On Mon, Jun 17, 2019 at 9:48 AM Jarek Potiuk <
> [email protected]>
> >>> wrote:
> >>>> Anyone has more comments. I think prevailing opnion is:
> >>>> 1) To keep all images in one repo (apache/airflow)
> >>>> 2) I am not sure about labelling but I might try to document all cases
> >>> in a
> >>>> "production" image proposal that I would like to start as soon as we
> >>> merge
> >>>> the current CI image (which I think is quite close to finalisation).
> >>>>
> >>>> J.
> >>>>
> >>>> On Tue, Jun 11, 2019 at 10:59 PM Jarek Potiuk <
> [email protected]
> >>>>
> >>>> wrote:
> >>>>
> >>>>> It's super easy to do :)
> >>>>>
> >>>>> On Tue, Jun 11, 2019 at 10:33 PM Ash Berlin-Taylor <[email protected]>
> >>> wrote:
> >>>>>
> >>>>>> I'm fine with us just publishing release images using the newest
> >>> python
> >>>>>> release (i.e. 3.7) as the main reason we support older python
> >>> versions is
> >>>>>> to support distros thats ship those versions.(i.e. Deb stable), but
> >>> I don't
> >>>>>> think we need to support that in docker.
> >>>>>>
> >>>>>> (But if it's easy to do since we want them for ci then sure)
> >>>>>>
> >>>>>> -ash
> >>>>>>
> >>>>>> On 11 June 2019 21:21:28 BST, Jarek Potiuk <
> [email protected]
> >>>>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>> Yeah Kamil - python 3.5 is the default one for now. I think we
> >>> should have
> >>>>>>> another discussion here - how many versions to support. There is
> >>> this
> >>>>>>> ticket opened today :
> >>> https://issues.apache.org/jira/browse/AIRFLOW-4762 about
> >>>>>>> supporting python 3.6 and 3.7 in tests. Anyone has a strong opinion
> >>> on
> >>>>>>> this? I am for testing on all 3.5, 3.6 and 3.7 even if it increases
> >>> the
> >>>>>>> build/test time on Travis. There are a number of differences
> >>> between those
> >>>>>>> major versions (I have a blog post about it in writing ) but I
> >>> think there
> >>>>>>> is concern about eating Apache Travis time.
> >>>>>>>
> >>>>>>> Anyone against those three ?
> >>>>>>>
> >>>>>>> On Tue, Jun 11, 2019 at 8:38 PM Kamil BreguĊ‚a <
> >>> [email protected]>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>> 1) I would prefer to use one repository.
> >>>>>>>> +1
> >>>>>>>>
> >>>>>>>> 2) The presented schema looks logical to me. I had doubts whether
> >>>>>>>> Python 3.5 was a good choice for "latest" version, but I checked
> >>> that
> >>>>>>>> travis uses only this version.
> >>>>>>>>
> >>>>>>>> On Tue, Jun 11, 2019 at 3:04 PM Jarek Potiuk <
> >>> [email protected]>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Hello everyone,
> >>>>>>>>>
> >>>>>>>>> We are close to finish AIP-10 (Airlfow image for CI) and seems
> >>> that we
> >>>>>>>>>
> >>>>>>>> will
> >>>>>>>>
> >>>>>>>>> start working soon on an official image AIP, but in the meantime
> >>> we have
> >>>>>>>>> 1.10.4 release coming and we would like to agree tagging scheme
> >>> used for
> >>>>>>>>> the current CI images. We discussed it a bit on Slack, but it's
> >>> time to
> >>>>>>>>> bring it here. I created a JIRA issue for it:
> >>>>>>>>> https://issues.apache.org/jira/browse/AIRFLOW-4764  and my
> >>> proposals
> >>>>>>>>>
> >>>>>>>> after
> >>>>>>>>
> >>>>>>>>> the initial discussion are those:
> >>>>>>>>>
> >>>>>>>>> First of all we have different images that we can talk about :
> >>>>>>>>>
> >>>>>>>>>    1. "base" one - with bare development-ready airflow with
> >>> minimum set
> >>>>>>>>>
> >>>>>>>> of
> >>>>>>>>
> >>>>>>>>>    dependencies
> >>>>>>>>>    2. "CI" with all the tools packages that are needed for CI
> >>> tests
> >>>>>>>>>    3. Soon we will likely have an "official" one which might be
> >>> used in
> >>>>>>>>>    similar fashion as the "puckel" one.
> >>>>>>>>>
> >>>>>>>>> There are two decisions to make:
> >>>>>>>>>
> >>>>>>>>> 1) How to keep those images - in one repository or whether we
> >>> should have
> >>>>>>>>> separate repos.
> >>>>>>>>>
> >>>>>>>>> It is easier for now to keep all of them within apache/airflow
> >>>>>>>>> <
> >>> https://cloud.docker.com/u/apache/repository/docker/apache/airflow>
> >>>>>>>>>
> >>>>>>>> repository
> >>>>>>>>
> >>>>>>>>> it seems and use a labelling scheme to separate those (there is
> >>> nothing
> >>>>>>>>> wrong with that but it might seem a bit hacky). It's a bit
> >>> easier to
> >>>>>>>>> maintain with access and CI.
> >>>>>>>>>
> >>>>>>>>> We could also think about separate apache/airflow-ci,
> >>> apache/airflow-dev,
> >>>>>>>>> apache/airflow-prod or smth similar - that would require some
> >>>>>>>>> infrastructure tickets and is not very common.
> >>>>>>>>>
> >>>>>>>>> 2) What labelling scheme to use(apache/airflow:label). My
> >>> proposal is
> >>>>>>>>> similar to this (if we keep everything in the airflow repository)
> >>>>>>>>>
> >>>>>>>>>    - *latest* = latest released version (python 3.5)  = *
> >>>>>>>>>
> >>>>>>>> v1.10.3-python3.5*
> >>>>>>>>
> >>>>>>>>> - *master* = latest master version (python 3.5)  =
> >>>>>>>>>
> >>>>>>>> *v2.0.0dev0-python3.5*
> >>>>>>>>
> >>>>>>>>>    - *v1.10.3-python3.5,v1.10.3-python3.6*  - released 1.10.3
> >>> with python
> >>>>>>>>>    3.5/3.6
> >>>>>>>>>    - *latest-ci *= latest released version of CI variant (python
> >>> 3.5)
> >>>>>>>>>    *v1.10.3-ci-python3.5*
> >>>>>>>>>    - *master-ci* = latest master version of CI variant (python
> >>> 3.5)
> >>>>>>>>>    *v2.0.0dev0-ci-python3.5*
> >>>>>>>>>    - *v1.10.3-ci-python3.5, v1.10.3-ci-python3.6* - released
> >>> 1.10.3 with
> >>>>>>>>>    python 3.5/3.6
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> My preference is to keep all the images in one repo and use
> >>> labelling
> >>>>>>>>> scheme as above,
> >>>>>>>>> but I am open to discuss this.
> >>>>>>>>>
> >>>>>>>>> J,
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>>
> >>>>>>>>> Jarek Potiuk
> >>>>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
> >>>>>>>>>
> >>>>>>>>> M: +48 660 796 129 <+48660796129>
> >>>>>>>>> [image: Polidea] <https://www.polidea.com/>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>>> --
> >>>>>
> >>>>> Jarek Potiuk
> >>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
> >>>>>
> >>>>> M: +48 660 796 129 <+48660796129>
> >>>>> [image: Polidea] <https://www.polidea.com/>
> >>>>>
> >>>>>
> >>>>
> >>>> --
> >>>>
> >>>> Jarek Potiuk
> >>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
> >>>>
> >>>> M: +48 660 796 129 <+48660796129>
> >>>> [image: Polidea] <https://www.polidea.com/>
> >>>
> >>>
> >>
> >> --
> >>
> >> Jarek Potiuk
> >> Polidea <https://www.polidea.com/> | Principal Software Engineer
> >>
> >> M: +48 660 796 129 <+48660796129>
> >> [image: Polidea] <https://www.polidea.com/>
> >>
> >>
> >
> > --
> >
> > Jarek Potiuk
> > Polidea <https://www.polidea.com/> | Principal Software Engineer
> >
> > M: +48 660 796 129 <+48660796129>
> > [image: Polidea] <https://www.polidea.com/>
>
>

-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Reply via email to