I also like the idea of SLIM images - always helpful.

Howard

On Wed, May 4, 2022 at 4:53 PM Ping Zhang <[email protected]> wrote:

> Hi Jarek,
>
> I really like the idea of having a slim airflow docker image.  500MB
> uncompressed is tiny 👍
>
>
> Thanks,
>
> Ping
>
>
> On Sun, May 1, 2022 at 8:41 AM Jarek Potiuk <[email protected]> wrote:
>
>> And just to clarify. Those "slim" images are not at all "toothless". You
>> can actually do stuff with them :)
>>
>> The 4 providers that are preinstalled are there:
>>
>> apache-airflow-providers-ftp    | File Transfer Protocol (FTP)
>> https://tools.ietf.org/html/rfc114             | 2.1.2
>> apache-airflow-providers-http   | Hypertext Transfer Protocol (HTTP)
>> https://www.w3.org/Protocols/            | 2.1.2
>> apache-airflow-providers-imap   | Internet Message Access Protocol (IMAP)
>> https://tools.ietf.org/html/rfc3501 | 2.2.3
>> apache-airflow-providers-sqlite | SQLite https://www.sqlite.org/
>>                                      | 2.1.3
>>
>> We could probably further slim them down but that would limit the
>> extensibility a bit and I consider 500 MB uncompressed as pretty "decent" -
>> it's ~ 130-160 MB of compressed data when you pull the image.
>>
>> J.
>>
>>
>>
>> On Sun, May 1, 2022 at 5:26 PM Jarek Potiuk <[email protected]> wrote:
>>
>>> Hello everyone,
>>>
>>> TL;DR: I am looking for consensus on releasing "slim" versions of PROD
>>> images - ones that will be way smaller and contain no providers nor
>>> other extras and would be database-specific.
>>>
>>> Context:
>>>
>>> Now after we are done with some infra changes that were also released
>>> in 2.3.0 I came back to the issue raised in in
>>> https://github.com/apache/airflow/issues/20849 which was originally
>>> about "vanilla" image for Airflow, but I renamed the idea to "slim"
>>> image (following similar convention by various distro and Python
>>> providers). The issue itself explains why there is a need for such
>>> images.
>>>
>>> The idea is to have a very small "base" ("slim") image that users will
>>> be able to extend  - not only a "regular" (see the relation with
>>> "slim" :D ?)  image where we pre-install a set of providers and
>>> support multiple database backends.
>>>
>>> The "slim" images also have the advantage that we can use
>>> "no-constraints" dependencies with them - which means that in those
>>> images, the dependencies are "latest" that airflow supports even if
>>> some providers would limit the dependencies.
>>>
>>> I looked at what it would mean and really what it translates to is
>>> that we would have to push many more images.
>>>
>>> The bad news:
>>>
>>> We need to push matrix of 4 * 3 = 12 new "slim" images (plus some
>>> aliases for "latest")
>>> *  Python versions: 3.7, 3.8, 3.9, 3.10
>>> *  Database: postgres, mysql, mssql
>>>
>>> Postgres images would be additionally multiplatform (AMD64/ARM64) and
>>> for now MySQL and MsSQL would  be just AMD64 (until we add support for
>>> ARM for those).
>>> Those are plenty of images, but this is a rather normal approach if
>>> you look for a number of other images published by multiple
>>> "platform-like" products.
>>>
>>> The good news:
>>>
>>> We only need to do it at release time and we already have the right
>>> set of scripts and parameters to enable that. It will take a bit
>>> longer, but those images are much smaller and building and pushing
>>> them is WAY faster and smaller han the regular image.
>>>
>>> Some comparison:
>>>
>>> Size (uncompressed): Regular (1.1G), Slim (500MB)
>>> Time to build single image: Regular(6m), Slim (up to 3m)
>>>
>>> Overall the release process would take some 20 mins longer if we
>>> release the slim images (and I already made it a separate step so it
>>> should not block "regular" release).
>>>
>>> The very good news:
>>>
>>> I've actually prepared PR:
>>> https://github.com/apache/airflow/pull/23391 to add this feature
>>> (including the docs), and it's a very small change. It does not change
>>> any of the source code of airflow or Dockerfile, we basically need to
>>> extend our "dev" script to build and push images to ... build and push
>>> more images. I actually even .. prepared and pushed 2.3.0 images of
>>> airflow to my private dockerhub account so that everyone can see how
>>> it will look like.
>>>
>>> You can see it here:
>>>
>>> https://hub.docker.com/repository/docker/potiuk/airflow/tags?page=1&ordering=last_updated&name=2.3.0
>>>
>>> I **believe** those changes don't even need PMC votes for release, and
>>> this is more a procedural change than software release, so we
>>> **could** release the "slim" 2.3.0 images even now - so that they are
>>> available as of 2.3.0. I think even if we see that this is a welcome
>>> change (despite the complexity of our dockerhub images available) it
>>> could even be agreed to via lasy-consensus if we see consensus
>>> forming.
>>>
>>> J.
>>>
>>

Reply via email to