On Tue, Apr 21, 2020 at 3:29 PM Ash Berlin-Taylor <[email protected]> wrote:

> I'm still not quite sure what problem are we solving here either...?
> What is broken with the current/already merged solution
>

It's explained in the https://github.com/apache/airflow/pull/8400 (just
read my first comment explaining it + some of the screenshots from
dockerhub showing the problem + there is more discussion that follows and
goes into a lot of details). But I am super happy and patient to explain it
again so here it is:

The root cause is that currently we often merge things faster than
DockerHub builds are. It's not uncommon for a master build to wait 10-12
hours in the queue currently (well that was before I implemented the "less
frequent" workaround that I am not happy about).

The current workaround (it was from the beginning temporary until we find a
better solution/wait for INFRA ) introduces 10 hours "window" of building
the images. If an image is older than 10 hours, it will be rebuilt - if not
- it won't be rebuild. This has the disadvantage that it assumes that we
are always "active" and new commits are merged to master. But if there is
no recent merge and there was a sequence of merges within 10 hours window
the image will be built only from the first commit (until the next merge to
master). That helps us to catch-up with the build queue. But we have
potentially stale master image until the next merge. This is not a problem
for PRs/CI because we will always incrementally build from the last
available image (so we will catch up for each PR build). But this is not
optimal - for example if a setup.py was changed, each job in CI takes few
minutes longer to catch-up. Also people who would like to download the
recent master image might not see the most recent changes.


> From a philosophical view I don't like deleting tags, and this feels
> like a bit of a hack to work around limitations in other systems.
> (Welcome to being a developer I guess.) What you have proposed is better
> than having the tags build up, certainly, but I'm still not wild about
> it (And to check: we can't just re-push a single "nightly" tag as Docker
> Hub will not rebuild when a tag changes? Have we confirmed this?)
>

That might be perfectly acceptable option for me as well. I know that it
will work when we push new tag, I do not yet know if you re-push the same
tag with different commit. I wanted to propose it in the previous mail but
for me it was conceptually the same as remove/re-add the tag. Essentially
that's what happens in git when you move tag. But I am also perfectly happy
with "nightly" tag. I will check and let you know.


>
> I've read the discussion you linked to, but the only thing I see is this
> comment https://github.com/apache/airflow/pull/8400#issuecomment-614796124


From: https://github.com/apache/airflow/pull/8400#issuecomment-614783967

"Alternatively we might want to rebuild the images regularly - every night
for example, but this is a bit problematic (I explained it to @turbaszek
above) - we could generate a trigger URL and use it in an external service
(there is no cron build support in DockerHub. But I am not sure it's safe."

And there are lots of resolved comments below and above and discussion
about different approach we could take - where we discussed it with Tomek
and Daniel. I also summarised my concern in the original response to your
proposal. Also there screenshots showing the queue and the problems
associated.

>
> > But is it safe to store such URL somewhere? Is it something that is
> > sustainable long term (who will take care that it is actually still
> > working :)) .... Who will watch the watcher. ?
>
> Yes, if we store that build URL in a secure secret, for instance using
> the encryption approach suggested here
>
> https://help.github.com/en/actions/configuring-and-managing-workflows/creating-and-storing-encrypted-secrets#limits-for-secrets
> we can get Apache Infra to add a single secret then we can add/change
> values easily in the future.


Sure we could to that as well. But if the URL leaks we can get DDOSd. If we
can avoid it, I think it's much better to get Dockerhub pull for our tags
than us pushing the secret URL.


> In the past I've used https://github.com/voxpupuli/hiera-eyaml even
> outside of puppet as it only encrypts the values, not the whole file,
> which makes it a bit easier to see what setting is changed, even if the
> setting is not visible in the diff. So what I'd suggest is we ask Infra
> to create an random GPG key, put the private key in a Secret in Github
> and then provide us with the public key. I'm happy to set this up if
> it's the route we want to go down.
>

That's a fantastic idea! I have a number of ideas where such secrets can be
used if we have that mechanism!  Please set it up!


>
> If it's a nightly Github action, so we'd see CRON failures as we did
> with Travis, no?
>

We will And that's designed to fail from time to time - for example when -
for whatever reason our Dockerfile will stop building or our
from-the-scratch dependencies will not install, we should get notification
and act on it (which we did). The CRON job's purpose is to rebuild
everything from the scratch and see if everything still works when you do.
That's a fantastic tool to have an early warning about external reasons for
the build to fail and I can remember at least three times where I could on
a slow pace introduce a fix that would otherwise cause failures for all the
people trying to make PR.

Reply via email to