On Tue, Apr 21, 2020 at 3:29 PM Ash Berlin-Taylor <[email protected]> wrote:
> I'm still not quite sure what problem are we solving here either...? > What is broken with the current/already merged solution > It's explained in the https://github.com/apache/airflow/pull/8400 (just read my first comment explaining it + some of the screenshots from dockerhub showing the problem + there is more discussion that follows and goes into a lot of details). But I am super happy and patient to explain it again so here it is: The root cause is that currently we often merge things faster than DockerHub builds are. It's not uncommon for a master build to wait 10-12 hours in the queue currently (well that was before I implemented the "less frequent" workaround that I am not happy about). The current workaround (it was from the beginning temporary until we find a better solution/wait for INFRA ) introduces 10 hours "window" of building the images. If an image is older than 10 hours, it will be rebuilt - if not - it won't be rebuild. This has the disadvantage that it assumes that we are always "active" and new commits are merged to master. But if there is no recent merge and there was a sequence of merges within 10 hours window the image will be built only from the first commit (until the next merge to master). That helps us to catch-up with the build queue. But we have potentially stale master image until the next merge. This is not a problem for PRs/CI because we will always incrementally build from the last available image (so we will catch up for each PR build). But this is not optimal - for example if a setup.py was changed, each job in CI takes few minutes longer to catch-up. Also people who would like to download the recent master image might not see the most recent changes. > From a philosophical view I don't like deleting tags, and this feels > like a bit of a hack to work around limitations in other systems. > (Welcome to being a developer I guess.) What you have proposed is better > than having the tags build up, certainly, but I'm still not wild about > it (And to check: we can't just re-push a single "nightly" tag as Docker > Hub will not rebuild when a tag changes? Have we confirmed this?) > That might be perfectly acceptable option for me as well. I know that it will work when we push new tag, I do not yet know if you re-push the same tag with different commit. I wanted to propose it in the previous mail but for me it was conceptually the same as remove/re-add the tag. Essentially that's what happens in git when you move tag. But I am also perfectly happy with "nightly" tag. I will check and let you know. > > I've read the discussion you linked to, but the only thing I see is this > comment https://github.com/apache/airflow/pull/8400#issuecomment-614796124 From: https://github.com/apache/airflow/pull/8400#issuecomment-614783967 "Alternatively we might want to rebuild the images regularly - every night for example, but this is a bit problematic (I explained it to @turbaszek above) - we could generate a trigger URL and use it in an external service (there is no cron build support in DockerHub. But I am not sure it's safe." And there are lots of resolved comments below and above and discussion about different approach we could take - where we discussed it with Tomek and Daniel. I also summarised my concern in the original response to your proposal. Also there screenshots showing the queue and the problems associated. > > > But is it safe to store such URL somewhere? Is it something that is > > sustainable long term (who will take care that it is actually still > > working :)) .... Who will watch the watcher. ? > > Yes, if we store that build URL in a secure secret, for instance using > the encryption approach suggested here > > https://help.github.com/en/actions/configuring-and-managing-workflows/creating-and-storing-encrypted-secrets#limits-for-secrets > we can get Apache Infra to add a single secret then we can add/change > values easily in the future. Sure we could to that as well. But if the URL leaks we can get DDOSd. If we can avoid it, I think it's much better to get Dockerhub pull for our tags than us pushing the secret URL. > In the past I've used https://github.com/voxpupuli/hiera-eyaml even > outside of puppet as it only encrypts the values, not the whole file, > which makes it a bit easier to see what setting is changed, even if the > setting is not visible in the diff. So what I'd suggest is we ask Infra > to create an random GPG key, put the private key in a Secret in Github > and then provide us with the public key. I'm happy to set this up if > it's the route we want to go down. > That's a fantastic idea! I have a number of ideas where such secrets can be used if we have that mechanism! Please set it up! > > If it's a nightly Github action, so we'd see CRON failures as we did > with Travis, no? > We will And that's designed to fail from time to time - for example when - for whatever reason our Dockerfile will stop building or our from-the-scratch dependencies will not install, we should get notification and act on it (which we did). The CRON job's purpose is to rebuild everything from the scratch and see if everything still works when you do. That's a fantastic tool to have an early warning about external reasons for the build to fail and I can remember at least three times where I could on a slow pace introduce a fix that would otherwise cause failures for all the people trying to make PR.
