Some update on my GitLab experiences so far:

TL;DR; I think the POC has shown that we can fairly easily replicate the CI
in GitLab + Kubernetes. I think i can say - it generally works, I can plug
it in for master/v1-10-test builds in the main Airflow project for a few
weeks to see how it is doing (while I am no holidays) and once we see it
running and get the support for PRs from GitLab we can switch to it.

What do you think ? Should i call a vote or just try to set it up ?

Some details

   - I manged to get full working builds in GitLabCI + kubernetes - without
   the kubernetes-specific tests yet, but this should be rather easy with kind
   (looking at it next):
   - Working example here - you can take a look and compare the UI/how it
   is to navigate, comparing to Travis etc:
   https://gitlab.com/Jarek.Potiuk/airflow/pipelines/74625817
   - Per-job it is a bit slower than Travis so far (still around 35 minutes
   in total), but I plan to optimise it further. I can play with memory/cpu
   settings of individual workers (Got some reasonable values now), I can use
   local SSD disk as Docker storage/logs/etc
   - I got an approval for 72vCPU quota (up for initial 24) - that should
   let us build 3 builds in parallel independently from each other.
   - I managed to get Preemptible nodes working (we have built in retry
   mechanism in GitLab to work in case of system failures like that
   - Current spending with > 120 builds is 40 USD. We should be way below
   500 USD/month according to my back-of-the-envelope calculations. Likely
   well below
   - The current setup does not use GCR as cache and Kaniko as I originally
   planned. GCR would require custom authentication (and easy-to-steal
   secrets) and Kaniko does not yet well handle multi-staging builds (cache
   does not work https://github.com/GoogleContainerTools/kaniko/issues/682).
   I updated
   
https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-23+Migrate+out+of+Travis+CI
to
   reflect that.
   - We only use GCR as mirroring of DockerHub - so that we can have
   reliable downloads not depending on DockerHub's stability (it has problems
   sometimes)
   - All in-all, it's GCP-independent. It could be run in any Kubernetes
   cluster (some optimisations like local volumes mounting for docker engine
   might have GCP-specific assumptions, but should be generally replicable).
   - You can take a look at the current source code in
   https://github.com/potiuk/airflow/commits/test-gitlab-ci
   - There will be some updates (I will get rid of custom builder Docker,
   simplify it a bit and implement kubernetes tests) - it's mostly some
   cleanups + removal of Travis-Specific variables + gitlab.ci yaml with
   job definitions.

J.


On Wed, Jul 31, 2019 at 10:57 AM Jarek Potiuk <jarek.pot...@polidea.com>
wrote:

> So GitLab already works on automatically running builds from for PRs :).
>
> Kamil got involved and will be out advocate on it:
> https://gitlab.com/gitlab-org/gitlab-ce/issues/65139
> J.
>
> Principal Software Engineer
> Phone: +48660796129
>
> pt., 26 lip 2019, 18:12 użytkownik Jarek Potiuk <jarek.pot...@polidea.com>
> napisał:
>
>> Update: I added appropriate comment in the GitLab CI issue about PRs and
>> we are getting attention of Jason Lenny - director of Product Management @
>> GitLab. Let's hope they prioritise it quickly enough.
>>
>> Speaking of potential complexity/Maintenance - in order to alleviate any
>> maintenance worries, I think about setting up the whole system on GitLab
>> CI + GKE and running it in parallel to Travis for quite some time (even
>> months) so that we can switch it at any time. Then we will be able to tune
>> it according to real use cases and compare the experience of both systems.
>>
>> Also I am going for holidays in two weeks and I will make sure that there
>> will be someone with GitLab + Kubernetes experience (from my company) who
>> can take over and make sure there will be no problems. However I am quite
>> confident :D nothing is going to happen while I am away. I would also
>> invite whoever from committers who would like to join the project and
>> gitlab instance (once I setup POC) to learn and see how easy it is and how
>> maintenance free it is going to be.
>>
>> J.
>>
>> On Fri, Jul 26, 2019 at 2:56 PM Kamil Breguła <kamil.breg...@polidea.com>
>> wrote:
>>
>>> GKE and its own CI will allow us to solve other problems - building
>>> and publishing documentation from the master branch. Currently,
>>> building is done using the RTD service. Unfortunately, our project is
>>> too large and often the documentation is not built properly.
>>> https://readthedocs.org/projects/airflow/builds/
>>> We should think about another way to build documentation. In the ideal
>>> world, building documentation should use the same environment as
>>> checking documentation on CI. Adding this step to Travis can further
>>> reduce our development opportunities.
>>> Discussion on Slack about it:
>>> https://apache-airflow.slack.com/archives/CJ1LVREHX/p1561756652021900
>>>
>>> It is worth thinking about the fact that our project will soon have a
>>> website and our documentation will also be available in many
>>> languages. Currently, talks are taking place with the design studio
>>> and developers who can make these websites ;-)
>>>
>>> https://lists.apache.org/thread.html/982c7baa06742ad722f2baa0db53ad99aea6c26b14b7d6d4aa522677@%3Cdev.airflow.apache.org%3E
>>> We should provide an environment that will allow you to build a
>>> website and documentation. At best, these tasks should be combined. I
>>> hope that we will be able to create a website that will be a real
>>> support for the community on current events, so it will be updated
>>> frequently.
>>>
>>> It seems to me that the project will grow. If we now have problems
>>> with Travis, then the significance of these problems in the future can
>>> only grow. Now we have a chance to provide a stable infrastructure for
>>> the project for a long time.
>>>
>>> I would like to share another situation which was not pleasant for me.
>>> Recently I wanted to send >10 PR, but because of Travis, I had to wait
>>> for the weekend to send changes. If I would send my changes in a week,
>>> I would block the queue for a few hours. Although I did it over the
>>> weekend, I got the message that the queue is blocked on Travis by my
>>> jobs.
>>>
>>> On Tue, Jul 23, 2019 at 6:12 PM Jarek Potiuk <jarek.pot...@polidea.com>
>>> wrote:
>>> >
>>> > Hello Everyone,
>>> >
>>> > I prepared a short docs where I described general architecture of the
>>> > solution I imagine we can deploy fairly quickly - having GitLab CI
>>> support
>>> > and Google provided funding for GCP resources.
>>> >
>>> > I am going to start working on Proof-Of-Concept soon but before I start
>>> > doing it, I would like to get some comments and opinions on the
>>> proposed
>>> > approach. I discussed the basic approach with my friend Kamil who
>>> works at
>>> > GitLab and he is a CI maintainer and this is what we think will be
>>> > achievable in fairly short time.
>>> >
>>> >
>>> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-23+Migrate+out+of+Travis+CI
>>> >
>>> > I am happy to discuss details and make changes to the proposal - we can
>>> > discuss it here or as comments in the document.
>>> >
>>> > Let's see what people think about it and if we get to some consensus we
>>> > might want to cast a vote (or maybe go via lasy consensus as this is
>>> > something we should have rather quickly)
>>> >
>>> > Looking forward to your comments!
>>> >
>>> > J.
>>> >
>>> > --
>>> >
>>> > Jarek Potiuk
>>> > Polidea <https://www.polidea.com/> | Principal Software Engineer
>>> >
>>> > M: +48 660 796 129 <+48660796129>
>>> > [image: Polidea] <https://www.polidea.com/>
>>>
>>
>>
>> --
>>
>> Jarek Potiuk
>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>
>> M: +48 660 796 129 <+48660796129>
>> [image: Polidea] <https://www.polidea.com/>
>>
>>

-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Reply via email to