It's definitely confirmed that the problem is on Travis CI side: I re-run the commit before the new CI was introduced (I cherry-picked a small doc fix related to recent sphinx dependency update) and it fails in exactly the same way (memory and cpu problems): https://travis-ci.org/apache/airflow/builds/562450592.
For now I cannot do much but wait for the INFRA's response (and work on GitLab CI replacement of Travis). I recommend to bring some pop-corn. It's going to be an interesting one to watch. J. On Tue, Jul 23, 2019 at 9:43 AM Jarek Potiuk <[email protected]> wrote: > It's now pretty consistent and happens pretty much every time using the > old build system - for example here: > https://travis-ci.org/apache/airflow/builds/562435992. > > I will cancel all PRs and disable automated PR build on Travis until we > solve the problem - as it is pointless - new PRs will simply queue and fail > constantly. > > I opened critical infrastructure ticket: > https://issues.apache.org/jira/browse/INFRA-18787 and I am running some > additional tests - I run the builds from commit before the new CI so that I > see if another change since then could cause it. > > J. > > > On Tue, Jul 23, 2019 at 8:55 AM Jarek Potiuk <[email protected]> > wrote: > >> Update2: I can confirm that the same memory/resource related issues >> happen in my Travis CI forks with reverted changes :( >> https://travis-ci.org/potiuk/airflow/builds/562430507 . I will escalate >> it to Travis/APACHE infrastructure >> >> On Tue, Jul 23, 2019 at 8:35 AM Jarek Potiuk <[email protected]> >> wrote: >> >>> Update: it looks like it's Travis's problem: I reverted the CI changes >>> and we have the same CPU problem in the old build: >>> https://travis-ci.org/potiuk/airflow/jobs/562430517 . >>> >>> On Tue, Jul 23, 2019 at 8:32 AM Jarek Potiuk <[email protected]> >>> wrote: >>> >>>> Hello everyone, >>>> >>>> We've started to experience some random failures on Travis relaated to >>>> lack of resources: those are either Out of Memory errors or lack of CPUS to >>>> run Kubernetes builds. >>>> >>>> I tried to rerun those, thinking it was an intermittent error. It >>>> started happening yesterday and I have not seen it before so I rather doubt >>>> it is related to the latest changes. >>>> >>>> But I do not want to risk everyone being blocked so I am testing now on >>>> my own fork if reverting the latest CI changes help. I will let you know >>>> and will revert in case I found old CI works in a stable way. >>>> >>>> In the meantime - I will cancel all outstanding builds that are >>>> blocking our queue and will test it both old CI and new CI in our fork :( >>>> (Travis queue limit is not helping). >>>> >>>> Can you please hold on with rebasing/pushing new PRs until I check it. >>>> >>>> Example failures: >>>> >>>> >>>> - OSError: [Errno 12] Cannot allocate memory ( >>>> https://travis-ci.org/apache/airflow/jobs/562395978) >>>> - [ERROR NumCPU]: the number of available CPUs 1 is less than the >>>> required 2 (https://travis-ci.org/apache/airflow/jobs/562395978) >>>> >>>> >>>> J. >>>> >>>> -- >>>> >>>> Jarek Potiuk >>>> Polidea <https://www.polidea.com/> | Principal Software Engineer >>>> >>>> M: +48 660 796 129 <+48660796129> >>>> [image: Polidea] <https://www.polidea.com/> >>>> >>>> >>> >>> -- >>> >>> Jarek Potiuk >>> Polidea <https://www.polidea.com/> | Principal Software Engineer >>> >>> M: +48 660 796 129 <+48660796129> >>> [image: Polidea] <https://www.polidea.com/> >>> >>> >> >> -- >> >> Jarek Potiuk >> Polidea <https://www.polidea.com/> | Principal Software Engineer >> >> M: +48 660 796 129 <+48660796129> >> [image: Polidea] <https://www.polidea.com/> >> >> > > -- > > Jarek Potiuk > Polidea <https://www.polidea.com/> | Principal Software Engineer > > M: +48 660 796 129 <+48660796129> > [image: Polidea] <https://www.polidea.com/> > > -- Jarek Potiuk Polidea <https://www.polidea.com/> | Principal Software Engineer M: +48 660 796 129 <+48660796129> [image: Polidea] <https://www.polidea.com/>
