I haven't seen any comments on this thread, so we are going to move forward
with the change.

On Mon, 2 Sep 2019 at 09:03, Barak Korren <bkor...@redhat.com> wrote:

> Adding Evgeny and Shirly who are AFAIK the owners of the metrics suit.
>
> On Sun, 1 Sep 2019 at 17:07, Barak Korren <bkor...@redhat.com> wrote:
>
>> If you have been using or monitoring any OST suits recently, you may have
>> noticed we've been suffering from long delays in allocating CI hardware
>> resources for running OST suits. I'd like to briefly discuss the reasons
>> behind this, what are planning to do to resolve this and the implication of
>> those actions for big suit owners.
>>
>> As you might know, we have moved a while ago from running OST suits each
>> on its own dedicated server to running them inside containers managed by
>> OpenShift. That had allowed us to run multiple OST suits on the same
>> bare-metal host which in turn increased our overall capacity by 50% while
>> still allowing us to free up hardware for accommodating the kubevirt
>> project on our CI hardware.
>>
>> Our infrastructure is currently built in a way where we use the exact
>> same POD specification (and therefore resource settings) for all suits.
>> Making it more flexible at this point would require significant code
>> changes we are not likely to make. What this means is that we need to make
>> sure our PODs have enough resources to run the most demanding suits. It
>> also means we waste some resources when running less demanding ones.
>>
>> Given the set of OST suits we have ATM, we sized our PODs to allocate
>> 32Gibs of RAM. Given the servers we have, this means we can run 15 suits at
>> a time in parallel. This was sufficient for a while, but given increasing
>> demand, and the expectation for it to increase further once we introduce
>> the patch gating features we've been working on, we must find a way to
>> significantly increase our suit running capacity.
>>
>> We have measured the amount of RAM required by each suit and came to the
>> conclusion that for the vast majority of suits, we could settle for PODs
>> that allocate only 14Gibs of RAM. If we make that change, we would be able
>> to run a total of 40 suits at a time, almost tripling our current capacity.
>>
>> The downside of making this change is that our STDCI V2 infrastructure
>> will no longer be able to run suits that require more then 14Gib of RAM.
>> This effectively means it would no longer be possible to run these suits
>> from OST's check-patch job or from the OST manual job.
>>
>> The list of relevant suits that would be affected follows, the suit
>> owners, as documented in the CI configuration, have be added as "to"
>> recipients to the message:
>>
>>    - hc-basic-suite-4.3
>>    - hc-basic-suite-master
>>    - metrics-suite-4.3
>>
>> Since we're aware people would still like to be able to work with the
>> bigger suits, we will leverage the nightly suit invocation jobs to enable
>> then to be run in the CI infra. We will support the following use cases:
>>
>>    - *Periodically running the suit on the latest oVirt packages* - this
>>    will be done by the nightly job like it is done today
>>    - *Running the suit to test changes to the suit`s code* - while
>>    currently this is done automatically by check-patch, this would have to be
>>    done manually in the future by manually triggering the nightly job and
>>    setting the REFSPEC parameter to point to the examined patch
>>    - *Triggering the suit manually* - This would be done by triggering
>>    the suit-specific nightly job (as opposed to the general OST manual job)
>>
>>  The patches listed below implement the changes outlined above:
>>
>>    - 102757 <https://gerrit.ovirt.org/102757> nightly-system-tests: big
>>    suits -> big containers
>>    - 102771 <https://gerrit.ovirt.org/102771>: stdci: Drop `big` suits
>>    from check-patch
>>
>> We know that making the changes we presented will make things a little
>> less convenient for users and maintainers of the big suits, but we believe
>> the benefits of having vastly increased execution capacity for all other
>> suits outweigh those shortcomings.
>>
>> We would like to hear all relevant comment and questions from the quite
>> owners and other interested parties, especially is you think we should not
>> carry out the changes we propose.
>> Please take the time to respond on this thread, or on the linked patches.
>>
>> Thanks,
>>
>> --
>> Barak Korren
>> RHV DevOps team , RHCE, RHCi
>> Red Hat EMEA
>> redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
>>
>
>
> --
> Barak Korren
> RHV DevOps team , RHCE, RHCi
> Red Hat EMEA
> redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
>


-- 
Barak Korren
RHV DevOps team , RHCE, RHCi
Red Hat EMEA
redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
_______________________________________________
Infra mailing list -- infra@ovirt.org
To unsubscribe send an email to infra-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/infra@ovirt.org/message/6UMJLCA45AICC5YPKYCRW6H3Y4GQY6K3/

Reply via email to