In order to not block other patches on CQ, I've sent [1] which will double
the amount of space on the ISCSI SD (with the patch it will have 40GB).

As a side note, we use the same configuration on the master suite, which
may explain
why we don't see the issue there.

[1] https://gerrit.ovirt.org/#/c/95922/

On Sun, Dec 2, 2018 at 5:41 PM Gal Ben Haim <gbenh...@redhat.com> wrote:

> Below you can find 2 jobs, one that succeeded and the other failed on the
> iscsi issue.
> Both were triggered by unrelated patches.
>
> Success -
> https://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/3546/
> Failure -
> https://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/3544/
>
>
> On Sun, Dec 2, 2018 at 2:37 PM Gal Ben Haim <gbenh...@redhat.com> wrote:
>
>> Raz, thanks for the investigation.
>> I'll send a patch for increasing the luns size.
>>
>> On Sun, Dec 2, 2018 at 1:27 PM Nir Soffer <nsof...@redhat.com> wrote:
>>
>>> On Sun, Dec 2, 2018, 10:44 Raz Tamir <rata...@redhat.com wrote:
>>>
>>>> After some analysis, I think the bug we are seeing here is
>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1588061
>>>> This applies for suspend/resume and also for a snapshot with memory.
>>>> Following the steps and considering that the iscsi storage domain is
>>>> only 20GB, this should be the reason for reaching ~4GB free space
>>>>
>>>
>>>
>>> OST configuration should change so it is will not fail because of such
>>> bugs.
>>>
>>
>> I disagree. the purpose of OST it to catch bugs, not covering them.
>>
>>>
>>> Iscsi storage can be created using sparse files, not consuming any
>>> resources until you write to the lvs, so having 100g storage domain cost
>>> nothing.
>>>
>>
>> OST use sparse files.
>>
>>>
>>> Nir
>>>
>>>
>>>> On Fri, Nov 30, 2018 at 10:01 PM Raz Tamir <rata...@redhat.com> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Fri, Nov 30, 2018, 21:57 Ryan Barry <rba...@redhat.com wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Nov 30, 2018 at 2:31 PM Raz Tamir <rata...@redhat.com> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Nov 30, 2018, 19:33 Dafna Ron <d...@redhat.com wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> This mail is to provide the current status of CQ and allow people
>>>>>>>> to review status before and after the weekend.
>>>>>>>> Please refer to below colour map for further information on the
>>>>>>>> meaning of the colours.
>>>>>>>>
>>>>>>>> *CQ-4.2*: RED (#1)
>>>>>>>>
>>>>>>>> I checked last date ovirt-engine and vdsm passed and moved packages
>>>>>>>> to tested as they are the bigger projects and it was on the 27-11-218.
>>>>>>>>
>>>>>>>> We have been having sporadic failures for most of the projects on
>>>>>>>> test check_snapshot_with_memory.
>>>>>>>> We have deducted that this is caused by a code regression in
>>>>>>>> storage based on the following things:
>>>>>>>> 1.Evgheni and Gal helped debug this issue to rule out lago and
>>>>>>>> infra issue as the cause of failure and both determined the issue is a 
>>>>>>>> code
>>>>>>>> regression - most likely in storage.
>>>>>>>> 2. The failure only happens on 4.2 branch.
>>>>>>>> 3. the failure itself is cannot run a vm due to low disk space in
>>>>>>>> storage domain and we cannot see any failures which would leave any
>>>>>>>> leftovers in the storage domain.
>>>>>>>>
>>>>>>> Can you please share the link to the execution?
>>>>>>>
>>>>>>
>>>>>> Here's an example of one run:
>>>>>> https://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/3550/
>>>>>>
>>>>>> The iSCSI storage domain starts emitting warnings about low storage
>>>>>> space immediately after removing the VmPool, but it's possible that the
>>>>>> storage domain is filling before that from some other call prior to that
>>>>>> which is still running, possibly the VM import.
>>>>>>
>>>>> Thanks Ryan, I'll try to help with debugging this issue
>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>> Dan and Ryan are actively involved in trying to find the regression
>>>>>>>> but the consensus is that this is a storage related regression and*
>>>>>>>> we are having a problem getting the storage team to join us in 
>>>>>>>> debugging
>>>>>>>> the issue. *
>>>>>>>>
>>>>>>>> I prepared a patch to skip the test in case we cannot get
>>>>>>>> cooperation from storage team and resolve this regression in the next 
>>>>>>>> few
>>>>>>>> days:
>>>>>>>> https://gerrit.ovirt.org/#/c/95889/
>>>>>>>>
>>>>>>>> *CQ-Master:* YELLOW (#1)
>>>>>>>>
>>>>>>>> We have failures which CQ is still bisecting and until its done we
>>>>>>>> cannot point to any specific failing projects.
>>>>>>>>
>>>>>>>>
>>>>>>>> Happy week!
>>>>>>>> Dafna
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> -------------------------------------------------------------------------------------------------------------------
>>>>>>>> COLOUR MAP
>>>>>>>>
>>>>>>>> Green = job has been passing successfully
>>>>>>>>
>>>>>>>> ** green for more than 3 days may suggest we need a review of our
>>>>>>>> test coverage
>>>>>>>>
>>>>>>>>
>>>>>>>>    1.
>>>>>>>>
>>>>>>>>    1-3 days       GREEN (#1)
>>>>>>>>    2.
>>>>>>>>
>>>>>>>>    4-7 days       GREEN (#2)
>>>>>>>>    3.
>>>>>>>>
>>>>>>>>    Over 7 days GREEN (#3)
>>>>>>>>
>>>>>>>>
>>>>>>>> Yellow = intermittent failures for different projects but no
>>>>>>>> lasting or current regressions
>>>>>>>>
>>>>>>>> ** intermittent would be a healthy project as we expect a number of
>>>>>>>> failures during the week
>>>>>>>>
>>>>>>>> ** I will not report any of the solved failures or regressions.
>>>>>>>>
>>>>>>>>
>>>>>>>>    1.
>>>>>>>>
>>>>>>>>    Solved job failures        YELLOW (#1)
>>>>>>>>    2.
>>>>>>>>
>>>>>>>>    Solved regressions      YELLOW (#2)
>>>>>>>>
>>>>>>>>
>>>>>>>> Red = job has been failing
>>>>>>>>
>>>>>>>> ** Active Failures. The colour will change based on the amount of
>>>>>>>> time the project/s has been broken. Only active regressions would be
>>>>>>>> reported.
>>>>>>>>
>>>>>>>>
>>>>>>>>    1.
>>>>>>>>
>>>>>>>>    1-3 days      RED (#1)
>>>>>>>>    2.
>>>>>>>>
>>>>>>>>    4-7 days      RED (#2)
>>>>>>>>    3.
>>>>>>>>
>>>>>>>>    Over 7 days RED (#3)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Ryan Barry
>>>>>>
>>>>>> Associate Manager - RHV Virt/SLA
>>>>>>
>>>>>> rba...@redhat.com    M: +16518159306     IM: rbarry
>>>>>> <https://red.ht/sig>
>>>>>>
>>>>>
>>>>
>>>> --
>>>>
>>>>
>>>> Raz Tamir
>>>> Manager, RHV QE
>>>> _______________________________________________
>>>> Devel mailing list -- devel@ovirt.org
>>>> To unsubscribe send an email to devel-le...@ovirt.org
>>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>>>> oVirt Code of Conduct:
>>>> https://www.ovirt.org/community/about/community-guidelines/
>>>> List Archives:
>>>> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/6EFAA4LR743GLDGGNVCK2PEOHL7USLB7/
>>>>
>>> _______________________________________________
>>> Devel mailing list -- devel@ovirt.org
>>> To unsubscribe send an email to devel-le...@ovirt.org
>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>>> oVirt Code of Conduct:
>>> https://www.ovirt.org/community/about/community-guidelines/
>>> List Archives:
>>> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/ZNMZS7V2TLRRXTYJ4EQ3R44Z634IL62T/
>>>
>>
>>
>> --
>> *GAL bEN HAIM*
>> RHV DEVOPS
>>
>
>
> --
> *GAL bEN HAIM*
> RHV DEVOPS
>


-- 
*GAL bEN HAIM*
RHV DEVOPS
_______________________________________________
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/MP277EZWHCFEHDHFSENQZWIVDXTLAP3I/

Reply via email to