Re: [PATCH] iotests: Remove 030 from the auto group

2020-09-23 Thread Thomas Huth
On 23/09/2020 20.18, Alberto Garcia wrote:
> On Fri 04 Sep 2020 10:25:13 AM CEST, Kevin Wolf wrote:
>>> Test 030 is still occasionally failing in the CI ... so for the
>>> time being, let's disable it in the "auto" group. We can add it
>>> back once it got more stable.
>>>
>>> Signed-off-by: Thomas Huth 
>>
>> I would rather just disable this one test function as 030 is a pretty
>> important one that tends to catch bugs.
>>
>>>  I just saw the problem here:
>>>   https://cirrus-ci.com/task/5449330930745344?command=main#L6482
>>>  and Peter hit it a couple of weeks ago:
>>>   https://lists.gnu.org/archive/html/qemu-devel/2020-08/msg00136.html
>>
>> I wonder how this can still happen. The test should have more than
>> enough time to complete now. Except if the throttling doesn't work as
>> expected.
>>
>> I can't seem to reproduce this even if I add rather long delays. After
>> 40 seconds, all jobs have moved either by 512k (which is STREAM_CHUNK)
>> or not at all.
> 
> I also don't understand how this can fail... I assume the test is not
> running for that long in the cases when it fails, right?

Hard to say ... the problem only occurs occasionally, and I've never
seen it happen "live", only in the CI logs after the job has failed. I
guess you'd have to print timestamps in the code and then submit a lot
of jobs to the CI systems that are sensitive to this problem (e.g.
Cirrus and Travis) to find out...

 Thomas




Re: [PATCH] iotests: Remove 030 from the auto group

2020-09-23 Thread Alberto Garcia
On Fri 04 Sep 2020 10:25:13 AM CEST, Kevin Wolf wrote:
>> Test 030 is still occasionally failing in the CI ... so for the
>> time being, let's disable it in the "auto" group. We can add it
>> back once it got more stable.
>> 
>> Signed-off-by: Thomas Huth 
>
> I would rather just disable this one test function as 030 is a pretty
> important one that tends to catch bugs.
>
>>  I just saw the problem here:
>>   https://cirrus-ci.com/task/5449330930745344?command=main#L6482
>>  and Peter hit it a couple of weeks ago:
>>   https://lists.gnu.org/archive/html/qemu-devel/2020-08/msg00136.html
>
> I wonder how this can still happen. The test should have more than
> enough time to complete now. Except if the throttling doesn't work as
> expected.
>
> I can't seem to reproduce this even if I add rather long delays. After
> 40 seconds, all jobs have moved either by 512k (which is STREAM_CHUNK)
> or not at all.

I also don't understand how this can fail... I assume the test is not
running for that long in the cases when it fails, right?

Berto



Re: [PATCH] iotests: Remove 030 from the auto group

2020-09-04 Thread Thomas Huth
On 04/09/2020 12.38, Max Reitz wrote:
> On 04.09.20 12:14, Thomas Huth wrote:
>> On 04/09/2020 10.25, Kevin Wolf wrote:
>>> Am 04.09.2020 um 07:57 hat Thomas Huth geschrieben:
 Test 030 is still occasionally failing in the CI ... so for the
 time being, let's disable it in the "auto" group. We can add it
 back once it got more stable.

 Signed-off-by: Thomas Huth 
>>>
>>> I would rather just disable this one test function as 030 is a pretty
>>> important one that tends to catch bugs.
>>
>> Ok, ... should it always get disabled, or shall we try to come up with
>> some magic checks so that it only gets disabled in the CI pipelines (...
>> though I don't have a clue how to check for Peter's merge test
>> environment...)?
> 
> I suppose we could let check-block.sh set some environment variable.

Sounds like a plan! I'll try to cook a patch.

 Thomas




Re: [PATCH] iotests: Remove 030 from the auto group

2020-09-04 Thread Max Reitz
On 04.09.20 12:14, Thomas Huth wrote:
> On 04/09/2020 10.25, Kevin Wolf wrote:
>> Am 04.09.2020 um 07:57 hat Thomas Huth geschrieben:
>>> Test 030 is still occasionally failing in the CI ... so for the
>>> time being, let's disable it in the "auto" group. We can add it
>>> back once it got more stable.
>>>
>>> Signed-off-by: Thomas Huth 
>>
>> I would rather just disable this one test function as 030 is a pretty
>> important one that tends to catch bugs.
> 
> Ok, ... should it always get disabled, or shall we try to come up with
> some magic checks so that it only gets disabled in the CI pipelines (...
> though I don't have a clue how to check for Peter's merge test
> environment...)?

I suppose we could let check-block.sh set some environment variable.

Max



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] iotests: Remove 030 from the auto group

2020-09-04 Thread Kevin Wolf
Am 04.09.2020 um 12:14 hat Thomas Huth geschrieben:
> On 04/09/2020 10.25, Kevin Wolf wrote:
> > Am 04.09.2020 um 07:57 hat Thomas Huth geschrieben:
> >> Test 030 is still occasionally failing in the CI ... so for the
> >> time being, let's disable it in the "auto" group. We can add it
> >> back once it got more stable.
> >>
> >> Signed-off-by: Thomas Huth 
> > 
> > I would rather just disable this one test function as 030 is a pretty
> > important one that tends to catch bugs.
> 
> Ok, ... should it always get disabled, or shall we try to come up with
> some magic checks so that it only gets disabled in the CI pipelines (...
> though I don't have a clue how to check for Peter's merge test
> environment...)?

Maybe we can detect whether we're run as part of the "auto" group and
skip the test then (as in QMPTestCase.case_skip)?

Kevin




Re: [PATCH] iotests: Remove 030 from the auto group

2020-09-04 Thread Thomas Huth
On 04/09/2020 10.25, Kevin Wolf wrote:
> Am 04.09.2020 um 07:57 hat Thomas Huth geschrieben:
>> Test 030 is still occasionally failing in the CI ... so for the
>> time being, let's disable it in the "auto" group. We can add it
>> back once it got more stable.
>>
>> Signed-off-by: Thomas Huth 
> 
> I would rather just disable this one test function as 030 is a pretty
> important one that tends to catch bugs.

Ok, ... should it always get disabled, or shall we try to come up with
some magic checks so that it only gets disabled in the CI pipelines (...
though I don't have a clue how to check for Peter's merge test
environment...)?

 Thomas




Re: [PATCH] iotests: Remove 030 from the auto group

2020-09-04 Thread Max Reitz
On 04.09.20 07:57, Thomas Huth wrote:
> Test 030 is still occasionally failing in the CI ... so for the
> time being, let's disable it in the "auto" group. We can add it
> back once it got more stable.
> 
> Signed-off-by: Thomas Huth 
> ---
>  I just saw the problem here:
>   https://cirrus-ci.com/task/5449330930745344?command=main#L6482
>  and Peter hit it a couple of weeks ago:
>   https://lists.gnu.org/archive/html/qemu-devel/2020-08/msg00136.html
> 
>  tests/qemu-iotests/group | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Thanks, applied to my block branch:

https://git.xanclic.moe/XanClic/qemu/commits/branch/block



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] iotests: Remove 030 from the auto group

2020-09-04 Thread Max Reitz
On 04.09.20 10:31, Max Reitz wrote:
> On 04.09.20 07:57, Thomas Huth wrote:
>> Test 030 is still occasionally failing in the CI ... so for the
>> time being, let's disable it in the "auto" group. We can add it
>> back once it got more stable.
>>
>> Signed-off-by: Thomas Huth 
>> ---
>>  I just saw the problem here:
>>   https://cirrus-ci.com/task/5449330930745344?command=main#L6482
>>  and Peter hit it a couple of weeks ago:
>>   https://lists.gnu.org/archive/html/qemu-devel/2020-08/msg00136.html
>>
>>  tests/qemu-iotests/group | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Thanks, applied to my block branch:
> 
> https://git.xanclic.moe/XanClic/qemu/commits/branch/block

Or maybe not O:)



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] iotests: Remove 030 from the auto group

2020-09-04 Thread Kevin Wolf
Am 04.09.2020 um 07:57 hat Thomas Huth geschrieben:
> Test 030 is still occasionally failing in the CI ... so for the
> time being, let's disable it in the "auto" group. We can add it
> back once it got more stable.
> 
> Signed-off-by: Thomas Huth 

I would rather just disable this one test function as 030 is a pretty
important one that tends to catch bugs.

>  I just saw the problem here:
>   https://cirrus-ci.com/task/5449330930745344?command=main#L6482
>  and Peter hit it a couple of weeks ago:
>   https://lists.gnu.org/archive/html/qemu-devel/2020-08/msg00136.html

I wonder how this can still happen. The test should have more than
enough time to complete now. Except if the throttling doesn't work as
expected.

I can't seem to reproduce this even if I add rather long delays. After
40 seconds, all jobs have moved either by 512k (which is STREAM_CHUNK)
or not at all.

What is interesting is that in both cases it's stream-node8, which is
the job streaming from node6 to node8, and node8 is the top-level node.
It's also the last job to be changed to full speed, so all others did
succeed before.

Kevin




[PATCH] iotests: Remove 030 from the auto group

2020-09-03 Thread Thomas Huth
Test 030 is still occasionally failing in the CI ... so for the
time being, let's disable it in the "auto" group. We can add it
back once it got more stable.

Signed-off-by: Thomas Huth 
---
 I just saw the problem here:
  https://cirrus-ci.com/task/5449330930745344?command=main#L6482
 and Peter hit it a couple of weeks ago:
  https://lists.gnu.org/archive/html/qemu-devel/2020-08/msg00136.html

 tests/qemu-iotests/group | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 5cad015231..f084061a16 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -51,7 +51,7 @@
 027 rw auto quick
 028 rw backing quick
 029 rw auto quick
-030 rw auto backing
+030 rw backing
 031 rw auto quick
 032 rw auto quick
 033 rw auto quick
-- 
2.18.2